In the context of the Nordic HPC community, I’ve been involved in some discussions on the applicability of cloud computing in HPC. Also in the blogosphere, the subject is receiving some attention (e.g. at bigdatamatters.com and hpcwire.com).
Here are some comments of mine:
A few times recently I’ve come across an argument that can be summarized by the following imaginary quote: “Cloud proponents want to ditch the old grid/batch infrastructure built over the last decade in European academia and either outsource everything to big (American) commercial players or replace it with technology that’s brand new and completely unproven in the context of HPC.”
While statements like this are clearly FUD, they are being made and deserve countering: Notice the word replace. Nobody in his right mind, be it a user, sysadmin or resource owner, would do that. Cloud technology is a new technology that may or may not be interesting for HPC. The only way to find out is to give it a try with some limited resources and manpower. This does not necessitate throwing out any working systems.
Cloud computing is about commoditization and economy of scale. It is IMO unlikely that any commercial/public vendor will offer Infiniband or other high-end interconnects any time soon. The market is not there. *** Correction, 6/7-2011: well, I was wrong – Amazon has offered this for some time now. Whaddayaknow ***
For HPC, commercial cloud offerings may be useful for scaling out serial jobs in peak-load situations, but what’s really interesting is the emerging open-source cloud technology. This technology will hopefully provide two game-changing features: 1) extreme server room automation, 2) an easy way to offer an EC2-like interface to users.
If 1 and 2 materialize, cloud technology deployed in HPC data centers could potentially allow such centers to service not only a small club of elite scientists, but also a whole other range of academic computing consumers, from the secretary tasked with creating an institute web site (fire up a machine with Joomla or Drupal preinstalled and configured with the university templates) over the student in need of running 10×12 hours of Matlab computations to the ATLAS physicist in need of running 500×4 hours of Athena reconstruction.
It’s all about money: the promise of technology like Eucalyptus is to allow servicing more users with the same hardware and manpower. The HPC community should not uncritically embrace this technology, but give it a serious try and see if it goes in the right direction to deliver on its promises.
Interestingly, in the Baltic countries, some effort is already being put into this. It will be very interesting to follow this project – in particular: Who will the users be? Will an EC2-like interface (a root shell on a multi-core machine) be more interesting than a traditional batch/grid interface to any HPC users? Will cloud technology allow more efficient server room administration?