High Performance Computing

Introduction to High Performance Computing Clusters

For many years, researchers faced with computationally demanding problems would resort to using powerful bespoke systems called supercomputers to solve their problems more quickly (a classic example is shown left). Often these were confined to regional and national computing centres and provided only limited availability. More recently, systems called high performance computing (HPC) clusters (or just clusters) have taken over much of the work done by supercomputers. These often form part of central university or departmental IT services. Access to these clusters is usually more open but, like their supercomputer counterparts, they are still based on fairly exotic hardware and require specialised software to make best use on them.

The cluster concept was first developed by enthusiasts as a way of providing high performance computing at more affordable prices. Early systems consisted of PCs linked together with standard network (ethernet) cable and, to keep prices down, the systems employed the freely available Linux operating system (as well as other free software such as the gcc compiler etc). These systems (dubbed Beowulf clusters) rapidly became popular amongst research groups and traditional large scale computer suppliers such IBM and Hewlett Packard soon decided to get in on the act and create their own cluster systems.

Building on years of expertise in commercial computing, these heavyweight suppliers quickly make big improvements to the performance of clusters whilst keeping the overall costs down to a fraction of that of a traditional supercomputer. The result was faster and larger disk storage systems, faster networking and the inclusion of specialised software for parallel computing. Today, clusters are sophisticated assemblies of hardware and software components which bear little relation to their do-it-yourself predecessors (for example, compare the system below to the Beowulf cluster). The overall organisation of clusters (often called the system architecture) has remained fairly much the same however and is described on the Cluster Hardware web page.