I recall some years ago porting an application code I worked with, which was developed and used almost exclusively on a high end supercomputer, to my PC. Naively (I was young), I was shocked to find that, per-processor, the code ran (much) faster on my PC than on the supercomputer. With very little optimization effort.
How could this be – this desktop machine costing only a few hundred pounds was matching the performance of a four processor HPC node costing many times that? Since I was also starting to get involved in HPC procurements, I naturally asked why we spend millions on special supercomputers, when for a twentieth of the price, we’d get the same throughput from a bunch of high-spec PCs?
The answer then (and now) was that I was extrapolating from only one application, and that application could be run as lots of separate test cases with no reduction in capability (i.e. we didn’t need large memory etc, just lots of parameter space). However, the other major workload (which I also ported and also ran fast on the PC) would not have been able to do the size of problem we wanted on a PC – we needed the larger memory and extra grunt from parallel processing. (We did look at the newfangled Network Of Workstations emerging at the time but decided it might be a wolf in sheep’s clothing. Sorry.)
In the end, we had to find a balance between (a) speed at lowest cost for the one application; (b) the best capability for the other application (i.e. fastest solution time for the largest problems); (c) ease of programming – to get a good enough (fast-enough) code developed with the limited developer effort and funding we had; and (d) whole life affordability.
Why do I foist this reminiscence on you? Because the current GPU crisis (maybe “crisis” is a bit strong – "PR storm" perhaps?) looks very much the same to me. The desktop HPC surprise of my youth has evolved into the dominant HPC processor and so for some years now, we have been developing and running our applications on clusters of general purpose processors – and a new upstart is trying to muscle in with the same tactic – “look how fast and how cheap” – the GPU (or similar technologies – e.g. Larrabee, sorry Knights-thingy).
The issues are the same: (a) for some applications, GPUs offer substantial performance improvements for considerably less cost than a “normal” HPC processor; (b) for other applications, the limits such as off-card bandwidth etc mean that GPU’s cannot deliver the required capability; (c) the underlying concern is ease of programming for GPUs; (d) affordability – sure GPU’s are cheap to buy, but what about power costs when in bulk, or code porting costs, etc?
Maybe the result will be the same as when commodity processors and clusters eventually exploded to leave custom supercomputer hardware as the minority solution. At first the uptake (now) is tentative - and painful. Some will have great success stories, many will get burnt. But in a few years time, we might well look back with nostalgia to when GPU’s were not the dominant processor for HPC systems.
I’ll continue on the future of HPC in my next blog in a few days, including an idea of what/who will emerge as the dominant solution ...