What does the SMP core do?

Our SMP core is very different than the way other distributed computing projects handle multi-core CPU's, and I thought it might be interesting for the FAH community to hear about the differences, pro and con.  As I think most people interested in computers know, Moore's law stating that the transistor count in CPUs will double every 1.5 years has continued for decades.  Most people think of Moore's law in terms of the speed of CPU's, but this isn't what Moore originally had in mind.  In the past, more transistors have lead to greater CPU speeds, but that has essentially ended (at least for traditional CPU's) a few years ago. 

But if Moore's law is still marching along (as it is), what do all those transistors do?  Over the last few years, more transistors have translated into more CPU cores, i.e. more CPUs on a chip.  While this is not what we wanted, this is perhaps not necessarily a disaster, if one can use these multiple CPUs to get faster calculations.  If we simply do more calculations (i.e. multiple Work Units, or WU's, simultaneously) not faster calculations (a WU completed in less time), distributed computers will run into the same problems that face supercomputers: how to scale to lots and lots of processors — i.e. how can we use all these processors to do a calculation faster over all.

In FAH, we've taken a different approach to multi-core CPUs.  Instead of just doing more WU's (eg doing 8 WU's simultaneously), we are applying methods to do a single WU faster.  This is typically much more valuable to a scientific project and it's important to us.  However, it comes with new challenges.  Getting a calculation to scale to lots of cores can be a challenge, as well as running complex multi-core calculations originally meant for supercomputers on operating systems not meant for this (eg Windows).

Right now, our SMP client seems to be running fairly well under Linux and OSX — operating systems based on UNIX, as is found on supercomputers.  We use a standard supercomputing library (MPI) to run these WU's and MPI behaves well on Unix-based machine.  MPI does not run well on Windows and we've been running into problems there.  However, as Windows MPI implementations mature, our SMP/Windows app will behave better.  Along the way, we also have a few tricks up our sleeve which may help as well.  However, if we can't get it to run as well as we'd like on Windows, we may choose to overhaul the whole code, as we did with the GPU1 client (which was really hard to run).

We're very excited about what the SMP client has been able to do so far.  One of our recent papers (#53 in our papers web site http://folding.stanford.edu/English/Papers) would have been impossible without the SMP client and represents a landmark calculation in the simulation of protein folding.  We're looking forward to more exciting results like that in the years to come!