Wednesday, 15 June 2016

Analysis of performance optimisation service requests: what kind of codes are we helping as part of POP CoE?

by Sally Bridgwater, NAG HPC Application Analyst

NAG is a partner in the Performance Optimisation and Productivity Centre of Excellence (POP). POP was created with the aim of boosting the productivity of EU research and industry by providing free of charge services to advise on improving the performance of high performance computing (HPC) parallel software.

The POP team consists of six partner organisations from Germany, France, Spain and the UK. Over 30 codes have applied for the POP service so far since its kick-off in October 2015. I decided to have a look into the details of what types of codes POP is working with and see if any interesting themes emerge. Since this is quite early in the project it will be useful to revisit and see how it evolves over time.

First I decided to look at what languages all of the codes were written in. From my experience in Physics, I generally assumed that Fortran was the most prevalent language in academic/scientific applications.

This seems to be the case so far in the POP project, and correlates with the larger number of academic codes that have been investigated. However, surprisingly to me, C++ is not very far behind followed closely by the combination of C & Fortran. There is a small subset of applications using Python in some form, often not as the main backbone of the application though. The “Others” includes a small number of other combinations of C, C++ and Fortran.

The most common parallelization model so far in the POP project has been hybrid MPI+OpenMP. 

This was quite surprising to me at first since this can be complicated to include in applications and often hard to get working well. However, this point may be exactly why it is seen more often by POP. Developers that are keen and interested in performance and scalability may be more likely to try it and may also be interested in seeking advice on how to best make use of it and ensure it is working efficiently; which is where POP comes in. The “Other” schemes include Java, Pthreads and GASPI.

The application areas of codes worked on shows that Engineering and Earth & Atmospheric sciences are by far the most prevalent at the moment but there is still a good breadth of subject areas which will hopefully increase as the project continues.

The POP project has funding to work with EU organization so I had a look at which country the codes came from. Since some of them are large collaborative works I chose the country of their main participant.

Germany and the UK are the largest contributors so far. This may not be surprising as NAG, based in the UK, is leading the customer finding efforts and there are three German partners on the POP team. Also Germany is one of the leading countries in the EU for engineering and HPC so this may explain the large number of engineering based applications we have worked with so far; around 60% of the engineering codes in POP are from Germany.

Looking through this data throughout the evolution of the POP project can give us an interesting view of HPC in the EU and also shows us where the POP project needs to focus more attention. For example, making sure we reach out to all EU countries equally not just the ones easiest for us. There is a wide breadth of application areas already covered by the POP project but it would be great to try and expand it even more because the service can benefit anyone in any field working with parallel code for HPC.

Friday, 3 June 2016

Improved Accessibility for NAG’s Mathematical and Statistical Routines for Python Data Scientists

By John Muddle, NAG Technical Sales Support Engineer

NAG and Continuum have partnered together to provide conda packages for the NAG Library for Python (nag4py), the Python bindings for the NAG C Library. Users wishing to use the NAG Library with Anaconda can now install the bindings with a simple command (conda install -c nag nag4py) or the Anaconda Navigator GUI.

For those of us who use Anaconda, the Open Data Science platform, for package management and virtual environments, this enhancement provides immediate access to the 1,500+ numerical algorithms in the NAG Library. It also means that you can automatically download any future NAG Library updates as they are published on the NAG channel in Anaconda Cloud.

To illustrate how to use the NAG Library for Python, I have created an IPython Notebook that demonstrates the use of NAG’s implementation of the PELT algorithm to identify the changepoints of a stock whose price history has been stored in a MongoDB database. Using the example of Volkswagen (VOW), you can clearly see that a changepoint occurred when the news about the recent emissions scandal broke. This is an unsurprising result in this case, but in general, it will not always be as clear when and where a changepoint occurs.

So far, conda packages for the NAG Library for Python have been made available for 64-bit Linux, Mac and Windows platforms. On Linux and Mac, a conda package for the NAG C Library will automatically be installed alongside the Python bindings, so no further configuration is necessary. A Windows conda package for the NAG C Library is coming soon. Until then, a separate installation of the NAG C Library is required. In all cases, the Python bindings require NumPy, so that will also be installed by conda if necessary.

Use of the NAG C Library requires a valid licence key, which is available here: The NAG Library is also available for a 30-day trial.