Monday, 16 November 2015

Diversity at NAG: Vision for our future and where we are today

A short time ago my colleague, Andrew Jones, VP HPC Business, introduced me to John West, Director of Strategic Initiatives at the Texas Advanced Computing Center (TACC). Andrew found his blog posts "Diversity in HPC won't improve until we start counting" and his follow up “How One HPC Center Learned to Count inspiring and suggested we get involved. We were all equally encouraged by John’s work and it gave us the impetus to get behind his initiative by investigating NAG’s Gender Diversification figures and publishing them at SC15.

(we drill down further into NAG’s Gender Diversity figures here)

NAG believes that Diversity among its workforce brings a broader range of ideas and in turn helps us develop more innovative products and services. If our workforce is diverse, just like the population, then we understand market needs better and can offer first class support. Ultimately, promoting and living diversity in the organisation ensures the best possible outcomes for NAG.

Our Diversity figures show that as an organisation, NAG is not where it would like to be, but we are using these to cement our commitment in creating a truly diverse workplace in the future. We continue to work on new initiatives to encourage girls and boys alike to be interested in coding and computing.

There’s a host of Diversity discussions going on at SC15 that we are proud to be part of. Learn more here



Friday, 6 November 2015

Finding the safe path through the HPC fog

The use of HPC (or supercomputing or big data computing or advanced technical computing or …) delivers real and significant benefits to those organizations that can exploit it effectively. However, the world of HPC technology is a complex space, with a plethora of technology options, continuous rapid change, ambiguous long terms trends, diverse business models, and potential disruptive technologies.

This presents challenges for buyers, managers, programmers and users of HPC facilities:
  • how to find the right technology that will accelerate your modelling, simulation and big data processing needs;
  • determining the best time to adopt new technologies (stability vs. first-mover benefits vs. mature ecosystem vs. migration cost vs. missed performance potential, ...);
  • how to reduce the risks in evaluating potential new technologies;
  • minimize time, effort and cost in finding the important information among the clutter in the HPC market;
  • develop confidence in solutions that will improve cost-effectiveness, business impact, or application performance.
NAG, together with our colleagues at Red Oak Consulting, continually explore and analyze this complex space on behalf of our customers. Our HPC experts:
  • regularly attend a wide range of HPC and other technology conferences around the world.
  • have a broad network of other users of HPC technologies and their real world experiences.
  • keep up to date with the many excellent sources of HPC news and analysis that are available.
  • engage with technology providers, liaise with researchers in academia, national labs and industry.
  • undertake in-house or collaborative mini-research projects and technology evaluations.
Most importantly, because HPC technology consulting is a core business for us, we have the time, effort and expertise to engage with all of these sources and to analyse the information to extract insight from the noise.

We have delivered this insight to our consulting customers for several years, with around 50 successful strategy, technology evaluation or procurement consulting projects.

Today, we are announcing a new service to deliver this impartial intelligence and analysis to HPC buyers, managers, programmers and users. The NAG HPC Technology Intelligence Service in partnership with Red Oak Consulting will deliver technology insight and risk-reduction to help HPC buyers and users make better decisions and optimize their HPC investments. The Service will help you find what you need to know about HPC technology with less time, effort and cost on your part and thus enable better decisions focusing on the important information among the clutter in the HPC market.

The central feature of the Service is our Technology Intelligence Report, issued three times per year to subscribers, with fresh intelligence and insight. Each report will have 3 or 4 focus areas that will be explored in depth, an update on the HPC technology space in general, a wildcard topic, and a community news section. Our launch report has four focus areas: compute, programming, storage, and Total-Cost-of-Ownership models. Each focus area chapter explores a particular theme in depth, covering the main technology options and issues, the short term outlook (what will affect your plans this year/next year), the long term outlook (how the future changes might affect your decisions and planning now), and includes our “what this means for you” summary.

For more information visit

We also invite you speak to the NAG and Red Oak Consulting team at booth #1619 at SC15 in Austin.

Watch this space in the next few days for extracts from the first report!

Friday, 2 October 2015

Travelling Rugby Fan Problem

The Rugby World Cup is well under way here in England. So far, I have been lucky enough to witness Japan beat South Africa at the Community Stadium in Brighton and next on my list is the nail biting encounter between England and Australia at Twickenham in South West London (Update: Even though it was disappointing to see England loose and go out of the tournament, it was a good day out). All this travelling got me thinking, what would be the fastest circular route to visit all the stadia hosting a Rugby World Cup game if one was lucky enough to have tickets?

The naive approach is to go to Google Maps and choose the best order yourself. However, and this is where the Numerical Algorithms Group comes in, algorithms exist to solve this type of problem. In fact, this is quite a famous problem and is commonly referred to as the Travelling Salesman Problem (TSP). A good introduction to the TSP can be found here.

So what can we use instead? The NAG Library contains a routine, H03BB, which can provide an approximate solution to the symmetric TSP. I was drawn to this routine for several reason: it has recently been added to the Library in Mark 25 and it simulates an interesting physical process called annealing.

Annealing is the process of heating a metal to a certain temperature and then cooling it slowly. The reason for doing this is to remove any stresses that have developed during the original formation. On the molecular scale, energetic atoms are free to rearrange themselves into the lowest energy state. This reduces impurities in the crystalline structure.

Simulated annealing is inspired by the above physical process and attempts to solve the global optimization problem commonly known as the TSP. At higher ‘temperatures’ the solution space is traversed in large jumps. These help identify regions of low energy. Fundamental to the routine is the ability to jump to ‘higher’ energy states, as this allows a greater search of the solution space. Overall, the algorithm favours moving to a state lower in ‘energy’, rather than moving to a closer state with higher ‘energy’. Finally, as the ‘material’ cools, the jumps reduce in size. More information can be found here.

Before we can use the H03BB algorithm, we need some input data. At this point, I remembered that I had seen a similar problem on Reddit. Dr. Randy Olson used a genetic algorithm to solve the TSP. His blog post can be found here. On his blog, he also provided all the necessary code, in Python, to download a matrix of distances between the points from Google Maps using the Google Maps API. Details on how to get this matrix can be found here. He also produced a HTML file that can be used to display the solution on a webpage. That code can be found here.

So, we have an algorithm to solve the TSP, a method of getting the input data and a way of displaying the results. It’s now time to go ahead and solve the problem. The following stadia are hosting at least one game during the Rugby World Cup:

  1. Brighton Community Stadium, Brighton
  2. Elland Road, Leeds
  3. Kingsholm Stadium, Gloucester
  4. Leicester City Stadium, Leicester
  5. Millennium Stadium, Cardiff
  6. Manchester City Stadium, Manchester
  7. Stadium MK, Milton Keynes
  8. Olympic Stadium, London
  9. Sandy Park, Exeter
  10. St James’ Park, Newcastle
  11. Villa Park, Birmingham
  12. Wembley Stadium, London
  13. Twickenham Stadium, London

Using Randy’s code, I was able to get a matrix of journey durations between all the different stadia. This is 13 X 13 symmetric matrix with zeros along the diagonal. I then used the NAG4PY wrappers to call the NAG C Library and solve the TSP and obtain an approximation to the quickest route.

A more detailed version of the above code can be found here.

Finally, using Randy’s HTML file I was able to produce the route below. For this particular route, I decided that cycling would be my preferred mode of transport. As well as visiting all 13 stadiums, the route passes through three national parks, eleven areas of outstanding natural beauty and through London where it passes: Canary Wharf, Queen Elizabeth Olympic Park, The Royal Courts of Justice, Trafalgar Square, The Mall, Buckingham Palace, Hyde Park and Kew Gardens.

The TSP routine is one of many different optimization routines that can be found in the NAG Library. Please feel free to contact us directly or look in our documentation to find out more.

Thursday, 24 September 2015

Life Service Award 2015 – David Sayers

The NAG Life Service Recognition Award has been running for 5 years – it was created in NAG’s 40th year as a way of showing gratitude to people that have dedicated immense time and commitment to the company.

It was a pleasure to see David Sayers, NAG Honorarium, receive the award at our recent AGM. As has become customary, James Davenport, Professor of Information Technology, University of Bath, presented David with the award and said a colleague wrote of David as follows, “David has been an outstanding ambassador for NAG for over forty years. For many customers, he has been the face of NAG - someone that they recognise and relate to, because of his many trips outside the office in the role of sales support. His interactions with NAG collaborators have been many and varied."

A code contributor wrote “David is, I believe, the main reason many of us contributed software to the early library. He came to our meetings, he listened and encouraged us, indeed he was one of us".

Professor James Davenport presenting David Sayers with the NAG Life Service Recognition Award 2015

Friday, 11 September 2015

Introducing the team: Kathy Godwin, Quality Manager and Libraries Project Manager

Kathy, describe your role at NAG?

I am now the Quality Manager and Libraries Project Manager.  

As Quality Manager, I try to keep everyone on their toes by checking 'everything' ... how it was done, why it was done, when it was done, ..., was it done to spec, was it thoroughly and successfully tested/checked, could we have done it any better, ..., is the customer happy, is the Developer happy, is the team happy. I am pleased to say that all at NAG take the quality of the products we produce and the services we provide very seriously.  

As Libraries Project Manager, I am instructed as to the new functionality, enhancements and corrections required for future releases of the NAG Library. In collaboration with NAG Developers, I schedule and manage the work to ensure that it progresses through development, peer-review, quality assurance and on to the Implementations Group so that they have the documentation and code they need for new releases to be made available to customers as soon as possible. 

Can you give examples of the customer problems you help solve via NAG Technical Support?

I don't work directly on NAG Technical Support, but I do monitor the progress of Technical Support queries to ensure that they are answered in a timely manner, that enough information is provided and that the customer has been provided with an answer. If they are not then I check that there has been a follow-up action to any short fall in our performance and that a satisfactory solution is provided.

Tell us something special/unusual about NAG?

I have worked at NAG almost continuously since 1979, experiencing many management styles. Our current style of management is the most open, honest, trusting and flexible that I have experienced. It makes for a truly relaxed, but inspiring place to work. So much more is achieved because we have the opportunity to work to our full potential. The people at NAG all contribute to a friendly and pleasant place to be.

Why have you worked at NAG so long?

I first started work at NAG as Mark 6 of the Library was due to go out. I spent the first year or so keying every single routine document, Chapter Introduction and Introductory document for our first automatically generated release of the Fortran Library. I have been involved in every Mark since, helping to develop the NAG style, introducing standards and consistency, and helping improve efficiency of production. Using TSSD, TeX, SGML and now XSLT and XML has made working at NAG varied and interesting. The Library is like an old friend and we have grown up together.

Developers have come and gone and each have had their own unique personality - they have been fun and interesting to get to know. They have all made important contributions to the Library and should be proud of that.

There are many benefits to working at NAG, on top of those I have already mentioned, so I have never 'really' wanted to leave.
Tell us something special/unusual about yourself?

I don't think that there is anything special or unusual about myself - I tend to do everything to extremes and I'll give anything a go. As long as I can see improvement or some advantage and, I guess, receive some positive feedback, I'll keep going ... which is probably another reason for still being here. I just like to get things done.

Thursday, 9 July 2015

NAG Linear Regression on Apache Spark

This is a brief summary of a talk I gave recently at the Chicago Apache Spark Users Group Meetup. During the talk, I present many of the problems and successes when using the NAG Library distributed on Apache Spark worker nodes. You can find the slides available here.

The Linear Regression Problem

In this post we test the scalability and performance of using NAG Library for Java to solve a large-scale multi-linear regression problem on Spark. The example data ranges from 2 gigabytes up to 64 gigabytes in the form of

label x1 x2 x3 x4
68050.9 42.3 12.1 2.32 1
87565 47.3 19.5 7.19 2
65151.5 47.3 7.4 1.68 0
78564.1 53.2 11.4 1.74 1
56556.7 34.9 10.7 6.84 0

We solve this problem using the normal equations. This method allows us to map the sum-of-squares matrix computation across worker nodes. The reduce phase of Spark aggregates two of these matrices together. In the final step, a NAG linear regression routine is called on the master node to calculate the regression coefficients. All of this happens in one pass over the data - no iterative methods needed!

Example output from the linear regression is the following:

Time taken for NAG analysis: 236.964 seconds
Number of Independent Variables: 5
Total Number of Points: 334800000
R-Squared = 0.699
Var Coef SE t-value
Intcp 12723.3 2.2 5783.3
0 989.7 0.02 47392.5
1 503.4 0.04 11866.5
2 491.1 0.1 4911.9
3 7859.1 0.9 8732.3

Predicting 4 points
Prediction: 57634.9 Actual: 60293.6
Prediction: 32746.6 Actual: 35155.5
Prediction: 49917.5 Actual: 52085.3
Prediction: 82413.2 Actual: 82900.3


Using the above method, the NAG linear regression algorithm is able to compute the exact values for the regression coefficients, t-values, and errors. Below is a log-log plot of the Runtimes vs. Size of Input data for the NAG routines on an 8 slave Amazon EC2 cluster.

Not only is it important the NAG routines run fast, but that they scale efficiently as you increase the number of worker nodes. Let's look at how the 16 GB data set scales as we vary the number of slaves.

Comparison To MLlib

We tried running the MLlib Linear Regression algorithm on the same data, but were unable to get meaningful results. The MLlib algorithm using the stochastic gradient descent to find the optimal coefficients, but the last stochastic steps always seemed to return NaNs (we would be happy to share sample data ...let us know if you can solve the problem!).

More Information

For more information including examples of the NAG Library on Spark, contact us at support(at)

Thursday, 25 June 2015

Helping primary school children achieve the best in their SATs

My wife Caroline is a teacher in a primary school near where we live in Buckinghamshire. She teaches year 6, that is to say, ten to eleven year old children.

Now, year 6 is a rather important year in British schools. Children following the National Curriculum in state-funded schools are subject to SATs (standard assessment tests) before they move on to secondary school, and the results of these tests are used to create the dreaded league tables which are supposed to help parents make an informed choice about which school to send their children to. Importantly, the number of pupils can affect the amount of funding a school gets.

Year 6 teachers are therefore under a lot of pressure to help their pupils achieve good SATs results, so extra sessions and booster groups are common. This year Caroline asked if I could go into school to spend a few hours with her top maths set - ones that it had been decided were good enough to be entered for the harder (Level 6) papers.

I'm used to teaching adults on NAG training courses, but this was the first time I'd been into a primary school to teach. It turned out to be good fun! I had four students, all of whom had already passed their "eleven plus" and would be going to grammar school. Topics we covered included straight line graphs, probability and algebra. The kids were enthusiastic, and not too shy. Many of the problems tackled turned out to involve rearrangement of equations, so drilling into them that if you do the same thing to both sides of an equation (with some caveats of course) then the equality still holds seemed quite important.

Whether the short time I spent with them actually made any difference is hard to know, but I really enjoyed the interaction and the children seemed to also. The SATs results are not yet out, and scoring a Level 6 in Mathematics or English is quite difficult (the average 11 year old is expected to achieve Level 4) - so there will be no dishonour in not achieving it, but I'm looking forward to hearing how they get on. And I'm hoping that next year, as part of NAG's Community Outreach Programme, I'll be able to go into school again, and maybe for more hours this time around.

Thursday, 18 June 2015

Wilkinson Prize for Numerical Software 2015

by Mike Dewar, Chief Technical Officer, Numerical Algorithms Group (NAG)

James Hardy Wilkinson
NAG, along with Argonne National Laboratory and the National Physical Laboratory (NPL), awards the Wilkinson Prize every four years to the authors of an outstanding piece of numerical software. The prize honours the pioneering work of Dr James Hardy Wilkinson in the field of numerical algorithms, work which had a fundamental impact on NAG in our early years and which still resonates today. The Wilkinson Prize is unique in that it is awarded on the basis of the quality of a piece of software, not just for the mathematics behind it, and the judges consider the documentation and test suites as well as the software itself.  

This year the winners are P.E. Farrell (University of Oxford), S.W. Funke (Simula Research Laboratory), D.A. Ham (Imperial College London), and M.E. Rognes (Simula Research laboratory) for the development of dolfin-adjoint, a package which automatically derives and solves adjoint and tangent linear equations from high-level mathematical specifications of finite element discretisations of partial differential equations. The prize will be presented at ICIAM in Beijing in August, during the SIAM Awards Lunch. 

More details about the Wilkinson Prize and past winners can be found here.   

Thursday, 21 May 2015

Introducing the team: Gail Austin, Customer Support Coordinator, Team Leader

Gail, describe your role at NAG?

I manage the Technical Support Service, the Sales Operations Team, NAG Oxford Reception and I am website editor for the marketing area of the website.

Can you give examples of the customer problems you help solve via NAG Technical Support?

Rather than solve users technical issues my role is connected to the direction of incoming support queries and the issuing and renewal of software licences. Over time I have come to understand the installation of our software and so can help users with some of their licensing and software applicability issues.

Tell us something special/unusual about NAG?

I believe that NAG truly cares about giving our customers the best possible service.  This comes across again and again in the positive feedback we receive following the closure of technical support queries.

Tell us something special/unusual about yourself? 

I’m just a little bit proud of the fact that when helping Mick Pont with some tests he was carrying out for a presentation of NAG’s Random Number Generator, I was found to be more random!

Are you attending any exhibitions or conferences this year?

I’m not, I’m too busy in the office, helping our customers to get the best from NAG.

What’s been the highlight of your time working at NAG so far?

That’s a difficult one, I don’t think I can put my finger on any one thing.  I think it’s enough to say that I hope to stay at NAG for the remainder of my career, I love my job here and I consider myself really lucky to have finally found my niche, I’m as at home here as I am ……… at home :-)

Wednesday, 29 April 2015

Professor Mike Powell F.R.S

We are pained to pass on the sad news of the death on 19th April of Prof Mike Powell F.R.S, a brilliant numerical analyst specialising in numerical optimisation and approximation theory.  

Mike touched many lives with his work. NAG and NAG's community benefitted greatly because he gave freely his code to NAG from the very beginning. The more mature may remember the routine E04DCF, an implementation of his hybrid method for unconstrained optimisation. More recently his work on derivative-free optimisation led to E04JCF/e04jcc entering our Libraries as implementations of BOBYQA.

As a founder member of the IMA, Mike took an interest in the IMANA Newsletter and encouraged NAG to contribute regularly to that until the Newsletter's role was subsumed by the internet.  

The Wikipedia page gives more detail of Mike's many achievements.
He will be greatly missed by NAG, especially past and present colleagues who had the pleasure of knowing him personally. We pass on our love, condolences and very best wishes to his family.


Wednesday, 15 April 2015

Introducing the team: Craig Lucas, Senior Technical Consultant

Craig, describe your role at NAG?

I am a Senior Technical Consultant based in NAG’s Manchester Office. Here I manage part of NAG’s High Performance Computing group. I come from a numerical linear algebra background. For my MSc dissertation I looked at nearest correlation matrix problems, something I still work on today, and this year I have an MSc student looking at more new algorithms.  Another big interest of mine is shared memory parallelism with OpenMP, and in particular helping users to better exploit it.

Can you give examples of the customer problems you help solve via NAG Technical Support?

Technical Support is a great way to interact with our users. Queries can range from something very simple, helping people link their program to the NAG Library for the first time, to an in-depth discussion about the accuracy of an optimization routine. It is often challenging and always an opportunity to learn something new about the mathematics of the Library. Sometimes I need to read our documentation to help me get standard!

Tell us something special/unusual about NAG?

Working at NAG is very collaborative, whether that is working with universities or with other NAG colleagues. One day I might be listening to a Manchester student talking about their new research, another I might be at the whiteboard in the office discussing the nuances of a new NAG algorithm.

Tell us something special/unusual about yourself?

I came to Mathematics relatively late and was 27 when I went to university. I had somehow drifted into Civil Engineering as a teenager, working in the area of land reclamation. Much of my work was dealing with closed coal mines; this was Staffordshire in the late 1980’s. Perhaps I had some sort of epiphany, I certainly realised that a childhood interest needed some proper exploration.  I will always be grateful to Graham Bowtell at City University who decided to take a risk on a mature student without any proper mathematics qualifications.

Are you attending any exhibitions or conferences this year?

My next conference is The University of Manchester SIAM Student Chapter Conference 2015. This annual event is another great opportunity to hear what young researchers are working on. I co-judge the Best talk and Best Poster prizes. It is a nice way to give back to a community that NAG really relies on.

Thursday, 26 March 2015

Introducing the team: Mick Pont, NAG Principal Technical Consultant

Over the next few months, scattered in between our technical blog posts, we are going to publish interviews with NAG colleagues. Our first interview is with Mick Pont.

Mick, what is your role at NAG? 

I'm a Principal Technical Consultant, and Deputy Manager of the Development Division.

I'm involved in the development and peer review of new NAG Library software and documentation, and in the scheduling of software production in line with company targets.

In "project management" speak, I'm named as "Executive" for the NAG Mark 26 Library (the NAG Library, Mark 25 is due for launch in April 2015) project, which means that I'm supposed to "provide support and advice to the project manager, and ensure that the project outcome is good and provides NAG with increased value".

I also produce the NAG Technical Support rotas (most of the NAG development team are on one or other of these rotas regularly). Handling support calls is a good way to get to know our customers, understand the issues they face and help solve them.

I build and test most of the Windows-based libraries that NAG produces, but I'm an operating system agnostic - I like to use Linux and other systems too.

I often go on customer site visits with my sales colleagues, as technical backup.

I also teach various training courses for NAG software, including the NAG Toolbox for MATLAB® and the NAG Library; these training days usually take place at customer sites.

Can you give examples of the customer problems you help solve when you’re in your support role?

Helping users get their NAG software and licence keys installed correctly; helping them migrate from one version of the NAG Library to another; advising on the best NAG routine to solve a particular problem.

Tell us something special/unusual about life at NAG?

We try to celebrate all of the world's festivals where appropriate. Icelandic cream bun day is a big favourite of mine.

Tell us something special/unusual about yourself?

I'm a solipsist, and I choose not to believe in this question.

What industry events are you going to be attending this year?

Lots of training days at customer sites around the UK, and later in the year I will be attending SC15 in Austin, Texas.

Thursday, 5 March 2015

Optimization, NAG and Me - 5 Years and Counting

Authored by Jan Fiala, NAG Numerical Software Developer

It feels almost like yesterday since I joined NAG so it is hard to believe that it has been 5 years already. Looking back, I see a long path I would like to tell you a bit about, just in case you were curious about one of the people answering your support queries.

I was always a blend of a computer scientist and a mathematician. Computers and programming were my hobby but it did not feel quite right to choose either as the main subject at university so I picked mathematics. Thereafter began my hunt to get as close to computers as possible. Numerical analysis and mathematical programming/optimization were my lucky answer.

I really enjoyed my university years, however an academic career is not for everyone. The necessity to write papers ("publish or perish"), going through a couple of postdocs before getting a tenure, fund raising, teaching and going ever deeper and deeper into a very narrow research field are some of the disadvantages. I love to learn, play and to hack things. To make them work. So, when I got the opportunity, I joined NAG.

NAG is a great place to work if you find scientific computing interesting. There is a very long history, in fact, the first NAG Library was distributed to the users even before I was born! NAG today is not confined to numerical software (NAG Libraries). The organisation has expanded to numerical and HPC services, compilers and more. However my job still consists mainly of work on the Libraries, developing new optimization solvers and maintaining the existing ones.

The best thing is the challenge -- you need get your hands dirty in various fields: from specialized solvers for derivative free optimization, through classical convex optimization to nonlinear semidefinite programming. I went through small pet projects (have you read my AMPL tutorial) through bespoke versions of our solvers tailored to special customers' data to a 50000 lines beast, a new semidefinite programming solver which will be released soon.

In addition, you mustn't forget to help the customers via our friendly support desk, write documentation and examples and keep an eye on the up-to-date algorithmic developments. I can still attend conferences as before, only the focus has changed from an academic to a more pragmatic approach. The relative freedom of the focus is almost limitless, as my ideas contribute to the development roadmap and thence to the future development. I won't lie to you. The time was not all green and great -- writing test programs or bug fixes can be tedious but they have to be done. At least it makes you look forward to the next new challenge which is waiting just around the corner.

If my story sounds familiar, you might be pleased to know that NAG is expanding our optimization team. So ladies and gentlemen, why are you waiting? Apply now and help us make the next big project happen.

Wednesday, 25 February 2015

Advanced Analytics on Apache Spark

Developed in AMPLab at UC Berkeley, Apache Spark has become an increasingly popular platform to perform large scale analysis on Big Data. With run-times up to 100x faster than MapReduce, Spark is well suited for machine learning applications.

Spark is written in Scala but has APIs for Java and Python. As the NAG Library is accessible from both Java and Python, this allows Spark users access to over 1600 high quality mathematical routines. The NAG Library covers areas such as:
  • Machine Learning including
    • Linear regression (with constraints)
    • Logistic regression (with constraints)
    • Principal Component Analysis (A good article relating Machine Learning and PCA can be found here)
    • Hierarchical cluster analysis
    • K-means
  • Statistics including
    • Summary information (mean, variance, etc)
    • Correlation
    • Probabilities and deviates for normal, student-t, chi-squared, beta, and many more distributions
    • Random number generation
    • Quantiles
  • Optimization including
    • Linear, nonlinear, quadratic, and sum of squares for the objective function
    • Constraints can be simple bounds, linear, or even nonlinear
  • Matrix functions
    • Inversion
    • Nearest correlation
    • Eigenvalues + eigenvectors

Calling the NAG Library on Spark

The fundamental datatype used in Spark is the Resilient Distributed Dataset (RDD). A RDD acts as a pointer to your distributed data on the filesystem. This object has intuitive methods (count, sample, filter, map/reduce, etc) and lazy evaluation that allow for fast and easy manipulation of distributed data.

Below is a simple Python example of using the NAG Library in Spark to calculate the cumulative Normal distribution function on a set of numbers (the message passing output from Spark has been omitted):

SparkContext available as sc
>>>  from nag4py.s import nag_cumul_normal
>>>  myNumbers = sc.parallelize( [-2.0, -1.0, 0.0, 1.0, 2.0] )
>>>  myNumbers.takeSample(False, 5, 0)
[ 0.0, -2.0, -1.0, 1.0, 2.0] 

>>> nag_cumul_normal ).takeSample(False, 5, 0)
[0.5, 0.02275, 0.15865, 0.84134, .97725]

It should be noted that the vast majority of the algorithms employed in the NAG library require all relevant data to be held in memory. This may seem to deviate from the Spark ecosystem, however when working with large datasets, two usage scenarios are commonly seen:
  1. The full dataset is split into subsets, for example a dataset covering the whole world may be split by country, county and city and an independent analysis carried out on each subset. In such cases all the relevant data for a given analysis may be held on a single node and therefore can be processed directly by NAG library routines.
  2. A single analysis is required that utilizes the full dataset. In this case it is sometimes possible to reformulate the problem. For example many statistical techniques can be reformulated as a maximum likelihood optimization problem. The objective function of such an optimization (the likelihood) can then be evaluated using the standard Spark map/reduce functions and the results fed back to one of the robust optimization routines available in the NAG library.

For more information on using the NAG Library in Spark or any of the other topics touched upon in this article please contact NAG at

Thursday, 22 January 2015

Adding a Slider Widget to Implied Volatility

In the last post on Implied Volatility, we downloaded real options data from the CBOE and calculated the volatility curves/surface. We saw the calculations of 30,000 implied volatilities in roughly 10 seconds. 

In this post we concentrate on the speed of calculating implied volatility via a variety of different methods. We look at the volatility curve/surface using Python's Scipy, the NAG Library for Python, and the NAG C Library. In addition, we've added a slider widget to the Python graphs from before to see the real-time effects of changing the interest and dividend rates (see the video below). All the code can be downloaded to produce the graphs, and a NAG license is not required for the case using scipy.optimize.fsolve.

The script and utility methods can be downloaded from here. The script begins by generating sample option prices. These are fed through different root finding methods (chosen by the user) to back out the implied volatilities. 

The methods tested include:
1) scipy (scipy.optimize.fsolve) - A wrapper around the hybrd and hybrj algorithms in the MINPACK Fortran library, followed by a Python Black-Scholes formula.
2) nag4py (The NAG Library for Python) - A wrapper (nag_zero_cont_funct_brent) around Brent's method for the root-finding, followed by nag_bsm_price for the Black-Scholes formula.
3) ctypes (The NAG C Library, Mark 23) - The same NAG functions as (2), but the looping and calculations are done directly in C, rather than through the nag4py wrapper layer. This requires building a shared library on your own machine, which is then loaded from the main script.

Running the Script

You can run the script using the following command, where a_method is one of {scipy, nag4py, ctypes}:
$ python --method a_method

Note that option (3) (ctypes) requires you to build a shared library. The build command for Windows, Linux and Mac can be found below. NAGCDIR should be the directory of your NAG C Library installation. The C code (nag_imp_vol.c) is included in the download.

    gcc -Wl,--no-undefined -fPIC -shared nag_imp_vol.c -INAGCDIR/include NAGCDIR/lib/ -o

    gcc -fPIC -shared nag_imp_vol.c -INAGCDIR/include NAGCDIR/lib/libnagc_nag.dylib -o nag_imp_vol.dylib

    cl /LD /DLL /MD -I"NAGCDIR\include" nag_imp_vol.c /link /LIBPATH:"NAGCDIR\lib" "NAGCDIR\lib\nagc_nag_MD.lib"

Running the script with --method ctypes produces the following result:

Timing the methods

We've added a timer around the first call when calculating the implied volatilities. You can also change n_strikes in to alter the total number of calculations. The base case uses 50 strikes, 5 expirations, and 2 option types (Call and Put) for a total of N = 50 * 5 * 2 = 500 implied volatilities.

The approximate times in seconds of each method are displayed below. (N is the total number of implied volatilities calculated.)

N scipy nag4py NAG C Library
500 1.40 0.26 0.008
1000 2.84 0.50 0.016
2000 5.74 1.00 0.020
4000 11.40 1.99 0.033
Some notes:
  • While scipy.fsolve looks considerable slower than nag4py, this is not the case. The differences in speed are a result of calling a pure python Black-Scholes formula vs. using nag4py's nag_bsm_price function.
  • The times for fsolve and nag4py scale somewhat linearly, while the NAG C shared library doesn't. I suspect this is due to the overhead of preparing the data into numpy arrays before calling the shared library.
  • If you encounter dependencies issues with Scipy or Matplotlib, we recommend switching to a Python distribution such as Anaconda or Canopy.
  • The code uses serial implementations in NAG and Python. You could look at a more advanced version that uses multi-threading or call a function that solves a system of equations (scipy.optimize.root or nag_zero_nonlin_eqns_easy).
  • The above analysis is a very oversimplified model of implied volatility. In practice, one should use a more complex model and look at other root-finding methods (such as the rational approximation).
  • You can also increase the number of calculations past 4000, but Python seems to have trouble plotting and updating the graphs.