Thursday, 21 May 2015

Introducing the team: Gail Austin, Customer Support Coordinator, Team Leader

Gail, describe your role at NAG?

I manage the Technical Support Service, the Sales Operations Team, NAG Oxford Reception and I am website editor for the marketing area of the website.

Can you give examples of the customer problems you help solve via NAG Technical Support?

Rather than solve users technical issues my role is connected to the direction of incoming support queries and the issuing and renewal of software licences. Over time I have come to understand the installation of our software and so can help users with some of their licensing and software applicability issues.

Tell us something special/unusual about NAG?

I believe that NAG truly cares about giving our customers the best possible service.  This comes across again and again in the positive feedback we receive following the closure of technical support queries.

Tell us something special/unusual about yourself? 

I’m just a little bit proud of the fact that when helping Mick Pont with some tests he was carrying out for a presentation of NAG’s Random Number Generator, I was found to be more random!

Are you attending any exhibitions or conferences this year?

I’m not, I’m too busy in the office, helping our customers to get the best from NAG.

What’s been the highlight of your time working at NAG so far?

That’s a difficult one, I don’t think I can put my finger on any one thing.  I think it’s enough to say that I hope to stay at NAG for the remainder of my career, I love my job here and I consider myself really lucky to have finally found my niche, I’m as at home here as I am ……… at home :-)

Wednesday, 29 April 2015

Professor Mike Powell F.R.S

We are pained to pass on the sad news of the death on 19th April of Prof Mike Powell F.R.S, a brilliant numerical analyst specialising in numerical optimisation and approximation theory.  

Mike touched many lives with his work. NAG and NAG's community benefitted greatly because he gave freely his code to NAG from the very beginning. The more mature may remember the routine E04DCF, an implementation of his hybrid method for unconstrained optimisation. More recently his work on derivative-free optimisation led to E04JCF/e04jcc entering our Libraries as implementations of BOBYQA.

As a founder member of the IMA, Mike took an interest in the IMANA Newsletter and encouraged NAG to contribute regularly to that until the Newsletter's role was subsumed by the internet.  

The Wikipedia page http://en.wikipedia.org/wiki/Michael_J._D._Powell gives more detail of Mike's many achievements.
 
He will be greatly missed by NAG, especially past and present colleagues who had the pleasure of knowing him personally. We pass on our love, condolences and very best wishes to his family.

 

Wednesday, 15 April 2015

Introducing the team: Craig Lucas, Senior Technical Consultant


Craig, describe your role at NAG?

I am a Senior Technical Consultant based in NAG’s Manchester Office. Here I manage part of NAG’s High Performance Computing group. I come from a numerical linear algebra background. For my MSc dissertation I looked at nearest correlation matrix problems, something I still work on today, and this year I have an MSc student looking at more new algorithms.  Another big interest of mine is shared memory parallelism with OpenMP, and in particular helping users to better exploit it.

Can you give examples of the customer problems you help solve via NAG Technical Support?

Technical Support is a great way to interact with our users. Queries can range from something very simple, helping people link their program to the NAG Library for the first time, to an in-depth discussion about the accuracy of an optimization routine. It is often challenging and always an opportunity to learn something new about the mathematics of the Library. Sometimes I need to read our documentation to help me get standard!

Tell us something special/unusual about NAG?

Working at NAG is very collaborative, whether that is working with universities or with other NAG colleagues. One day I might be listening to a Manchester student talking about their new research, another I might be at the whiteboard in the office discussing the nuances of a new NAG algorithm.

Tell us something special/unusual about yourself?

I came to Mathematics relatively late and was 27 when I went to university. I had somehow drifted into Civil Engineering as a teenager, working in the area of land reclamation. Much of my work was dealing with closed coal mines; this was Staffordshire in the late 1980’s. Perhaps I had some sort of epiphany, I certainly realised that a childhood interest needed some proper exploration.  I will always be grateful to Graham Bowtell at City University who decided to take a risk on a mature student without any proper mathematics qualifications.

Are you attending any exhibitions or conferences this year?

My next conference is The University of Manchester SIAM Student Chapter Conference 2015. This annual event is another great opportunity to hear what young researchers are working on. I co-judge the Best talk and Best Poster prizes. It is a nice way to give back to a community that NAG really relies on.

Thursday, 26 March 2015

Introducing the team: Mick Pont, NAG Principal Technical Consultant


Over the next few months, scattered in between our technical blog posts, we are going to publish interviews with NAG colleagues. Our first interview is with Mick Pont.

Mick, what is your role at NAG? 

I'm a Principal Technical Consultant, and Deputy Manager of the Development Division.

I'm involved in the development and peer review of new NAG Library software and documentation, and in the scheduling of software production in line with company targets.

In "project management" speak, I'm named as "Executive" for the NAG Mark 26 Library (the NAG Library, Mark 25 is due for launch in April 2015) project, which means that I'm supposed to "provide support and advice to the project manager, and ensure that the project outcome is good and provides NAG with increased value".

I also produce the NAG Technical Support rotas (most of the NAG development team are on one or other of these rotas regularly). Handling support calls is a good way to get to know our customers, understand the issues they face and help solve them.

I build and test most of the Windows-based libraries that NAG produces, but I'm an operating system agnostic - I like to use Linux and other systems too.

I often go on customer site visits with my sales colleagues, as technical backup.

I also teach various training courses for NAG software, including the NAG Toolbox for MATLAB® and the NAG Library; these training days usually take place at customer sites.

Can you give examples of the customer problems you help solve when you’re in your support role?

Helping users get their NAG software and licence keys installed correctly; helping them migrate from one version of the NAG Library to another; advising on the best NAG routine to solve a particular problem.

Tell us something special/unusual about life at NAG?

We try to celebrate all of the world's festivals where appropriate. Icelandic cream bun day is a big favourite of mine.

Tell us something special/unusual about yourself?

I'm a solipsist, and I choose not to believe in this question.

What industry events are you going to be attending this year?

Lots of training days at customer sites around the UK, and later in the year I will be attending SC15 in Austin, Texas.

Thursday, 5 March 2015

Optimization, NAG and Me - 5 Years and Counting

Authored by Jan Fiala, NAG Numerical Software Developer

It feels almost like yesterday since I joined NAG so it is hard to believe that it has been 5 years already. Looking back, I see a long path I would like to tell you a bit about, just in case you were curious about one of the people answering your support queries.

I was always a blend of a computer scientist and a mathematician. Computers and programming were my hobby but it did not feel quite right to choose either as the main subject at university so I picked mathematics. Thereafter began my hunt to get as close to computers as possible. Numerical analysis and mathematical programming/optimization were my lucky answer.

I really enjoyed my university years, however an academic career is not for everyone. The necessity to write papers ("publish or perish"), going through a couple of postdocs before getting a tenure, fund raising, teaching and going ever deeper and deeper into a very narrow research field are some of the disadvantages. I love to learn, play and to hack things. To make them work. So, when I got the opportunity, I joined NAG.

NAG is a great place to work if you find scientific computing interesting. There is a very long history, in fact, the first NAG Library was distributed to the users even before I was born! NAG today is not confined to numerical software (NAG Libraries). The organisation has expanded to numerical and HPC services, compilers and more. However my job still consists mainly of work on the Libraries, developing new optimization solvers and maintaining the existing ones.

The best thing is the challenge -- you need get your hands dirty in various fields: from specialized solvers for derivative free optimization, through classical convex optimization to nonlinear semidefinite programming. I went through small pet projects (have you read my AMPL tutorial) through bespoke versions of our solvers tailored to special customers' data to a 50000 lines beast, a new semidefinite programming solver which will be released soon.

In addition, you mustn't forget to help the customers via our friendly support desk, write documentation and examples and keep an eye on the up-to-date algorithmic developments. I can still attend conferences as before, only the focus has changed from an academic to a more pragmatic approach. The relative freedom of the focus is almost limitless, as my ideas contribute to the development roadmap and thence to the future development. I won't lie to you. The time was not all green and great -- writing test programs or bug fixes can be tedious but they have to be done. At least it makes you look forward to the next new challenge which is waiting just around the corner.

If my story sounds familiar, you might be pleased to know that NAG is expanding our optimization team. So ladies and gentlemen, why are you waiting? Apply now and help us make the next big project happen.

Wednesday, 25 February 2015

Advanced Analytics on Apache Spark

Developed in AMPLab at UC Berkeley, Apache Spark has become an increasingly popular platform to perform large scale analysis on Big Data. With run-times up to 100x faster than MapReduce, Spark is well suited for machine learning applications.

Spark is written in Scala but has APIs for Java and Python. As the NAG Library is accessible from both Java and Python, this allows Spark users access to over 1600 high quality mathematical routines. The NAG Library covers areas such as:
  • Machine Learning including
    • Linear regression (with constraints)
    • Logistic regression (with constraints)
    • Principal Component Analysis (A good article relating Machine Learning and PCA can be found here)
    • Hierarchical cluster analysis
    • K-means
  • Statistics including
    • Summary information (mean, variance, etc)
    • Correlation
    • Probabilities and deviates for normal, student-t, chi-squared, beta, and many more distributions
    • Random number generation
    • Quantiles
  • Optimization including
    • Linear, nonlinear, quadratic, and sum of squares for the objective function
    • Constraints can be simple bounds, linear, or even nonlinear
  • Matrix functions
    • Inversion
    • Nearest correlation
    • Eigenvalues + eigenvectors

Calling the NAG Library on Spark

The fundamental datatype used in Spark is the Resilient Distributed Dataset (RDD). A RDD acts as a pointer to your distributed data on the filesystem. This object has intuitive methods (count, sample, filter, map/reduce, etc) and lazy evaluation that allow for fast and easy manipulation of distributed data.

Below is a simple Python example of using the NAG Library in Spark to calculate the cumulative Normal distribution function on a set of numbers (the message passing output from Spark has been omitted):

SparkContext available as sc
>>>  from nag4py.s import nag_cumul_normal
>>>  myNumbers = sc.parallelize( [-2.0, -1.0, 0.0, 1.0, 2.0] )
>>>  myNumbers.takeSample(False, 5, 0)
[ 0.0, -2.0, -1.0, 1.0, 2.0] 

>>>  myNumbers.map( nag_cumul_normal ).takeSample(False, 5, 0)
[0.5, 0.02275, 0.15865, 0.84134, .97725]


It should be noted that the vast majority of the algorithms employed in the NAG library require all relevant data to be held in memory. This may seem to deviate from the Spark ecosystem, however when working with large datasets, two usage scenarios are commonly seen:
  1. The full dataset is split into subsets, for example a dataset covering the whole world may be split by country, county and city and an independent analysis carried out on each subset. In such cases all the relevant data for a given analysis may be held on a single node and therefore can be processed directly by NAG library routines.
  2. A single analysis is required that utilizes the full dataset. In this case it is sometimes possible to reformulate the problem. For example many statistical techniques can be reformulated as a maximum likelihood optimization problem. The objective function of such an optimization (the likelihood) can then be evaluated using the standard Spark map/reduce functions and the results fed back to one of the robust optimization routines available in the NAG library.

For more information on using the NAG Library in Spark or any of the other topics touched upon in this article please contact NAG at support@nag.com.

Thursday, 22 January 2015

Adding a Slider Widget to Implied Volatility

In the last post on Implied Volatility, we downloaded real options data from the CBOE and calculated the volatility curves/surface. We saw the calculations of 30,000 implied volatilities in roughly 10 seconds. 

In this post we concentrate on the speed of calculating implied volatility via a variety of different methods. We look at the volatility curve/surface using Python's Scipy, the NAG Library for Python, and the NAG C Library. In addition, we've added a slider widget to the Python graphs from before to see the real-time effects of changing the interest and dividend rates (see the video below). All the code can be downloaded to produce the graphs, and a NAG license is not required for the case using scipy.optimize.fsolve.

The script and utility methods can be downloaded from here. The script begins by generating sample option prices. These are fed through different root finding methods (chosen by the user) to back out the implied volatilities. 

The methods tested include:
1) scipy (scipy.optimize.fsolve) - A wrapper around the hybrd and hybrj algorithms in the MINPACK Fortran library, followed by a Python Black-Scholes formula.
2) nag4py (The NAG Library for Python) - A wrapper (nag_zero_cont_funct_brent) around Brent's method for the root-finding, followed by nag_bsm_price for the Black-Scholes formula.
3) ctypes (The NAG C Library, Mark 23) - The same NAG functions as (2), but the looping and calculations are done directly in C, rather than through the nag4py wrapper layer. This requires building a shared library on your own machine, which is then loaded from the main script.

Running the Script

You can run the script using the following command, where a_method is one of {scipy, nag4py, ctypes}:
$ python implied_volatility.py --method a_method

Note that option (3) (ctypes) requires you to build a shared library. The build command for Windows, Linux and Mac can be found below. NAGCDIR should be the directory of your NAG C Library installation. The C code (nag_imp_vol.c) is included in the download.

Linux
    gcc -Wl,--no-undefined -fPIC -shared nag_imp_vol.c -INAGCDIR/include NAGCDIR/lib/libnagc_nag.so -o nag_imp_vol.so

Mac
    gcc -fPIC -shared nag_imp_vol.c -INAGCDIR/include NAGCDIR/lib/libnagc_nag.dylib -o nag_imp_vol.dylib

Windows
    cl /LD /DLL /MD -I"NAGCDIR\include" nag_imp_vol.c /link /LIBPATH:"NAGCDIR\lib" "NAGCDIR\lib\nagc_nag_MD.lib"

Running the script with --method ctypes produces the following result:



Timing the methods

We've added a timer around the first call when calculating the implied volatilities. You can also change n_strikes in implied_volatility.py to alter the total number of calculations. The base case uses 50 strikes, 5 expirations, and 2 option types (Call and Put) for a total of N = 50 * 5 * 2 = 500 implied volatilities.

The approximate times in seconds of each method are displayed below. (N is the total number of implied volatilities calculated.)






N scipy nag4py NAG C Library
500 1.40 0.26 0.008
1000 2.84 0.50 0.016
2000 5.74 1.00 0.020
4000 11.40 1.99 0.033
Some notes:
  • While scipy.fsolve looks considerable slower than nag4py, this is not the case. The differences in speed are a result of calling a pure python Black-Scholes formula vs. using nag4py's nag_bsm_price function.
  • The times for fsolve and nag4py scale somewhat linearly, while the NAG C shared library doesn't. I suspect this is due to the overhead of preparing the data into numpy arrays before calling the shared library.
  • If you encounter dependencies issues with Scipy or Matplotlib, we recommend switching to a Python distribution such as Anaconda or Canopy.
  • The code uses serial implementations in NAG and Python. You could look at a more advanced version that uses multi-threading or call a function that solves a system of equations (scipy.optimize.root or nag_zero_nonlin_eqns_easy).
  • The above analysis is a very oversimplified model of implied volatility. In practice, one should use a more complex model and look at other root-finding methods (such as the rational approximation).
  • You can also increase the number of calculations past 4000, but Python seems to have trouble plotting and updating the graphs.

Thursday, 2 October 2014

Problem needed for research on Bermudan Option Pricing Algorithms

Introduction

NAG together with Prof. Oosterlee and an MSc student from TU Delft are investigating the recent Stochastic Grid Bundling Method (SGBM) [1,2]. The objective is to compare the performance of SGBM to the well-known Longstaff-Schwartz (least squares method or LSM) in a non-academic setting, i.e. on the pricing of a Bermudan option, with underlying asset(s) driven by a realistic process such as Heston or LMM. We are looking for an interesting case to test these two methods. This includes the type of option, the underlying processes and any other important features or details.

Outline

The well known LSM by Longstaff-Schwartz[3] is the industry standard for pricing multi-dimensional Bermudan options by simulation and regression. LSM is based on the regression now principle, whereas the Stochastic Grid Bundling Method (SGBM) by Jain and Oosterlee applies regression later in order to get more accurate approximations. However, this limits us to apply SGBM to processes where an analytical or approximate expression of the discounted moments are available.

Another advantage of SGBM is its regression on bundles instead of the whole data set in a further attempt to decrease the regression error, and SGBM allows the computation of the Greeks at almost no extra cost.

Numerical results show that, compared to LSM, a higher accuracy can be obtained at comparable computational time. However, these tests were performed on academic problems, using geometric Brownian motion for the underlying assets.

Very recent research (Feng and Oosterlee, Sept 2014) extended SGBM for stochastic volatility and stochastic interest rate dynamics (Heston-Hull-White). This led to the possibility of comparing the performance of SGBM in a non-academic setting and is the cause for our research. We therefore wish to compare LSM and SGBM on a problem which is interesting and relevant to industry. Realistically, handling all the complexities of traded products will probably require more time than we have (around two months), so ideally we seek a problem which captures all the salient features and allowing us to see whether SGBM outperforms LSM.

Industry Involvement

We would like some advice in defining the product to be valued, especially regarding dimensionality, process, correlation structure, payoff and exercise features. Any additional industry involvement will be light: perhaps a few emails to clarify details, a conf call or face to face meeting.

For correspondence or further details please email support@nag.co.uk

References

[1] S. Jain and C. W. Oosterlee, "The Stochastic Grid Bundling Method: Efficient Pricing of Bermudan Options and their Greeks,'' papers, SSRN, Sept. 2013.
[2] S. Jain and C. W. Oosterlee, "Pricing high-dimensional Bermudan options using the stochastic grid method,'' International Journal of Computer Mathematics, 89(9):1186-1211, 2012.
[3] F. Longstaff and E. Schwartz,  "Valuing American Options by Simulation: a Simple Least-squares Approach,''  Review of Financial Studies, vol. 14, no. 1, pp. 113-147, 2001.

Thursday, 21 August 2014

Gaussian Mixture Model

With the release of Mark 24 of the NAG C Library comes a plethora of new functionality including matrix functions, pricing Heston options w/term structure, best subset selection, and element-wise weightings for the nearest correlation matrix.

Among the new routines I was excited to test out was the Gaussian mixture model (g03ga). This routine will take a set of data points and fit a mixture of Gaussians for a given (co)variance structure by maximizing the log-likelihood function. The user inputs the (co)variance structure, number of groups, and (optionally) the initial membership probabilities.

I decided to test out this new functionality, which is also in Mark 24 of the NAG Toolbox for MATLAB. Often I will use MATLAB with the NAG Toolbox before switching to C++ and the NAG C Library for my production code. So I generated some data and tried the routine to see if it could find the covariance structure. You can download the script and try it out for yourself here. The example will generate the test data, run the NAG Gaussian mixture routine and plot the results. An example of the output is given below:


The blue points are the generated data, while the red and yellow ovals show the covariance structure output from NAG Gaussian mixture model (the ovals are contours of ~0.60 density for their respective groups).

While running the example a couple times and re-sampling through the starting values for the initial membership probabilities, I noticed what I thought to be unusual behavior for the routine. Namely, the Gaussian mixture model algorithm isn't able to identify the Gaussian mixtures. The function would occasionally converge to the below structure (run the above script and click 'Resample' 3 times):


It appears the routines has converged to local extrema of the likelihood function. This happens as a result of randomizing the initial membership probabilities.

Since we have the power of the NAG Library at our disposal, I've added a K-means clustering option in the above script to initialize the membership probabilities to a particular cluster before being input into the Gaussian mixture model.


My colleagues tell me that k-means can also get stuck in a local minima and exhibit this 'wrong' behavior as well, thus one should always be careful with initial allocations - luckily the NAG Library provides a generally acceptable default allocation as an option! Many thanks to Martyn Byng and Stephen Langdell for comments on this post.

Tuesday, 17 June 2014

Secrets of HPC Procurement

Liked my article today in HPC Wire "Secrets of the Supercomputers"? I firmly poke fun at various elements of an imaginary supercomputer procurement process. However, I'm sure many readers will also see familiar and painfully serious aspects in the fictional story.

As I mention at the bottom of that article, NAG can help make the process of buying HPC systems much better than the worrying case in the article.

For example, the tutorial I ran at SC13 with Terry Hewitt titled "Effective Procurement of HPC Systems" was very popular (~100 attendees). We have provided similar training as part of consulting engagements and we are now looking at running the tutorial again as an open short course.

Thursday, 10 April 2014

Testing Matrix Function Algorithms Using Identities

Edvin Deadman and Nick Higham (University of Manchester) write:


In a previous blog post we explained how testing new algorithms is difficult. We discussed the forward error (how far from the actual solution are we?) and the backward error (what problem have we actually solved?) and how we'd like the backward error to be close to the unit roundoff, u.

For matrix functions, we also mentioned the idea of using identities such as sin2A + cos2A = I to test algorithms. In practice, rather than I, we might find that we obtain a matrix R close to I, perhaps with ||R-I|| ≈ 10-13. What does this tell us about how the algorithms for sin A and cos A are performing? In particular, does it tell us anything about the backward errors? We've just written a paper which aims to answer these questions. This work is an output of NAG's Knowledge Transfer Partnership with the University of Manchester, so we thought we'd blog about it here.

Let's consider the identity exp(log A) - A = 0. Suppose that when we evaluate the left-hand side in floating point arithmetic we get a nonzero residual R rather than 0. We'll assume that this residual is  caused by some backward errors E1 and E2 so that exp(log(A + E1) + E2) = R. We'd like to investigate how big R can be when E1 and E2 are small, so we expand the left-hand side in a Taylor series to linear order. After a bit of algebra, the result is a linear operator relating R to E1 and E2R = L(E1E2). The operator is different for each identity considered, but it always involves the Fréchet derivatives of the matrix functions in the identity (the full gory details, including formulae for the linear operators associated with various identities, are in our paper).

We now have two options. The first option is to estimate the norm of the linear operator. This enables us to determine the maximum value of ||R|| consistent with backward errors of size u. The second option is to use the linear operator to explicitly estimate E1 and E2. This works just as well, but it is more expensive.

Our paper contains several examples demonstrating our approach in practice, so we'll just pick one here, involving the matrix exponential. The 10 x 10 Forsythe matrix is


where we set the parameter u =  2-53 ≈ 10-16 (the unit roundoff). It's known that the Schur-Parlett general-purpose matrix function algorithm struggles with this matrix, whereas algorithms based on scaling and squaring have no such problems. Does our approach reflect this behaviour? Using the identity ee-A = I, the maximum normwise residual consistent with backward stability is found to be 7.1 x 10-15. The Schur-Parlett approach gives a normwise residual of 7.7 x 10-11, but the scaling and squaring algorithm gives a residual of 2.2 x 10-15. Consistent with this, the backward error estimates are  2.5 x 10-11 and 4.5 x 10-16 respectively. So our approach has correctly identified the preferred algorithm.

There are some subleties which we haven't mentioned (for example certain matrices and identities can produce misleading cancellation effects) but we now have a useful method for testing matrix function algorithms when high-precision arithmetic isn't available. The relevant routines for computing matrix functions and Fréchet derivatives are available in chapter F01 of the NAG Library.