Types in The NAG Toolbox for Matlab®

I was recently asked why the NAG Toolbox for Matlab requires users to be explicit about the types of parameters passed to NAG routines, for example:

[vr, info] = f08nf(int64(1), int64(4), vr, tau);

and similarly why output parameters like info in this example have class int64 as opposed to double.

The original reason for this was efficiency. The NAG Toolbox consists of wrappers to NAG Library routines and there are always issues in transforming data into the correct format when calling between different languages. As well as the top-level call (like the one shown here), we also have to transform data when calling Matlab functions in "call backs", for example when evaluating the objective function during the solution of an optimisation problem. In such cases there may be many thousands of calls and so we want to make the data transformation as efficient as possible. It was for this reason that we decided to put the onus on the user to do any coercions and checks on the validity of those coercions. When we beta tested our original prototype the feedback we got was that this wasn't a big deal.

Back then, all the integers used in the NAG Toolbox were 32 bit, even on 64 bit machines. This was because the version of LAPACK that Matlab used was 32 bit and we needed to be compatible with that since we were calling those routines ourselves. In Matlab, 32 bit integers behave quite nicely: you can perform arithmetic on them and use them as the bounds in loops so users were unlikely to notice the types of objects that we returned. However when Matlab switched to 64 bit integer versions of LAPACK we had to follow, and that's when things started to get harder. Back then it wasn't possible to do arithmetic on 64 bit integers in Matlab or to loop over them, and so programs to call our routines and their associated call-backs started to look quite clumsy with lots of casts between int64 and double. At this point we started to consider whether to accept and return doubles and do the coercions transparently inside our wrappers.

Unfortunately there is a problem with this approach, in that not all 64 bit integers can be represented as doubles. For example consider the following Matlab session:

>> x=2^53
x =
>> x+1-x
ans =
>> n=int64(2)^53
n =
>> n+1-x
ans =

Since there are only 53 bits available to represent the mantissa in a double precision number, once we get beyond 253 we can't represent every integer exactly.  For example in this case we can represent x and x+2, but not x+1.  In many cases integers returned by NAG routines will never be big enough for this to be a problem but there are places where we cannot be sure of this. The good news is that recent versions of Matlab support arithmetic on 64 bit integers although you still cannot loop over them, so the problem has been alleviated but hasn't quite gone away.

There is another issue with having versions of the Toolbox that accept different kinds of integers, which is that it's harder to write portable code. In the next version of the Toolbox we will provide two functions nag_int and nag_int_name to allow users to perform operations like:

[vr, info] = f08nf(nag_int(1), nag_int(4), vr, tau);
a = zeros(100, nag_int_name);

which will work correctly on both 32- and 64-bit toolboxes.

As explained above, we can't always coerce a 64 bit integer to a double so we have no option but to pass integers to call-backs and return integers from calls to NAG routines. However we could give users the option of passing doubles in the place of integer parameters to NAG routines, and returning doubles from call-back functions. With this approach the user could avoid explicit coercions (except when using a 64 bit integer as the end point of a loop) if they wished, but they would incur a performance penalty, which would be noticeable if arrays were involved (both in terms of execution speed and memory usage). While we aren't planning this for the release later this year we would consider it for a future version if users wanted it. I'd love to hear your thoughts.


Popular posts from this blog

Implied Volatility using Python's Pandas Library

C++ wrappers for the NAG C Library

ParaView, VTK files and endianness