Friday, 20 August 2010

What's in a Name?

Newcomers to NAG often ask about the names of NAG routines. Certainly the names appear strange and unfamiliar at first, but to many NAG users the name encapsulates the heritage of quality and precision offered by NAG.

Originally the NAG Libraries were created by contributors from U.K. academia. Some of these wrote in Algol 60 and others in Fortran. It was agreed to have two libraries of identical content, so that every contribution in one language had a counterpart in the other. Thus both communities were appeased. Early printed documentation included both language versions.

It was clearly desirable to be able to distinguish one language version of an algorithm from the same algorithm implemented in the alternate language and in general this is still desirable if a single program needs to call both instances of the algorithm.

Equally the early developers saw that, as new algorithms were developed, replacement routines would be required with perhaps different calling sequences, so that any naming scheme chosen needed to accommodate the ongoing development of the libraries. Another desirable feature would be to naturally group by name routines that lay in the same field of numerical computation.

With these goals in mind the founders of NAG considered two lines of approach.

The first idea looked at the concept of 'meaningful' names, names which might convey the algorithm being used or the problem addressed. Even today this is difficult, but in the early 1970s standard Fortran did not permit more than 6 characters for any routine name. And NAG certainly did not want to have Fortran and Algol names so radically different from each other that the algorithmic connection was obscured. The problem of updating also needed to be solved.

The alternative looked much more appealing. At that time a modified SHARE classification system had been published and NAG decided to base its routine names on that. This gave rise to the names we now see. Each name is 6 characters long, the final letter denoting the language of implementation. In those days 'F' denoted Fortran and 'A' denoted Algol 60. Subsequently this would be extended as NAG produced both single and double precision libraries in Fortran and as NAG produced other language versions such as C. The first characters, generally the first 3, indicated the broad subject ( or chapter) area to which the algorithm belonged. (Characters in positions 2 and 3 are always digits.) Thus E04WDF is a routine in the FORTRAN Library that lies in the E04 chapter. This chapter deals with 'Minimizing or Maximizing a Function'. The letters 'WD' uniquely identify the algorithm.

Of course modern Fortran no longer has a restriction of only 6 characters for routine names and so it is perhaps worthwhile to consider whether the NAG naming scheme can usefully take advantage of this relaxation.

The debate rages within NAG. Certainly the old problems remain if we wish to have 'meaningful' names, since routines will change with time. Do we describe the problem to be tackled or the algorithm used? If the former then NAG may wish to offer several techniques. In the C06 chapter we offer both Weeks' and Crump's method for Inverse Laplace transforms for example. Problem addressed is evidently not enough to uniquely identify an algorithm. Description of the routine by algorithm alone falls down heavily in the optimisation chapter where 'Sequential Quadratic Programming' and 'Quasi-Newton' and 'Modified-Newton' account for the vast majority of routines. We are thus led towards a combination of the two.

Looking in detail at the effect of combining problem and method into a meaningful name we rapidly encounter the very powerful and flexible routines of the NAG optimisation chapter. Here one routine often addresses several related, but different, problems and has both sparse and dense counterparts. Long names might indeed become very long if they are to convey any useful meaning and not deceive.

We see an early attempt by NAG to provide long, meaningful, names in its C Library. Users will be familiar with nag_opt_lin_lsq to denote the C routine e04ncc. This routine also solves convex qp problems, something the reader would not deduce from this name, yet this is probably its most important role.

Proponents of meaningful long names argue that the solution to this is to have separate routines with only a single purpose. Their opponents argue that it is easy for users to effectively choose their own names for routines, given the variety of options open to them: pre-processors, wrappers and aliasing for example.

What do you think? How many characters in a long name would you tolerate and how would you name NAG routines?

No comments:

Post a Comment

NAG moderates all replies and reserves the right to not publish posts that are deemed inappropriate.