Monday, 23 May 2011

Programming gnuplot

One of the side effects of having the word 'visualisation' in your job title is that people expect you to know something about the subject. Accordingly, every now and again I get contacted by one of my colleagues who needs to draw a graph, or reformat an image, or incorporate a visualisation into a document, and is - usually - too busy with other (probably more important) activities to spend a lot of time figuring out how to make a graphics package behave itself and do the right thing.


I received the latest of these queries last week, from a colleague who wanted to reproduce Figure 1 (below) as part of the documentation for the NAG Parallel Library. It's a schematic illustration of the way in which, for certain routines in the Library, elements of a matrix are distributed between compute processors. More specifically, this example shows the distribution of a two-dimensional matrix consisting of twelve rows and columns (each numbered from 1 to 12) across a two by three grid of processors.
The request was to reproduce the figure using gnuplot, a plotting program that can be used to create a wide variety of two- and three-dimensional plots of data, functions and approximations to data. It runs on a variety of platforms, and is freely distributed. Gnuplot's interface is command-driven - i.e. the user either enters commands one at a time at its interactive prompt, or puts the commands into a script, which the program then loads and runs. Several example scripts (and their associated output) are available at the gnuplot site - for example, these illustrate the range of the program's functionality for plotting analytical functions. We use the program a lot for the display of the results from the example programs which form an important part of NAG Library documentation; thus, Figure 2 is a plot of the output from the c06ab routine, which is used in the numerical summation of series.
Whilst producing this sort of picture in gnuplot is comparatively straightforward, and can be done with a handful of commands, the diagram in Figure 1 presents a few more challenges. However, the elementary components aren't hard to find: thus, gnuplot has a set object rectangle command which creates a rectangle, given the coordinates of its diagonal corners, a set arrow command which creates an arrow (or a line) between two endpoints, and a set label command which adds text at a specified location. So I started writing the script in this fashion:
# draw box.
set object 1 rect from 0,0 to 6,8.2 

# add label.
set label "{1,0}" at 3,4.1 center

# draw y boundaries.
set arrow 2 from 1.5,0 to 1.5,8.2 nohead
set arrow 3 from 3.0,0 to 3.0,8.2 nohead
set arrow 4 from 4.5,0 to 4.5,8.2 nohead
...
but I hadn't got very far before realising that hardwiring all of the coordinate values wasn't a good idea if - say - the size of the boxes, or the spacing between them, needed changing. And I was getting tired of keeping track of the running number of objects, and working out in my head the locations of the boundaries. Fortunately, I recalled that a computer is good at this sort of thing (the clue is in the name, after all). I also discovered that gnuplot allows user-defined variables to be declared, and supports mathematical operations on them. This made writing the script a lot easier, but there's a large degree of repetition in the code which - although straightforward to produce by cutting and pasting - looks inelegant and verbose. In a more fully-featured programming environment, this drawback would be overcome by using control statements such as loops or branches, or by using subprograms. However, these features aren't supported by gnuplot (or if they are, I couldn't find them - and the call of more important activities was becoming too strident to ignore), so we're left with a lengthy script, which is reproduced below in all its resplendence for the curious.

Buried within it (almost at the end) is one technical point that might be of wider interest: because gnuplot is, when all's said and done, a plotting package, each script must contain at least one plot command, otherwise nothing will appear. More specifically in our example, we've used the set command to create items to be displayed by specifying values for its object, arrow and label options, but these won't be drawn on the graph until the plot command is issued. This has a compulsory argument, but having specified everything to be displayed, we don't have anything else left to be explicitly plotted. Other users who have used the command in similar circumstances (for example, here and here) have used the expression 1/0 for this argument; gnuplot evaluates this to NaN, which is interpreted as an undefined point, and is quietly ignored by the plot command when it's displaying the other objects in the figure.
# gnuplot file for figure (data distribution from the 
# processors' point of view) from 'essential introduction 
# to nag parallel library'

# NB: delete '#EPS' or '#PNG' from the following set commands to 
# generate individual image files
#EPSset term postscript eps enhanced monochrome font "Times" 
#EPSset output "fdessint2_fig.ps"
#PNGset term png enhanced medium font "TIMES"
#PNGset output "fdessint2_fig.png"

# set values for total x and y dimensions.
xLength = 25
yLength = 25

# set values for overall offset at left and top, to make room 
# for labels.
xLabelOffset = 1.0
yLabelOffset = 1.0

# set values for x and y box sizes.
xBox = 6.0
yBox = 8.2

# set values for number of cells within each box.
xNumCell = 4
yNumCell = 6

# calcuate offset between each box.  There are 3 boxes along x, 
# and 2 along y.
xBoxOffset = (xLength - xLabelOffset - 3*xBox)/2.0
yBoxOffset = yLength - yLabelOffset - 2*yBox

# calculate spacing between each cell.
xCellSpace = xBox/xNumCell
yCellSpace = yBox/yNumCell

# Specify the extent of the plotting space.
set xrange [0:xLength]
set yrange [0:yLength] 

# specify the aspect ratio of the plot.
set size ratio 0.8

# turn off the key, the axis tics and the border around the plot.
unset key
unset xtics
unset ytics
unset border

# set the fill style (for the boxes).
set style fill empty border rgb "black"

# initialize counter for objects.
nobj = 0

############
# initialize location of box (bottom row). 
rx = xLabelOffset
ry = 0

# draw box.
nobj = nobj+1
set object nobj rect from rx,ry to rx+xBox,ry+yBox 

# add label.
set label "{1,0}" at rx+0.5*xBox,ry+0.5*yBox center

# initialize coordinates for cell boundaries. 
ax = rx
ay = ry

# draw y boundaries.
ax = ax+xCellSpace
nobj = nobj+1; set arrow nobj from ax,ry to ax,ry+yBox nohead
ax = ax+xCellSpace
nobj = nobj+1; set arrow nobj from ax,ry to ax,ry+yBox nohead
ax = ax+xCellSpace
nobj = nobj+1; set arrow nobj from ax,ry to ax,ry+yBox nohead

# draw x boundaries.
ay = ay+yCellSpace
nobj = nobj+1; set arrow nobj from rx,ay to rx+xBox,ay nohead
ay = ay+yCellSpace
nobj = nobj+1; set arrow nobj from rx,ay to rx+xBox,ay nohead
ay = ay+yCellSpace
nobj = nobj+1; set arrow nobj from rx,ay to rx+xBox,ay nohead
ay = ay+yCellSpace
nobj = nobj+1; set arrow nobj from rx,ay to rx+xBox,ay nohead
ay = ay+yCellSpace
nobj = nobj+1; set arrow nobj from rx,ay to rx+xBox,ay nohead

# draw next box.
rx = rx+xBox+xBoxOffset 
nobj = nobj+1
set object nobj rect from rx,ry to rx+xBox,ry+yBox 

# add label.
set label "{1,1}" at rx+0.5*xBox,ry+0.5*yBox center

# initialize coordinates for cell boundaries. 
ax = rx
ay = ry

# draw y boundaries.
ax = ax+xCellSpace
nobj = nobj+1; set arrow nobj from ax,ry to ax,ry+yBox nohead
ax = ax+xCellSpace
nobj = nobj+1; set arrow nobj from ax,ry to ax,ry+yBox nohead
ax = ax+xCellSpace
nobj = nobj+1; set arrow nobj from ax,ry to ax,ry+yBox nohead

# draw x boundaries.
ay = ay+yCellSpace
nobj = nobj+1; set arrow nobj from rx,ay to rx+xBox,ay nohead
ay = ay+yCellSpace
nobj = nobj+1; set arrow nobj from rx,ay to rx+xBox,ay nohead
ay = ay+yCellSpace
nobj = nobj+1; set arrow nobj from rx,ay to rx+xBox,ay nohead
ay = ay+yCellSpace
nobj = nobj+1; set arrow nobj from rx,ay to rx+xBox,ay nohead
ay = ay+yCellSpace
nobj = nobj+1; set arrow nobj from rx,ay to rx+xBox,ay nohead

# draw next box.
rx = rx+xBox+xBoxOffset 
nobj = nobj+1
set object nobj rect from rx,ry to rx+xBox,ry+yBox 

# add label.
set label "{1,2}" at rx+0.5*xBox,ry+0.5*yBox center

# initialize coordinates for cell boundaries. 
ax = rx
ay = ry

# draw y boundaries.
ax = ax+xCellSpace
nobj = nobj+1; set arrow nobj from ax,ry to ax,ry+yBox nohead
ax = ax+xCellSpace
nobj = nobj+1; set arrow nobj from ax,ry to ax,ry+yBox nohead
ax = ax+xCellSpace
nobj = nobj+1; set arrow nobj from ax,ry to ax,ry+yBox nohead

# draw x boundaries.
ay = ay+yCellSpace
nobj = nobj+1; set arrow nobj from rx,ay to rx+xBox,ay nohead
ay = ay+yCellSpace
nobj = nobj+1; set arrow nobj from rx,ay to rx+xBox,ay nohead
ay = ay+yCellSpace
nobj = nobj+1; set arrow nobj from rx,ay to rx+xBox,ay nohead
ay = ay+yCellSpace
nobj = nobj+1; set arrow nobj from rx,ay to rx+xBox,ay nohead
ay = ay+yCellSpace
nobj = nobj+1; set arrow nobj from rx,ay to rx+xBox,ay nohead

############
# initialize location of box (top row). 
rx = xLabelOffset
ry = yBox + yBoxOffset

# draw next box.
nobj = nobj+1
set object nobj rect from rx,ry to rx+xBox,ry+yBox 

# add label.
set label "{0,0}" at rx+0.5*xBox,ry+0.5*yBox center

# initialize coordinates for cell boundaries. 
ax = rx
ay = ry

# draw y boundaries.
ax = ax+xCellSpace
nobj = nobj+1; set arrow nobj from ax,ry to ax,ry+yBox nohead
ax = ax+xCellSpace
nobj = nobj+1; set arrow nobj from ax,ry to ax,ry+yBox nohead
ax = ax+xCellSpace
nobj = nobj+1; set arrow nobj from ax,ry to ax,ry+yBox nohead

# draw x boundaries.
ay = ay+yCellSpace
nobj = nobj+1; set arrow nobj from rx,ay to rx+xBox,ay nohead
ay = ay+yCellSpace
nobj = nobj+1; set arrow nobj from rx,ay to rx+xBox,ay nohead
ay = ay+yCellSpace
nobj = nobj+1; set arrow nobj from rx,ay to rx+xBox,ay nohead
ay = ay+yCellSpace
nobj = nobj+1; set arrow nobj from rx,ay to rx+xBox,ay nohead
ay = ay+yCellSpace
nobj = nobj+1; set arrow nobj from rx,ay to rx+xBox,ay nohead

# draw next box.
rx = rx+xBox+xBoxOffset 
nobj = nobj+1
set object nobj rect from rx,ry to rx+xBox,ry+yBox 

# add label.
set label "{0,1}" at rx+0.5*xBox,ry+0.5*yBox center

# initialize coordinates for cell boundaries. 
ax = rx
ay = ry

# draw y boundaries.
ax = ax+xCellSpace
nobj = nobj+1; set arrow nobj from ax,ry to ax,ry+yBox nohead
ax = ax+xCellSpace
nobj = nobj+1; set arrow nobj from ax,ry to ax,ry+yBox nohead
ax = ax+xCellSpace
nobj = nobj+1; set arrow nobj from ax,ry to ax,ry+yBox nohead

# draw x boundaries.
ay = ay+yCellSpace
nobj = nobj+1; set arrow nobj from rx,ay to rx+xBox,ay nohead
ay = ay+yCellSpace
nobj = nobj+1; set arrow nobj from rx,ay to rx+xBox,ay nohead
ay = ay+yCellSpace
nobj = nobj+1; set arrow nobj from rx,ay to rx+xBox,ay nohead
ay = ay+yCellSpace
nobj = nobj+1; set arrow nobj from rx,ay to rx+xBox,ay nohead
ay = ay+yCellSpace
nobj = nobj+1; set arrow nobj from rx,ay to rx+xBox,ay nohead

# draw next box.
rx = rx+xBox+xBoxOffset 
nobj = nobj+1
set object nobj rect from rx,ry to rx+xBox,ry+yBox 

# add label.
set label "{0,2}" at rx+0.5*xBox,ry+0.5*yBox center

# initialize coordinates for cell boundaries. 
ax = rx
ay = ry

# draw y boundaries.
ax = ax+xCellSpace
nobj = nobj+1; set arrow nobj from ax,ry to ax,ry+yBox nohead
ax = ax+xCellSpace
nobj = nobj+1; set arrow nobj from ax,ry to ax,ry+yBox nohead
ax = ax+xCellSpace
nobj = nobj+1; set arrow nobj from ax,ry to ax,ry+yBox nohead

# draw x boundaries.
ay = ay+yCellSpace
nobj = nobj+1; set arrow nobj from rx,ay to rx+xBox,ay nohead
ay = ay+yCellSpace
nobj = nobj+1; set arrow nobj from rx,ay to rx+xBox,ay nohead
ay = ay+yCellSpace
nobj = nobj+1; set arrow nobj from rx,ay to rx+xBox,ay nohead
ay = ay+yCellSpace
nobj = nobj+1; set arrow nobj from rx,ay to rx+xBox,ay nohead
ay = ay+yCellSpace
nobj = nobj+1; set arrow nobj from rx,ay to rx+xBox,ay nohead

############
# Now draw the arrows between boxes.  
# First, in y direction (between rows).
ax = xLabelOffset + 0.5*xBox

# shave off a bit of the arrow so it's not touching the box 
# (looks nicer).
eps = 0.1*yBoxOffset

nobj = nobj+1 
set arrow nobj from ax,yBox+eps to ax,yBox+yBoxOffset-eps heads
ax = ax + xBox + xBoxOffset 
nobj = nobj+1
set arrow nobj from ax,yBox+eps to ax,yBox+yBoxOffset-eps heads
ax = ax + xBox + xBoxOffset 
nobj = nobj+1
set arrow nobj from ax,yBox+eps to ax,yBox+yBoxOffset-eps heads

# Second, in x direction (between columns - bottom row).
ax = xLabelOffset + xBox
ay = 0.5*yBox
eps = 0.1*xBoxOffset 

nobj = nobj+1
set arrow nobj from ax+eps,ay to ax+xBoxOffset-eps,ay heads
ax = ax + xBox + xBoxOffset 
nobj = nobj+1
set arrow nobj from ax+eps,ay to ax+xBoxOffset-eps,ay heads

# Third, in x direction (between columns - top row).
ax = xLabelOffset + xBox
ay = ay + yBox + yBoxOffset
eps = 0.1*xBoxOffset 

nobj = nobj+1
set arrow nobj from ax+eps,ay to ax+xBoxOffset-eps,ay heads
ax = ax + xBox + xBoxOffset 
nobj = nobj+1
set arrow nobj from ax+eps,ay to ax+xBoxOffset-eps,ay heads

############
# Add labels.  First, left of bottom row.
lx = 0.5*xLabelOffset
ly = 0.5*yCellSpace
set label "12" at lx,ly center  
ly = ly+yCellSpace
set label "10" at lx,ly center
ly = ly+yCellSpace
set label "8" at lx,ly center
ly = ly+yCellSpace
set label "6" at lx,ly center
ly = ly+yCellSpace
set label "4" at lx,ly center
ly = ly+yCellSpace
set label "2" at lx,ly center

# Second, left of top row.
lx = 0.5*xLabelOffset
ly = yBox + yBoxOffset + 0.5*yCellSpace
set label "11" at lx,ly center
ly = ly+yCellSpace
set label "9" at lx,ly center
ly = ly+yCellSpace
set label "7" at lx,ly center
ly = ly+yCellSpace
set label "5" at lx,ly center
ly = ly+yCellSpace
set label "3" at lx,ly center
ly = ly+yCellSpace
set label "1" at lx,ly center

# Third, top of the three columns.
rx = xLabelOffset
ly = yLength - yLabelOffset + 0.5*yCellSpace
lx = rx + 0.5*xCellSpace
set label "1" at lx,ly center
lx = lx+xCellSpace
set label "4" at lx,ly center
lx = lx+xCellSpace
set label "7" at lx,ly center
lx = lx+xCellSpace
set label "10" at lx,ly center

rx = rx+xBox+xBoxOffset 
lx = rx + 0.5*xCellSpace
set label "2" at lx,ly center
lx = lx+xCellSpace
set label "5" at lx,ly center
lx = lx+xCellSpace
set label "8" at lx,ly center
lx = lx+xCellSpace
set label "11" at lx,ly center

rx = rx+xBox+xBoxOffset 
lx = rx + 0.5*xCellSpace
set label "3" at lx,ly center
lx = lx+xCellSpace
set label "6" at lx,ly center
lx = lx+xCellSpace
set label "9" at lx,ly center
lx = lx+xCellSpace
set label "12" at lx,ly center

############
# call plot (with a dummy argument) to display the objects 
# we've defined.
plot 1/0

pause -1

5 comments:

  1. A great technical post, thanks for sharing.

    GNUPlot is a great package and about half of my PhD thesis plots were generated using it.

    For that particular plot,however, I would have probably used Mathematica which I think would have been able to do the same thing with MUCH less code. Of course Mathematica is rather more expensive than GNUPlot :)

    Cheers,
    Mike

    ReplyDelete
  2. Nice article Jeremy - I hadn't realised you can use Gnuplot almost like a programming language.

    I'm curious as to why the spec was to do it in Gnuplot though - wouldn't it have been easier to use a real programming language with loops and subroutines, like PostScript?

    ReplyDelete
  3. Thanks, Mick - I too was surprised that gnuplot commands can be assembled into something that "almost" looks like a program. It's probably asking for too much (and would be getting too far away from its original design) to ask for the extra features - loops, branches, subprograms - that would make it more like a "real" programming language, but it's interesting to see how far you can get (e.g. by unrolling loops) just employing the features it has.

    I think the spec asked for gnuplot because it's used for most (all?) of the other figures in our documentation, and because it's reasonably straightforward to produce images in a variety of formats (e.g. PNG, as well as PostScript) from it. Plus, of course, there's a certain amount of gnuplot experience and expertise (to which, perhaps, this article is a tiny contribution) lying around here.

    ReplyDelete
  4. Many thanks for the helpful comment, Mike - my apologies for the delayed response. Yes, Mathematica (or MATLAB, or Java, or C, or Python, or any similarly-endowed programming environment) would indeed facilitate a much more compact representation of the same program, but the spec called for gnuplot - mostly, I think, for reasons of history and (as you note) economy. Whether or not the latter proves to be false is another question, of course.

    ReplyDelete
  5. Thanks for sharing your gnuplot experience! Would you like to write about a LaTeX graphics and gnuplot contest in your blog, to give your readers a chance to win a gnuplot cookbook? It's here: http://latex-community.org/component/content/article/92-contests/431-gnuplot-book

    ReplyDelete

NAG moderates all replies and reserves the right to not publish posts that are deemed inappropriate.