# Check the dimension of a pdf-figure

#### 30 September 2014

I like consistency, and I like it when the LaTeX text and the included figures appear as one entity. To achieve this you want to set the font in the figure to the same font as in the text, or at least use the same font for every figure; the same applies to the font size. Furthermore, I want the line thickness to be the same in all figures, no matter how big or small the figure is. The colours should be consistent, the dashes, and so on. I used little scripts to set my matplotlibrc to achieve this. By doing some research for this blog entry I found out that Matplotlib introduces stylesheets with version 1.4, which makes using your own style, or different styles, very easy:

>>> from matplotlib import style
>>> style.use('yourstyle')
>>> print(plt.style.available)


What a great feature! Use the latter to see which pre-defined styles are available.

There was one thing though that kept bugging me, and that is what I talk about in this fifth post of the aftershock series. I tend to save Matplotlib figures as pdf to include them in the LaTeX document. And I want to include them without specifying width nor height, for the same reason as mentioned above: I want the font size and line width to be consistent, which is not the case if you scale your image. However, this requires that the pdf you want to include has the correct dimension. That is where checksize comes in handy. To use checksize you need to install PyPDF2. I also installed wand, to plot pdf’s in the notebook for the examples; it is not required for checksize.

(ipynb)$pip install PyPDF2, wand  ## checksize() The function checksize saves a figure as a pdf with plt.savefig. It then reads the created pdf with PyPDF2 and compares the pdf-size with the specified, desired size. If the difference is bigger than the defined precision, it adjusts the figure-size and calls recursively checksize again, until the precision is matched. In [1]: import numpy as np import matplotlib.pyplot as plt from PyPDF2 import PdfFileReader # Read pdf from wand.image import Image as Image # Plot pdf # Increase font size, set CM as default text, and use LaTeX rc('font', **{'size': 16, 'family': 'serif', 'serif': ['Computer Modern Roman']}) rc('text', usetex=True) # Define colours (taken from http://colorbrewer2.org) clr = ['#377eb8', '#e41a1c', '#4daf4a', '#984ea3', '#ff7f00', '#ffff33', '#a65628']  ### Load the checksize-function (You can find it in the notebook adashof.ipynb, in the same repo as this notebook). In [2]: %load -s checksize adashof.py  In [3]: def checksize(fhndl, name, dsize, precision=0.01, extent=0.05, kwargs={}, _cf=False): """Print figure with 'name.pdf', check size, compare with dsize, and adjust if required Parameters ---------- fhndl : figure-handle Figure handle of the figure to be saved. name : string Figure name. dsize : list of two floats Desired size of pdf in cm. precision : float, optional; <0.01> Desired precision in cm of the dimension, defaults to 1 mm. extent : float or list of floats, optional; <0.01> - If float, then bbox_inches is set to tight, and pad_inches=extent. - If it is an array of two numbers it sets the percentaged extent-width, Bbox.expanded. - If it is an array of four numbers it sets [x0, y0, x1, y1] of Bbox. kwargs : dict Other input arguments that will be passed on to plt.savefig; e.g. dpi or facecolor. _cf : Internal parameter for recursion and adjustment. """ # Import PyPDF2 from PyPDF2 import PdfFileReader # Check extent input and set bbox_inches and pad_inches accordingly if np.size(extent) == 1: bbox_inches = 'tight' pad_inches = extent else: fext = fhndl.gca().get_window_extent().transformed( fhndl.dpi_scale_trans.inverted()) if np.size(extent) == 2: bbox_inches = fext.expanded(extent[0], extent[1]) elif np.size(extent) == 4: fext.x0, fext.y0, fext.x1, fext.y1 = extent extent = [1, 1] # set extent to [1, 1] for recursion bbox_inches = fext pad_inches=0 # Save the figure fhndl.savefig(name+'.pdf', bbox_inches=bbox_inches, pad_inches=pad_inches, **kwargs) # Get pdf-dimensions in cm pdffile = PdfFileReader(open(name+'.pdf', mode='rb')) pdfsize = np.array([float(pdffile.getPage(0).mediaBox[2]), float(pdffile.getPage(0).mediaBox[3])]) pdfdim = pdfsize*2.54/72. # points to cm # Define print-precision on desired precision pprec = abs(int(('%.1e' % precision).split('e')[1]))+1 # Get difference btw desired and actual size diff = dsize-pdfdim # If diff>precision, adjust, else finish if np.any(abs(diff) > precision): if not _cf: _cf = [1, 1] # Be verbose print(' resize...') # Adjust width if (abs(diff[0]) > precision): print(' X-diff:', np.round(diff[0], pprec), 'cm') # Set new factor to old factor times (desired size)/(actual size) _cf[0] = _cf[0]*dsize[0]/pdfdim[0] # Set new figure width fhndl.set_figwidth(_cf[0]*dsize[0]/2.54) # cm2in # Adjust height if (abs(diff[1]) > precision): print(' Y-diff:', np.round(diff[1], pprec), 'cm') # Set new factor to old factor times (desired size)/(actual size) _cf[1] = _cf[1]*dsize[1]/pdfdim[1] # Set new figure height fhndl.set_figheight(_cf[1]*dsize[1]/2.54) #cm2in # Call the function again, with new factor _cf figsize = checksize(fhndl, name, dsize, precision, extent, kwargs, _cf) return figsize else: # Print some info if the desired dimensions are reached # Print figure name and pdf dimensions print('Figure saved to '+name +'.pdf;', np.round(pdfdim[0], pprec), 'x', np.round(pdfdim[1], pprec), 'cm.') # Print the new figsize if it had to be adjusted if _cf: print(' => NEW FIG-SIZE: figsize=('+ str(np.round(fhndl.get_size_inches()[0], 2*pprec))+', '+ str(np.round(fhndl.get_size_inches()[1], 2*pprec))+')') # Return figsize return fhndl.get_size_inches()  ### Example Generate some data to plot. In [4]: xdata = np.linspace(0, 2*np.pi, 201) ydata = np.sin(xdata) xpdata = np.arange(5)*np.pi/2 ypdata = np.sin(xpdata) # Small function to plot the data def plot_data(): plt.plot(xdata, ydata, c=clr[0], lw=2) plt.plot(xpdata, ypdata, 'o', mec='none', mfc=clr[1], ms=10) plt.xlabel('x') plt.ylabel('sin(x)') plt.xticks(xpdata, (r'0', r'$\pi$/2', r'$\pi$', r'3$\pi$/2', r'2$\pi\$'))
plt.yticks([-1, 0, 1])
plt.axis([0, 2*np.pi, -1.2, 1.2])

# Small function to load pdf and get dimension
def get_pdf_dim(name):
pdfsize = np.array([float(pdffile.getPage(0).mediaBox[2]),
float(pdffile.getPage(0).mediaBox[3])])
return np.round(pdfsize*2.54/72., 3) # points to cm, rounded


First I create a normal plot. I would like a plot of dimensions 7 cm by 5 cm. The figsize argument of plt.figure takes inches, hence I have to divide our centimetres by 2.54.

In [5]:
fig1 = plt.figure(figsize=(7/2.54, 5/2.54))
plot_data()


I now save the figure with plt.savefig, with a facecolor in order to see exactly what was saved if I load the figure again. If I check the dimensions of the pdf, we see that the actual size is bigger than the specified size of 7 cm by 5 cm. The actual size depends on the tick-labels, on the axis-labels, titles, and other factors.

In [6]:
# Save the figure to 'orig.pdf'
fig1.savefig('data/checksize/orig.pdf', bbox_inches='tight', facecolor='.9')

# Get pdf-dimensions
pdfdim = get_pdf_dim('data/checksize/orig.pdf')
print('Pdf width x height:', pdfdim[0], 'x', pdfdim[1], 'cm')

Image(filename='data/checksize/orig.pdf')

Pdf width x height: 7.819 x 5.763 cm


Out[6]:

I save the figure now with checksize, providing the desired dimension of 7 cm by 5 cm; I do not specify the precision, the default precision of 1 mm is therefore applied.

In [7]:
figsize = checksize(fig1, 'data/checksize/check1', dsize=[7.0, 5.0], kwargs={'facecolor':'.9'})

  resize...
X-diff: -0.565 cm
Y-diff: -0.509 cm
resize...
X-diff: -0.16 cm
Y-diff: -0.151 cm
resize...
X-diff: -0.048 cm
Y-diff: -0.048 cm
resize...
X-diff: -0.015 cm
Y-diff: -0.015 cm
Figure saved to data/checksize/check1.pdf; 7.004 x 5.005 cm.
=> NEW FIG-SIZE: figsize=(2.470962, 1.712652)



checksize detects that the output dimension is different from the desired dimension, and adjusts it accordingly. If the precision is set too low, this process can take many iterations, and even fall into and endless loop. But one millimetre or a tenth of it is generally not a problem.

To check if checksize did a proper job, I do the same to control the pdf dimension as I did before:

In [8]:
# Get pdf-dimensions
pdfdim = get_pdf_dim('data/checksize/check1.pdf')
print('Pdf width x height:', pdfdim[0], 'x', pdfdim[1], 'cm')

Image(filename='data/checksize/check1.pdf')

Pdf width x height: 7.004 x 5.005 cm


Out[8]:

If checksize has to change the dimensions, it prints the new figure size at the end, and also returns the new values. I can now provide the updated figsize when I create the figure:

In [9]:
fig2 = plt.figure(figsize=figsize)
plot_data()
figsize = checksize(fig2, 'data/checksize/check2', dsize=[7.0, 5.0], kwargs={'facecolor':'.9'})

Figure saved to data/checksize/check2.pdf; 7.004 x 5.005 cm.



With the adjusted figsize the figure is saved with the correct dimensions straight away, no looping is required anymore. Of course, it is likely to change if you make changes to the plot.

You can play around with the extent argument to adjust the space around the figure. For some figures, checksize does not work very well; e.g. figures with 3D axis, or when the axis are set to equal. Again, have a go with the extent-parameter, it might help in those cases too.

You can find the above notebook Checksize.ipynb on my GitHub page in the blog-notebooks-repo.

## Decay

This fifth post concludes the series, the aftershocks decayed.

However, should you ever have a question regarding my thesis, either with respect to geophysics, Python, or LaTeX, please drop me a line, and I will see what I can do.