werthmuller.org

a dash of

Aftershock series

03 August 2014

Over the course of the next few weeks I will write a couple of posts, maybe half a dozen, which deal with figure creation in Python and their inclusion in a LaTeX document. These posts are remnants of my Ph.D. thesis, which I submitted in September 2013; hence the name aftershock.

I am a geophysicist, and in my thesis, which you can find in the research area, I dealt with the estimation of background resistivities in the search for hydrocarbons. However, this is not of any importance in this series. Besides my interest in geophysics I am a tiny perfectionists when it comes to typesetting, and I am also interested in the reproducibility of data and results. I created all the figures in my thesis in Python, and by calling the master script I can reproduce every figure of my thesis in one go. I am also a strong supporter of the open-source movement. My Ph.D. was industry funded, and I had to use some proprietary data and software, unfortunately. However, I made as much as possible available in the research section, including the Python and LaTeX source codes.

Reproducing the figures is only one, albeit important, aspect. I also want the font in the figures to be consistent with the font in the thesis. And, more importantly, the font sizes and line thicknesses of the figures to be consistent throughout the thesis, as well as the sizes of the figures themselves.

This should give you the setting for the posts that are to follow: it is about creating nice, consistent figures in Python for the inclusion into a LaTeX document.

Before I start, a word of caution:

  • I am a scientist, not a software developer. I more often abuse than use my programming (mostly scripting) skills to get my job done. I am sure most of the things I show here could be done more beautifly with a proper implementation within the corresponding Python package.

  • I submitted my thesis over a year ago, and the last few months of the thesis were solely writing. (I have, unfortunately, not used Python in my new job so far.) This means that the posts here are sort of outdated; given the rapid development of SciPy and co., I would not be surprised if some of the things are implemented by now. The system I used was running Ubuntu 12.04 64-bit, with TEX Live 2009, Python 2.7.3, IPython 0.13.2, NumPy 1.6.1., SciPy 0.9.0, matplotlib 1.1.1rc, and PyMC 2.1alpha.

If you know of a better way to achieve my goals, please share your knowledge by commenting or sending me a message, so I can add it to the post!

I am going to write the posts partly as IPython notebooks, and I will put the notebooks onto my GitHub page. In the following I go briefly through the set-up of the virtual environment I use, show how I include the notebooks in my blog, and include a small example notebook.

Setting up virtualenv with numpy, scipy, matplotlib, and ipython

(Edit: If you want to avoid the trouble of setting up the environment manually, have a look at my next blog. I learned about the free Anaconda python distribution, which makes it much easier to set-up a virtual environment with numpy and friends.)

I want to run my notebooks in a virtual environment, separated from my daily environment. I had quite some trouble to get scipy and matplotlib going in a virtual environment. This is how I finally got it going (in Ubuntu 14.04):

  1. Install pip and virtualenv, if you don’t have it already:

    $ sudo apt-get install python-pip
    $ sudo pip install virtualenv
    
  2. Install dependencies for scipy and matplotlib:

    $ sudo apt-get build-dep python3-scipy
    $ sudo apt-get build-dep python-matplotlib
    
  3. Create virtual environment ipynb and activate it:

    $ virtualenv -p /usr/bin/python3 ipynb
    $ source ipynb/bin/activate
    
  4. Install numpy, scipy, matplotlib, and ipython:

    (ipynb)$ pip install numpy scipy matplotlib ipython[all]
    

With this I have my virtual environment ipynb, with Python 3.4.0, IPython 2.1.0, NumPy 1.8.1., SciPy 0.14.0, and matplotlib 1.3.1. So I try to adapt my old scripts to current versions of Python, learning a bit of Python 3 along the way.

For the inclusion of IPython notebooks I use the brilliant Pelican plugin liquid_tags by Jake Vanderplas and contributors. (I abandoned the idea of using the plugin pelican-ipynb mentioned in my previous post in favour of liquid_tags.) The plugin liquid_tags is great! But it puts in every html-file a massive header, which I did not like. I made some adjustments and created a pull request; however, my adjustments might not be in the sense of the original creator, the pull request might therefore be included or not. In any way, you can find my version of it in my fork (outsource-css-js-branch).

After including the plugin I had to make some adjustments to my pelican theme dashof, and I was ready to go! I include here an example notebook to show how smooth the integration works, starting with the following title:

Demo.py: an example notebook

This is a simple demo notebook to show the inclusion of an IPython notebook in Pelican with the plugin liquid_tags, specifically notebook.py (full blog entry).

\(\LaTeX\) is handled in the text as well as in figures, \[ \int_a^b f(x)\mathrm{d}x \, . \]

For demonstration purposes, I load the integral_demo.py from matplotlib.org, display an image, and print some numbers.

In [1]:
%matplotlib inline
%load http://matplotlib.org/mpl_examples/showcase/integral_demo.py
In [2]:
"""
Plot demonstrating the integral as the area under a curve.

Although this is a simple example, it demonstrates some important tweaks:

    * A simple line plot with custom color and line width.
    * A shaded region created using a Polygon patch.
    * A text label with mathtext rendering.
    * figtext calls to label the x- and y-axes.
    * Use of axis spines to hide the top and right spines.
    * Custom tick placement and labels.
"""
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.patches import Polygon


def func(x):
    return (x - 3) * (x - 5) * (x - 7) + 85


a, b = 2, 9 # integral limits
x = np.linspace(0, 10)
y = func(x)

fig, ax = plt.subplots()
plt.plot(x, y, 'r', linewidth=2)
plt.ylim(ymin=0)

# Make the shaded region
ix = np.linspace(a, b)
iy = func(ix)
verts = [(a, 0)] + list(zip(ix, iy)) + [(b, 0)]
poly = Polygon(verts, facecolor='0.9', edgecolor='0.5')
ax.add_patch(poly)

plt.text(0.5 * (a + b), 30, r"$\int_a^b f(x)\mathrm{d}x$",
         horizontalalignment='center', fontsize=20)

plt.figtext(0.9, 0.05, '$x$')
plt.figtext(0.1, 0.9, '$y$')

ax.spines['right'].set_visible(False)
ax.spines['top'].set_visible(False)
ax.xaxis.set_ticks_position('bottom')

ax.set_xticks((a, b))
ax.set_xticklabels(('$a$', '$b$'))
ax.set_yticks([])

plt.show()
In [3]:
from IPython.display import Image
Image(url='https://ipython.org/_static/IPy_header.png', width=360)
Out[3]:
In [4]:
print(y[:10])
[-20.          -6.12644391   6.54863195  18.07622674  28.50733963
  37.89296977  46.28411631  53.73177843  60.28695527  66.00064599]

This is the last line of this IPython notebook example. What follows is again from the regular blog entry.

Back to the blog. You can see the power of liquid_tags from this little example, and how seamless the notebook integrates with other content of the blog. Now all is set to start the Aftershock Series!