Skip to main content

Contour Plotting with Matplotlib

Visualising data well is an important part of any analysis and a good handle on the Python package Matplotlib is essential for any Python data analyst. I hope to provide a few tutorials on some of the more complex concepts in data visualisation and how to produce clear and tidy graphs with Matplotlib. I will assume some basic knowledge of the Matplotlib package but will try and explain the code as clearly as possible. Comments are always welcome!

I am going to begin with this piece on contour plotting which is an area I have a particular interest in. Specifically I am interested in plotting parameter spaces for fitted functions with contours defined by an objective function like $\chi^2$. We will get to an example of this shortly but I first want to look at how we make contour plots with a simpler example.

Basic Example with Radii

For our simple example we will define variables $x$ and $y$ over a given range and plot the corresponding radius, $Z$ from 0 for each data point $(x, y)$ as contour lines. To define $x$ and $y$ we will use numpy and this is done like so,
import matplotlib.pyplot as plt
import numpy as np

x = np.arange(-10, 20, 1)
y = np.arange(-10, 20, 1)
where we have also imported matplotlib.pyplot in order to produce our contour maps of the radius.

The next stage is to create a np.meshgrid of $x$ and $y$ data corresponding to each data point in the $y - x$ plane which we define with the parameters $X$ and $Y$. We can define the radius using the standard definition from polar coordinates, $Z = \sqrt{x^2 + y^2}$,
X, Y = np.meshgrid(x, y)

Z = np.sqrt(X**2 + Y**2)
The matrices $X$, $Y$ and $Z$ all have the same dimensions and the elements of $Z$ correspond to the radius evaluated at different combinations of $x$ and $y$ in the matrices $X$ and $Y$.

There are several different ways to produce the corresponding contour plot and here I will demonstrate two. One will produce a colour map contour plot and the other an inline contour map. To illustrate there equivalence I will plot them in a side by side subplot which is defined by,
fig, ax = plt.subplots(1, 2, sharex=True, sharey=True, figsize=(10, 5))
The function plt.subplots() returns a fig element which can be used to set general properties of the plot. The ax element of the function is an iterable array where each element corresponds to one of the subplots in the plot. Here there are two subplots arranged in 1 row and 2 columns. The arguments 'sharex' and 'sharey' ensure that the x and y scales are the same in both plots. This is not strictly necessary here as we are plotting the same data in both figures but it is often good practice in subplot graphs to use these key word arguments. It also ensures that only the outer axis on the subplot have tick labels which can be helpful for formatting clear graphs. The argument 'figsize' rescales the graph size from the default. It takes a tuple with each element corresponding to the width and height of the graph in inches respectively.

Now to add the contour plots. For 'ax[0]' corresponding here to the left most axis we have,
cp = ax[0].contour(X, Y, Z)
ax[0].clabel(cp, inline=True, fontsize=10, fmt='%1.1f')
The 'contour' function produces a line contour map of $Z$ in the $X$ and $Y$ space. We can then label each line with the value of $Z$ at that contour using the 'clabel' function. The 'inline' argument places the labels directly on the contour lines and the 'fontsize' argument determines how big they are. 'fmt' determines how the contour values are displayed. Here we are displaying the contour values as floats with 1 decimal place.

For 'ax[1]' we can plot the contours as a colour map using the 'contourf' function,
cp = ax[1].contourf(X, Y, Z, cmap='autumn')
plt.colorbar(cp, label='Radius')
We have defined the colour map to be 'autumn'. The function 'colorbar' adds a colour bar with label 'Radius' which displays the values of $Z$ corresponding to each coloured contour.

Finally, we can tidy up the graph for a nicer presentation. The script below will wrap the entire figure inside another plot that has no axis, axis ticks or tick labels. However, we can use this figure to add 'global' labels that cover both subplots.
fig.add_subplot(111, frame_on=False)
plt.tick_params(labelcolor="none", bottom=False, left=False)
plt.xlabel(r'$X$')
plt.ylabel(r'$Y$')
We can then remove the white space between the two figures using the 'subplots_adjust' function and save/show the figure.
plt.subplots_adjust(wspace=0)
plt.savefig('Basic_example.png')
plt.show()
The resultant graph is shown below and it illustrates the two approaches to plotting contours. I would recommend building the code and displaying the plot gradually so that you can see what each function and argument is doing. Particularly it might be worth plotting this with and without 'sharex=True' and 'sharey=True' (you can set these to 'False' or remove completely). It is also worth investigating the 'suplots_adjust' function and the contour label options like 'fmt'. The Matplotlib documentation is very extensive with a range of examples and I have not covered all optional arguments for the functions used here. For example the 'levels' argument in both the 'contour' and 'contourf' functions can be used to adjust the frequency of the contour lines or coloured regions. It is worth exploring these a bit more as well and there is a link to the documentation at the end of the article.

The colour map figure appears in the figure below to have a shorter width than the line plot. This is as a consequence of the subplot environment we have used and the presence of the colour bar. A better visualisation can be achieved by using Matplotlibs 'gridspec' environment. However this is out of the scope of this article and I may follow it up with a later post.
From the graph we can see that the value of the contour lines gets larger as we move to larger absolute values of $x$ and $y$ which is what we would expect for the radius $Z$.

$\chi^2$ Example

How can we apply this to parameter spaces and $\chi^2$? 

Well first we need to define some data and a fitting function. We define some mock data like so,
x = np.arange(-10, 20, 1)
y = 1.4*x**2 + 0.2*x + 4.3*x**3
In reality we will not necessarily have any prior knowledge of the data set other than the fact that it looks polynomial in nature. Consequently, we may choose to fit it with a polynomial with powers on $x$ of $1$ and $2$ with two fitting parameters $a$ and $b$. We therefore define our model using the following function, 
def fitting(x, a, b):
    y_fit = a*x**2 + b*x
    return y_fit
To fit this function to the data we are going to use here the curve_fit function from scipy.optimize which uses the Levenburg-Marquardt fitting routine. The algorithm is based on the Gauss-Newton and Gradient Descent algorithms and is not particularly reliable when there are a large number of local minima present. However, for the purposes of this article it is sufficient and is implemented like so,
from scipy.optimize import curve_fit
p0 = [1, 1]
parameters, cov = curve_fit(fitting, x, y, p0=p0)
Here 'p0' is the initial parameter guess, 'parameters' are the optimum parameters for the fit and 'cov' is the covariance matrix for the fit. You can read more about the algorithm via the link at the bottom of the article.

If we want to plot the value of $\chi^2$ in the parameter space we need to sample the parameters around their optimum values which were returned from the algorithm. We use np.linspace to produce arrays of the parameter values that have length $100$ and extend $50\%$ either side of the optimum values. We then produce a mesh grid of these two parameter arrays and for each combination of $a$ and $b$ or 'param1' and 'param2' we calculate $\chi^2$ and assign these values to the matrix $Z$. This is performed with the following code,
param1 = np.linspace(parameters[0]*0.5, parameters[0]*1.5, 100)
param2 = np.linspace(parameters[1]*0.5, parameters[1]*1.5, 100)

X, Y = np.meshgrid(param1, param2)

Z = np.empty(X.shape)
for i in range(len(X)):
    for j in range(X.shape[0]):
        Z[i, j] = (np.sum((y - fitting(x, X[i,j], Y[i,j]))**2))
Here we have used the standard definition of $\chi^2$ given by,

$\chi^2 = \sum (y_{data} - y_{fit})^2$.

Finally we can plot the contour map using the following code,
fig = plt.figure()

cp = plt.contourf(X, Y, Z, cmap='jet')
plt.colorbar(cp, label=r'$\chi^2$')

plt.xlabel(r'$a$')
plt.ylabel(r'$b$')
plt.savefig('chi2_example.png')
plt.show()
I have chosen to do this using the colour map method as I believe it makes for a nicer visualisation of the data. I have also used a different colour map to illustrate how this can be altered. The resultant graph which shows how the $\chi^2$ value varies with parameters $a$ and $b$ is shown below.
Unsurprisingly our model produces a bad fit but the example serves to illustrate how this style of plot can be achieved. The optimum value is found in the centre of the graph since we have defined our parameter ranges to be $50\%$ either side of the optimum values. However the graph also tells us that the value of $a$ and $b$ are anti-correlated. What this means is if we take arbitrary parameters, $(a_0, b_0)$ in the space and we increase the value of $b_0$ then the value of $a_0$ must decrease in order for the new parameters, $(a_1, b_1)$ to produce the same quality of fit as the original parameters.

This is a powerful tool for visualising parameter spaces and also for diagnosing relationships between parameters in fitted models.

Thanks for reading! Have fun plotting contours!

Further reading:

Comments

Popular posts from this blog

Random Number Generation: Box-Muller Transform

Knowing how to generate random numbers is a key tool in any data scientists tool box. They appear in multiple different optimisation routines and machine learning processes. However, we often use random number generators built into programming languages without thinking about what is happening below the surface. For example in Python if I want to generate a random number uniformally distributed between 0 and 1 all I need to do is import numpy and use the np.random.uniform() function. Similarly if I want gaussian random numbers to for example simulate random noise in an experiment all I need to do is use np.random.normal(). But what is actually happening when I call these functions? and how do I go about generating random numbers from scratch? This is the first of hopefully a number of blog posts on the subject of random numbers and generating random numbers. There are multiple different methods that can be used in order to do this such as the inverse probability transform method and I

LDL Decomposition with Python

I recently wrote a post on the Cholesky decomposition of a matrix. You can read more about the Cholesky decomposition here;  https://harrybevins.blogspot.com/2020/04/cholesky-decomposition-and-identity.html . A closely related and more stable decomposition is the LDL decomposition which has the form, $\textbf{Q} = \textbf{LDL*}$, where $\textbf{L}$ is a lower triangular matrix with diagonal entries equal to 1, $\textbf{L*}$ is it's complex conjugate and $\textbf{D}$ is a diagonal matrix. Again an LDL decomposition can be performed using the Scipy or numpy linear algebra packages but it is a far more rewarding experience to write the code. This also often leads to a better understanding of what is happening during this decomposition. The relationship between the two decompositions, Cholesky and LDL, can be expressed like so, $\textbf{Q} = \textbf{LDL*} = \textbf{LD}^{1/2}(\textbf{D})^{1/2}\textbf{*L*} = \textbf{LD}^{1/2}(\textbf{LD}^{1/2})\textbf{*}$. A simple way to calcu

Random Number Generation: Inverse Transform Sampling with Python

Following on from my previous post, in which I showed how to generate random normally distributed numbers using the Box-Muller Transform, I want to demonstrate how Inverse Transform Sampling(ITS) can be used to generate random exponentially distributed numbers. The description of the Box-Muller Transform can be found here:  https://astroanddata.blogspot.com/2020/06/random-number-generation-box-muller.html . As discussed in my previous post random numbers appear everywhere in data analysis and knowing how to generate them is an important part of any data scientists tool box. ITS takes a sample of uniformly distributed numbers and maps them onto a chosen probability density function via the cumulative distribution function (CDF). In our case the chosen probability density function is for an exponential distribution given by, $P_d(x) = \lambda \exp(-\lambda x)$. This is a common distribution that describes events that occur independently, continuously and with an average constant rate, $\