As already discussed in some of my previous articles good visualisation of data is essential to getting the associated message across. One aspect of this is the need to plot multiple data sets or visualise the same data set in different ways on the same figure. For example we may wish to illustrate our data and the residuals after we subtract a fit to that data all in the same figure.
This can be effectively done using matplotlib and the associated subplots environments. Mastery of these tools is something that comes very much with practice and I do not claim to be an expert. However, I have some experience with the environment and I will share with you the basics in this article.
In order to use the matplotlib environment we will need to begin by importing matplotlib via,
import matplotlib.pyplot as plt
We can then proceed to explore what is on offer.
plt.subplot() and plt.subplots()
So plt.subplots() and plt.subplot() are probably where most people begin to learn about the idea of combining sub-figures into a single figure. This was certainly my experience when learning Python and while they are incredibly useful they do have some imitations.
To begin with lets look at plt.subplot(). We can produce a simple subplot graph using the following code,
plt.subplot(211) plt.subplot(212) plt.show()
which displays the figure below. The numbers inside plt.subplot() indicate the number of rows, number of columns and the figure number. The figures are numbered by row and then by column which effectively results in clockwise numbering if there is more than one row. Underneath each plt.subplot() we write the relevant plt.plot(), plt.xlabel() ect functions that we want to apply to each sub-figure.
Pretty straight forward right? You can provide keyword arguments to plt.subplot() aswell and these might be worth investigating further via the link to the matplotlib documentation at the end of the article. However, I find defining subplots in this manner is not always the best choice and tend to favour using plt.subplots() which is subtly different.
In my opinion plt.subplots() is much more flexible and usable than the above (although I may have a biased opinion having slightly more experience with this slightly different environment). Where the two functions differ is that plt.subplots() produces all of the sub-figures at the same time where as using plt.subplot() they have to be individually created. This means we can do the following,
nrows, ncols = 2, 2 fig, axs = plt.subplots(nrows, ncols) plt.show()
where all 4 of the plots are created together and their axis placed in an array. For me this helps me keep track of what is happening with each sub-figure via functions like axs[0].plot(). You can also provide plt.subplots() with arguments like `sharex', `sharey' and `figsize'. If we leave `figsize' at the default value then are figure will often appear cramped and wont display our data effectively. `sharex' and `sharey' allow you to ensure that all of the axis have the same scales across the subplots by setting their values to True. Alternatively they can be set to `col' and `row' to share the axis scales across columns and rows. An example of this is as follows,
import numpy as np x = np.linspace(3, 5, 100) y = x**2 x1 = np.linspace(2, 5, 100) y1 = x**3 nrows, ncols = 2, 1 fig, axs = plt.subplots(nrows, ncols, sharex='col') axs[0].plot(x, y) axs[1].plot(x1, y1) plt.show()
where I have used numpy to define some simple power law data. Note that I have specifically defined the data over different ranges but the resultant figure below has the same range of x values for both subplots. This is because I have set the `sharex' argument to `col' which also removes the tick labels in the figure for the first subplot. Alternatively I could have set `sharex' equal to True as in this instance there is only one column. The above example also illustrates how to iterate through the axs array to give each sub-figure detail. Similar functions to axs[1].plot() like axs[1].set_xlabel() can be used to give each sub-figure further specific properties.
We might also want to remove the whitespace between the two figures since they are sharing the same x-axis and this can be done with plt.subplots_adjust(hspace=0). We will also want to add x and y labels to the figure. We can do this as described above for each sub-figure if the y-axes correspond to different variables using axs[i].set_ylabel(). Alternatively if the y-axis of both figures represents the same variable we want to have a single global label for the pair that sits nicely in the middle of our figure. We can do this by encasing the figure inside a new subplot that has no frame, axis ticks or ticklabels but does have axes labels. We use the fig.add_subplots() function to do this and this is shown in the code bellow,
nrows, ncols = 2, 1 fig, axs = plt.subplots(nrows, ncols, sharex='col') axs[0].plot(x, y) axs[1].plot(x1, y1) fig.add_subplot(111, frame_on=False) plt.tick_params(labelcolor="none", bottom=False, left=False) plt.xlabel('x') plt.ylabel('y') plt.subplots_adjust(hspace=0) plt.savefig('Fig2.png') plt.show() plt.close()
where the arguments inside add_subplot() and tick_params() ensure that we don't see anything other than the assigned labels for the global figure. We can see the results in the below figure.
We can create more complex figures like the one below with a bit more work. The code for this figure can be found on my github linked at the end of the article. I have used the plt.subplots() function to define a $4\times4$ array of sub-figures and then subsequently removed the axis of the figures in the top right of the graph using the axs[j, i].axis('off') function for i > j . I have also rotated the x-ticks and adjusted the positioning of the x-label so that it is not obscured.
One of the disadvantages of the plt.subplots environment is the inability to resize particular sub figures. For this we can use the GridSpec environment which I will hopefully cover the basics of in a future post. However, for most purposes the subplots environment is sufficient and can improve the quality of your data visualisation significantly. I hope that this article has been informative. The code to produce the graphs shown can be found here.
Thanks for reading!
Further reading:
Comments
Post a Comment