# Simple Line Plots: Using matplotlib

This post belongs to a series of tutorials about how to draw simple line plots for academic papers. Information about the data used is available in the first post of the series, and the source code is on GitHub. Here I will focus on using matplotlib, which is a Python library. The other two implementations are done using PGFPLOTS and using ggplot2.

Since the data is divided in three files, the first thing to do is read those files and create some lists to hold the data:

import csv
import matplotlib.pyplot as plt

implementations = ["gpu", "cpuParallel", "cpuSerial"]
input_file_suffix = ".csv"
x_values = range(10, 20)
y_values = []
for i in implementations:
implementation_y_values = []
csv_file = open(i + input_file_suffix)
# Skip the first line with the columns' names
implementation_y_values.append(row[0])
y_values.append(implementation_y_values)

Now we have two structures with the coordinates for the plot: $x\_values$ (a list with dataset sizes) and $y\_values$ (a list containing one list of values for each implementation).

Next, a couple of configurations regarding the size and the background of the canvas where the plot will be drawn:

# Format figure
plt.rcParams["figure.figsize"] = [10, 10]
plt.rcParams["figure.facecolor"] = 'white'

After that, the actual drawing is as follows:

# Construct the plot with lines for different implementations
gpuLine, = plt.plot(x_values, y_values[0],'-ko', linewidth=4, markersize=20)
cpuParallelLine, = plt.plot(x_values, y_values[1],'-k^', linewidth=4, markersize=20)
cpuSerialLine, = plt.plot(x_values, y_values[2], '-ks', linewidth=4, markersize=20)

To see how it looks, we can use the $plt.show()$ command. This give us the following partial result:

Note that each implementation calls the $plot()$ function, but with a few different parameters: the values of the Y coordinates and the style of the markers for that implementation ($-ko$ for a solid line with circles, $-k\wedge$ for a solid line with triangles, $-ks$ for a solid line with squares). For all the plots, I choose a thicker line and a larger marker than the default since I believe it makes it easier to read the plot.

The next step is to format the axes to change the labels of the axes, the labels of the ticks, the fonts used for both sets of labels, leave only the ticks on the left and the bottom, the scale of the Y axis to the logarithmic scale, set lower/upper bounds for the axes and add horizontal grid lines.:

# Format axes
plt.ylabel("Elapsed time (s)", fontname='Times New Roman', fontsize=35)
plt.xlabel("|R|", fontname='Times New Roman', fontsize=35)
x_labels = ["\$\mathregular{2^{" + str(label) + "}}\$" for label in x_values]
plt.xticks(x_values, x_labels, fontname='Times New Roman')
plt.yticks(fontname='Times New Roman')
plt.tick_params(axis='y', right='off', which='both', labelsize=35)
plt.tick_params(axis='y', left='off', which='minor')
plt.yscale('log', nonposy='clip')
plt.axis([9, 20, 0.05, 50000])
plt.axes().yaxis.grid()
plt.tight_layout()

It looks almost done, but the legend is still missing:

We can fix that with the following configurations:

# Format legend
plt.legend([cpuSerialLine, cpuParallelLine, gpuLine], ["CPU (Serial)", "CPU (Parallel)", "GPU"], frameon=False, fontsize=30, prop={'family': 'Times New Roman', 'size': 32}, numpoints=1, loc=2)

This will give us the look we wanted and now we can save it to a .pdf file:

# Export figure
plt.savefig(‘linePlot.pdf’, format=’pdf’)

The full source file is available at GitHub. The comment section is open for discussion and suggestions about the design choices for the plot or about the way they were implemented in this tutorial.