My first brush with NumPy happened over writing a block of code to make a plot using** pylab**. ⇣

`pylab`

is part of`matplotlib`

(in`matplotlib.pylab`

) and tries to give you a MatLab like environment.`matplotlib`

has a number of dependencies, among them`numpy`

which it imports under the common alias`np`

.`scipy`

is not a dependency of`matplotlib`

.

I had a *tuple* (of lows and highs of temperature) of lengh 2 with 31 entries in each (the number of days in the month of July), parsed from this text file:

Boston July Temperatures | |

------------------------- | |

Day High Low | |

------------ | |

1 91 70 | |

2 84 69 | |

3 86 68 | |

4 84 68 | |

5 83 70 | |

6 80 68 | |

7 86 73 | |

8 89 71 | |

9 84 67 | |

10 83 65 | |

11 80 66 | |

12 86 63 | |

13 90 69 | |

14 91 72 | |

15 91 72 | |

16 88 72 | |

17 97 76 | |

18 89 70 | |

19 74 66 | |

20 71 64 | |

21 74 61 | |

22 84 61 | |

23 86 66 | |

24 91 68 | |

25 83 65 | |

26 84 66 | |

27 79 64 | |

28 72 63 | |

29 73 64 | |

30 81 63 | |

31 73 63 |

Given below, are 2 sets of code that do the same thing; one ** without NumPy** and the other

**. They output the following graph using**

*with*NumPy**PyLab**:

**Code without NumPy**

import pylab | |

def loadfile(): | |

inFile = open('julyTemps.txt', 'r') | |

high =[]; low = [] | |

for line in inFile: | |

fields = line.split() | |

if len(fields) < 3 or not fields[0].isdigit(): | |

pass | |

else: | |

high.append(int(fields[1])) | |

low.append(int(fields[2])) | |

return low, high | |

def producePlot(lowTemps, highTemps): | |

diffTemps = [highTemps[i] - lowTemps[i] for i in range(len(lowTemps))] | |

pylab.title('Day by Day Ranges in Temperature in Boston in July 2012') | |

pylab.xlabel('Days') | |

pylab.ylabel('Temperature Ranges') | |

return pylab.plot(range(1,32),diffTemps) | |

producePlot(loadfile()[1], loadfile()[0]) |

**Code with NumPy**

import pylab | |

import numpy as np | |

def loadFile(): | |

inFile = open('julyTemps.txt') | |

high = [];vlow = [] | |

for line in inFile: | |

fields = line.split() | |

if len(fields) != 3 or 'Boston' == fields[0] or 'Day' == fields[0]: | |

continue | |

else: | |

high.append(int(fields[1])) | |

low.append(int(fields[2])) | |

return (low, high) | |

def producePlot(lowTemps, highTemps): | |

diffTemps = list(np.array(highTemps) - np.array(lowTemps)) | |

pylab.plot(range(1,32), diffTemps) | |

pylab.title('Day by Day Ranges in Temperature in Boston in July 2012') | |

pylab.xlabel('Days') | |

pylab.ylabel('Temperature Ranges') | |

pylab.show() | |

(low, high) = loadFile() | |

producePlot(low, high) |

The difference in code lies in how the variable `diffTemps`

is calculated.

diffTemps = list(np.array(highTemps) - np.array(lowTemps))

seems more readable than

diffTemps = [highTemps[i] - lowTemps[i] for i in range(len(lowTemps))]

**Notice how straight forward it is with NumPy.** At the core of the NumPy package, is the *ndarray* object. This encapsulates n-dimensional arrays of homogeneous data types, with many operations being performed in compiled code for performance. element-by-element operations are the “default mode” when an *ndarray* is involved, but the element-by-element operation is speedily executed by pre-compiled C code.