# Assignment 6 - Numpy and Matplotlib¶

### Due Thursday October 12¶

Your assignment should be handed in as an ipython notebook checked into your github repository in a new folder named `assignment_6`

. To download this assignment, your best option is to clone the original github repository for the course website:

`git clone https://github.com/rabernat/research_computing.git`

and then navigate to the `assignment`

folder.

## 1 Plotting and analyzing ARGO float data¶

#### 1.1 Import numpy¶

#### 1.2 Use the shell command `curl`

to download an example ARGO float profile from the North Atlantic.¶

The data file's url is http://www.ldeo.columbia.edu/~rpa/argo_float_4901412.npz

#### 1.3 Load the data file¶

#### 1.4 Extract the temperature, pressure and salinity arrays to arrays T, S, P and mask out invalid data (the nan values from missing points).¶

#### 1.5 Extract the date, lat, lon, and level arrays.¶

#### 1.5 Note the shapes of T, S and P compared to these arrays. How do they line up?¶

#### 1.6 Load the necessary package for plotting using pyplot from matplotlib.¶

#### 1.7 Make a 1 x 3 array of plots for each column of data in T, S and P.¶

The vertical scale should be the `levels`

data. Flip the veritcal axis direction so that levels increase downward on the plot. Each plot should have a line for each column of data. It will look messy. Make sure you label the axes and put a title on each subplot.

#### 1.8 Compute the mean and standard deviation of each of T, S and P at each depth in `level`

.¶

#### 1.9 Now make a similar plot, but show only the mean T, S and P at each depth. Show error bars on each plot using the standard deviations.¶

Again, make sure you label the axes and put a title on each subplot.

```
```

#### 1.10 Compute the mean and standard deviation of each of T, S and P for each time in `date`

.¶

#### 1.11 Plot the mean T, S and P for each entry in *time*, now on a *3 x 1* subplot grid with time on the horizontal axis. Show error bars on each plot using the standard deviations.¶

#### 1.12 Create a scatter plot of the positions of the ARGO float data. Color the positions by the date. Add a grid overlay.¶

Don't forget to label the axes!

## 2 Matrix multiplication revisited¶

#### 2.1 Create a function called myMatrixMultiply that takes input matrices X and Y and computes their matrix product.¶

Use the same three loop formulation from Assignment 5. If you want, you can replace the innermost loop with the sum operation or a matrix dot product since that may speed things up a bit.

#### 2.2 Create ones() square matrices for A and B with n = 100. Use the `%timeit`

function to compute the matrix product AB using your function `myMatrixMultiply`

.¶

#### 2.3 Now let's see how much faster Numpy's built in matrix multiplication routine is.¶

In Numpy, matrix multiplication is done using the `dot()`

function. Use the `%timeit`

function to compute the matrix product AB for n = 100 using `dot()`

and time it using the `%timeit`

function.

Now time how long it takes for n = 1000

When I ran this on my Mac laptop and used Activity Monitor.app to view the CPU usage of Python, I noticed that it was using up to 400% of my CPU. My laptop has 4 processing cores, so 400% means it was using all four cores to compute the matrix product. In other words, it was using parallel processing to speed up the calculations. Numpy uses some highly optimized versions of the BLAS linear algebra routines that are part of the Intel Math Kernel Library. By default, it uses a multi-threaded version of the MKL to take advantage of the many processing cores available on modern computers. Let's turn off multithreading and see how much slower it runs.

In your notebook type:

```
import mkl
mkl.set_num_threads(1)
```

Now rerun the n=1000 example using the `dot()`

function.