Speed, Performance, and Profiling in Python

1) Write a function to find the maximum of a list of numbers using pure core python (no numpy)

In [ ]:
 

2) Apply this function to lists of 100, 1000, 10000, 10000, and 1000000 randomly generated numbers. Use the %timeit magic to profile the execution speed

(You can use numpy to generate the random numbers.)

In [ ]:
 

3) Do the same thing for numpy's max function

In [ ]:
 

4) Put all of the above data into a pandas dataframe and plot it

In [ ]:
 

5) Now do the same thing with dask

Use array sizes from 10,000 to 100,000,000 and chunk sizes from 1000 to 1,000,000. Only test combinations where chunk sizes is less than the array size.

In [ ]:
 

(Extra) Play around with this dataset

In [ ]:
import intake
catalog_url = 'https://github.com/pangeo-data/pangeo/raw/master/gce/catalog.yaml'
ds = intake.Catalog(catalog_url).newmann_zarr.to_dask()
ds
In [ ]: