In [1]:
# do imports here

I saved you some time by pre-downloading some data in .csv format from the USGS Earthquakes Database. It is located at:

http://www.ldeo.columbia.edu/~rpa/usgs_earthquakes_2014.csv

You don't even need to download it. You can open it directly with Pandas.

Part I: Read Earthquake Data

We don't need any groupby to do part I

1) Use Pandas' read_csv function directly on this url to open it as a DataFrame

(Don't use any special options). Display the first few rows and the DataFrame info.

In [ ]:
 
In [ ]:
 

You should have seen that the dates were not automatically parsed into datetime types.

2) Re-read the data in such a way that all date columns are identified as dates and the earthquake id is used as the index

Verify that this worked using the head and info functions.

In [ ]:
 
In [ ]:
 

3) Use describe to get the basic statistics of all the columns

Note the highest and lowest magnitude of earthquakes in the databse.

In [ ]:
 

4) Use nlargest to get the top 20 earthquakes by magnitude

https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.nlargest.html

In [ ]:
 

Examine the structure of the place column. The country information seems to be in there. How would you get it out?

5) Extract the country using Pandas text data functions

Add it as a new column to the dataframe.

In [ ]:
 
In [ ]:
 

7) Make a bar chart of the top 5 earthquake magnitudes vs country/state

In [ ]:
 

8) Create a filtered dataset that only has earthquakes of magnitude 4 or larger and

In [ ]:
 

9) Make a histogram the distribution of the Earthquake magnitudes

https://pandas.pydata.org/pandas-docs/version/0.23/generated/pandas.DataFrame.hist.html https://matplotlib.org/api/_as_gen/matplotlib.pyplot.hist.html

Do one subplot for the filtered and one for the unfiltered dataset. Use a Logarithmic scale. What sort of relationship do you see?

In [ ]:
 

11) Visualize the locations of earthquakes by making a scatterplot of their latitude and longitude.

Do both the filtered and unfiltered datasets. Color it by magnitude. Make it pretty

What difference do you note between the filtered and unfiltered datasets?

In [ ]:
 
In [ ]:
 
In [ ]: