Xarray Tips and Tricks
Xarray Tips and Tricks¶
Build a multi-file dataset from an OpenDAP server¶
One thing we love about xarray is the open_mfdataset
function, which combines many netCDF files into a single xarray Dataset.
But what if the files are stored on a remote server and accessed over OpenDAP. An example can be found in NOAA's NCEP Reanalysis catalog.
more ...Assignment 10: Maps with Cartopy
Homework 10: Cartopy¶
1) Plot data from NARR¶
NARR is NCEP's North American Regional Reanalysis, a widely used product for studying the weather and climate of the continental US. The data is available from NOAA's Earth System Research Laboratory via OPeNDAP, meaing that xarray can opent the data "remotely" without downloading a file.
more ...Maps with Cartopy
Maps in Scientific Python¶
Making maps is a fundamental part of geoscience research. Maps differ from regular figures in the following principle ways:
- Maps require a projection of geographic coordinates on the 3D Earth to the 2D space of your figure.
- Maps often include extra decorations besides just our data (e.g. continents, country borders, etc.) more ...
Assignment 9: Performance and Profiling
Dask for Parallel Computing and Big Data
Dask for Parallel Computing in Python¶
In past lectures, we learned how to use numpy, pandas, and xarray to analyze various types of geoscience data. In this lecture, we address an incresingly common problem: what happens if the data we wish to analyze is "big data"
Aside: What is "Big Data"?¶
There is a lot of hype around the buzzword "big data" today. Some people may associate "big data" with specific sortware platforms (e.g. "Hadoop", "spark"), while, for others, "big data" means specific machine learning techniques. But I think wikipedia's definition more ...
Python Environments
Note: many elements in this guide are adapted from Daniel Rothenberg's excellent getting started guide.
Python
Python and nearly all of the software packages in the scientific python ecosystem are open-source. They are maintained and developed by a community of scientists and programmers, some of whose work is supported …
more ...Assignment 8: Xarray for ENSO
Assignment 8 : Xarray Groupby¶
Here will will calculate the NINO 3.4 index of El Nino variabillity and use it to analyze datasets.
First read this page from NOAA. It tells you the following.
- The Nino 3.4 region is defined as the region between +/- 5 deg. lat, 170 W - 120 W lon. more ...
Intermediate Xarray
Assignment 7: Xarray Fundamentals
Assignment 7 : Xarray¶
In this assignment, we will use Xarray to analyze top-of-atmosphere radiation data from NASA's CERES project.
Public domain, by NASA, from Wikimedia Commons
I have pre-downloaded and subsetted a portion of this dataset for use in our class. You can download it here: http://ldeo.columbia.edu/~rpa/CERES_EBAF-TOA_Edition4.0_200003-201701.condensed.nc. The size of the data file is 702.53 MB. It will take a minute or two to download.
more ...Xarray Fundamentals
Xarray for multidimensional gridded data¶
In last week's lecture, we saw how Pandas provided a way to keep track of additional "metadata" surrounding tabular datasets, including "indexes" for each row and labels for each column. These features, together with Pandas' many useful routines for all kinds of data munging and analysis, have made Pandas one of the most popular python packages in the world.
more ...Assignment 6: Pandas Groupby
Assignment 6: Pandas Groupby with Hurricane Data¶
Import pandas and matplotlib
Groupby in Pandas
Pandas: Groupby¶
groupby
is an amazingly powerful function in pandas. But it is also complicated to use and understand.
The point of this lesson is to make you feel confident in using groupby
and its cousins, resample
and rolling
.
These notes are loosely based on the Pandas GroupBy Documentation.
Imports:
Assignment 5: Pandas
Pandas for Tabular Data
Pandas¶
Pandas is a an open source library providing high-performance, easy-to-use data structures and data analysis tools. Pandas is particularly suited to the analysis of tabular data, i.e. data that can can go into a table. In other words, if you can imagine the data in an Excel spreadsheet, then Pandas is the tool for the job.
more ...Assignment 4: More Matplotlib
Assignment 4: More Matplotlib¶
The goal here is to replicate the figures you see as closely as possible.
In order to get some data, you will have to run the code in the cells below. Don't worry about how this code works. In the end, it will give you some numpy arrays, which you will use in your plots. You are not allowed to use any packages other than numpy and matplotlib to complete your assignment.
more ...More Matplotlib
More Matplotlib¶
Matplotlib is the dominant plotting / visualization package in python. It is important to learn to use it well. In the last lecture, we saw some basic examples in the context of learning numpy. This week, we dive much deeper. The goal is to understand how matplotlib represents figures internally.
more ...Assignment 3 - Numpy and Matplotlib
First import numpy and matplotlib
Numpy and Matplotlib
Numpy and Matplotlib¶
These are two of the most fundamental parts of the scientific python "ecosystem". Most everything else is built on top of them.
Assignment 2: Python Functions, Classes, and Modules
Assignment 2: Python Functions and Classes¶
Part I: Exploring the Python Standard Library¶
Skim the documentation for the datetime module
1. Import the datetime
module¶
Functions, Classes, and Modules
Python Functions, Classes, and Modules¶
For longer and more complex tasks, it is important to organize your code into reuseable elements. For example, if you find yourself cutting and pasting the same or similar lines of code over and over, you probably need to define a function to encapsulate that code and make it reusable. An important principle in programming in DRY more ...
Assignment 1: Unix, Git, Basic Python
Assignment 1¶
Part 1: Files, Git, GitHub¶
Use JupyterLab to launch a terminal and use the terminal to do the following tasks:
- Create a new directory called
resume
within your home directory - Create an empty file within this directory called
Readme.md
Now use JupyterLab to edit the file:
- Navigate to the directory in the file browser more ...
Introduction to Python
Core Python Language¶
Mostly copied from the official python tutorial
Invoking Python¶
There are three main ways to use python.
- By running a python file, e.g.
python myscript.py
- Through an interactive console (python interpreter or ipython shell)
- In an interactive iPython notebook
We will be using the iPython notebook.
Python Versions¶
There are two versions of the python language out there: python 2 and python 3. Python 2 is more common in the wild but is depracated. The community is moving to python 3. As new python learners, you should learn python 3. But it is important to be aware that python 2 exists. It is possible that a package you want to use is only supported in python 2. In general, it is pretty easy to switch between then.
more ...