Part 1 - Equivalent potential temperature

Prepared by Robin Noyelle.

The objectives of this part are the following:

open the data files and understand their structure
compute equivalent potential temperature for one timestep of the data
plot the result and visually identify frontal structures

Preparation

Create a new notebook in the code folder and make sure you select ipp_analysis as kernel. Rename the notebook with a well-chosen name (see the introduction).
Convert the first cell to Markdown and add a title, e.g. # Frontal structures in CESM2 and add your name on a new line.
Add a new cell and import the required packages. We will need numpy, matplotlib.pyplot and xarray (use the standard abbreviations: np, plt and xr).

Reading data

The first step in a scientific investigation is usually to load the data we want to analyse. Here we read the data located at ../data. This is a file path relative to the location of the notebook, .. means ‘one folder upwards’.

Create a new code cell. Define the filename and open the historical data using xarray.

path = "../data/"
t700 = xr.open_dataset(path + "t700_2015-2024.nc")
q700 = xr.open_dataset(path + "q700_2015-2024.nc")

Create a new code cell. Look at the representation of t700 - i.e. write t700 in the new code cell and execute it. What variables - data_vars - are on the file? What dimensions? What time period does the dataset cover?
Select the first day using t700.isel(...) (remember: python is 0-based), select the variable T700 and create a plot using .plot(). Do the same for q700. Can you already see some frontal structures? How do you recognize them?

Computing equivalent potential temperature

The identification of weather fronts is easier when using equivalent potential temperature $\theta_e$ rather than working directly on temperature and specific humidity. An approximate formula for computing equivalent potential temperature is:

$$\theta_e \simeq ( T + \frac{L}{c_{pd}} r ) ( \frac{p_0}{p} )^{ \frac{R_d}{c_{pd}} } $$

where $T$ is the temperature, $L=2450$ kJ/kg is the latent heat of evaporation, $c_{pd}=1006$ J/(kg $\cdot$ K) is the specific heat of dry air at constant pressure, $r = \frac{q}{1-q}$ is the water vapour mixing ratio, $p_0=1000$ hPa is a reference pressure, $p$ is the pressure and $R_d=287$ J/(kg $\cdot$ K) is the specific gas constant for dry air.

Can you remeber the physical definition of $\theta_e$? What is the key property of this quantity? Can you guess how this key property may be useful to identify frontal structures?

Execute the following code, can you guess what it does?

    tmp = t700.isel(time=0, lat=0, lon=0)
    p = 700e2
    rho = p / (8.314 * tmp)

Now copy and execute the following code, what does it do? What is the difference to the code above?

def air_density( p , T ):
"""calculate air pressure density

    Parameters
    ----------
    p : float
        Air pressure
    T : float
        Air temperature
"""

    # gas constant
    R = 8.314

    return p / (R * T)

air_density(p, temp)

Write a python function (using the example from above as a starting point) which takes as input a temperature, a specific humidity and a pressure level and returns the corresponding equivalent potential temperature from the preceding formula. What are the IS units of the input and output quantities of this function? Try your function by passing the temperature, specific humidity and pressure level of the first lat-lon grid point of the first day of our data. Does the value you obtain make sense?
Now that we have a function to compute $\theta_e$ at one grid point, we need to do the same at every grid point. Write a second function that receives as input lat-lon fields of temperature and specific humidity and loops explicitly (i.e. using an actual python loop) over all grid points to compute $\theta_e$. To loop over the latitude grid points you can write:
```
n_lat = data['lat'].size
for i in range(n_lat):
    data_gp = data.isel(lat=i)
```
and similarly for longitude grid points. Try this function by passing the temperature and specific humidity fields of the first time step of our data.
Your last function is probably slow, can you guess why? To measure its speed we can use a very useful so-called ipython magic: timeit. Create a new cell and write %timeit myfunction(arguments_of_my_function), this will give you the mean running time of your function. Can you now estimate how long it would take on your computer to compute $\theta_e$ for the 10 years of our data?
As a general rule, for loops are slow in Python and you should avoid using them as much as possible. One work around is to use the broadcasting property of numpy arrays: numpy is a package which implements fast arrays because it is coded in C, a compiled language which is closer to machine language. When you add or multiply numpy arrays, whatever shape they have, this will be fully transparent for you and it will be much faster than explicitly looping over all elements of the array. The xarray package that we are using now for working with our data is a wrapper around numpy, so that it also has this property. Now try to send the lat-lon data that you sent to your slow function to your original function computing $\theta_e$ for one grid point. If you coded it in a naive way this should work ! Compute the time this function takes compared to your explicit looping function and estimate how much faster it is. How long would the computation over the 10 years take now?
Currently, we can only use the function computing $\theta_e$ in this notebook. If we want to use it again in the next notebook or in a script we need to find a different solution. We will copy the function to a python module - in the simplest case a python module is just a text file with the ending .py.
- Open a python file in the folder code using Jupyter Lab and name it computation.py
- Copy your function inside this file (also you should probably add import numpy as np at the top of the file)
- Import the computation.py module in the notebook you are currently working on - to do so you need to leave the py ending away - i.e., import computation
- Check that you can indeed use your function inside the notebook after the import
- If this does not work you have to restart the notebook (Kernel ‣ Restart Kernel) and run all code cells again

Identifying fronts visually

Now that you have a fast function to compute $\theta_e$, compute it for the first day of our data and plot the results. Make a nice figure with the map of temperature on the first column, the map of specific humidity on the second column and the map of $\theta_e$ on the last column. Use the Robinson projection and save the figure as a .png file. Remember that in a scientific figure, variables usually have a unit and x and y-axes have a name! Do the same for one other date in all seasons of the first year. Can you visually identify the frontal structures in your plots? Can you explain as precisely as possible how you identify them? How would you implement this identification on a computer?