Part 1 - Equivalent potential temperature
Prepared by Robin Noyelle.
The objectives of this part are the following:
open the data files and understand their structure
compute equivalent potential temperature for one timestep of the data
plot the result and visually identify frontal structures
Preparation
Create a new notebook in the
code
folder and make sure you selectipp_analysis
as kernel. Rename the notebook with a well-chosen name (see the introduction).Convert the first cell to Markdown and add a title, e.g. # Frontal structures in CESM2 and add your name on a new line.
Add a new cell and import the required packages. We will need
numpy
,matplotlib.pyplot
andxarray
(use the standard abbreviations:np
,plt
andxr
).
Reading data
The first step in a scientific investigation is usually to load the data we want to analyse. Here we read the data located at ../data
. This is a file path relative to the location of the notebook, ..
means ‘one folder upwards’.
Create a new code cell. Define the filename and open the historical data using
xarray
.path = "../data/" t700 = xr.open_dataset(path + "t700_2015-2024.nc") q700 = xr.open_dataset(path + "q700_2015-2024.nc")
Create a new code cell. Look at the representation of
t700
- i.e. writet700
in the new code cell and execute it. What variables -data_vars
- are on the file? What dimensions? What time period does the dataset cover?Select the first day using
t700.isel(...)
(remember:python
is 0-based), select the variableT700
and create a plot using.plot()
. Do the same forq700
. Can you already see some frontal structures? How do you recognize them?
Computing equivalent potential temperature
The identification of weather fronts is easier when using equivalent potential temperature $\theta_e$ rather than working directly on temperature and specific humidity. An approximate formula for computing equivalent potential temperature is:
$$\theta_e \simeq ( T + \frac{L}{c_{pd}} r ) ( \frac{p_0}{p} )^{ \frac{R_d}{c_{pd}} } $$
where $T$ is the temperature, $L=2450$ kJ/kg is the latent heat of evaporation, $c_{pd}=1006$ J/(kg $\cdot$ K) is the specific heat of dry air at constant pressure, $r = \frac{q}{1-q}$ is the water vapour mixing ratio, $p_0=1000$ hPa is a reference pressure, $p$ is the pressure and $R_d=287$ J/(kg $\cdot$ K) is the specific gas constant for dry air.
Can you remeber the physical definition of $\theta_e$? What is the key property of this quantity? Can you guess how this key property may be useful to identify frontal structures?
Execute the following code, can you guess what it does?
tmp = t700.isel(time=0, lat=0, lon=0) p = 700e2 rho = p / (8.314 * tmp)
Now copy and execute the following code, what does it do? What is the difference to the code above?
def air_density( p , T ): """calculate air pressure density Parameters ---------- p : float Air pressure T : float Air temperature """ # gas constant R = 8.314 return p / (R * T) air_density(p, temp)
Write a
python
function (using the example from above as a starting point) which takes as input a temperature, a specific humidity and a pressure level and returns the corresponding equivalent potential temperature from the preceding formula. What are the IS units of the input and output quantities of this function? Try your function by passing the temperature, specific humidity and pressure level of the first lat-lon grid point of the first day of our data. Does the value you obtain make sense?Now that we have a function to compute $\theta_e$ at one grid point, we need to do the same at every grid point. Write a second function that receives as input lat-lon fields of temperature and specific humidity and loops explicitly (i.e. using an actual
python
loop) over all grid points to compute $\theta_e$. To loop over the latitude grid points you can write:n_lat = data['lat'].size for i in range(n_lat): data_gp = data.isel(lat=i)
and similarly for longitude grid points. Try this function by passing the temperature and specific humidity fields of the first time step of our data.
Your last function is probably slow, can you guess why? To measure its speed we can use a very useful so-called
ipython
magic:timeit
. Create a new cell and write%timeit myfunction(arguments_of_my_function)
, this will give you the mean running time of your function. Can you now estimate how long it would take on your computer to compute $\theta_e$ for the 10 years of our data?As a general rule,
for
loops are slow in Python and you should avoid using them as much as possible. One work around is to use the broadcasting property ofnumpy
arrays:numpy
is a package which implements fast arrays because it is coded in C, a compiled language which is closer to machine language. When you add or multiplynumpy
arrays, whatever shape they have, this will be fully transparent for you and it will be much faster than explicitly looping over all elements of the array. Thexarray
package that we are using now for working with our data is a wrapper aroundnumpy
, so that it also has this property. Now try to send the lat-lon data that you sent to your slow function to your original function computing $\theta_e$ for one grid point. If you coded it in a naive way this should work ! Compute the time this function takes compared to your explicit looping function and estimate how much faster it is. How long would the computation over the 10 years take now?Currently, we can only use the function computing $\theta_e$ in this notebook. If we want to use it again in the next notebook or in a script we need to find a different solution. We will copy the function to a
python
module - in the simplest case apython
module is just a text file with the ending.py
.Open a
python
file in the folder code using Jupyter Lab and name itcomputation.py
Copy your function inside this file (also you should probably add
import numpy as np
at the top of the file)Import the
computation.py
module in the notebook you are currently working on - to do so you need to leave the py ending away - i.e.,import computation
Check that you can indeed use your function inside the notebook after the import
If this does not work you have to restart the notebook (Kernel ‣ Restart Kernel) and run all code cells again
Identifying fronts visually
Now that you have a fast function to compute $\theta_e$, compute it for the first day of our data and plot the results. Make a nice figure with the map of temperature on the first column, the map of specific humidity on the second column and the map of $\theta_e$ on the last column. Use the Robinson projection and save the figure as a .png
file. Remember that in a scientific figure, variables usually have a unit and x and y-axes have a name! Do the same for one other date in all seasons of the first year. Can you visually identify the frontal structures in your plots? Can you explain as precisely as possible how you identify them? How would you implement this identification on a computer?