Part 2 - Reading global mean temperature data

Prepared by Mathias Hauser.

In the first part we have computed the global mean temperature from two climate models, and learned that they can be very different. Here we expand on this and will compare the global mean temperature from more models. However, we will not compute the global mean ourselves but work with pre-computed global mean temperature data, saved in a file with “comma-separated values” (csv).

Learning goals

programming goals
- open a csv file and convert it to a DataArray
- write more python functions
scientific and data analysis goals
- re-iterate the data analysis approach: load data, transform it, and create a figure
- see the difference between absolute temperatures and temperature anomalies
- get feeling for the inter-model spread, i.e. the differences between individual models

Preparation

Create a new notebook in the code folder and make sure you select ipp_analysis as kernel. Rename the notebook from Untitled.ipynb to p2_tglob_timeseries_name.ipynb where you replace name with your ETH username.
Add a title (in markdown) and your name to the notebook.
Add a new cell and import the required packages. We will need, cartopy.crs, numpy, matplotlib.pyplot, and xarray (use the standard abbreviations). Also import computation.

Read global mean temperature from csv

So far we have only loaded data from netCDF files. This is convenient, because it usually “just works”. Here we will open a csv file, which can be tedious, as its format is not standardized.

We will work with the file "../data/cmip6/tas/tas_annmean_globmean.csv".

We read the file using the pandas library. Pandas is a great library for working with tabular data, such as data stored in spreadsheets or databases. Once you open the csv file it will be a pandas DataFrame. A DataFrame is a 2 dimensional data container, replicating the rows and columns of our csv file. We will not cover this in-depth here.

Open the csv-file in a text editor and have a look - how is it structured?
Import pandas at the top of the notebook. Pandas is abbreviated as pd.
Read the data from the csv.
- Use the pandas function pd.read_csv().
- This will not “just work” you will have to pass some arguments to the function. Check the documentation either online at pandas.read_csv or using pd.read_csv? in a code cell.
- You will have to pass two additional arguments to be able to correctly read the file.
- In addition pass index_col=["year", "models"] to pd.read_csv (this is important to be able to convert it to a xr.Dataset later).
- Assign the data to the variable df (for DataFrame)
Tip

Check the sep and header arguments of the function.
Look at the repr (representation) of df. How does it look? How many columns and rows does it have?

Convert to a Dastaset

We could now continue to work with the DataFrame and it could be worthwhile to learn pandas. However, here we will continue to work with xarray and therefore need to convert the DataFrame to an xarray Dataset.

Find out how to convert the Dataframe to a Dataset, call it ds, which should now have the dimensions "year" and "models".
Save ds as netCDF file under the name "../data/cmip6/tas/tas_annmean_globmean.nc". This can be done using ds.to_netcdf(...) - we will use this file in the next part.
Select the variable tas on ds, such that you have a DataArray. Name it tas.

Plot a single model

List all available models using tas.models.
Select a single model by name (e.g. tas.sel(models="BCC-CSM2-MR")) and plot it.

Comparison to ACCESS-CM2

It’s always good to double check that everything went correctly.

Load the gridded tas data for the “ACCESS-CM2” model using xr.open_mfdataset (as in Part 1.).
- Instead of passing a list you can also use a wildcard (*) and use
```
filename = "../data/cmip6/tas/tas_ann_ACCESS-CM2_*_r1i1p1f1_g025.nc"
ds_access = xr.open_mfdataset(filename).load()
```
Note

We need to call load() to avoid opening it as a dask array (see Part 1).
Compute the global mean using computation.global_mean.

Warning

This requires that you already completed Part 1.
Plot the above-computed global mean time series.
Select the model ACCESS-CM2 using tas.sel(models="ACCESS-CM2") and plot the time series in the same code cell.

If everything worked out correctly the two lines should now exactly be above each other and you should only see one.

Plotting all models

So far we have not done any new science compared to Part 1. Here we want to extend this by plotting the time series of all models. Before we can do this we quickly need to introduce some new things.

Above we used tas.sel(...) to select and plot a single model. This is too cumbersome to do for all models we have here. So we will use a for loop.
We also look a bit more at the inner working of the DataArray: tas and tas.model are DataArray objects. However, under the hood they store the data as numpy arrays. These numpy arrays can be accessed using .values.

Add print(tas) and print(tas.values) in a new cell and execute it. Can you see the difference?
For the plot we will have to loop through the model names. Copy the code below to a new cell and execute it. What does it do?
```
for model in tas.models:
    print(model.values)
```

Copy the code below to plot all models individually:

for model in tas.models:
    tas.sel(models=model).plot(color="0.5")

Change the code so that all models have the same color.
Convert the data from K to °C.
Add a vertical line for the current year using plt.axvline(2025, color="0.1")
Look up the approximate global mean temperature on the internet. How do the models compare to that?

Calculate and plot anomalies

The models are very different from each other. They have very different global mean temperatures - to make them more comparable we can subtract the mean temperature over a period for each model and calculate anomalies.

Note

The term temperature anomaly means a departure from a reference value or long-term average

Select the time period 1850 to 1900 from tas and calculate the mean over the time dimension (i.e. over "year").
Subtract the mean over 1850 – 1900 from tas and name the result tas_anom.
Plot all models again
We want to add the multi model mean as well. Calculate the mean over all "models" of tas_anom and add it to the same plot. Color the line black. This needs to be outside of the for loop.
Create a figure with one subplot the way we learned it in the tutorial and pass ax=ax to the plot methods.
Add a title, x- and y- labels to the plot.

We now have a plot of historical and projected global mean temperature change from 1850–2100 for 20 different climate models. What can we learn from this figure? Note your findings in a new cell.

How different are the models at the beginning?
How different are they at the end of the 21st century?
How much did the models approximately warm on average?

Background information

As mentioned, the future part of the simulations feature a scenario with high radiative forcing and a strong warming. All the simulations were driven with the same boundary conditions. However, the models project very different temperatures at the end of the century — despite being forced with the same conditions. This poses a problem - which model should we believe?

One option is to consider the mean over all models. For example for the period 2081–2100 and for this specific selection of models we would get a multi-model mean warming of 4.7°C. For the reminder of this part we will look a bit deeper at models that show different warming and investigate the models with the highest and lowest warming, respectively.

The newest IPCC report takes a different approach to answer what temperature is likely at the end of the 21st century, given a certain scenario. They combine “different lines of evidence” - in addition to model data they also take observational data into account. Based on this they get a best estimate warming of 4.4°C at the end of the 21^st century (with a very likely range of 3.3 to 5.7) for this emission scenario.

Note that the chosen scenario — SSP5-8.5 — represents “the upper boundary of the range of scenarios described in the literature”, Indeed the SSP5-8.5 scenario might be too extreme, even if no or few actions are taken to reduce greenhouse gas emissions. (But also a more “realistic” high emission scenario still projects a very strong warming.)

Create a function to calculate anomalies

Before we go on to find the models with the most/ least warming we have to take care of something. We will need to calculate anomalies again, so to keep the habit let’s create a function for this:

Write a function with the following signature:
```
def calc_anomaly(ds, period=slice(1850, 1900)):
    ...
```
where ds is the passed Dataset and period is an optional argument.
Copy the code to calculate anomalies from above into a code cell. Make sure you change the variable names and to use period.
Replace the three dots (...) with your actual code.
Re-compute the anomaly, this time using calc_anomaly(tas).
Try another reference period.
Copy the function to computation.py.
Calculate the anomaly again, this time using computation.calc_anomaly(tas).
- For this you will have to restart the kernel (Kernel ‣ Restart Kernel).

Why did we create another function here? While yes we will use this function again, it’s only one line - so it might not really be worth it. The actual reason here is to train writing functions. If you learn to create functions, this can simplify your life tremendously.

Find extreme models

Even after subtracting the climatological mean the models still show a different warming. They reach different temperatures at the end of the century. Some of the models have a much smaller climate sensitivity (i.e. how strong the modelled climate reacts to a change in CO₂). Here, we are interested to find the models that warmed the most and the least.

Select the last year of tas_anom.
What is the temperature anomaly of the coldest and the warmest model in the year 2100?
Find the name of the two models (hint: idxmin & idxmax)
Select the last 20 years and calculate the mean - which model has the smallest/ largest temperature anomaly over the last 20 years (i.e. over the period 2081–2100)?
Are they the same two extreme models of the year 2100?

What have you learned so far?

Reflect back on Part 2. Note what you learned at the end of the notebook.

What have you learnt about temperature projections of climate models?

This concludes Part 2.