Part 2 - Reading global mean temperature data

Prepared by Mathias Hauser.

In the first part we have computed the global mean temperature from two climate models, and learned that they can be very different. Here we expand on this and will compare the global mean temperature from more models. However, we will not compute the global mean ourselves but work with pre-computed global mean temperature data, saved in a file with “comma-separated values” (csv).

Learning goals

  • programming goals

    • open a csv file and convert it to a DataArray

    • write more python functions to familiarize yourself with the concept

  • scientific and data analysis goals

    • re-iterate the data analysis approach: load data, transform it, and create a figure

    • see the difference between absolute temperatures and temperature anomalies

    • get feeling for the inter-model spread, i.e. the differences between individual models

Preparation

  1. Create a new notebook in the code folder and make sure you select ipp_analysis as kernel. Rename the notebook from Untitled.ipynb to p2_tglob_timeseries_name.ipynb where you replace name with your ETH username.

  2. Add a title (in markdown) and your name to the notebook.

  3. Add a new cell and import the required packages. We will need, cartopy.crs, numpy, matplotlib.pyplot, and xarray (use the standard abbreviations). Also import computation.

Read global mean temperature from csv

So far we have only loaded data from netCDF files. This is convenient, because it usually “just works”. Here we will open a csv file, which can be tedious, as its format is not standardized.

We will work with the file "../data/cmip6/tas/tas_annmean_globmean.csv".

We read the file using the pandas library. Pandas is a great library to work with tabular data, such as data stored in spreadsheets or databases. If you successfully manage to open the csv file it will be a pandas DataFrame. A DataFrame is a 2 dimensional data container, replicating the rows and columns of our csv file. We will not cover this in-depth here.

  1. Open the csv-file in a text editor and have a look - how is it structured?

  2. Import pandas at the top of the notebook. Pandas is abbreviated as pd.

  3. Read the data from the csv.

    • Use the pandas function pd.read_csv.

    • This will not “just work” you will have to pass some arguments to the function. Check the documentation either online at pandas.read_csv or using pd.read_csv? in a code cell.

    • You will have to pass two additional arguments to be able to correctly read the file.

    • In addition pass index_col=["year", "models"] to pd.read_csv (this is important to be able to convert it to a xr.Dataset later).

    • Assign the data to the variable df (for DataFrame)

    Note

    Check the sep and header arguments of the function.

  4. Look at the repr (representation) of df. How does it look? How many columns and rows does it have?

Convert to a Dastaset

We could now continue to work with the DataFrame and it could be worthwhile to learn pandas. However, here we will continue to work with xarray and therefore need to convert the DataFrame to an xarray Dataset.

  1. Find out how to convert the Dataframe to a Dataset, call it ds, which should now have the dimensions "year" and "models".

  2. Save ds as netCDF file under the name "../data/cmip6/tas/tas_annmean_globmean.nc". This can be done using ds.to_netcdf(...) - we will use this file in the next part.

  3. Select the variable tas on ds, such that you have a DataArray. Name it tas.

Plot a single model

  1. List all available models using tas.models.

  2. Select a single model by name (tas.sel(...)) and plot it.

Comparison to ACCESS-CM2

It’s always good to double check that everything went correctly.

  1. Load the gridded tas data for the “ACCESS-CM2” model using xr.open_mfdataset (as in Part 1.).

    • Instead of passing a list you can also use a wildcard (*) and use

      filename = "../data/cmip6/tas/tas_ann_ACCESS-CM2_*_r1i1p1f1_g025.nc"
      ds_access = xr.open_mfdataset(filename).load()
      

    Note

    We need to call load() to avoid opening it as a dask array (see Part 1).

  2. Compute the global mean using computation.global_mean.

    Warning

    This requires that you already completed Part 1.

  3. Plot the above-computed global mean time series.

  4. Select the model ACCESS-CM2 from tas.sel(...) and plot the time series in the same code cell.

If everything worked out correctly the two lines should now exactly be above each other and you should only see one.

Plotting all models

So far we have not done any new science compared to Part 1. Here we want to extend this by plotting the time series of all models. Before we can do this we quickly need to introduce some new things.

  • Above we used tas.sel(...) to select and plot a single model. This is too cumbersome to do for all models we have here. So we will use a for loop.

  • We also look a bit more at the inner working of the DataArray: tas and tas.model are DataArray objects. However, under the hood they store the data as numpy arrays. These numpy arrays can be accessed using .values.

  1. Add print(tas) and print(tas.values) in a new cell and execute it. Can you see the difference?

  2. For the plot we will have to loop through the model names. Copy the code below to a new cell and execute it. What does it do?

    for model in tas.models:
           print(model.values)
    
  3. Copy the code but instead of the print function select the model from tas and plot it.

  4. Now all models have different colors. We don’t want that. Make sure all models have the same color.

  5. Convert the data from K to °C.

  6. Look up the approximate global mean temperature on the internet. How do the models compare to that?

Calculate and plot anomalies

The models are very different from each other. They have very different global mean temperatures - to make them more comparable we can subtract the mean temperature over a common period for each model and calculate anomalies.

  1. Select the time period 1850 to 1900 from tas and calculate the mean over the time (i.e. "year") dimension.

  2. Subtract the mean over 1850 - 1900 from tas and name the result tas_anom.

  3. Plot all models again

  4. We want to add the multi model mean as well. Calculate the mean over all "models" of tas_anom and add it to the same plot. Color the line black. This needs to be outside of the for loop.

  5. Create a figure with one subplot the way we learned it in the tutorial and pass ax=ax to the plot methods.

  6. Add a title, x- and y- labels to the plot.

We now have a plot of historical and projected global mean temperature change from 1850-2100 for 20 different climate models. What can we learn from this figure? Note your findings in a new cell.

  1. How different are the models at the beginning?

  2. How different are they at the end of the 21st century?

  3. How much did the models approximately warm on average?

As mentioned, the future part of the simulations feature a scenario with high radiative forcing and a strong warming. All the simulations were driven with the same boundary conditions. However, the models project very different temperatures at the end of the century - despite being forced with the same conditions. This poses a problem - which model should we believe?

One option is to consider the mean over all models. For example for the period 2081-2100 and for this specific selection of models we would get a multi-model mean warming of 4.7°C. For the reminder of this part we will look a bit deeper at models that show different warming and investigate the models with the highest and lowest warming, respectively.

The newest IPCC report takes a different approach to answer what temperature is likely at the end of the 21st century, given a certain scenario. They combine “different lines of evidence” - in addition to model data they also take observational data into account. Based on this they get a best estimate warming of 4.4°C at the end of the 21st century (with a very likely range of 3.3 to 5.7).

Note that the chosen scenario - SSP5-8.5 - represents “the upper boundary of the range of scenarios described in the literature”, Indeed the SSP5-8.5 scenario might be too extreme, even if no or few actions are taken to reduce greenhouse gas emissions. (But also a more “realistic” high emission scenario still projects a very strong warming.)

Create a function to calculate anomalies

Before we go on to find the models with the most/ least warming we have to take care of something. We will need to calculate anomalies again, so to keep the habit let’s create a function for this:

  1. Write a function with the following signature:

    def calc_anomaly(ds, period=slice(1850, 1900)):
        ...
    

    where ds is the passed Dataset and period is an optional argument.

  2. Copy the code to calculate anomalies from above into a code cell. Make sure you change the variable names and to use period.

  3. Replace the three dots (...) with your actual code.

  4. Re-compute the anomaly, this time using calc_anomaly(tas).

  5. Try another reference period.

  6. Copy the function to computation.py.

  7. Calculate the anomaly again, this time using computation.calc_anomaly(tas).

    • For this you will have to restart the notebook (Kernel ‣ Restart Kernel) or reload computation for your changes to take effect.

      from importlib import reload
      reload(computation)
      

Why did we create another function here? While yes we will use this function again, it’s only one line - so it might not really be worth it. The bigger reason here is to train writing functions. If you learn to create functions, this can simplify your life tremendously.

Find extreme models

Even after subtracting the climatological mean the models still show a different warming. They reach different temperatures at the end of the century. Some of the models have a much smaller climate sensitivity (i.e. how strong the modelled climate reacts to a change in CO2). Here, we are now interested to find the models that warmed the most and the least.

  1. Select the last year of tas_anom.

  2. What is the temperature anomaly of the coldest and the warmest model in the year 2100?

  3. Find the name of the two models (hint: idxmin & idxmax)

  4. Select the last 20 years and calculate the mean - which model has the smallest/ largest temperature anomaly over the last 20 years (i.e. over the period 2081-2100)?

  5. Are they the same two extreme models of the year 2100?

What have you learned so far?

Reflect back on Part 2. Note what you learned at the end of the notebook.

  • What have you learnt about temperature projections of climate models?

This concludes Part 2.