Part 2 - Reading global mean temperature data
Prepared by Mathias Hauser.
In the first part we have computed the global mean temperature from two climate models, and learned that they can be very different. Here we expand on this and will compare the global mean temperature from more models. However, we will not compute the global mean ourselves but work with pre-computed global mean temperature data, saved in a file with “comma-separated values” (csv).
Learning goals
programming goals
open a csv file and convert it to a
DataArray
write more python functions to familiarize yourself with the concept
scientific and data analysis goals
re-iterate the data analysis approach: load data, transform it, and create a figure
see the difference between absolute temperatures and temperature anomalies
get feeling for the inter-model spread, i.e. the differences between individual models
Preparation
Create a new notebook in the
code
folder and make sure you selectipp_analysis
as kernel. Rename the notebook fromUntitled.ipynb
top2_tglob_timeseries_name.ipynb
where you replacename
with your ETH username.Add a title (in markdown) and your name to the notebook.
Add a new cell and import the required packages. We will need,
cartopy.crs
,numpy
,matplotlib.pyplot
, andxarray
(use the standard abbreviations). Alsoimport computation
.
Read global mean temperature from csv
So far we have only loaded data from netCDF files. This is convenient, because it usually “just works”. Here we will open a csv file, which can be tedious, as its format is not standardized.
We will work with the file "../data/cmip6/tas/tas_annmean_globmean.csv"
.
We read the file using the pandas library. Pandas is a great
library to work with tabular data, such as data stored in spreadsheets
or databases. If you successfully manage to open the csv file it will be
a pandas DataFrame
. A DataFrame
is a 2 dimensional data
container, replicating the rows and columns of our csv file. We will
not cover this in-depth here.
Open the csv-file in a text editor and have a look - how is it structured?
Import pandas at the top of the notebook. Pandas is abbreviated as
pd
.Read the data from the csv.
Use the pandas function
pd.read_csv
.This will not “just work” you will have to pass some arguments to the function. Check the documentation either online at pandas.read_csv or using
pd.read_csv?
in a code cell.You will have to pass two additional arguments to be able to correctly read the file.
In addition pass
index_col=["year", "models"]
topd.read_csv
(this is important to be able to convert it to axr.Dataset
later).Assign the data to the variable
df
(for DataFrame)
Note
Check the
sep
andheader
arguments of the function.Look at the repr (representation) of
df
. How does it look? How many columns and rows does it have?
Convert to a Dastaset
We could now continue to work with the DataFrame
and it could be
worthwhile to learn pandas. However, here we will continue to work with
xarray and therefore need to convert the DataFrame
to an xarray
Dataset
.
Find out how to convert the
Dataframe
to aDataset
, call itds
, which should now have the dimensions"year"
and"models"
.Save
ds
as netCDF file under the name"../data/cmip6/tas/tas_annmean_globmean.nc"
. This can be done usingds.to_netcdf(...)
- we will use this file in the next part.Select the variable
tas
onds
, such that you have aDataArray
. Name ittas
.
Plot a single model
List all available models using
tas.models
.Select a single model by name (
tas.sel(...)
) and plot it.
Comparison to ACCESS-CM2
It’s always good to double check that everything went correctly.
Load the gridded tas data for the “ACCESS-CM2” model using
xr.open_mfdataset
(as in Part 1.).Instead of passing a list you can also use a wildcard (
*
) and usefilename = "../data/cmip6/tas/tas_ann_ACCESS-CM2_*_r1i1p1f1_g025.nc" ds_access = xr.open_mfdataset(filename).load()
Note
We need to call
load()
to avoid opening it as a dask array (see Part 1).Compute the global mean using
computation.global_mean
.Warning
This requires that you already completed Part 1.
Plot the above-computed global mean time series.
Select the model ACCESS-CM2 from
tas.sel(...)
and plot the time series in the same code cell.
If everything worked out correctly the two lines should now exactly be above each other and you should only see one.
Plotting all models
So far we have not done any new science compared to Part 1. Here we want to extend this by plotting the time series of all models. Before we can do this we quickly need to introduce some new things.
Above we used
tas.sel(...)
to select and plot a single model. This is too cumbersome to do for all models we have here. So we will use a for loop.We also look a bit more at the inner working of the DataArray:
tas
andtas.model
areDataArray
objects. However, under the hood they store the data as numpy arrays. These numpy arrays can be accessed using.values
.
Add
print(tas)
andprint(tas.values)
in a new cell and execute it. Can you see the difference?For the plot we will have to loop through the model names. Copy the code below to a new cell and execute it. What does it do?
for model in tas.models: print(model.values)
Copy the code but instead of the
print
function select themodel
fromtas
and plot it.Now all models have different colors. We don’t want that. Make sure all models have the same color.
Convert the data from K to °C.
Look up the approximate global mean temperature on the internet. How do the models compare to that?
Calculate and plot anomalies
The models are very different from each other. They have very different global mean temperatures - to make them more comparable we can subtract the mean temperature over a common period for each model and calculate anomalies.
Select the time period 1850 to 1900 from
tas
and calculate the mean over the time (i.e."year"
) dimension.Subtract the mean over 1850 - 1900 from tas and name the result
tas_anom
.Plot all models again
We want to add the multi model mean as well. Calculate the mean over all
"models"
oftas_anom
and add it to the same plot. Color the line black. This needs to be outside of the for loop.Create a figure with one subplot the way we learned it in the tutorial and pass
ax=ax
to the plot methods.Add a title, x- and y- labels to the plot.
We now have a plot of historical and projected global mean temperature change from 1850-2100 for 20 different climate models. What can we learn from this figure? Note your findings in a new cell.
How different are the models at the beginning?
How different are they at the end of the 21st century?
How much did the models approximately warm on average?
As mentioned, the future part of the simulations feature a scenario with high radiative forcing and a strong warming. All the simulations were driven with the same boundary conditions. However, the models project very different temperatures at the end of the century - despite being forced with the same conditions. This poses a problem - which model should we believe?
One option is to consider the mean over all models. For example for the period 2081-2100 and for this specific selection of models we would get a multi-model mean warming of 4.7°C. For the reminder of this part we will look a bit deeper at models that show different warming and investigate the models with the highest and lowest warming, respectively.
The newest IPCC report takes a different approach to answer what temperature is likely at the end of the 21st century, given a certain scenario. They combine “different lines of evidence” - in addition to model data they also take observational data into account. Based on this they get a best estimate warming of 4.4°C at the end of the 21st century (with a very likely range of 3.3 to 5.7).
Note that the chosen scenario - SSP5-8.5 - represents “the upper boundary of the range of scenarios described in the literature”, Indeed the SSP5-8.5 scenario might be too extreme, even if no or few actions are taken to reduce greenhouse gas emissions. (But also a more “realistic” high emission scenario still projects a very strong warming.)
Create a function to calculate anomalies
Before we go on to find the models with the most/ least warming we have to take care of something. We will need to calculate anomalies again, so to keep the habit let’s create a function for this:
Write a function with the following signature:
def calc_anomaly(ds, period=slice(1850, 1900)): ...
where
ds
is the passedDataset
andperiod
is an optional argument.Copy the code to calculate anomalies from above into a code cell. Make sure you change the variable names and to use
period
.Replace the three dots (
...
) with your actual code.Re-compute the anomaly, this time using
calc_anomaly(tas)
.Try another reference period.
Copy the function to
computation.py
.Calculate the anomaly again, this time using
computation.calc_anomaly(tas)
.For this you will have to restart the notebook (
) or reloadcomputation
for your changes to take effect.from importlib import reload reload(computation)
Why did we create another function here? While yes we will use this function again, it’s only one line - so it might not really be worth it. The bigger reason here is to train writing functions. If you learn to create functions, this can simplify your life tremendously.
Find extreme models
Even after subtracting the climatological mean the models still show a different warming. They reach different temperatures at the end of the century. Some of the models have a much smaller climate sensitivity (i.e. how strong the modelled climate reacts to a change in CO2). Here, we are now interested to find the models that warmed the most and the least.
Select the last year of
tas_anom
.What is the temperature anomaly of the coldest and the warmest model in the year 2100?
Find the name of the two models (hint:
idxmin
&idxmax
)Select the last 20 years and calculate the mean - which model has the smallest/ largest temperature anomaly over the last 20 years (i.e. over the period 2081-2100)?
Are they the same two extreme models of the year 2100?
What have you learned so far?
Reflect back on Part 2. Note what you learned at the end of the notebook.
What have you learnt about temperature projections of climate models?
This concludes Part 2.