Evaluation of model performance

Why measure model performance?

Climate models are the main tools available for investigating the response of the Earth's climate to various external and internal changes (‘forcings’, such as changes to solar radiation or volcanic eruptions), for making climate predictions on seasonal time scales, and for making projections of future climate. For this reason it is important to evaluate the performance of these models.

Model evaluation determines how well climate models represent historical climate and forms an integral part of the confidence building exercise for climate change projections. The assumption is that the better models perform over the historical (observed) period, the more confidence can be assigned to their projected changes.

What is usually done?

The direct approach to model evaluation is to compare climate model output with observations and analyse the resulting difference. Where possible, averages over the same time period in both models and observations are compared.

There are several generic model evaluation approaches (See the IPCC Working Group I, Chapter 9, 2013):

  1. Evaluate the overall model results: A significant development since the previous CMIP3 model evaluation is the increased use of quantitative statistical measures, referred to as performance metrics. The use of such metrics simplifies model evaluation and enables the quantitative assessment of model improvements over time.
  1. Isolation of climate processes in climate models: To understand the cause of climate model differences, it is necessary to evaluate the processes (e.g. formation of clouds or ocean currents) both in the context of the full model and in isolation. Additionally, model components related to climate processes can be evaluated separately in off-line simulations.
  1. Instrument simulators: Another approach is to calculate what a satellite would show if the satellite system were 'observing' the model, recreating the relationship between satellites and the real Earth. This approach is called ‘instrument simulator’ and allows a better comparison between satellite observations and model output.
  1. Ensemble approaches: Ensemble methods are used to explore the uncertainty in climate model simulations that arise from model internal variability, boundary conditions, model structure, and different model formulations.

What are the implications?

The evaluation of model simulations of historical climate is of direct relevance to detection and attribution studies since these rely on model-derived patterns of climate response to external forcing, and on the ability of models to simulate decadal and longer-timescale internal variability.

Confidence in climate model projections is based on physical understanding of the climate system and its representation in climate models, and on a demonstration of how well models represent a wide range of processes and climate characteristics on various spatial and temporal scales. A climate model’s credibility is increased if the model is able to simulate past variations in climate.