Seasonal skill score based on the historical performance of each calibrated NMME model and their multimodel ensemble (1982-2010).

Skill is mapped by calendar month for seasonal lead times. Lead 1 = months 2-4, Lead 2 = months 3-5, Lead 3 = months 4-6, Lead 4 = months 5-7 after the forecast is issued. Forecasts skill scores combine start times by calendar month and across years 1982 to 2010. The observational reference datasets are CMAP-URD for precipitation and GHCN-CAMS for temperature. The models included in the assessment are: the Center for Ocean-Land-Atmosphere Studies/University of Miami (COLA-RSMAS-CCSM4), one from the National Aeronautics and Space Administration (NASA-GMAO-062012), three from the Geophysical Fluid Dynamics Laboratory (GFDL-CM2p1-ae04, GFDL-CM2p5-FLOR-A06, GFDL-CM2p5-FLOR-B01), two from the Canadian Meteorological Center (CMC1- CanCM3, CMC2-CanCM4), one from NOAA’s Centers for Environmental Prediction (NCEP- CFSv2) and one from the National Center for Atmospheric Research (NCAR-CESM1).

These skill scores diagnostics maps give a sense of where and when (issued which months of the year and for which seasonal lead times) the probabilistic seasonal forecasts have the potential to provide useful information, based on hindcasting.

**Skill scores definitions:**

**RPSS**: Ranked Probability Skill Scores (RPSS; Epstein (1969); Murphy (1969, 1971); Weigel et al. (2007)) are used to quantify the extent to which the calibrated tercile-category predictions are improved compared to climatological frequencies. RPSS values tend to be small, even for skillful forecasts. The approximate relationship between RPSS and correlation being such that a RPSS value of 0.1 corresponds to a correlation of about 0.44 (Tippett et al. 2010).

**Forecasts Skill Scores:** Global 1˚ Multi-Model Ensemble forecasts skill scores per month of the year over
the period 1982-2010 available here.

