Singular value decomposition (SVD) is quite possibly the most widely-used multivariate statistical technique used in the atmospheric sciences.
The technique was first introduced to meteorology in a 1956 paper by Edward Lorenz, in which he referred to the process as empirical orthogonal function (EOF) analysis.
Today, it is also commonly known as principal-component analysis (PCA). All three names are still used, and refer to the same set of procedures within the Data Library.
The purpose of singular value decomposition is to reduce a dataset containing a large number of values to a dataset containing significantly fewer values,
but which still contains a large fraction of the variability present in the original data.
Often in the atmospheric and geophysical sciences, data will exhibit large spatial correlations. SVD analysis results in a more compact representation
of these correlations, especially with multivariate datasets and
can provide insight into spatial and temporal variations exhibited in the fields of data being analyzed.
There are a few caveats one should be aware of before computing the SVD of a set of data. First, the data must consist of anomalies. Secondly, the data should be de-trended.
When trends in the data exist over time, the first structure often captures them. If the purpose of the analysis is to find spatial correlations independent of trends, the
data should be de-trended before applying SVD analysis.
* * * * * * * * * *
* * * * * * * * * *
* * * * * * * * * *
Example: Perform a singular value decomposition of reconstructed sea surface temperature anomaly data in the North Atlantic for the months of December, January, and February from 1870 to 2004.
Locate Dataset and Variable
Compute Monthly Anomalies
Select Temporal and Spatial Domains
The time range entered will select only December, January, and February values for each year.Compute Singular Value Decomposition
{Y cosd} [X Y] [T] svd
The svd function computes the singular value
decomposition of the SST dataset weighted over the cosine of the latitude. Often, spatial
data will be weighted over the cosine of the latitude to account for area changes between meridians at varying latitudes. A weight term, however, is not necessary to complete the SVD analysis.
Five new variables appear under the Datasets and Variables subheading: normalized eigenvalues, structures, singular values, time series, and weights.
While all of the variables are associated with the same new coordinate system generated by the SVD, each contain a different piece of information about the system. View Normalized Eigenvalues
The first normalized eigenvalue is .233, the second eigenvalue is .151, and the third eigenvalue is .139. Recall that normalized eigenvalues represent the fraction of variance explained
by the structure associated with that eigenvalue. Therefore, the first structure explains 23% of the variance, the second structure 15%, and so on. Looking at the table, there are 402 structures. Yet, the first
three structures account for over 50% of the variance.Return to Dataset Page
This will remove the normalized eigenvector variable selection and return you to the SVD page.View Structures
This is an image of the 1st structure, which explains 23.2% of the total variance present in the original data.
Recall that the structures have been normalized, and as a result, are unitless quantities.
Note the large negative values off the coast of West Africa. This variability is caused by an ocean-atmosphere coupling system described in the third example.
This is an image of the second structure, which explains 15% of the total variance present in the original data. Notice the large negative values off the east coast of the United States that extend into the Central Atlantic.
These large values may be produced, in part, by the Gulf Stream current, which causes annual variability of SST's in the region. An image of the gulf stream current is provided below. The large values present in the 2nd EOF structure above and the vectors that represent the gulf stream current in the image below appear to overlap.
This region is also aligned with the jet stream, a narrow area where weather patterns move off the coast and cause additional variability in SST's.
The large values in the 2nd structure may also be caused by an atmospheric circulation
pattern known as the North Atlantic Oscillation.
Return to Dataset Page
This will remove the structures variable selection and return you to the SVD page.View Time Series
*NOTE: The singular values variable can be accessed the same way as the other three variables shown above.
* * * * * * * * * *
Example: Perform a singular value decomposition analysis of mean sea level pressure anomaly data in the North Atlantic for the months of December, January, and February from 1950 to 2004.
Locate Dataset and Variable
Compute Monthly Anomalies
Select Temporal and Spatial Domains
The time range entered will select only December, January, and February values for each year.Compute Singular Value Decomposition
{Y cosd} [X Y] [T] svd
The svd function computes the singular value decomposition of the mean sea level pressure dataset weighted over the cosine of the latitude. Find Eigenvalue of 1st Structure
The first normalized eigenvalue is .402, the second eigenvalue is .278, and the third eigenvalue is .100. Normalized eigenvalues represent the fraction of varience explained
by the structure associated with that eigenvalue. In this example, we will only be concerned with the first eigenvalue, which explains 40.2% of the total variance.Return to Dataset Page
This will remove the normalized eigenvector variable selection and return you to the SVD page.View 1st Structure
This is an image of the first structure, which explains 40.2% of the total variance present
in the original data. The large positive values centered around 45° N and the
large negative values centered around 65° N are indicative of two regions whose
mean sea level pressures are generally inversely related. This system is a well known
low-frequency atmospheric circulation pattern called the North Atlantic Oscillation. The
NAO is characterized by large-scale MSLP variablity associated with a subtropical high /
polar low system over the Northern Atlantic. During a postive NAO, the subtropical high is
stronger than usual and the polar low is deeper than usual. The increased pressure gradient
causes stronger winter storms to cross over the Atlantic. During a negative NAO, the
subtropical high and polar low are both weaker than usual, resulting in fewer / less severe
storms crossing the Atlantic.
* * * * * * * * * *
Example: Correlate a SVD time series of mean sea level pressure anomalies with a SVD time series of SST anomalies in the North Atlantic for the months of December, January, and February.
Select Dataset, Variable, and Domains
*NOTE: Datasets used in the example are similar to those used in the previous two examples.
SOURCES .NOAA .NCEP-NCAR .CDAS-1 .MONTHLY .Intrinsic .MSL .pressure
yearly-anomalies
Y (10N) (70N) RANGEEDGES
T (Dec-Feb 1950-2004) VALUES
X (5W) (80W) RANGEEDGESCompute Singular Value Decomposition
{Y cosd} [X Y] [T] svd
The svd function computes the singular value decomposition of the mean sea level pressure dataset weighted over the cosine of the latitude. Select Time Series Variable and 1st Eigenvector
You have selected the first eigenvector, and its associated time series.Add the Second Structure SVD Time Series of Reconstructed SST Anomaly Data.
SOURCES .NOAA .NCDC .ERSST .version2 .SST yearly-anomalies
X (5W) (80W) RANGEEDGES
Y (10N) (70N) RANGEEDGES
T (Dec-Feb 1870-2004) VALUES
{Y cosd}[X Y][T]svd
.Ts
ev (1) VALUE
The above commands add the SST anomaly data to the interface. The singular value decomposition of this data has already been preformed, and the 1st eigenvector has been selected.Correlate Datasets
[T] correlate
We can conclude there is a slight correlation between MSLP anomalies and SST anomalies in the North Atlantic.
The correlation coefficient is not very high because correlations between the 1st SST anomaly strucuture, for example, can be found in multiple MSLP anomaly structures. SVD analyses of the MSLP and SST datasets are independent of each other.
There is no guarantee that the maximum amount of association between two variables will be found in two distinct principal component analysis time series.
The above command correlates the two sets of data. The correlation coefficient is located under the Expert Mode text box in bold: 0.249616.
However, it has been proven that there is a relationship between these two datasets, specifically between these two structures. Atmospheric anomalies do cause SST anomalies, and vice versa. In this example, changes in MSLP sometimes cause an anomalous atmospheric cyclonic circulation
centered around 40° W and 30° N. The cyclone weakens the normal northerly winds off the west coast of Africa. As a result, coastal upwelling is reduced and positive SST anomalies occur. Scroll up the page to the first EOF structure in the first example. Notice the
extremely low values off the coast of West Africa. This SST variability is associated with variations in MSLP that produce the anomalous low.
* * * * * * * * * *
Unrotated emperical orthogonal functions (EOFs) are often very useful to describe natural modes of variability in a data field, due to their spatial and temporal orthogonality, ability to extract the maximum variance from a field, and relative
simplicity.
Yet, unrotated emperical orthogonal functions generally do a poor job of isolating individual modes of variation.
This weakness is largely due to four inherent characteristcs of unrotated EOFs: domain shape dependence, subdomain instability, sensitivity to sampling, and an inaccurate portrayal of the physical relationships embedded within the input data (Richman 1986).
* * * * * * * * * *
* * * * * * * * * *
* * * * * * * * * *
Example: Perform a varimax rotation of an SVD analysis of East Pacific sea surface temperatures.
Locate Dataset and Variable
Select Temporal and Spatial Domains
Compute Singular Value Decomposition
{Y cosd} [X Y] [T] svd
The svd function computes the singular value decomposition of the SST dataset weighted over the cosine of the latitude.
Five new variables appear under the Datasets and Variables subheading: normalized eigenvalues, structures, singular values, time series, and weights.
View Structures
The first structure is representative of the El Niño Southern Oscillation.
Recall that the first structure is the pattern that explains the most variability in the
original set of data.
The relatively large positive values located immediately off the west coast of South America
correspond to the
variability in SSTs caused by upwelling during La Niño years and the lack of
upwelling during El Niño years.
Notice that these values extend westward in a narrow line, and as a result, do not cover much surface area in the Pacific.
However, ENSO generally effects a greater area than depicted by this first structure. One explanation is that part of the ENSO pattern might be contained
in another strucuture, or multiple structures.
Return to Dataset Page
This will remove the structures variable selection and return you to the SVD page.Perform Varimax Rotation
3 varimax
The varimax function above performs a varimax rotation using the first three eigenvectors. Changing the number before the varimax command will change the number of eigenvalues
to be entered into the function. Seven new variables appear under the Datasets and Variables subheading: varimax rotation, communalities, energy, rotated structures, singular values, time series, and weights.
Select Rotated Structures Variable
View Structures
Notice that the colorscale is not centered around 0°. To enhance the interpretability of the image, the colormap can be adjusted so that
the scale is centered around 0°.
Return to Dataset Page
Generate Colormap
startcolormap
-1.5 1.5 RANGE
white DarkViolet DarkViolet
-1.5 VALUE
cyan
0 VALUE
white 0 bandmax
yellow orange
0.5 VALUE
red
1.5 VALUE
firebrick endcolormap
The colorscale is depicted at the bottom of the dataset page. Values less than -1.5° are assigned the color DarkViolet and values greater than 1.5° are assigned the color firebrick.
Values of 0° are white. Missing values are also white. For more information on colorscales, see the Data Library Tutorial.
View Structures
By rotating the first three eigenvectors via the varimax method, the resulting structure is more representative of the
physical pattern (ENSO) than the unrotated EOF structure illustrated earlier in the example. Pieces of the ENSO pattern contained in the multiple unrotated principal components have been incorporated
into one rotated component. The negative values now extend farther north and south, as
well as to the west. Many times, rotating the EOFs / PCs will result in a solution that better explains the underlying physical patterns in the input data.