Here analog models were used to obtain daily mean wind speed and wind gust estimations in Spain. Three datasets were used: daily 1000hPa geopotential height field over the North Atlantic at 12:00 UTC (Z1000) from ERA40 reanalysis, observational daily mean wind speed (MWS) and observational daily gust wind speeds (WGU) in Spain. Principal component analysis is used to reduce the dimensionality of the large-scale atmospheric pattern before the analog method is applied. The analog method is based on finding a PC subset of large-scale atmospheric patterns in the historic geopotential height that are most similar to a large-scale atmospheric pattern used as input.

In any analog model a weighting function is needed that considers the similarity of a situation to the past situations. Here, two Euclidean metrics are defined to be used in the ANPAF (Analog Pattern Finder) analog model (Figure 1.). One which takes into account the full set of PCs and another which considers a truncated set of PCs. The search of analog patterns is based on finding a time t that minimizes such distances in the PCA space. The result is a measure of the RMS distance between the historic PC score and the input PC score (the input atmospheric pattern (ie. the forecast)). d1 is the measure using all PCs and d2 is the measure using the elements of the retained PCs, avoiding some part of variability not contained in the retained first patterns and supposed as noise. λj is the eigenvalue which gives a measure of the variance of the data and is included to weight the variability of the different retained PCs.

Once the similarity scores have been obtained their corresponding dates give the associated wind fields resulting in an estimated wind field over Spain by averaging the analogs. A sensitivity study is performed to choose the best analogs. Two strategies have been used: distance threshold and fixed analog number. Large distance thresholds can correspond to climatological estimations (for a distance threshold of 50, more than 3000 analogs are found). In some cases a distance threshold ends with missing results if no analogs similar enough are found. In order to avoid this, the second method of setting the number of analogs is used. It was found that between 5 and 10 was the best number of analogs to use (Figure 2.).
Relevancy to our research:
This method may be used in our estimation of the PCs most similar to our input forecasts in our post-processing method.