Eadaoin’s time at EGU 2019

On Monday I attended the session ‘Numerical weather prediction, data assimilation and ensemble forecasting’. Among other interesting talks was one titled ‘Fraction of large forecast errors in global NWP’ by Thomas Haiden, which discussed the errors in upper level variables at ECMWF and the large-scale errors. He found that the fraction of large errors is important for communicating the forecast improvements. The fraction of large errors is highly correlated with average errors. Even if the skill of an NWP model improves from day 5 to day 4 over a few years, it is still important to look at the distribution of errors and how that changes with model updates.

I also attended the session ‘Big data and machine learning geosciences’ where there was an interesting talk ‘Machine learning methods for predicting energy demand/production based on hydro-meteorological input’ by Konrad Bogner, where they looked at a number of different machine learning methods to predict the energy demand and supply in the near future. These can also be used to look at predictive uncertainties in the energy demand and supply chain. Methods included multivariate linear regression (MLR), multivariate adaptive regression splines (MARS), Quantile regression (QR), quantile regression neural networks (QRNN) quantile random forest (WRF) and deep learning quantile regression (DLQR). The demand model includes variables such as daily measurements of temperature, precipitation, global radiation and wind speed as well as details about weekdays and holidays.

On Thursday I went to an interesting session which was very relevant to my research called ‘Advances in statistical post-processing for deterministic and ensemble forecasts’. A talk ‘Statistical post-processing of dual-resolution ensemble forecasts’ by Sándor Baran, looked at the skill of post-processing a dual resolution ensemble with members with a horizontal resolution of 18km and 45km. The most popular post-processing methods include BMA and EMOS. Results found that the improvement of post-processing dual resolution ensembles is not as pronounced as using a single resolution ensemble.

The next talk was by Nina Schuhen called ‘Rapid Adjustment of Forecast Trajectories: Improving short-term forecast skill through statistical post-processing’ which was about applying post-processing techniques to older forecast runs to before the next forecast is complete. The adjustment is from short-term observations which can be applied as soon as observations are available. Results found a better forecast from RAFT applied to earlier forecast runs than newer forecast runs.

The ‘Energy meteorology’ session was on Friday. The first talk was ‘Forecasting of solar irradiance at Reunion Island using numerical weather prediction models’ by Frederik Kurzrock, who found that spatial averaging usually improves GHI forecasts.

In the session ‘Forecasting the weather and aviation meteorology’ I heard an interesting talk about ‘IMPROVER: A probabilistic, multi-model post-processing system for meteorological forecasts’ by Benjamin Ayliffe, which is an open-source probability-based post-processing system from the Met Office. It has multiple modular steps including verification, thresholds to create probability, neighbourhood processing – land and topographic aware, time lag individual models among others.

An interesting piece of information I picked up during the week was that ERA5 tends to have low wind speed, possibly due to a roughness parameter issue.

While at EGU I presented my poster – ‘Multivariate spatial post-processing for renewable energy forecasts’. I got some positive interest including interest from energy traders in industry. I also got some good feedback recommending me to extend the lead time of my forecasts out further.


Journal club: 03-04-2019

Probabilistic and deterministic results of the ANPAF analog model for Spanish wind field estimations (2012).

Here analog models were used to obtain daily mean wind speed and wind gust estimations in Spain. Three datasets were used: daily 1000hPa geopotential height field over the North Atlantic at 12:00 UTC (Z1000) from ERA40 reanalysis, observational daily mean wind speed (MWS) and observational daily gust wind speeds (WGU) in Spain. Principal component analysis is used to reduce the dimensionality of the large-scale atmospheric pattern before the analog method is applied. The analog method is based on finding a PC subset of large-scale atmospheric patterns in the historic geopotential height that are most similar to a large-scale atmospheric pattern used as input.

Figure 1. Illustration of hte ANPAF analog method

In any analog model a weighting function is needed that considers the similarity of a situation to the past situations. Here, two Euclidean metrics are defined to be used in the ANPAF (Analog Pattern Finder) analog model (Figure 1.). One which takes into account the full set of PCs and another which considers a truncated set of PCs. The search of analog patterns is based on finding a time t that minimizes such distances in the PCA space. The result is a measure of the RMS distance between the historic PC score and the input PC score (the input atmospheric pattern (ie. the forecast)). d1 is the measure using all PCs and d2 is the measure using the elements of the retained PCs, avoiding some part of variability not contained in the retained first patterns and supposed as noise. λj is the eigenvalue which gives a measure of the variance of the data and is included to weight the variability of the different retained PCs.

Figure 2. RMSE values of the Z1000 estimations versus analog number applying different distances, d1 (blue continuous line) and d2 (red dashed line)

Once the similarity scores have been obtained their corresponding dates give the associated wind fields resulting in an estimated wind field over Spain by averaging the analogs. A sensitivity study is performed to choose the best analogs. Two strategies have been used: distance threshold and fixed analog number. Large distance thresholds can correspond to climatological estimations (for a distance threshold of 50, more than 3000 analogs are found). In some cases a distance threshold ends with missing results if no analogs similar enough are found. In order to avoid this, the second method of setting the number of analogs is used. It was found that between 5 and 10 was the best number of analogs to use (Figure 2.).

Relevancy to our research:

This method may be used in our estimation of the PCs most similar to our input forecasts in our post-processing method.


Journal club: 29-11-2018

Improved very short-term spatio-temporal wind forecasting using atmospheric regimes (2018).

Here a regime-switching vector autoregressive (VAR) method for very short-term wind speed forecasting (1-6 hours ahead) at multiple locations with regimes based on large-scale meteorological phenomena is presented.

Principal component analysis is first performed on surface wind, sea-level pressure fields and the geopotential height field at 500hPa level from MERRA-2 reanalysis dataset. Self-organising maps followed by k-means clustering is then used to group the data into atmospheric modes. Three atmospheric modes are found to be optimal for the case study of 6 years of measurements from 23 weather station in the UK. Mode 1 is associated with anticyclone circulation and moderate wind speed conditions, mode 2 is associated with low-wind speed cases and calm conditions over the UK and mode 3 is linked with cyclonic atmosphere circulation patterns and relatively high wind speed conditions (Figure 1.). Relatively small changes in large-scale atmospheric circulation may lead to different surface wind fields over the UK, which is important for wind energy applications.

Figure 1. Visualisation of sea‐level pressure field (SLP), geopotential height at 500 hPa (Z500), and wind speed (WS) in units of ms−1 for the 3 atmospheric mode centroids.

VAR is used to capture advantages in using lagged measurements for spatially dispersed sites.

A range of VAR models is tested:

VAR_d – time of day is included as dummy variables as wind speed exhibits diurnal seasonality.

VAR_d_m – atmospheric mode dummies are also included.

CVAR_d – model parameters may themselves be dependent on atmospheric mode resulting in a conditional VAR model.

The RMSE for a 1 hour ahead forecast is reduced by 0.3% – 4.1% and for a 6 hour ahead forecast the improvement is about 3.1% compared to the most competitive benchmark. Improvement is dependent on the mode, the largest errors are associated with mode 3 (cyclonic conditions).

Relevancy to our research:

The authors suggest to run this method operationally the atmospheric mode could be determined from forecasts produced by NWP. They also note that the work here has only been applied to wind speed forecasting and further work is required to quantify the benefits for wind power forecasting. “Defining atmospheric modes on numerical weather predictions in order to forecast the future mode, for example, could enhance both very short-term and day-ahead wind and wind power forecasts.”


Post-processing Techniques for Renewable Energy Forecasts

Here is a review of the most common statistical post-processing methods used in renewable energy forecasting. The main methods discussed here are: machine learning, model output statistics, Kalman filters, regression models, historical analogs and a process of post-processing depending on weather typing.