Deep Learning for Forecasting Solar Energetic Proton Events

In this work, we implemented short-term and long-term forecasting models to predict the solar protons’ integral flux at 1 AU using the Bi-directional Long Short-Term Memory (Bi-LSTM) neural network model. The LSTM networks are a subset of the Recurrent Neural Network (RNN) used in deep learning to address contextual information by integrating a loop that allows information to flow from one time step to the next. This is managed by learning when to remember and when to forget, through their forget gate weights. In the case of the Bi-directional LSTM network, the input flows in two directions to preserve the future and the past information, which gives better results in our case compared to the regular LSTM model.

We used 9 features with daily cadence – the sunspot number; obtained from the World Data Center for the production, preservation, and dissemination of the international sunspot number; the solar radio flux density, the IMF, the SW speed, and the proton integral flux at 3 energy channels (>10 MeV, >30 MeV, and >60 MeV) obtained from the OMNI database; and the long- and short-wavelength bands of X-ray flux; obtained from the GOES database. The reason for choosing those features is because the dynamics of the solar activity influence the protons’ flux since they travel within the inner heliosphere. By doing correlation analysis, we selected the top 6 correlated features with the proton flux for each energy channel.

The data was split into 75% (from 1976 to 2008) for training the model and 25% (from 2008 to 2019) for validating the performance. A multivariate multi-step Bi-LSTM NN model is implemented, based on the Multiple Output Strategy, to forecast the integral protons flux throughout the following 6 hours, 12 hours, and 24 hours for the short-term mode – and throughout the following 3 days, 5 days, and 7 days for the long-term mode. Two Bi-LSTM layers and one Dense layer are used, with 32 cells each, batch size of 512, and 70 epochs. The input horizon is 730 steps (~24 days) for daily data and 720 steps (~30 hours) for hourly data.

We show below an example of the model training and forecasting performance for the 3-day forecasting of PF>10 MeV. To summarise, we implemented forecasting models to do short-term and long-term forecasting for the integral protons flux in 3 energy channels based on the data of the previous 4 solar cycles and by using 7 input features that reflect the solar activity state. The MSE of prediction of the PF>30 MeV channel is generally the highest in both the long-term and short-term forecasting, while the MSE for the PF>10 MeV is the lowest.

For the long-term forecasting, the MSE increases at larger future horizons, as expected, except for the PF>30 MeV and PF>60 MeV — The MSE for the PF>10 MeV is similar up to 3 digits for the 5-day and 7-day forecasting. The MSE for the PF>10 MeV is similar up to 3 digits for the 12-hr and 24-hr forecasting. The same applies to the PF>30 MeV. The MSE for the PF>60 MeV is similar up to 3 digits for the 3 forecasting windows, which means that changing the future horizon has very little impact on the model performance. The model still needs more fine-tuning and the performance can potentially be greatly improved. This work is being prepared for submission to a refereed journal.

Performance of the Bi-LSTM model for forecasting SEP fluxes at 1 au. Panels 1 and 2 show the mean-squared error and mean absolute error metrics during the model training, demonstrating convergence. Panel 3 shows a comparison between the observed and forecasted proton fluxes. Panel 4 demonstrates the comparison for a short segment of the validation dataset.