Forecasting Solar Energetic Proton Events with a Bi-directional Long Short-Term Memory Neural Network

Solar energetic particles (SEP) are mainly protons and originate from the Sun during solar flares or coronal shock waves. Forecasting the SEP flux is critical for several operational sectors, such as communication and navigation systems, space exploration missions, and aviation flights, as the hazardous radiation may endanger astronauts’, aviation crew and passengers’ health, as well as the delicate electronic components of satellites, space stations, and ground power stations. Therefore, the prediction of SEP flux is of high importance to our lives and may help mitigate the negative impacts of one of the serious space weather transient phenomena on the near-Earth space environment. Numerous SEP prediction models are being developed with a variety of approaches, such as empirical models, probabilistic models, physics-based models, and AI-based models. 

In this recent work (Nedal et al. 2023), we used the bi-directional long short-term memory (BiLSTM) neural network model architecture to train SEP forecasting models for 3 standard integral GOES channels (>10 MeV, >30 MeV, >60 MeV) with 3 forecast windows (1-day, 2-day, and 3-day ahead) based on daily data obtained from the OMNIWeb database from 1976 to 2019 (shown in Figure 1). As the SEP variability is modulated by the solar cycle, we selected input parameters that capture the short-term and long-term variability of the solar activity. We took the F10.7 index, the sunspot number, the time series of logarithm of the x-ray flux, the solar wind speed, and the average strength of the interplanetary magnetic field as input parameters to our model. The data was split into training, validation, and testing parts (different colors points in Fig. 1). The results are validated with an out-of-sample testing set and benchmarked with other types of models, showing that our model outperformed them (Figure 2).

Figure 1. Data splitting for all input features, showing the training, validation, and testing sets. Daily data from 1976-12-25 00:00 to 2019-07-30 00:00. The gray shading labels the solar cycles from SC21 to SC24.

We found that the models performed very well. The correlations had R > 0.9 for all points of the validation set across the forecasting windows for the 3 energy channels. The correlations had R > 0.7 for the observation points at fluxes > 10 pfu as well. An example plot of the correlations for E > 10 MeV is shown in Figure 3. The correlation between the modeled data and the observations exhibited a decline as the forecast horizon increased, in accordance with the anticipated result.

Figure 2. Benchmarking of 10 models, shows the Huber loss for the validation and test sets.

Overall, our results demonstrate the potential of using BiLSTM neural networks for forecasting SEP integral fluxes. The model can provide long-term predictions, as well as short-term predictions depending on the data resolution provided initially, which can be used to anticipate the behavior of the near-Earth space environment. These predictions have important implications for space weather forecasting, which is essential for protecting satellites, spacecraft, and astronauts from the adverse effects of solar storms.

Figure 3. Correlation between the model predictions and observations for 1-day, 2-day, and 3-day ahead for particles with energies >10 MeV (left panel). The panel in the left column represent all the points of the validation set, those in the right column represent all the observations points with daily mean flux ≥ 10 pfu.