One of the objectives of the MOSAIICS project is to extend the existing applications of Deep Learning in heliophysics, by developing an innovative and powerful methodology for characterizing multi-frequency/multi-wavelength imaging observations of CMEs and coronal shock waves. We will achieve this by combining powerful signal processing, morphological image segmentation for object tracking and feature labeling, as well as novel CNN models. We will develop and apply Deep Learning models that implement the recent concept of instance segmentation – a combination of supervised object detection (feature classification) and semantic segmentation (separating the different features in an image). This is an advanced application of CNN models, used for supervised object tracking (for example, Santos et al. 2019). We will model our application after CNN models used extensively in medical imaging for classification and tracking of cells, which have been quite successful [Sadanandan et al 2017]. Most such models and their training data are freely available. In addition, many of these pre-trained generally models can be adapted to specific applications via transfer learning methods, reducing the tedium of training.
We will naturally extend our previous work on solar eruption characterization and tracking. First, we will curate datasets from multi-wavelength EUV (AIA) and multi-frequency radio imaging data (LOFAR, MWA). Datasets will be ordered according to complexity, feature richness, and research goal. Then, we will adapt and hone the discrete wavelet algorithm to enhancing these datasets. We will adapt the YAFTA algorithm to radio imaging, and improve its performance. The algorithm will be applied to automatically label the enhanced EUV and radio data. The advantages of a morphological algorithm such as YAFTA is that 1) it automatically labels features with no need for training, and 2) it incorporates time-dependent information, producing better labels. This will provide scientific results in itself, such as morphology, extent, and kinematics of CMEs and shocks. These feature datasets will also be released to the public for future machine learning efforts;
Next, we will develop Deep Learning model architectures, based on existing successful CNN models used for other scientific applications. The models will be initially trained and validated on benchmark datasets, such as the popular CIFAR-10/100 and PASCAL VOC. We will then train, validate, and test the validated models using the wavelet-processed, YAFTA-labeled AIA and LOFAR/MWA data. We will perform hyperparameter tuning and iterative learning for improving their performance. The tested models will be used in detailed multi-wavelength studies tracking the evolution of CMEs and coronal shocks. From the results we will extract physical information about the eruptions from the characterised datasets, as well as use them in SEP modeling. This work will be performed by me and a postdoc, with help from a PhD student. We will train, validate, and test the models on our work stations with modern GPU cards.
The MOSAIICS project is funded under contract KP-06-DV-8/18.12.2019 to the Institute of Astronomy and NAO, BAS, under the National Scientific Program “VIHREN” of the Bulgarian National Science Fund.