Understanding the performance of neural network models for short [PDF]

“Multilayer feedforward networks are universal approximators” (Hornik et al., ... “Multilayer feedforward networks

0 downloads 23 Views 4MB Size

Recommend Stories


Neural Network models and Google Translate
Life is not meant to be easy, my child; but take courage: it can be delightful. George Bernard Shaw

Unnormalized exponential and neural network language models
Every block of stone has a statue inside it and it is the task of the sculptor to discover it. Mich

Comparison of Performance Analysis using Different Neural Network and Fuzzy Logic Models for
Ask yourself: What am I doing about the things that matter most in my life? Next

Preliminary performance tests on artificial neural network models for opening strategies of double
If your life's work can be accomplished in your lifetime, you're not thinking big enough. Wes Jacks

INVESTIGATING BIDIRECTIONAL RECURRENT NEURAL NETWORK LANGUAGE MODELS FOR
Be like the sun for grace and mercy. Be like the night to cover others' faults. Be like running water

Building Neural Network Models for Time Series: A Statistical Approach
If you are irritated by every rub, how will your mirror be polished? Rumi

Understanding Linux Network Internals Pdf
Where there is ruin, there is hope for a treasure. Rumi

Neural Network
It always seems impossible until it is done. Nelson Mandela

Econometrics of Network Models
Kindness, like a boomerang, always returns. Unknown

Idea Transcript


Understanding the performance of neural network models for short-term predictions applied to geomagnetic indices Swedish Institute of Space Physics

Peter Wintoft and Magnus Wik

1

2

Proc. R. Soc. Lond. A 1911 85 44-50; DOI: 10.1098/rspa.1911.0019. Published 14 March 1911



30 model settings: 3D MHD models; kinetic models; specification models.



4 storms: Aug 2001 (-40 nT), Oct 2003 (-353 nT), Aug 2005 (-131 nT), Dec 2006 (-139 nT)

MHD Kinetic Specification

3

Rastätter et al., Geospace environment modeling 2008–2009 challenge: Dst index, SPACE WEATHER, VOL. 11, 187–205, doi:10.1002/swe.20036, 2013.

• 

• 

4

Ayala Solares, J. R., Wei, H.-L., Boynton, R. J., Walker, S. N. & Billings, S. A., ‘Modelling and predic@on of global magne@c disturbance in near-earth space: A case study for Kp index using narx models’, Space Weather, 2016. WintoJ, P., Wik, M., Matzka, J. & Shprits, Y. (2017), ‘Forecas@ng Wintoft, P., M. Wik, J. Matzka and Y. Shprits, Forecasting Kp from Kp from solar wind data: input parameter study using 3-hour solar wind data: input parameter study using 3-hour averages and 3averages and 3-hour range values’, Journal of Space Weather and hour range values, J. Space Weather Space Clim.

Space Climate, Accepted, 2017. Volume 7, A29, 2017

Neural networks

5



Data …



Data gaps (esp. hard for time-series).



Division of datasets used for



training,



validation, and



testing.



Training set => parameter estimation



Validation set => hyper-parameter search



Test set => final estimate of performance

A lot of other considerations:

Learning algorithm

Input parameters

Transformation of inputs

Network architecture

6

“Multilayer feedforward networks are universal approximators” (Hornik et al., 1989).

7

“Multilayer feedforward networks are universal approximators” (Hornik et al., 1989).

8

“Multilayer feedforward networks are universal approximators” (Hornik et al., 1989).

9

“Multilayer feedforward networks are universal approximators” (Hornik et al., 1989).

10

“Multilayer feedforward networks are universal approximators” (Hornik et al., 1989).

11

“Multilayer feedforward networks are universal approximators” (Hornik et al., 1989).

Dynamic NNs •

Time delays



Recurrent connections



More advanced: LSTM

F. Takens, Detecting strange attractors in turbulence, Springer, 1981. Recurrent Neural Networks are universal approximators, Schäfer and Zimmermann, Int J Neural Syst. 2007 Aug;17(4):253-63. 12

LSTM: A Search Space Odyssey, K. Greff and R. K. Srivastava and J. Koutník and B. R. Steunebrink and J. Schmidhuber, Transactions on neural networks and learning systems  28  2222-2232  (2017)

3-hour averages

Kp(t+3) = f(Bz(t), Bz(t-3), …, n(t), n(t-3), …, V(t), V(t-3), …)

13

Derivation of Kp

• •

Sensitivity to sub-3-hour variations. High-pass filtered storm dynamics Derivation, meaning, and use of geomagnetic indices, P. N. Mayaud, AGU, 22, 1980.

14

15

Performance in low-density statespace Points with Kp>6 from training set

NN Solar wind

Median

Kp

NN

Ensemble of networks Observed Kp Range Trace of a storm from the test set

Median of pred.

16

Wintoft P, Wik M, Matzka J, Shprits Y. 2017. Forecasting Kp from solar wind data: input parameter study using 3-hour averages and 3-hour range values. J. Space Weather Space Clim. 7: A29

Increasing Kp prediction lead time?

17

http://lund.irf.se/forecast/kp2017/

Will be included at ESA SSA ESC-G

18

B, Bz, n, V

Operational forecasts of the geomagnetic Dst index, H. Lundstedt and H. Gleisner and P. Wintoft

Geophysical Research Letters  29, 34-1–4, 2002. 19

B, By, Bz, n, V, DOY, UT

Semiannual variation of Dst Semiannual variation of the geomagnetic Dst index: Evidence for a dominant nonstorm component, E. W. Cliver and Y. Kamide and A. G. Ling and N. Yokoyama, Journal of Geophysical Research  106  21,297-21,304  (2001)

20

Range of Dst from NN model

• Outputs from final hidden layer are limited to [-1,+1]. • Sum |weights| + bias => possible range of Dst.

[-650, 190] nT 75%: [-500, 140] nT

21

Main

Recovery

6,7 7,8

The solar wind electric field does not control the dayside reconnection rate, J. E. Borovsky and J. Birn, Journal of Geophysical Research: Space Physics  119  751--760  (2014)

22

Echer, E., Gonzalez, W. D., Tsurutani, B. T. & Gonzalez, A. L. C. (2008), ‘Interplanetary conditions causing intense geomagnetic storms (Dst kernel trick (e.g. RBF corresponds to an infinite-dimensional space)



Example: Comparison of neural network and support vector machine methods for Kp forecasting, E.-Y. Ji and Y.-J. Moon and J. Park and J.-Y. Lee and D.-H. Lee, Journal of Geophysical Research  118  5109-5117  (2013)

Nonlinear Auto-Regressive Moving Average with eXogenous inputs (NARMAX):



A “library” of terms (e.g. polynomials or other functions) are ranked and combined including time lags.



Will reveal a function that is more assessable for interpretation than SVM or NN.



Example: Data derived NARMAX Dst model, R. J. Boynton and M. A. Balikhin and S. A. Billings and A. S. Sharma and O. A. Amariutei, Annales Geophysicae  29  965-971  (2011).

Summary •





The selection of training, validation, and test data is important but can also be challenging:



Data gaps (esp. hard for time-series).



Division of data based on output distribution, but what about multidimensional inputs?

Both NN and SVM will be limited by the range of the training data.



Use them in their valid regimes.



Is extrapolation improved by handling known non-linearities?

Should simpler models be used at the extremes?

• 33

How to determine extreme? This project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 637302, and the ESA SSA Space Weather ESC contract No 4000113185/15/D/MRP.

Smile Life

When life gives you a hundred reasons to cry, show life that you have a thousand reasons to smile

Get in touch

© Copyright 2015 - 2024 PDFFOX.COM - All rights reserved.