Understanding the performance of neural network models for short [PDF]

âMultilayer feedforward networks are universal approximatorsâ (Hornik et al., ... âMultilayer feedforward networks

0 downloads 23 Views 4MB Size

Report

Download PDF

PNG Network

Recommend Stories

Neural Network models and Google Translate

Life is not meant to be easy, my child; but take courage: it can be delightful. George Bernard Shaw

Unnormalized exponential and neural network language models

Every block of stone has a statue inside it and it is the task of the sculptor to discover it. Mich

Comparison of Performance Analysis using Different Neural Network and Fuzzy Logic Models for

Ask yourself: What am I doing about the things that matter most in my life? Next

Preliminary performance tests on artificial neural network models for opening strategies of double

If your life's work can be accomplished in your lifetime, you're not thinking big enough. Wes Jacks

Investigating Bidirectional Recurrent Neural Network Language Models for Speech Recognition

Where there is ruin, there is hope for a treasure. Rumi

INVESTIGATING BIDIRECTIONAL RECURRENT NEURAL NETWORK LANGUAGE MODELS FOR

Be like the sun for grace and mercy. Be like the night to cover others' faults. Be like running water

Building Neural Network Models for Time Series: A Statistical Approach

If you are irritated by every rub, how will your mirror be polished? Rumi

Understanding Linux Network Internals Pdf

Where there is ruin, there is hope for a treasure. Rumi

Neural Network

It always seems impossible until it is done. Nelson Mandela

Econometrics of Network Models

Kindness, like a boomerang, always returns. Unknown

Idea Transcript

Understanding the performance of neural network models for short-term predictions applied to geomagnetic indices Swedish Institute of Space Physics

Peter Wintoft and Magnus Wik

1

2

Proc. R. Soc. Lond. A 1911 85 44-50; DOI: 10.1098/rspa.1911.0019. Published 14 March 1911

•

30 model settings: 3D MHD models; kinetic models; specification models.

•

4 storms: Aug 2001 (-40 nT), Oct 2003 (-353 nT), Aug 2005 (-131 nT), Dec 2006 (-139 nT)

MHD Kinetic Specification

3

Rastätter et al., Geospace environment modeling 2008–2009 challenge: Dst index, SPACE WEATHER, VOL. 11, 187–205, doi:10.1002/swe.20036, 2013.

• 

• 

4

Ayala Solares, J. R., Wei, H.-L., Boynton, R. J., Walker, S. N. & Billings, S. A., ‘Modelling and predic@on of global magne@c disturbance in near-earth space: A case study for Kp index using narx models’, Space Weather, 2016. WintoJ, P., Wik, M., Matzka, J. & Shprits, Y. (2017), ‘Forecas@ng Wintoft, P., M. Wik, J. Matzka and Y. Shprits, Forecasting Kp from Kp from solar wind data: input parameter study using 3-hour solar wind data: input parameter study using 3-hour averages and 3averages and 3-hour range values’, Journal of Space Weather and hour range values, J. Space Weather Space Clim.

Space Climate, Accepted, 2017. Volume 7, A29, 2017

Neural networks

5

•

Data …

•

Data gaps (esp. hard for time-series).

•

Division of datasets used for

•

training,

•

validation, and

•

testing.

•

Training set => parameter estimation

•

Validation set => hyper-parameter search

•

Test set => final estimate of performance

A lot of other considerations:

Learning algorithm

Input parameters

Transformation of inputs

Network architecture

6

“Multilayer feedforward networks are universal approximators” (Hornik et al., 1989).

7

“Multilayer feedforward networks are universal approximators” (Hornik et al., 1989).

8

“Multilayer feedforward networks are universal approximators” (Hornik et al., 1989).

9

“Multilayer feedforward networks are universal approximators” (Hornik et al., 1989).

10

“Multilayer feedforward networks are universal approximators” (Hornik et al., 1989).

11

“Multilayer feedforward networks are universal approximators” (Hornik et al., 1989).

Dynamic NNs •

Time delays

•

Recurrent connections

•

More advanced: LSTM

F. Takens, Detecting strange attractors in turbulence, Springer, 1981. Recurrent Neural Networks are universal approximators, Schäfer and Zimmermann, Int J Neural Syst. 2007 Aug;17(4):253-63. 12

LSTM: A Search Space Odyssey, K. Greﬀ and R. K. Srivastava and J. Koutník and B. R. Steunebrink and J. Schmidhuber, Transactions on neural networks and learning systems 28 2222-2232 (2017)

3-hour averages

Kp(t+3) = f(Bz(t), Bz(t-3), …, n(t), n(t-3), …, V(t), V(t-3), …)

13

Derivation of Kp

• •

Sensitivity to sub-3-hour variations. High-pass filtered storm dynamics Derivation, meaning, and use of geomagnetic indices, P. N. Mayaud, AGU, 22, 1980.

14

15

Performance in low-density statespace Points with Kp>6 from training set

NN Solar wind

Median

Kp

NN

Ensemble of networks Observed Kp Range Trace of a storm from the test set

Median of pred.

16

Wintoft P, Wik M, Matzka J, Shprits Y. 2017. Forecasting Kp from solar wind data: input parameter study using 3-hour averages and 3-hour range values. J. Space Weather Space Clim. 7: A29

Increasing Kp prediction lead time?

17

http://lund.irf.se/forecast/kp2017/

Will be included at ESA SSA ESC-G

18

B, Bz, n, V

Operational forecasts of the geomagnetic Dst index, H. Lundstedt and H. Gleisner and P. Wintoft

Geophysical Research Letters 29, 34-1–4, 2002. 19

B, By, Bz, n, V, DOY, UT

Semiannual variation of Dst Semiannual variation of the geomagnetic Dst index: Evidence for a dominant nonstorm component, E. W. Cliver and Y. Kamide and A. G. Ling and N. Yokoyama, Journal of Geophysical Research 106 21,297-21,304 (2001)

20

Range of Dst from NN model

• Outputs from final hidden layer are limited to [-1,+1]. • Sum |weights| + bias => possible range of Dst.

[-650, 190] nT 75%: [-500, 140] nT

21

Main

Recovery

6,7 7,8

The solar wind electric field does not control the dayside reconnection rate, J. E. Borovsky and J. Birn, Journal of Geophysical Research: Space Physics 119 751--760 (2014)

22

Echer, E., Gonzalez, W. D., Tsurutani, B. T. & Gonzalez, A. L. C. (2008), ‘Interplanetary conditions causing intense geomagnetic storms (Dst kernel trick (e.g. RBF corresponds to an infinite-dimensional space)

•

Example: Comparison of neural network and support vector machine methods for Kp forecasting, E.-Y. Ji and Y.-J. Moon and J. Park and J.-Y. Lee and D.-H. Lee, Journal of Geophysical Research 118 5109-5117 (2013)

Nonlinear Auto-Regressive Moving Average with eXogenous inputs (NARMAX):

•

A “library” of terms (e.g. polynomials or other functions) are ranked and combined including time lags.

•

Will reveal a function that is more assessable for interpretation than SVM or NN.

•

Example: Data derived NARMAX Dst model, R. J. Boynton and M. A. Balikhin and S. A. Billings and A. S. Sharma and O. A. Amariutei, Annales Geophysicae 29 965-971 (2011).

Summary •

•

•

The selection of training, validation, and test data is important but can also be challenging:

•

Data gaps (esp. hard for time-series).

•

Division of data based on output distribution, but what about multidimensional inputs?

Both NN and SVM will be limited by the range of the training data.

•

Use them in their valid regimes.

•

Is extrapolation improved by handling known non-linearities?

Should simpler models be used at the extremes?

• 33

How to determine extreme? This project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 637302, and the ESA SSA Space Weather ESC contract No 4000113185/15/D/MRP.

Understanding the performance of neural network models for short [PDF]

Recommend Stories

Idea Transcript

Helpful Links

Smile Life

Get in touch