Gradient Explosion Free Algorithm for Training Recurrent Neural Networks
- 주제(키워드) Recurrent Neural Networks , Dynamical systems , Gradient explosion , Optimal control.
- 주제(기타) 수학
- 설명문(URI) https://www.kci.go.kr/kciportal/ci/sereArticleSearch/ciSereArtiView.kci?sereArticleSearchBean.artiId=ART002662312
- 관리정보기술 faculty
- 등재 KCI등재
- 발행기관 한국산업응용수학회
- 발행년도 2020
- URI http://www.dcollection.net/handler/ewha/000000176398
- 본문언어 영어
초록/요약
Exploding gradient is a widely known problem in training recurrent neural networks. The explosion problem has often been coped with cutting off the gradient norm by some fixed value. However, this strategy, commonly referred to norm clipping, is an ad hoc approach to attenuate the explosion. In this research, we opt to view the problem from a different perspective, the discrete-time optimal control with infinite horizon for a better understanding of the problem. Through this perspective, we fathom the region at which gradient explosion occurs. Based on the analysis, we introduce a gradient-explosion-free algorithm that keeps the training process away from the region. Numerical tests show that this algorithm is at least three times faster than the clipping strategy.
more