Understand Vanishing and Exploding Gradients Problem in Recurrent Neural Network & Some technics to Mitigate them

Zoumana Keita
4 min readMar 26, 2021

This article is a comprehensive overview to understand vanishing and exploding gradients problem and some technics to mitigate them for a better model.

Introduction

A Recurrent Neural Network is made up of memory cells unrolled through time, where the output to the previous time instance is used as input to the next time instance, just like in a regular feed-forward neural network where the output of the previous layer is fed as input to the next layer. In the recurrent neural networks, the number of layers correspond to the number of time instance existing in the dataset, which means that they can very deep. The action of enrolling a layer of recurrent network to as many time instances as there are time periods in the data is what allows the neural network to learn from the past. So, how is it possible to train such network and get good performance while mitigating the problem of exploding gradients and vanishing gradients?

In this article, we will cover the following topics:

  • Vanishing gradients and exploding gradients problem in RNNs.
  • Some solutions to mitigate vanishing gradients and exploding gradients problem.

Vanishing gradient and exploding gradients problem in RNNs

--

--

Zoumana Keita
Zoumana Keita

Written by Zoumana Keita

Senior Data Scientist/IT Analyst @OXY || Videos about AI, Data Science, Programming & Tech 👉 https://www.youtube.com/@techwithzoum

No responses yet