Adaptive scaling of the learning rate by second order automatic differentiation - Université Paul Sabatier - Toulouse III Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2022

Adaptive scaling of the learning rate by second order automatic differentiation

Résumé

In the context of the optimization of Deep Neural Networks, we propose to rescale the learning rate using a new technique of automatic differentiation. If (1C, 1M) represents respectively the computational time and memory footprint of the gradient method, the new technique increase the overall cost to either (1.5C, 2M) or (2C, 1M). This rescaling has the appealing characteristic of having a natural interpretation, it allows the practitioner to choose between exploration of the parameter set and convergence of the algorithm. The rescaling is adaptive, it depends on the data and on the direction of descent. The rescaling is tested using the simple strategy of exponential decay, a method with comprehensive hyperparameters that requires no tuning. When compared to standard algorithm with optimized hyperparameters, this algorithm exhibit similar convergence rates and is also empirically shown to be more stable than standard method.
Fichier principal
Vignette du fichier
Adaptive scaling of the learning rate by second order automatic differentiation.pdf (60.32 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03748574 , version 1 (09-08-2022)
hal-03748574 , version 2 (25-10-2022)

Identifiants

  • HAL Id : hal-03748574 , version 1

Citer

Alban Gossard, Frédéric de Gournay. Adaptive scaling of the learning rate by second order automatic differentiation. 2022. ⟨hal-03748574v1⟩
102 Consultations
43 Téléchargements

Partager

Gmail Facebook X LinkedIn More