Research reports
Years: 2025 2024 2023 2022 2021 2020 2019 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2004 2003 2002 2001 2000 1999 1998 1997 1996 1995 1994 1993 1992 1991
UnICORNN: A recurrent model for learning very long time dependencies
by T. K. Rusch and S. Mishra
(Report number 2021-10)
Abstract
The design of recurrent neural networks (RNNs) to accurately process sequential inputs with long-time dependencies is very challenging on account of the exploding and vanishing gradient problem. To overcome this, we propose a novel RNN architecture which is based on a structure preserving discretization of a Hamiltonian system of second-order ordinary differential equations that models networks of oscillators. The resulting RNN is fast, invertible (in time), memory efficient and we derive rigorous bounds on the hidden state gradients to prove the mitigation of the exploding and vanishing gradient problem. A suite of experiments are presented to demonstrate that the proposed RNN provides state of the art performance on a variety of learning tasks with (very) long-time dependencies.
Keywords: Recurrent neural network, Hamiltonian system, Gradient stability, Long-term dependencies
BibTeX@Techreport{RM21_952, author = {T. K. Rusch and S. Mishra}, title = {UnICORNN: A recurrent model for learning very long time dependencies}, institution = {Seminar for Applied Mathematics, ETH Z{\"u}rich}, number = {2021-10}, address = {Switzerland}, url = {https://www.sam.math.ethz.ch/sam_reports/reports_final/reports2021/2021-10.pdf }, year = {2021} }
Disclaimer
© Copyright for documents on this server remains with the authors.
Copies of these documents made by electronic or mechanical means including
information storage and retrieval systems, may only be employed for
personal use. The administrators respectfully request that authors
inform them when any paper is published to avoid copyright infringement.
Note that unauthorised copying of copyright material is illegal and may
lead to prosecution. Neither the administrators nor the Seminar for
Applied Mathematics (SAM) accept any liability in this respect.
The most recent version of a SAM report may differ in formatting and style
from published journal version. Do reference the published version if
possible (see SAM
Publications).