Research reports

Years: 2025 2024 2023 2022 2021 2020 2019 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2004 2003 2002 2001 2000 1999 1998 1997 1996 1995 1994 1993 1992 1991

Higher-order Quasi-Monte Carlo Training of Deep Neural Networks

by M. Longo and S. Mishra and T. K. Rusch and Ch. Schwab

(Report number 2020-57)

Abstract
We present a novel algorithmic approach and an error analysis leveraging Quasi-Monte Carlo (QMC) points for training deep neural network (DNN) surrogates of holomorphic Data- to-Observable (DtO) maps in engineering design. Our analysis reveals higher-order consistent, deterministic choices of training points in the input parameter space for both deep and shallow Neural Networks with holomorphic activation functions such as tanh. We prove that higher order QMC training points facilitate higher-order decay (in terms of the number of training samples) of the underlying generalization error, with consistency error bounds that are free from the curse of dimensionality in terms of the number of input param- eters, provided that DNN weights in hidden layers satisfy certain summability conditions. We present numerical experiments for DtO maps from elliptic and parabolic PDEs with uncertain inputs that confirm the theoretical analysis.

Keywords: deep learning, higher-order QMC, generalization error, deep neural networks, scientific computing

BibTeX

@Techreport{LMRS20_930,
  author = {M. Longo and S. Mishra and T. K. Rusch and Ch. Schwab},
  title = {Higher-order Quasi-Monte Carlo Training of Deep Neural Networks},
  institution = {Seminar for Applied Mathematics, ETH Z{\"u}rich},
  number = {2020-57},
  address = {Switzerland},
  url = {https://www.sam.math.ethz.ch/sam_reports/reports_final/reports2020/2020-57.pdf },
  year = {2020}
}

Download

Revision 1: April 2021 (PDF)file_download (latest version)
First published: September 2020 (PDF)file_download

Disclaimer
© Copyright for documents on this server remains with the authors. Copies of these documents made by electronic or mechanical means including information storage and retrieval systems, may only be employed for personal use. The administrators respectfully request that authors inform them when any paper is published to avoid copyright infringement. Note that unauthorised copying of copyright material is illegal and may lead to prosecution. Neither the administrators nor the Seminar for Applied Mathematics (SAM) accept any liability in this respect. The most recent version of a SAM report may differ in formatting and style from published journal version. Do reference the published version if possible (see SAM Publications).