Research reports

Childpage navigation

Years: 2025 2024 2023 2022 2021 2020 2019 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2004 2003 2002 2001 2000 1999 1998 1997 1996 1995 1994 1993 1992 1991

Constructive Deep ReLU Neural Network Approximation

by L. Herrmann and J. A. A. Opschoor and Ch. Schwab

(Report number 2021-04)

Abstract
We propose an efficient, deterministic algorithm for constructing exponentially convergent deep neural network (DNN) approximations of multivariate, analytic maps $f:[-1,1]^K\to \mathbb{R}$ . We address in particular networks with the rectified linear unit (ReLU) activation function. Similar results and proofs apply for many other popular activation functions. The algorithm is based on collocating $f$ in deterministic families of grid points with small Lebesgue constants, and by a-priori (i.e., "offline") emulation of a spectral basis with DNNs to prescribed fidelity. Assuming availability of $N$ function values of a possibly corrupted, numerical approximation $\breve{f}$ of $f$ in $[-1,1]^K$ and a bound on $\|f-\breve{f}\|_{L^\infty([-1,1]^K)}$ , we provide an explicit, computational construction of a ReLU DNN which attains accuracy $\varepsilon$ (depending on $N$ and $\|f-\breve{f}\|_{L^\infty([-1,1]^K)}$ ) uniformly, with respect to the inputs. For analytic maps $f: [-1,1]^K\to \mathbb{R}$ , we prove exponential convergence of expression and generalization errors of the constructed ReLU DNNs. Specifically, for every target accuracy $\varepsilon\in (0,1)$ , there exists $N$ depending also on $f$ such that the error of the construction algorithm with $N$ evaluations of $\breve{f}$ as input in the norm $L^\infty([-1,1]^K;\mathbb{R})$ is smaller than $\varepsilon$ up to an additive data-corruption bound $\|f-\breve{f}\|_{L^\infty([-1,1]^K)}$ multiplied with a factor growing slowly with $1/\varepsilon$ and the number of non-zero DNN weights grows polylogarithmically with respect to $1/\varepsilon$ . The algorithmic construction of the ReLU DNNs which will realize the approximations, is explicit and deterministic in terms of the function values of $\breve{f}$ in tensorized Clenshaw--Curtis grids in $[-1,1]^K$ . We illustrate the proposed methodology by a constructive algorithm for (offline) computations of posterior expectations in Bayesian PDE inversion.

Keywords: Deep ReLU neural networks, exponential convergence, neural network construction, generalization error

BibTeX

@Techreport{HOS21_946,
  author = {L. Herrmann and J. A. A. Opschoor and Ch. Schwab},
  title = {Constructive Deep ReLU Neural Network Approximation},
  institution = {Seminar for Applied Mathematics, ETH Z{\"u}rich},
  number = {2021-04},
  address = {Switzerland},
  url = {https://www.sam.math.ethz.ch/sam_reports/reports_final/reports2021/2021-04.pdf },
  year = {2021}
}

Download

Revision : November 2021 (PDF)file_download (latest version)
First published: January 2021 (PDF)file_download

Disclaimer
© Copyright for documents on this server remains with the authors. Copies of these documents made by electronic or mechanical means including information storage and retrieval systems, may only be employed for personal use. The administrators respectfully request that authors inform them when any paper is published to avoid copyright infringement. Note that unauthorised copying of copyright material is illegal and may lead to prosecution. Neither the administrators nor the Seminar for Applied Mathematics (SAM) accept any liability in this respect. The most recent version of a SAM report may differ in formatting and style from published journal version. Do reference the published version if possible (see SAM Publications).