Research reports
Years: 2025 2024 2023 2022 2021 2020 2019 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2004 2003 2002 2001 2000 1999 1998 1997 1996 1995 1994 1993 1992 1991
Intrinsic fault tolerance of multi level Monte Carlo methods
by S. Pauli and P. Arbenz and Ch. Schwab
(Report number 2012-24)
Monte Carlo (MC) and \MLMC (MLMC) methods applied to solvers for Partial Differential Equations withrandom input data are proved to exhibit intrinsic failure resilience. Sufficient conditions are provided for non-recoverable loss of a random fraction of MC samples not to fatally damage the asymptotic accuracy vs. work of a MC simulation. Specifically, the convergence behavior of MLMC methods on massively parallel hardware with runtime faults is analyzed mathematically and investigated computationally. Our mathematical model assumes node failures which occur uncorrelated of MC sampling and with general sample failure statistics on the different levels and which also assume absence of checkpointing, i.e., we assume irrecoverable sample failures with complete loss of data. Modifications of the MLMC with enhanced resilience are proposed. The theoretical results are obtained under general statistical models of CPU failure at runtime. Particular attention is paid to node failures with so-called Weibull failure models. We discuss the resilience of massively parallel stochastic Finite Volume computational fluid dynamics simulations.
BibTeX@Techreport{PAS12_467, author = {S. Pauli and P. Arbenz and Ch. Schwab}, title = {Intrinsic fault tolerance of multi level Monte Carlo methods}, institution = {Seminar for Applied Mathematics, ETH Z{\"u}rich}, number = {2012-24}, address = {Switzerland}, url = { }, year = {2012} }
© Copyright for documents on this server remains with the authors.
Copies of these documents made by electronic or mechanical means including
information storage and retrieval systems, may only be employed for
personal use. The administrators respectfully request that authors
inform them when any paper is published to avoid copyright infringement.
Note that unauthorised copying of copyright material is illegal and may
lead to prosecution. Neither the administrators nor the Seminar for
Applied Mathematics (SAM) accept any liability in this respect.
The most recent version of a SAM report may differ in formatting and style
from published journal version. Do reference the published version if
possible (see SAM