Bayesian surrogate training on multiple data sources : a hybrid modeling strategy

dc.contributor.authorReiser, Philipp
dc.contributor.authorBürkner, Paul-Christian
dc.contributor.authorGuthke, Anneli
dc.date.accessioned2026-05-30T09:54:40Z
dc.date.issued2026
dc.date.updated2026-05-22T01:27:13Z
dc.description.abstractSurrogate models are often used as computationally efficient approximations to complex simulation models, enabling tasks such as solving inverse problems, sensitivity analysis, and probabilistic forward predictions, which would otherwise be computationally infeasible. During training, surrogate parameters are fitted such that the surrogate reproduces the simulation model’s outputs as closely as possible. However, the simulation model itself is merely a simplification of the real-world system, often missing relevant processes or suffering from misspecifications e.g., in inputs or boundary conditions. Hints about these might be captured in real-world measurement data, and yet, we typically ignore those hints during surrogate building. In this paper, we propose two novel probabilistic approaches to integrate simulation data and real-world measurement data during surrogate training. The first method trains separate surrogate models for each data source and combines their predictive distributions, while the second incorporates both data sources by training a single surrogate. Both hybrid modeling approaches employ a novel weighting strategy for combining heterogeneous data sources during surrogate training, which operates independently of the chosen surrogate family. We show the conceptual differences and benefits of the two approaches through both synthetic and real-world case studies. The results demonstrate the potential of these methods to improve predictive accuracy, predictive coverage, and to diagnose problems in the underlying simulation model. These insights can improve system understanding and future model development.en
dc.description.sponsorshipProjekt DEAL
dc.description.sponsorshipDeutsche Forschungsgemeinschaft
dc.identifier.issn1573-1375
dc.identifier.issn0960-3174
dc.identifier.urihttp://nbn-resolving.de/urn:nbn:de:bsz:93-opus-ds-186280de
dc.identifier.urihttps://elib.uni-stuttgart.de/handle/11682/18628
dc.identifier.urihttps://doi.org/10.18419/opus-18609
dc.language.isoen
dc.relation.uridoi:10.1007/s11222-026-10906-9
dc.rightsCC BY
dc.rightsinfo:eu-repo/semantics/openAccess
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subject.ddc620
dc.titleBayesian surrogate training on multiple data sources : a hybrid modeling strategyen
dc.typearticle
dc.type.versionpublishedVersion
ubs.fakultaetFakultäts- und hochschulübergreifende Einrichtungen
ubs.fakultaetFakultätsübergreifend / Sonstige Einrichtung
ubs.institutStuttgarter Zentrum für Simulationswissenschaften (SC SimTech)
ubs.institutFakultätsübergreifend / Sonstige Einrichtung
ubs.publikation.noppnyesde
ubs.publikation.seiten25
ubs.publikation.sourceStatistics and computing 36 (2026), No. 145
ubs.publikation.typZeitschriftenartikel

Files

Original bundle

Now showing 1 - 1 of 1
Thumbnail Image
Name:
11222_2026_Article_10906.pdf
Size:
4.1 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
3.3 KB
Format:
Item-specific license agreed upon to submission
Description: