Universität Stuttgart
Permanent URI for this communityhttps://elib.uni-stuttgart.de/handle/11682/1
Browse
Search Results
Item Open Access A surrogate-assisted Bayesian framework for uncertainty-aware validation benchmarks(Stuttgart : Eigenverlag des Instituts für Wasser- und Umweltsystemmodellierung der Universität Stuttgart, 2023) Mohammadi, Farid; Flemisch, Bernd (apl. Prof. Dr. rer. nat.)Over the last century, computational modeling in geoscience, especially in porous media research, has witnessed tremendous improvement. After decades of development, the state-of-the-art simulators can now solve coupled partial differential equations governing the complex subsurface multiphase flow system within a practically large spatial and temporal domain. Given the importance of computational modeling, quality assessment of these models in light of the purpose of a given simulation is of paramount importance to engineering designers and managers, public officials, and those affected by the decisions based on the predictions. Users and developers of computational simulations deal with a challenging question: How should confidence in modeling and simulation be critically assessed? Validation is one of the primary methods for building and quantifying confidence in modeling and simulation. It investigates the degree to which a model accurately represents reality from the perspective of the intended application of the model. Usually, this comparison between model outputs and experimental data constitutes plotting the model results against data on the same axes to provide a visual assessment of agreement or lack thereof. While comparisons between model and data are at the heart of any validation procedure, there are several concerns with such naive comparisons. First, these comparisons tend to provide qualitative rather than quantitative assessments and are clearly insufficient as a basis for making decisions regarding model validity. Second, naive comparisons often disregard or only partly account for existing uncertainties in the experimental observations or the model input parameters. Third, such comparisons can not reveal whether the model is appropriate for the intended purposes, as they mainly focus on the agreement in the observable quantities. These pitfalls give rise to the need for an uncertainty-aware framework that includes a validation metric. This metric shall provide a measure for comparison of the system response quantities of an experiment with the ones from a computational model while accounting for uncertainties in both in a rigorous way. To address this need, we developed a statistical framework incorporating a probabilistic modeling technique using a fully Bayesian approach. The dissertation aims to help modelers perform uncertainty aware model validation benchmarks. A two-stage Bayesian multi-model framework is discussed for modeling tasks where a set of models are at hand. To make this framework applicable for computationally demanding models, it is extended to a surrogate-assisted framework, keeping the computational costs at a reasonable level. Moreover, correction factors were introduced to compensate for the surrogate error in the Bayesian hypothesis testing and Bayesian model selection, as using surrogate representations instead of the full-fidelity computational models introduces additional errors to the validation metrics. In this dissertation, I show how the Bayesian formalism could be materialized by employing the concept of polynomial chaos expansion to achieve more accurate surrogates with a sparse representation and account for the uncertainty in the surrogate’s predictions. I also highlight how such surrogate models could be constructed with as few simulations as the computational budget allows. To this end, sequential adaptive sampling strategies are discussed, in which one attempts to augment the initial design iteratively. By doing so, informative regions in the parameter space are adequately explored. These regions are more likely to provide valuable information on the behavior of the original model responses. Using a sequential sampling strategy avoids the waste of computational resources, as opposed to the so-called one-shot designs. A series of benchmark studies are conducted to investigate the predictive capabilities of different sparsity and sequential adaptive sampling methods. Moreover, I introduce BayesValidRox, an open-source, object-oriented Python package that provides an automated workflow for surrogate-based sensitivity analysis, Bayesian calibration, and validation of computational models with a modular structure. The uncertainty-aware validation framework was applied to a range of cases in the field of subsurface hydro-system modeling, mainly to flow and transport in porous media, such as flow simulation models in fractured porous media, coupling free flow and porous medium flow, and microbially induced calcite precipitation. However, this validation framework can be transferred to other disciplines in which models are used for prediction.Item Open Access Physics-informed neural networks for learning dynamic, distributed and uncertain systems(Stuttgart : Eigenverlag des Instituts für Wasser- und Umweltsystemmodellierung der Universität Stuttgart, 2023) Praditia, Timothy; Nowak, Wolfgang (Prof. Dr.-Ing.)Scientific models play an important role in many technical inventions to facilitate daily human activities. We use them to assist us in simple decision making such as deciding what type of clothing we should wear using the weather forecast model, and also in complex problems such as assessing the environmental impact of industrial wastes. Existing scientific models, however, are imperfect due to our limited understanding of complex physical systems. Due to the rapid growth in computing power in recent years, there has been an increasing interest in applying data-driven modeling to improve upon current models and to fill in the missing scientific knowledge. Traditionally, these data-driven models require a significant amount of observation data, which is often challenging to obtain, especially from a natural system. To address this issue, prior physical knowledge has been included in the model design, resulting in so-called hybrid models. Although the idea of infusing physics with data seems sound, current state-of-the-art models have not found the ideal combination of both aspects, and the application to real-world data has been lacking. To bridge this gap, three research questions are formulated: 1. How can prior physical knowledge be adopted to design a consistent and reliable hybrid model for dynamic systems? 2. How can prior physical and numerical knowledge be adopted to design a consistent and reliable hybrid model for dynamic and spatially distributed systems? 3. How can the hybrid model learn about its own total (predictive) uncertainty in a computationally effective manner, so that it is appropriate for real-world applications or could facilitate scientific hypothesis testing? The overall goal is, with these questions answered, to contribute to more consistent approaches for scientific inquiry through hybrid models. The first contribution of this thesis addresses the first research question by proposing a modeling framework for a dynamic system, in the form of a Thermochemical Energy Storage device. A Nonlinear Autoregressive Network with Exogeneous Input (NARX) model is trained recurrently with multiple time lags to capture the temporal dependency and the long-term dynamics of the system. During training, the model is penalized when it violates established physical laws, such as mass and energy conservation. As a result, the model produces accurate and physically plausible predictions compared to models that are trained without physical regularization. The second research question is addressed by the second contribution of this thesis, by designing a hybrid model that complements the Finite Volume Method (FVM) with the learning ability of Artificial Neural Networks (ANNs). The resulting model enables the learning of unknown closure/constitutive relationships in various advection-diffusion equations. This thesis shows that the proposed model outperforms state-of-the-art deep learning models by several orders of magnitude in accuracy, and it possesses excellent generalization ability. Finally, the third contribution addresses the third research question, by investigating the performance of assorted uncertainty quantification methods on the hybrid model. As a demonstration, laboratory measurement data of a groundwater contaminant transport process is employed to train the model. Since the available training data is extremely scarce and noisy, uncertainty quantification methods are essential to produce a robust and trustworthy model. It is shown that a gradient-based Markov Chain Monte Carlo (MCMC) algorithm, namely the Barker proposal is the most suitable to quantify the uncertainty of the proposed model. Additionally, the hybrid model outperforms a calibrated physical model and provides appropriate predictive uncertainty to sufficiently explain the noisy measurement data. With these contributions, this thesis proposes a robust hybrid modeling framework that is suitable for filling in missing scientific knowledge and lays the groundwork for a wider variety of complex real-world applications. Ultimately, the hope is for this work to inspire future studies that contribute to the continuous and mutual improvements of both scientific knowledge discovery and scientific model robustness.Item Open Access Development and parameter estimation of conceptual snow-melt models using MODIS snow-cover distribution(Stuttgart : Eigenverlag des Instituts für Wasser- und Umweltsystemmodellierung der Universität Stuttgart, 2023) Gyawali, Dhiraj Raj; Bárdossy, András (Prof. Dr. rer. nat. Dr.-Ing.)Due to a high spatio-temporal variability observed in the inherent snow-related processes in snow-dominated regimes, reliable representation of spatial distribution of seasonal snow has remained a critical challenge for effective monitoring of seasonal evolution of snow and subsequently hydrological estimations, in mountainous regions around the world. This issue, coupled with the crucial relevance to climate change, is further exacerbated by data scarcity in these regions. To address this issue, this thesis presents a novel standalone calibration technique employing the pixel-wise binary (’snow’, ’no snow’) information from MODIS snow-cover images to calibrate independent conceptual snow-melt models, thereby estimating model parameters from individual or sets of MODIS images. This methodology exploits the pertinent information of snow-cover distribution from the freely available remote sensing images, to reliably simulate snow-processes in data scarce regions. Switzerland and Baden-Württemberg were selected as study snow regimes, with the former representing partly longer duration snow and the latter associated with a shorter duration. Different extensions of parsimonious conceptual snow-melt models were developed and used to simulate the snow-cover distribution, with all models showcasing an adept and robust simulation. The selection of binary snow-cover information as calibration variable permits relatively complex snow-melt modules to be calibrated with more robustness because of reduced uncertainty associated with the calibration data. This work further identifies and recommends different simulation thresholds for defining the calibration data (NDSI thresholds), selecting the images for calibration (cloud cover thresholds), and reclassifying the snow water equivalent (SWE) outputs to snow-cover information (SWE thresholds). Furthermore, validation of the MODIS based snow-melt model calibration and the simulated melt outputs was carried out using a modified hydrological model (modified HBV variant) without the snow-routine. This hydrological performance was contrasted with the standard HBV model calibrated solely on discharge. The melt output provided as standalone inputs to the modified HBV was observed to impart an enhanced discharge prediction. As compared with the discharge calibrated standard HBV, a reduction in uncertainty in terms of model performance was observed along with reduced parameter compensation. The increase in model performance is deemed for ‘the right reason’ as the snow processes are adeptly represented by process-informed parameters. The estimation of the parameters solely from MODIS information not only eliminates the reliance on a single calibration variable ’discharge’ which is already an availability constraint in the higher altitudes but also preserves the spatial heterogeneity at a more regional level. This methodology holds a crucial relevance for discharge simulation in areas with episodic days of snow, where the snow processes can be calibrated quickly on images without having to calibrate the entire hydrological model. The study approach shows that the addition of freely available snow-cover information in estimating the parameters of snow-melt models utilizing the snow/no-snow information and a modest and globally available input data demand, facilitates a simple, spatially flexible approach to calibrate snow-cover distribution in mountainous areas with reasonably accurate precipitation and temperature data, especially in data scarce regions.Item Open Access Stochastic model comparison and refinement strategies for gas migration in the subsurface(Stuttgart : Eigenverlag des Instituts für Wasser- und Umweltsystemmodellierung der Universität Stuttgart, 2023) Banerjee, Ishani; Nowak, Wolfgang (Prof. Dr.-Ing.)Gas migration in the subsurface, a multiphase flow in a porous-medium system, is a problem of environmental concern and is also relevant for subsurface gas storage in the context of the energy transition. It is essential to know and understand the flow paths of these gases in the subsurface for efficient monitoring, remediation or storage operations. On the one hand, laboratory gas-injection experiments help gain insights into the involved processes of these systems. On the other hand, numerical models help test the mechanisms observed and inferred from the experiments and then make useful predictions for real-world engineering applications. Both continuum and stochastic modelling techniques are used to simulate multiphase flow in porous media. In this thesis, I use a stochastic discrete growth model: the macroscopic Invasion Percolation (IP) model. IP models have the advantages of simplicity and computational inexpensiveness over complex continuum models. Local pore-scale changes dominantly affect the flow processes of gas flow in water-saturated porous media. IP models are especially favourable for these multi-scale systems because using continuum models to simulate them can be extremely computationally difficult. Despite offering a computationally inexpensive way to simulate multiphase flow in porous media, only very few studies have compared their IP model results to actual laboratory experimental image data. One reason might be the fact that IP models lack a notion of experimental time but only have an integer counter for simulation steps that imply a time order. The few existing experiments-to-model comparison studies have used perceptual similarity or spatial moments as comparison measures. On the one hand, perceptual comparison between the model and experimental images is tedious and non-objective. On the other hand, comparing spatial moments of the model and experimental images can lead to misleading results because of the loss of information from the data. In this thesis, an objective and quantitative comparison method is developed and tested that overcomes the limitations of these traditional approaches. The first step involves volume-based time-matching between real-time experimental data and IP-model outputs. This is followed by using the (Diffused) Jaccard coefficient to evaluate the quality of the fit. The fit between the images from the models and experiments can be checked across various scales by varying the extent of blurring in the images. Numerical model predictions for sparsely known systems (like the gas flow systems) suffer from high conceptual uncertainties. In literature, numerous versions of IP models, differing in their underlying hypotheses, have been used for simulating gas flow in porous media. Besides, the gas-injection experiments belong to continuous, transitional, or discontinuous gas flow regimes, depending on the gas flow rate and the porous medium's nature. Literature suggests that IP models are well suited for the discontinuous gas flow regime; other flow regimes have not been explored. Using the abovementioned method, in this thesis, four macroscopic IP model versions are compared against data from nine gas-injection experiments in transitional and continuous gas flow regimes. This model inter-comparison helps assess the potential of these models in these unexplored regimes and identify the sources of model conceptual uncertainties. Alternatively, with a focus on parameter uncertainty, Bayesian Model Selection is a standard statistical procedure for systematically and objectively comparing different model hypotheses by computing the Bayesian Model Evidence (BME) against test data. BME is the likelihood of a model producing the observed data, given the prior distribution of its parameters. Computing BME can be challenging: exact analytical solutions require strong assumptions; mathematical approximations (information criteria) are often strongly biased; assumption-free numerical methods (like Monte Carlo) are computationally impossible for large data sets. In this thesis, a BME-computation method is developed to use BME as a ranking criterion for such infeasible scenarios: The \emph{Method of Forced Probabilities} for extensive data sets and Markov-Chain models. In this method, the direction of evaluation is swapped: instead of comparing thousands of model runs on random model realizations with the observed data, the model is forced to reproduce the data in each time step, and the individual probabilities of the model following these exact transitions are recorded. This is a fast, accurate and exact method for calculating BME for IP models which exhibit the Markov chain property and for complete "atomic" data. The analysis results obtained using the methods and tools developed in this thesis help identify the strengths and weaknesses of the investigated IP model concepts. This further aids model development and refinement efforts for predicting gas migration in the subsurface. Also, the gained insights foster improved experimental methods. These tools and methods are not limited to gas flow systems in porous media but can be extended to any system involving raster outputs.Item Open Access Coupled free-flow-porous media flow processes including drop formation(Stuttgart : Eigenverlag des Instituts für Wasser- und Umweltsystemmodellierung der Universität Stuttgart, 2023) Veyskarami, Maziar; Helmig, Rainer (Prof. Dr.-Ing.)Behavior of a coupled free-flow-porous medium system is determined by the interface between the two domains. Formation of droplets at the interface governs transport processes in the whole system by enormously affecting the exchange of mass, momentum, and energy between the free flow and the porous medium. A droplet that forms at the interface might grow or shrink due to the flow from the porous medium and evaporation from its surface into the free flow. It also might be detached from the interface by the free flow. An example of such phenomena in nature is formation of sweat droplets on the skin by perspiration and the resulted cooling effect through their evaporation into the surrounding air. Water management in fuel cells, cooling systems, and inkjet printing are just a few technical applications in which droplet formation at the interface between a free flow and a porous medium appears. In this work, we developed a novel model to describe the formation, growth and detachment as well as evaporation of droplets at the interface between a coupled free-flow-porous medium system. Pore-network modeling is used as a tool to capture pore-scale phenomena occurring in porous media. New coupling concepts between the free flow and the porous medium are developed, which include storing mass, momentum and energy in the droplet. The formation and growth of a droplet is described and a new approach is developed to include the impact of the growing droplet on the free-flow field. Description of the forces acting in the system is given and accordingly the droplet detachment is predicted. A clear description of the droplet evaporation is provided and the impact of free-flow and porous medium properties on the droplet evaporation have been analyzed.Item Open Access Spatial extent of precipitation extremes in hydrology(Stuttgart : Eigenverlag des Instituts für Wasser- und Umweltsystemmodellierung der Universität Stuttgart, 2023) El Hachem, Abbas; Bárdossy, András (Prof. Dr. rer. nat. Dr.-Ing.)Precipitation extremes are a space-time phenomenon that influences many engineering design decisions. The occurrence of precipitation extremes is, however, rare and with values that can deviate notably from ”normal” observations. For design purposes, an estimate of areal rainfall depth for a corresponding return period is needed. Traditionally, point rainfall extreme value statistics are transferred to areal statistics using the concept of area reduction factors. These are, in general, based on simple assumptions without considering the effects of climate change. Area Depth Duration Frequency (ADDF) curves are a mathematical function relating the area of a location to the depth and frequency of a rainfall event for a certain temporal duration and return period. The calculation of the ADDF curves is, however, not straightforward, as, in contrast to point precipitation, areal precipitation is not measured but must be estimated. This work considers precipitation as a spatial phenomenon, without purely point statistics, and aims to assess areal precipitation extremes for the present and future time periods along their expected change with climate change.