Recent Submissions
Survey and evaluation of explainable AI methods for VQA models that learn from disagreement
(2024) Feng, Xiwen
Traditionelle Visual Question Answering (VQA)-Systeme konzentrieren sich darauf, eine einzige "korrekte" Antwort für jedes Bild-Frage-Paar zu identifizieren, wobei die Vielfalt menschlicher Interpretationen vernachlässigt wird. Diese Studie untersucht das Phänomen der Meinungsverschiedenheiten in VQA, definiert als Variationen in den von Annotatoren bereitgestellten Antworten, und schlägt ein binäres Klassifikationsmodell zur Vorhersage der Antwortübereinstimmung vor. Unter Verwendung des VQA-MHUG-Datensatzes werden visuelle Merkmale, die mit CLIP extrahiert wurden, und textuelle Merkmale aus DistilBERT kombiniert, um einen Random Forest Klassifikator zu trainieren. Das Modell erreicht eine Klassifikationsgenauigkeit von 72,81% und zeigt damit eine robuste Leistung bei der Identifizierung vollständiger Übereinstimmung, steht jedoch vor Herausforderungen bei der Behandlung von Fällen teilweiser Übereinstimmung. Zusätzlich wird ein auf Local Interpretable Model-Agnostic Explanations (LIME) basierendes XAI-Framework eingesetzt, um die Vorhersagen des Modells zu interpretieren und Einblicke in die Bedeutung von Bildregionen und Fragekomponenten zu geben. Diese Arbeit unterstreicht die Komplexität der Behandlung von VQA-Meinungsverschiedenheiten und verdeutlicht die Notwendigkeit weiterer Fortschritte bei der Merkmalsintegration und den Erklärungsmethoden, um menschenzentrierte Variationen im visuellen und textuellen Verständnis besser zu erfassen.
Multi-modal graphormer for action recognition in egocentric videos
(2024) Bickici, Deniz
The recognition of actions in egocentric videos is an increasingly important topic due to the continuous rise of wearable augmented reality devices. Most current methods primarily focus on single-modality approaches, such as using RGB images with vision models. However, these approaches often lack relational information, which can be critical for understanding action scenes. Modern approaches incorporate multi-modal data, such as audio or gaze, but often leverage all modalities to predict the action classes directly. This work takes a different approach, instead of predicting actions as a whole, we split the task into sub-tasks by separately predicting verbs and nouns. Our method, selectively employs modalities in contexts where they are most effective. Therefore, we propose a hierarchical multi-modal action recognition model that effectively combines diverse visual modalities including hand-object interactions, gaze data, scene semantics, motion dynamics, and RGB images. The model incorporates transformer-based graph and vision models to effectively integrate visual and relational information. This design allows the model to capture the distinct contribution of each modality to identify the correct action. While the proposed model did not achieve state-of-the-art accuracy, experimental results demonstrate its effectiveness in integrating multi-modal information hierarchically and its potential for improving action recognition. This research highlights the promise of graph-based architectures in multi-modal learning and lays a foundation for more holistic modality integration and more efficient action recognition systems.
Distinguishing the flow of airborne microorganisms along with environmental conditions and their influence on historic heritage buildings
(2025) de-Lemos-Medina, Leonora; Bermúdez-Marín, Aurelio; Jaikel-Víquez, Daniela; Camacho-Cambronero, N.; Segura-Vargas, A.; Gómez-Arrieta, Alfredo; Mora-Quirós, Y.; Ureña-Alvarado, K.; Rautenberg, L.; Fonseca-Alfaro, I.; Calderón-Mesén, Paula; Sandoval-Gutiérrez, María Isabel; Redondo-Solano, Mauricio; Herrera-Sancho, Oskar Andrey
The National Theater of Costa Rica is a national symbol of our country and an important architectural landmark. It was built in 1897 and is decorated with large-format paintings made by Italian artists. One of these painters was Paolo Serra, who was in charge of decorating three chambers, including the Management’s Office (MO). A yearly volumetric air sampling was conducted in seven rooms of the theater, showing high levels of circulating fungal spores in the MO. Thus, further analysis was carried out in this venue. First, the environmental conditions (wind velocity, temperature, relative humidity, and particles) were monitored. Then, the air and particles’ flow and the probability of particles concentration were simulated for different scenarios of natural ventilation using a Discrete Phase Model. This allowed us to identify areas where damage and microorganisms were probably more prevalent. The artworks in this chamber were analyzed to determine the types of damages identified: buckling, cracks, cuts, craquelure, holes, scratches, color gaps with exposed fabric or wall, insect debris, flaking, humidity stains, interventions of old restorations, opaque stains, whitish stains and total loss of the original image. As a complement, a microbiological sample was held in 34 sites of interest, resulting in the isolation and identification of five bacterial isolates and thirteen fungi. We identified the artist’s main colour palette: madder lake, lead white, calcium carbonate, yellow ocher, natural ultramarine and natural barite. We can conclude that artworks under non-controlled environmental conditions may present a significant degree of deterioration, biodeterioration, and chromatic alteration.
Control of soft robotic McKibben actuators using fluid-driven membrane valves
(2025) Hofmann, Julius Paul
Investigations on the shear load‐bearing behavior of functionally graded concrete beams with mineral hollow spheres
(2025) Strahm, Benedikt; Haufe, Carl Niklas; Blandini, Lucio
The technology of functionally graded concrete (FGC) enables a significant reduction in the mass of concrete components while maintaining all structural and functional requirements through a targeted design of the interior of components. The present work deals with the principle of meso‐gradation, where spherical mineral hollow bodies are placed in the structure to create cavities where the level of principal stresses of the load‐bearing component is low. Within the scope of this work, the shear resistance of FGC components is investigated experimentally, analytically, and numerically on full‐scale beams. As a result, an analytical design approach for the shear resistance of voided slabs is validated, demonstrating the applicability of the approach for structural elements. The load‐bearing behavior observed in the experiments was in good agreement with the numerical simulations. This allows the applied numerical model to be used for FGC components with other hollow body diameters or varying concrete covers, reducing the necessity for costly large‐scale experiments. The findings highlight the potential benefits of using FGC in terms of reducing mass while increasing the recycling rate and therefore minimizing the environmental footprint of concrete structures. The analytical design model and experimental results provide useful guidance for the design and construction of such elements in structural engineering applications.
Exploring mesoionic imine‐carbodiimide (MII‐CDI) adducts : 1,3 H‐shift, N(I) compounds and guanidinate‐type ligands
(2025) Mahata, Alok; Rudolf, Richard; Walter, Robert R. M.; Neuman, Nicolás. I.; Sarkar, Biprajit
In this study, we report our recent findings on the synthesis and reactivity of a novel 1,2,3‐triazolin‐5‐imine‐type mesoionic imine‐carbodiimide ( MII‐CDI ) adduct. Unlike reported NHC‐CDI adducts, formed by the reactions of N ‐heterocyclic carbene ( NHC ) with CDI , these zwitterionic compounds undergo a spontaneous 1,3‐hydrogen shift (1,3‐H shift), resulting in guanidine‐type compounds. The mechanism of this 1,3‐H shift has been investigated through quantum chemical calculations. The MII‐CDI adduct serves as a valuable synthon for the synthesis of mesoionic carbene‐acyclic diamino carbene (MIC‐ADC)‐based nitreone ( N(I) ) compounds. We have conducted a detailed investigation into the electronic properties, chemical reactivity, and electrochemical behavior of these nitreone ( N(I) ) compounds. Additionally, the potential of these MII‐CDI adducts as guanidinate ligands is explored. Our investigations here display the distinct reactivities of MII in contrast to their N ‐heterocyclic imine ( NHI ) congeners.
A comparison of bead‐spring and site‐binding models for weak polyelectrolytes
(2025) Burth, Loris; Beyer, David; Holm, Christian
Understanding the ionization behavior of weak polyelectrolytes in aqueous solutions with added salt is crucial for designing advanced materials. Predicting the ionization states of weak polyelectrolyte is challenging due to the interplay between long‐range Coulomb interactions, conformational flexibility, and chemical equilibria. Bead‐spring models with explicit ion treatment provide accurate results but are computationally expensive. In contrast, Ising‐like site‐binding models are computationally efficient but neglect conformational flexibility and use an implicit salt description. To assess the validity of these approximations, a site‐binding model is compared with bead‐spring models that include implicit and explicit ion treatments. These results show that under strong electrostatic coupling, explicit ion treatment is critical for accurately modeling ionization behavior. Both the site‐binding and implicit bead‐spring models overestimate monomer correlations in this regime, leading to significant deviations from the explicit bead‐spring model. Under weak coupling, typical of aqueous environments with monovalent salts, all models give reasonable ionization curves, with slight differences. The implicit bead‐spring model shows slightly stronger suppression of ionization, while the site‐binding model aligns more closely with the explicit bead‐spring model due to compensating errors in ion treatment and flexibility. In conclusion, while all models perform well under weak coupling, explicit ion treatment is essential for accurate ionization under strong coupling.
Bacterial minicell‐based biohybrid sub‐micron swimmers for targeted cargo delivery
(2025) Baltaci, Saadet Fatma; Akolpoglu, Mukrime Birgul; Kalita, Irina; Sourjik, Victor; Sitti, Metin
Bacterial biohybrid microrobots possess significant potential for targeted cargo delivery and minimally invasive therapy. However, many challenges, such as biocompatibility, stability, and effective cargo loading, remain. Bacterial membrane vesicles, also referred to as minicells, offer a promising alternative for creating sub‐micron scale biohybrid swimmers (minicell biohybrids) due to their active metabolism, non‐dividing nature, robust structure, and high cargo‐carrying capacity. Here, a biohybrid system is reported that utilizes motile minicells, ≈400 nm in diameter, generated by aberrant cell division of engineered Escherichia coli ( E. coli ), for the first time. Achieving over 99% purification from their parental bacterial cells, minicells are functionalized with magnetic nanoparticles (MNPs) to enable external magnetic control. Minicell biohybrids are capable of swimming at an average speed of up to 13.3 µm s -1 and being steered under a uniform magnetic field of 26 mT. Furthermore, they exhibit a significantly high drug loading capacity (2.8 µg mL -1 ) while maintaining their motility and show pH‐sensitive release of anticancer drug doxorubicin hydrochloride (DOX) under acidic conditions. Additionally, drug‐loaded minicell biohybrids notably reduce the viability of SK‐BR‐3 breast cancer cells in vitro. This study introduces minicell biohybrids and establishes their potential as magnetically guided, drug‐loaded biohybrid systems for targeted therapies in future medical applications.
Advanced imaging‐based metrology for precise deformation monitoring : railway bridge case study
(2025) Hartlieb, Simon; Zeller, Amelie; Haist, Tobias; Reichardt, André; Tarín, Cristina; Reichelt, Stephan
In this article, two advanced imaging‐based metrology methods, the multipoint method and the tele‐wide‐angle method, are introduced to the field of structural health monitoring. Both provide the means to significantly improve either the measurement uncertainty or the field of view compared to classical imaging‐based methods. The multipoint method utilizes a computer‐generated hologram to replicate a single object point to a predefined spot pattern in the image. Spatial averaging of the spot positions improves the measurement uncertainty. The second method, called tele‐wide‐angle, uses a diffraction grating to considerably enlarge the field of view of a tele objective lens. Both methods are investigated regarding the achievable measurement uncertainty at distances between 34 and 50 m. The standard deviations of the error range between 0.027 and 0.034 mm for the multipoint method and 0.008 and 0.02 mm for the tele‐wide‐angle method. In the second part of the article, both measurement systems are employed in a field study, measuring the deformation of a railway bar arch bridge. An inductive displacement transducer and several accelerometers are installed to validate the measured displacements and dynamics.
Simulation and control of a constrained-based walking mechanism
(2022) Eckstein, Simon