Bitte benutzen Sie diese Kennung, um auf die Ressource zu verweisen: http://dx.doi.org/10.18419/opus-14139
Langanzeige der Metadaten
DC ElementWertSprache
dc.contributor.authorRichter, Vanessa-
dc.date.accessioned2024-03-27T15:47:42Z-
dc.date.available2024-03-27T15:47:42Z-
dc.date.issued2023de
dc.identifier.other1885243707-
dc.identifier.urihttp://nbn-resolving.de/urn:nbn:de:bsz:93-opus-ds-141587de
dc.identifier.urihttp://elib.uni-stuttgart.de/handle/11682/14158-
dc.identifier.urihttp://dx.doi.org/10.18419/opus-14139-
dc.description.abstractUtilizing computer vision and speech signal processing to assess neurological and psychiatric conditions has the potential to help detecting diseases or monitoring their progression earlier and more accurately. However, retrieving the required information from speech and facial modalities presents the challenge of finding features that generalize across studies with high sensitivity and specificity. A major task in finding such features is dealing with overfitting to data biases in small sample sizes and redundancy in the analysis of high-dimensional feature sets. It is also critical to ensure interpretability of these methods since the results of health screening tools must be explainable to clinicians and patients. In this thesis, we present a transparent feature selection pipeline that specifically addresses demographic biases and feature redundancy. Our method provides interpretable insights by quantifying feature contributions to classification results using Shapley values. More specifically, we assessed age trends of the entire healthy control cohort and corrected the feature values based on the determined age coefficients. Sex-specific z-scoring was used to account for differences between males and females. To address feature redundancy, we used hierarchical clustering to group features into sensible domain-specific clusters, such as voice quality, jaw movement, or mouth symmetry. These clusters together with feature effect sizes were used in the classification step to select only the most salient features as input to the classifier. Finally, Shapley values were calculated to unwrap model decisions and evaluate the contribution of individual features. We used datasets on neurological (bulbar pre-symptomatic and bulbar symptomatic ALS) and mental (depression and schizophrenia) diseases as well as a healthy control dataset. The data was collected in a real-world scenario, where participants engaged with a virtual agent that guided the participants through a set of tasks. We apply the presented feature selection method including Shapley-based analyses on these datasets. Our analysis provides valuable insights into feature contribution among binary and multiclass classification experiments and reveals shared characteristics across disorders.en
dc.language.isoende
dc.rightsinfo:eu-repo/semantics/openAccessde
dc.subject.ddc004de
dc.subject.ddc400de
dc.titleCreating a robust and effective feature selection pipeline in the clinical setting : how to leverage information from multiple modalities to identify features that are health condition-sensitive and -specific?en
dc.typemasterThesisde
ubs.fakultaetInformatik, Elektrotechnik und Informationstechnikde
ubs.institutInstitut für Maschinelle Sprachverarbeitungde
ubs.publikation.seiten77de
ubs.publikation.typAbschlussarbeit (Master)de
Enthalten in den Sammlungen:05 Fakultät Informatik, Elektrotechnik und Informationstechnik

Dateien zu dieser Ressource:
Datei Beschreibung GrößeFormat 
Thesis_Vanessa_Richter_CL.pdf8,53 MBAdobe PDFÖffnen/Anzeigen


Alle Ressourcen in diesem Repositorium sind urheberrechtlich geschützt.