Bitte benutzen Sie diese Kennung, um auf die Ressource zu verweisen: http://dx.doi.org/10.18419/opus-9314
Autor(en): Bettadapura Raghavendra, Shreyas
Titel: Relevance of the two adjusting screws in data analytics: data quality and optimization of algorithms
Erscheinungsdatum: 2017
Dokumentart: Abschlussarbeit (Master)
Seiten: 94
URI: http://nbn-resolving.de/urn:nbn:de:bsz:93-opus-ds-93311
http://elib.uni-stuttgart.de/handle/11682/9331
http://dx.doi.org/10.18419/opus-9314
Zusammenfassung: In the context of learning from data, the impact on the performance of a learning algorithm has traditionally been studied through the perspective of data preprocessing and through that of empirical works. We attempt to provide a middle ground by employing an approach which enables a systematic analysis considering the interaction between the quality of the data provided for training, and the configurations applied to the learning algorithm. This is achieved through the concepts of a Data Quality Profile, which depicts quality indicators for the dataset and a Classification Configuration Profile, which depicts the configuration parameters applied to the learning algorithm. Both the profiles have the common characteristic of being able to distinctly view, and equally represent the variations in their properties, allowing for a systematic study. We demonstrate this through a prototypical implementation, considering the data quality indicators of missing values, label imbalance, and high cardinality, and evaluating it against the CART Decision Tree algorithm, configurable by its splitting criteria, early stopping criteria, and training data preprocessing operations. We were able to successfully observe a relationship between decreasing quality of the training data, and deterioration in the performance of the algorithm. The flexibility of the approach allows for easy progression to other algorithms, and implementations of more quality indicators.
Enthalten in den Sammlungen:05 Fakultät Informatik, Elektrotechnik und Informationstechnik

Dateien zu dieser Ressource:
Datei Beschreibung GrößeFormat 
ShreyasThesisFinal.pdf1,25 MBAdobe PDFÖffnen/Anzeigen


Alle Ressourcen in diesem Repositorium sind urheberrechtlich geschützt.