Please use this identifier to cite or link to this item: http://dx.doi.org/10.18419/opus-9314
Authors: Bettadapura Raghavendra, Shreyas
Title: Relevance of the two adjusting screws in data analytics: data quality and optimization of algorithms
Issue Date: 2017
metadata.ubs.publikation.typ: Abschlussarbeit (Master)
metadata.ubs.publikation.seiten: 94
URI: http://nbn-resolving.de/urn:nbn:de:bsz:93-opus-ds-93311
http://elib.uni-stuttgart.de/handle/11682/9331
http://dx.doi.org/10.18419/opus-9314
Abstract: In the context of learning from data, the impact on the performance of a learning algorithm has traditionally been studied through the perspective of data preprocessing and through that of empirical works. We attempt to provide a middle ground by employing an approach which enables a systematic analysis considering the interaction between the quality of the data provided for training, and the configurations applied to the learning algorithm. This is achieved through the concepts of a Data Quality Profile, which depicts quality indicators for the dataset and a Classification Configuration Profile, which depicts the configuration parameters applied to the learning algorithm. Both the profiles have the common characteristic of being able to distinctly view, and equally represent the variations in their properties, allowing for a systematic study. We demonstrate this through a prototypical implementation, considering the data quality indicators of missing values, label imbalance, and high cardinality, and evaluating it against the CART Decision Tree algorithm, configurable by its splitting criteria, early stopping criteria, and training data preprocessing operations. We were able to successfully observe a relationship between decreasing quality of the training data, and deterioration in the performance of the algorithm. The flexibility of the approach allows for easy progression to other algorithms, and implementations of more quality indicators.
Appears in Collections:05 Fakultät Informatik, Elektrotechnik und Informationstechnik

Files in This Item:
File Description SizeFormat 
ShreyasThesisFinal.pdf1,25 MBAdobe PDFView/Open


Items in OPUS are protected by copyright, with all rights reserved, unless otherwise indicated.