Bitte benutzen Sie diese Kennung, um auf die Ressource zu verweisen:
http://dx.doi.org/10.18419/opus-9889
Autor(en): | Villanueva Zacarías, Alejandro Gabriel |
Titel: | Exploring classification algorithms and data feature selection for domain specific industrial text data |
Erscheinungsdatum: | 2016 |
Dokumentart: | Abschlussarbeit (Master) |
Seiten: | 103 |
URI: | http://nbn-resolving.de/urn:nbn:de:bsz:93-opus-ds-99062 http://elib.uni-stuttgart.de/handle/11682/9906 http://dx.doi.org/10.18419/opus-9889 |
Zusammenfassung: | Unstructured text data represents a valuable source of information that nonetheless remains sub utilised due to the lack of efficient methods to manipulate it and extract insights from it. One example of such deficiencies is the lack of suitable classification solutions that address the particular nature of domain-specific industrial text data. In this thesis we explore the factors that impact the performance of classification algorithms, as well as the properties of domain-specific industrial text data, to propose a framework that guides the design of text classification solutions that can achieve an optimal trade-off between accuracy and processing time. Our research model investigates the effect that the availability of data features has on the observed performance of a classification algorithm. To explain this relationship, we build a series of prototypical Naïve Bayes algorithm configurations out of existing components and test them on two role datasets from a quality process of an automotive company. A key finding is that properly designed feature selection techniques can play a major role in achieving optimal performance both in terms of accuracy and processing time by providing the right amount of meaningful features. We test our results for statistical significance, proceed to suggest an optimal solution for our application scenario and conclude by describing the nature of the variable relationships contained in our research model. |
Enthalten in den Sammlungen: | 05 Fakultät Informatik, Elektrotechnik und Informationstechnik |
Dateien zu dieser Ressource:
Datei | Beschreibung | Größe | Format | |
---|---|---|---|---|
AGVZ Master thesis.pdf | 3,74 MB | Adobe PDF | Öffnen/Anzeigen |
Alle Ressourcen in diesem Repositorium sind urheberrechtlich geschützt.