Measurement of the quality of structured and unstructured data accumulating in the product life cycle in a data quality dashboard

dc.contributor.authorChellathurai Saroja, Shalini
dc.date.accessioned2017-10-26T13:14:55Z
dc.date.available2017-10-26T13:14:55Z
dc.date.issued2017de
dc.description.abstractThis thesis provides an overview on existing data quality metrics for structured and unstructured data as well as on the existing data quality dashboards for measuring the quality of structured and unstructured data. Open research questions for interpreting the data quality are discussed. The metrics percentage of null values, percentage of duplicate values and percentage of non-domain values were selected and implemented as REST based web services. Furthermore, a web application was developed to enable (1) upload of the data file for which data quality shall be assessed from two standard formats JSON and CSV and (2) flexible integration of various data quality metrics. The latter is enabled by using an interface. To illustrate the functionality of this interface, the metric percentage of spelling mistakes provided by the supervisor of the thesis is integrated with the web application. The data quality is indicated as percentage in the range from 0 to 100 as well as encoded with colors for the whole dataset and for each column. Donut chart or pie chart visualizations are implemented for the chosen data quality metrics. The implemented web application and metrics were evaluated with the example datasets for data accumulating in the product life cycle as provided by the supervisor. Finally, the dashboard is compared with existing data quality dashboards and the results are tabulated.en
dc.identifier.other496361716
dc.identifier.urihttp://nbn-resolving.de/urn:nbn:de:bsz:93-opus-ds-93288de
dc.identifier.urihttp://elib.uni-stuttgart.de/handle/11682/9328
dc.identifier.urihttp://dx.doi.org/10.18419/opus-9311
dc.language.isoende
dc.rightsinfo:eu-repo/semantics/openAccessde
dc.subject.ddc004de
dc.titleMeasurement of the quality of structured and unstructured data accumulating in the product life cycle in a data quality dashboarden
dc.title.alternativeMessen der Qualität von strukturierten und unstrukturierten Daten, die im Produktlebenszyklus anfallen in einem Datenqualitätsdashboardde
dc.typemasterThesisde
ubs.fakultaetInformatik, Elektrotechnik und Informationstechnikde
ubs.institutInstitut für Parallele und Verteilte Systemede
ubs.publikation.seitenxvi, 61de
ubs.publikation.typAbschlussarbeit (Master)de

Files

Original bundle

Now showing 1 - 1 of 1
Thumbnail Image
Name:
0990-0004.pdf
Size:
2.34 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
3.39 KB
Format:
Item-specific license agreed upon to submission
Description: