A Systematic Approach to extend the common Software Testing Types with modules specific to the field of Machine Learning

Revanna, Sweekar

A Systematic Approach to extend the common Software Testing Types with modules specific to the field of Machine Learning

Files

Primary Final_Thesis.pdf (5.41 MB)

Date

2019

Authors

Revanna, Sweekar

Abstract

In this data-driven age, many Machine learning (ML) or predictive analytics related software applications are developed, utilizing the data to extract knowledge and provide insights to the customers. Software testing plays an important role in assuring the quality of a software application. Hence, there is a need to combine these two distinctive domains and develop a systematic approach to detect the errors in the ML by practicing the principles of software testing. Recent publications emphasize the necessity of testing the ML model, the aspects to test in the ML domain and provide suggestions for possible tests. However, the extent and rigor of software testing principles as specified in the ISO/IEC/IEEE 29119 series was not sufficiently considered by these publications. Therefore, this thesis focuses on the applicability of common software testing types, techniques, and methods to the field of ML. To do so, we determine defects and errors that affect the model quality through expert interviews, literature review and occurrence frequency of the defects in a discussion group of the Kaggle website. Unit testing, data quality checks, and model evaluation metrics were common Quality Assurance (QA) practices followed by the interviewed data scientists. As the main contribution, this thesis presents a set of automated tests based on the ISO/IEC/IEEE 29119 series of software testing standards that check data and model for the ML related defects. The automated tests were evaluated against publicly available data and the codes of ML model for the competition, "Titanic: Machine Learning from Disaster" on Kaggle website. The evaluation revealed defects hidden in the data and the model, demonstrating the benefit of our proposed extension of software testing principles to the ML domain. Furthermore, from seven data scientists indicated that these additional tests are easily understandable and usable.

URI

http://nbn-resolving.de/urn:nbn:de:bsz:93-opus-ds-106823
http://elib.uni-stuttgart.de/handle/11682/10682
http://dx.doi.org/10.18419/opus-10665

Collections

05 Fakultät Informatik, Elektrotechnik und Informationstechnik

Full item page

A Systematic Approach to extend the common Software Testing Types with modules specific to the field of Machine Learning

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By