Bimodal taint analysis for detecting unusual parameter-sink flows

Chow, Yiu Wai

Bitte benutzen Sie diese Kennung, um auf die Ressource zu verweisen: http://dx.doi.org/10.18419/opus-12626

Autor(en):	Chow, Yiu Wai
Titel:	Bimodal taint analysis for detecting unusual parameter-sink flows
Erscheinungsdatum:	2022
Dokumentart:	Abschlussarbeit (Master)
Seiten:	vi, 44
URI:	http://nbn-resolving.de/urn:nbn:de:bsz:93-opus-ds-126450 http://elib.uni-stuttgart.de/handle/11682/12645 http://dx.doi.org/10.18419/opus-12626
Zusammenfassung:	Finding vulnerabilities is a crucial activity, and automated techniques for this purpose are in high demand. For example, the Node Package Manager (npm) offers a massive amount of software packages, which get installed and used by millions of developers each day. Because of the dense network of dependencies between npm packages, vulnerabilities in individual packages may easily affect a wide range of software. Taint analysis is a powerful tool to detect such vulnerabilities. However, it is challenging to clearly define a problematic flow. A possible way to identify problematic flows is to incorporate natural language information like code convention and informal knowledge into the analysis. For example, a user might not find it surprising that a parameter named cmd of a function named execCommand is open to command injection. Thus this flow is likely unproblematic as the user will not pass untrusted data to cmd. In contrast, a user might not expect a parameter named value of a function named staticSetConfig to be vulnerable to command injection. Thus this flow is likely problematic as the user might pass untrusted data to value, since the natural language information from the parameter and function name suggests a different security context. To effectively exploit the implicit information in code, we introduce a bimodal taint analysis tool, Fluffy. The first modality is code: Fluffy uses a mining analysis implemented in CodeQL to find examples of flows from parameters to vulnerable sinks. The second modality is natural language: Fluffy uses a machine learning model that, based on a corpus of such examples, learns how to distinguish unexpected flows from expected flows using natural language information. We instantiate four neural models, offering different trade-offs between manual efforts required and accuracy of predictions. In our evaluation, Fluffy is able to achieve a F1-score of 0.85 or more on four common vulnerability types. In addition, Fluffy is able to flag eleven previously unknown vulnerabilities in real-life projects, of which six are confirmed.
Enthalten in den Sammlungen:	05 Fakultät Informatik, Elektrotechnik und Informationstechnik

Dateien zu dieser Ressource:

Datei	Beschreibung	Größe	Format
Chow_master_thesis.pdf		2,46 MB	Adobe PDF	Öffnen/Anzeigen

Zur Langanzeige

Alle Ressourcen in diesem Repositorium sind urheberrechtlich geschützt.

Universität Stuttgart

OPUS - Online Publikationen der Universität Stuttgart