Representation learning of scene images for task and motion planning

Nguyen, Son Tung

Bitte benutzen Sie diese Kennung, um auf die Ressource zu verweisen: http://dx.doi.org/10.18419/opus-11998

Langanzeige der Metadaten

DC Element	Wert	Sprache
dc.contributor.author	Nguyen, Son Tung	-
dc.date.accessioned	2022-02-24T11:20:20Z	-
dc.date.available	2022-02-24T11:20:20Z	-
dc.date.issued	2020	de
dc.identifier.other	1795010509	-
dc.identifier.uri	http://nbn-resolving.de/urn:nbn:de:bsz:93-opus-ds-120159	de
dc.identifier.uri	http://elib.uni-stuttgart.de/handle/11682/12015	-
dc.identifier.uri	http://dx.doi.org/10.18419/opus-11998	-
dc.description.abstract	This thesis investigates two different methods to learn a state representation from only image observations for task and motion planning (TAMP) problems. Our first method integrates a multimodal learning formulation to optimize an autoencoder not only on a regular image reconstruction but also jointly on a natural language processing (NLP) task. Therefore, a discrete, spatially meaningful latent representation is obtained that enables effective autonomous planning for sequential decisionmaking problems only using visual sensory data. We integrate our method into a full planning framework and verify its feasibility on the classic blocks world domain [26]. Our experiments show that using auxiliary linguistic data leads to better representations, thus improves planning capability. However, since the representation is not interpretable, learning an accurate action model is extremely challenging, rendering the method still inapplicable to TAMP problems. Therefore, to address the necessity of learning an explainable representation, we present a self-supervised learning method to learn scene graphs that represent objects (“red box”) and their spatial relationships (“yellow cylinder on red box”). Such a scene graph representation provides spatial relations in the form of symbolic logical predicates, thus eliminates the need of pre-defining these symbolic rules. Finally, we unify the proposed representation with a non-linear optimization method for robot motion planning and verify its feasibility on the classic blocks-world domain. Our proposed framework successfully finds the sequence of actions and enables the robot to execute feasible motion plans to realize the given tasks.	en
dc.language.iso	en	de
dc.rights	info:eu-repo/semantics/openAccess	de
dc.subject.ddc	004	de
dc.title	Representation learning of scene images for task and motion planning	en
dc.type	masterThesis	de
ubs.fakultaet	Informatik, Elektrotechnik und Informationstechnik	de
ubs.institut	Institut für Parallele und Verteilte Systeme	de
ubs.publikation.seiten	58	de
ubs.publikation.typ	Abschlussarbeit (Master)	de
Enthalten in den Sammlungen:	05 Fakultät Informatik, Elektrotechnik und Informationstechnik

Dateien zu dieser Ressource:

Datei	Beschreibung	Größe	Format
Nguyen_Son_Tung_master_thesis.pdf		4,37 MB	Adobe PDF	Öffnen/Anzeigen

Zur Kurzanzeige

Alle Ressourcen in diesem Repositorium sind urheberrechtlich geschützt.

Universität Stuttgart

OPUS - Online Publikationen der Universität Stuttgart