Bitte benutzen Sie diese Kennung, um auf die Ressource zu verweisen: http://dx.doi.org/10.18419/opus-11998
Langanzeige der Metadaten
DC ElementWertSprache
dc.contributor.authorNguyen, Son Tung-
dc.date.accessioned2022-02-24T11:20:20Z-
dc.date.available2022-02-24T11:20:20Z-
dc.date.issued2020de
dc.identifier.other1795010509-
dc.identifier.urihttp://nbn-resolving.de/urn:nbn:de:bsz:93-opus-ds-120159de
dc.identifier.urihttp://elib.uni-stuttgart.de/handle/11682/12015-
dc.identifier.urihttp://dx.doi.org/10.18419/opus-11998-
dc.description.abstractThis thesis investigates two different methods to learn a state representation from only image observations for task and motion planning (TAMP) problems. Our first method integrates a multimodal learning formulation to optimize an autoencoder not only on a regular image reconstruction but also jointly on a natural language processing (NLP) task. Therefore, a discrete, spatially meaningful latent representation is obtained that enables effective autonomous planning for sequential decisionmaking problems only using visual sensory data. We integrate our method into a full planning framework and verify its feasibility on the classic blocks world domain [26]. Our experiments show that using auxiliary linguistic data leads to better representations, thus improves planning capability. However, since the representation is not interpretable, learning an accurate action model is extremely challenging, rendering the method still inapplicable to TAMP problems. Therefore, to address the necessity of learning an explainable representation, we present a self-supervised learning method to learn scene graphs that represent objects (“red box”) and their spatial relationships (“yellow cylinder on red box”). Such a scene graph representation provides spatial relations in the form of symbolic logical predicates, thus eliminates the need of pre-defining these symbolic rules. Finally, we unify the proposed representation with a non-linear optimization method for robot motion planning and verify its feasibility on the classic blocks-world domain. Our proposed framework successfully finds the sequence of actions and enables the robot to execute feasible motion plans to realize the given tasks.en
dc.language.isoende
dc.rightsinfo:eu-repo/semantics/openAccessde
dc.subject.ddc004de
dc.titleRepresentation learning of scene images for task and motion planningen
dc.typemasterThesisde
ubs.fakultaetInformatik, Elektrotechnik und Informationstechnikde
ubs.institutInstitut für Parallele und Verteilte Systemede
ubs.publikation.seiten58de
ubs.publikation.typAbschlussarbeit (Master)de
Enthalten in den Sammlungen:05 Fakultät Informatik, Elektrotechnik und Informationstechnik

Dateien zu dieser Ressource:
Datei Beschreibung GrößeFormat 
Nguyen_Son_Tung_master_thesis.pdf4,37 MBAdobe PDFÖffnen/Anzeigen


Alle Ressourcen in diesem Repositorium sind urheberrechtlich geschützt.