Hierarchical inverse reinforcement learning from motion capture data

Prakash, Rohit

Bitte benutzen Sie diese Kennung, um auf die Ressource zu verweisen: http://dx.doi.org/10.18419/opus-11962

Autor(en):	Prakash, Rohit
Titel:	Hierarchical inverse reinforcement learning from motion capture data
Erscheinungsdatum:	2019
Dokumentart:	Abschlussarbeit (Master)
Seiten:	50
URI:	http://nbn-resolving.de/urn:nbn:de:bsz:93-opus-ds-119798 http://elib.uni-stuttgart.de/handle/11682/11979 http://dx.doi.org/10.18419/opus-11962
Zusammenfassung:	A human motion generally consists of multiple low-level tasks which are performed in a defined order or in parallel to achieve a high level task. For example, making pizza dough consists of several low-level tasks such as measuring water, adding yeast, measuring flour, etc. And these activities must be performed in a definite order to make the dough. For a system to imitate these sequence of activities of a long horizon task with one global reward function is a lot of work. This process can be made easier if there is hierarchical state representation of the task and learning of local rewards for the hierarchies. In this thesis, we have learned to imitate a general day to day human activity of ’setting table for one person’. This work adopts a framework called Hierarchical Inverse Reinforcement Learning (HIRL), which is a model to learn sub-task structure from demonstrations. With this framework, the activity is decomposed into multiple lower level tasks which are performed in a sequence using learned policies. In this work, Maximum Entropy Inverse Reinforcement Learning (MaxEnt-IRL) is used to learn local rewards for the sub-tasks. Together with hierarchical state space representation and local reward functions, the model encodes the high level task objective based on human demonstrations of full body motion performing the high level task. The model achieves a success rate of 84% on average for middle levels and 83% for the top level in the cross validation tests. For visualization, the model is simulated in a 2D representation that takes current environment state as input and runs till the completion of the task.
Enthalten in den Sammlungen:	05 Fakultät Informatik, Elektrotechnik und Informationstechnik

Dateien zu dieser Ressource:

Datei	Beschreibung	Größe	Format
19-prakash-MSc.pdf		4,98 MB	Adobe PDF	Öffnen/Anzeigen

Zur Langanzeige

Alle Ressourcen in diesem Repositorium sind urheberrechtlich geschützt.

Universität Stuttgart

OPUS - Online Publikationen der Universität Stuttgart