A novel NeRF-based approach for extracting single objects and generating synthetic training data for grasp prediction models

Mobasher, Anas

Bitte benutzen Sie diese Kennung, um auf die Ressource zu verweisen: http://dx.doi.org/10.18419/opus-14252

Autor(en):	Mobasher, Anas
Titel:	A novel NeRF-based approach for extracting single objects and generating synthetic training data for grasp prediction models
Erscheinungsdatum:	2023
Dokumentart:	Abschlussarbeit (Master)
Seiten:	62
URI:	http://nbn-resolving.de/urn:nbn:de:bsz:93-opus-ds-142718 http://elib.uni-stuttgart.de/handle/11682/14271 http://dx.doi.org/10.18419/opus-14252
Zusammenfassung:	One of the main challenges in robotic manipulation is to grasp previously unseen objects without prior knowledge. State-of-the-art methods rely on dedicated machine learning models which are trained on RGB-Depth (RGB-D) images and annotated labels to predict grasp poses in unstructured environments and for a wide range of previously unseen objects. Collecting a diverse and labeled dataset, however, can be time-consuming and costly. To overcome these challenges, we propose to use Neural Radiance Fields (NeRF) to generate RGB-D images and to combine these with a cutting-edge automatic-labeling approach to create data for training grasp prediction networks. The main contribution of this thesis is a novel method for obtaining individual NeRFs for objects of interest and backgrounds. The method requires two input scenes: a complete scene containing an object of interest and the same scene but without the object. The steps of the method include training a NeRF on the background scene, aligning it with the object scene, combining it with another NeRF to be trained on the object scene, and joint optimization of both NeRFs with depth regularization loss added to NeRF loss. By applying this approach to various datasets, it is possible to create a library of trained object and background NeRFs. Arbitrary combinations of these NeRFs can then be used to generate novel scenes and render synthetic images for training detection networks. In a comprehensive ablation study, we employ our approach to create four distinct datasets, apply an automatic labeling pipeline to them and use them to train corresponding grasp prediction networks. The results validate the viability of NeRF-generated data for training detection models, showcasing a performance nearly on par with real data. Furthermore, our approach unveils exciting potential for scalability by facilitating the generation of novel data. Overall, this research advances the field of robotic manipulation by proving the potential of using NeRF-generated synthetic data and novel scenes to train robust grasp prediction models for real-world applications.
Enthalten in den Sammlungen:	13 Zentrale Universitätseinrichtungen

Dateien zu dieser Ressource:

Datei	Beschreibung	Größe	Format
Mobasher_Anas.pdf		30,1 MB	Adobe PDF	Öffnen/Anzeigen

Zur Langanzeige

Alle Ressourcen in diesem Repositorium sind urheberrechtlich geschützt.

Universität Stuttgart

OPUS - Online Publikationen der Universität Stuttgart