06 Fakultät Luft- und Raumfahrttechnik und Geodäsie

Permanent URI for this collectionhttps://elib.uni-stuttgart.de/handle/11682/7

Browse

Search Results

Now showing 1 - 10 of 13

Open Access
Forming a hybrid intelligence system by combining Active Learning and paid crowdsourcing for semantic 3D point cloud segmentation
(2023) Kölle, Michael; Sörgel, Uwe (Prof. Dr.-Ing.)
While in recent years tremendous advancements have been achieved in the development of supervised Machine Learning (ML) systems such as Convolutional Neural Networks (CNNs), still the most decisive factor for their performance is the quality of labeled training data from which the system is supposed to learn. This is why we advocate focusing more on methods to obtain such data, which we expect to be more sustainable than establishing ever new classifiers in the rapidly evolving ML field. In the geospatial domain, however, the generation process of training data for ML systems is still rather neglected in research, with typically experts ending up being occupied with such tedious labeling tasks. In our design of a system for the semantic interpretation of Airborne Laser Scanning (ALS) point clouds, we break with this convention and completely lift labeling obligations from experts. At the same time, human annotation is restricted to only those samples that actually justify manual inspection. This is accomplished by means of a hybrid intelligence system in which the machine, represented by an ML model, is actively and iteratively working together with the human component through Active Learning (AL), which acts as pointer to exactly such most decisive samples. Instead of having an expert label these samples, we propose to outsource this task to a large group of non-specialists, the crowd. But since it is rather unlikely that enough volunteers would participate in such crowdsourcing campaigns due to the tedious nature of labeling, we argue attracting workers by monetary incentives, i.e., we employ paid crowdsourcing. Relying on respective platforms, typically we have access to a vast pool of prospective workers, guaranteeing completion of jobs promptly. Thus, crowdworkers become human processing units that behave similarly to the electronic processing units of this hybrid intelligence system performing the tasks of the machine part. With respect to the latter, we do not only evaluate whether an AL-based pipeline works for the semantic segmentation of ALS point clouds, but also shed light on the question of why it works. As crucial components of our pipeline, we test and enhance different AL sampling strategies in conjunction with both a conventional feature-driven classifier as well as a data-driven CNN classification module. In this regard, we aim to select AL points in such a manner that samples are not only informative for the machine, but also feasible to be interpreted by non-experts. These theoretical formulations are verified by various experiments in which we replace the frequently assumed but highly unrealistic error-free oracle with simulated imperfect oracles we are always confronted with when working with humans. Furthermore, we find that the need for labeled data, which is already reduced through AL to a small fraction (typically ≪1 % of Passive Learning training points), can be even further minimized when we reuse information from a given source domain for the semantic enrichment of a specific target domain, i.e., we utilize AL as means for Domain Adaptation. As for the human component of our hybrid intelligence system, the special challenge we face is monetarily motivated workers with a wide variety of educational and cultural backgrounds as well as most different mindsets regarding the quality they are willing to deliver. Consequently, we are confronted with a great quality inhomogeneity in results received. Thus, when designing respective campaigns, special attention to quality control is required to be able to automatically reject submissions of low quality and to refine accepted contributions in the sense of the Wisdom of the Crowds principle. We further explore ways to support the crowd in labeling by experimenting with different data modalities (discretized point cloud vs. continuous textured 3D mesh surface), and also aim to shift the motivation from a purely extrinsic nature (i.e., payment) to a more intrinsic one, which we intend to trigger through gamification. Eventually, by casting these different concepts into the so-called CATEGORISE framework, we constitute the aspired hybrid intelligence system and employ it for the semantic enrichment of ALS point clouds of different characteristics, enabled through learning from the (paid) crowd.
Open Access
Design and development of a calibration solution feasible for series production of cameras for video-based driver-assistant systems
(2022) Nekouei Shahraki, Mehrdad; Haala, Norbert (apl. Prof. Dr.)
In this study, we reviewed the current techniques and methods in photogrammetry - especially close-range photogrammetry - and focused on camera calibration. We reviewed the new evolving field of video-based driver-assistant systems, their requirements and their applications. Exclusively of fisheye cameras and a general omnidirectional projection, we extended an existing camera calibration model to address our needs and functionality requirements. These extensions enable us to use the camera calibration model in real-time embedded mobile systems with low processing power. We also introduced the free-function model as a flexible and advantageous model for camera distortion modelling. This is a new approach for modelling the overall image distortion together with the local lens distortions that are estimated using a standard model during the calibration process. Using free-function model on different lens designs, one can achieve good calibration accuracies by modelling the very local lens distortion taking benefit from the flexibility of this model. We introduced optimization strategies for recalculation and image rectification. These optimizations are also used to minimize the amount of required processing power and device memory. This brings many advantages to variety of computational platforms such as FPGAs, x86 and ARM processors, and makes it possible to benefit from variety of parallel-processing techniques. This model is capable of being used in runtime and is an ideal calibration model for using in variety of machine vision solutions. We also discussed several important requirements for accurate camera calibration that we later used in hardware test stand design phase. We designed and developed two different test stands in order to realize the specifications and geometrical features of multiple-view test-field-based camera calibration referred to as bundle-block calibration. One of their special geometrical characteristics is the uniform point distribution, which corresponds to the uniform motion. Such a point distribution is beneficial when using calibration models such as free-function model that enable us to model of local lens distortion with good accuracy and quality all over the image. A very important feature of this test stand is having the capability of performing camera/sensor alignment testing, a feature which is very important for testing the geometrical alignment of the internal mechanical elements of each camera. Using automated machines and algorithms in test stand calibration increased the stability and accuracy of the calibration and thus ensured the quality and speed of the calibration for cameras. These test stands are capable of performing automatic camera calibration, suitable for applications such as series-production of cameras. As an accuracy -and flexibility evaluation step for the free-function model, we tested the free-function calibration model on real-world data using a stereo camera with added large local distortions taking images from a front vehicle similar to the conditions where real-world use-cases are defined. By performing the camera calibration, we compared the calibration results and accuracy parameters of the free-function model to a conventional calibration model. Using these calibration results, we generated a set of disparity maps and compared their density and availability, especially on the areas where the local distortion was present. We used this test to compare the capabilities of the proposed model to conventional ones in real-wold situations where large optical distortions could be present that cannot be easily modelled with conventional calibration models. The higher modelling capability and accuracy of the free-function model will generally influence those functions that are using the information of the disparity map or the derived 3D information as part of their input data and potentially leads to the better functionality or even their availability if local distortions are present in the image. There are many more use-cases in photogrammetry and computer-vision where a higher calibration accuracy is beneficial on hardware such as low-cost optics where sometimes optical distortion are available that cannot easily be modelled with classical models. These use-cases could all benefit from the flexibility and modelling accuracy of the free-function model.
Open Access
New methods for 3D reconstructions using high resolution satellite data
(2021) Gong, Ke; Fritsch, Dieter (Prof. Dr.-Ing. habil. Prof. h.c.)
Open Access
Making historical gyroscopes alive - 2D and 3D preservations by sensor fusion and open data access
(2021) Fritsch, Dieter; Wagner, Jörg F.; Ceranski, Beate; Simon, Sven; Niklaus, Maria; Zhan, Kun; Mammadov, Gasim
The preservation of cultural heritage assets of all kind is an important task for modern civilizations. This also includes tools and instruments that have been used in the previous decades and centuries. Along with the industrial revolution 200 years ago, mechanical and electrical technologies emerged, together with optical instruments. In the meantime, it is not only museums who showcase these developments, but also companies, universities, and private institutions. Gyroscopes are fascinating instruments with a history dating back 200 years. When J.G.F. Bohnenberger presented his machine to his students in 1810 at the University of Tuebingen, Germany, nobody could have foreseen that this fascinating development would be used for complex orientation and positioning. At the University of Stuttgart, Germany, a collection of 160 exhibits is available and in transition towards their sustainable future. Here, the systems are digitized in 2D, 2.5D, and 3D and are made available for a worldwide community using open access platforms. The technologies being used are computed tomography, computer vision, endoscopy, and photogrammetry. We present a novel workflow for combining voxel representations and colored point clouds, to create digital twins of the physical objects with 0.1 mm precision. This has not yet been investigated and is therefore pioneering work. Advantages and disadvantages are discussed and suggested work for the near future is outlined in this new and challenging field of tech heritage digitization.
Open Access
CRBeDaSet : a benchmark dataset for high accuracy close range 3D object reconstruction
(2023) Gabara, Grzegorz; Sawicki, Piotr
This paper presents the CRBeDaSet - a new benchmark dataset designed for evaluating close range, image-based 3D modeling and reconstruction techniques, and the first empirical experiences of its use. The test object is a medium-sized building. Diverse textures characterize the surface of elevations. The dataset contains: the geodetic spatial control network (12 stabilized ground points determined using iterative multi-observation parametric adjustment) and the photogrammetric network (32 artificial signalized and 18 defined natural control points), measured using Leica TS30 total station and 36 terrestrial, mainly convergent photos, acquired from elevated camera standpoints with non-metric digital single-lens reflex Nikon D5100 camera (ground sample distance approx. 3 mm), the complex results of the bundle block adjustment with simultaneous camera calibration performed in the Pictran software package, and the colored point clouds (ca. 250 million points) from terrestrial laser scanning acquired using the Leica ScanStation C10 and post-processed in the Leica Cyclone™ SCAN software (ver. 2022.1.1) which were denoized, filtered, and classified using LoD3 standard (ca. 62 million points). The existing datasets and benchmarks were also described and evaluated in the paper. The proposed photogrammetric dataset was experimentally tested in the open-source application GRAPHOS and the commercial suites ContextCapture, Metashape, PhotoScan, Pix4Dmapper, and RealityCapture. As the first experience in its evaluation, the difficulties and errors that occurred in the software used during dataset digital processing were shown and discussed. The proposed CRBeDaSet benchmark dataset allows obtaining high accuracy (“mm” range) of the photogrammetric 3D object reconstruction in close range, based on a multi-image view uncalibrated imagery, dense image matching techniques, and generated dense point clouds.
Open Access
Evaluation of Phase One scan station for analogue aerial image digitisation
(2021) Schulz, Joachim; Cramer, Michael; Herbst, Theresa
Historical aerial photographs represent a special cultural asset for preserving information about land cover and land use change in the twentieth century with a high spatial and temporal resolution. A current topic is the digitisation of historical images to make them accessible to a wider range of users and to preserve them from age deterioration. For a photogrammetric evaluation, a high geometric stability and accuracy during the digitization process is required. In this work, the resolving power and geometric quality of a Phase One iXM-MV150F high-performance camera was investigated, which is used at the Landesamt für Geoinformation und Landentwicklung Baden-Württemberg in the project ‘Digitaler Luftbildatlas Baden-Württemberg’ for the digitisation of historical aerial photographs. The resolving power of the system was empirically measured and analysed. The required modulation transfer function was determined using Siemens stars. With this method, the significant influence of the focus setting and deviations of the plane-parallel alignment could be determined. Using a digitised aerial survey of the Vaihingen/Enz test field, the impact of the above-mentioned effects and the influence of the geometry of the scanning camera on the quality of the derived data products was shown in comparison to a photogrammetric scanner. The comparison showed that dedicated photogrammetric scanners still achieve a higher accuracy, even if a high-quality optical system is used for the digitising stand with the document camera. Further investigations are justified to improve the accuracy and stability of digitising the aerial image with a document camera.
Open Access
On the information transfer between imagery, point clouds, and meshes for multi-modal semantics utilizing geospatial data
(2022) Laupheimer, Dominik; Haala, Norbert (apl. Prof. Dr.-Ing.)
The semantic segmentation of the huge amount of acquired 3D data has become an important task in recent years. Images and Point Clouds (PCs) are fundamental data representations, particularly in urban mapping applications. Textured meshes integrate both representations by wiring the PC and texturing the reconstructed surface elements with high-resolution imagery. Meshes are adaptive to the underlying mapped geometry due to their graph structure composed of non-uniform and non-regular entities. Hence, the mesh is a memory-efficient realistic-looking 3D map of the real world. For these reasons, we primarily opt for semantic segmentation of meshes, which is a widely overlooked topic in photogrammetry and remote sensing yet. In particular, we head for multi-modal semantics utilizing supervised learning. However, publicly available annotated geospatial mesh data has been rare at the beginning of the thesis. Therefore, annotating mesh data has to be done beforehand. To kill two birds with one stone, we aim for a multi-modal fusion that enables multi-modal enhancement of entity descriptors and semi-automatic data annotation leveraging publicly available annotations of non-mesh data. We propose a novel holistic geometry-driven association mechanism that explicitly integrates entities of modalities imagery, PC, and mesh. The established entity relationships between pixels, points, and faces enable the sharing of information across the modalities in a two-fold manner: (i) feature transfer (measured or engineered) and (ii) label transfer (predicted or annotated). The implementation follows a tile-wise strategy to facilitate scalability to large-scale data sets. At the same time, it enables parallel, distributed processing, reducing processing time. We demonstrate the effectiveness of the proposed method on the International Society for Photogrammetry and Remote Sensing (ISPRS) benchmark data sets Vaihingen 3D and Hessigheim 3D. Taken together, the proposed entity linking and subsequent information transfer inject great flexibility into the semantic segmentation of geospatial data. Imagery, PCs, and meshes can be semantically segmented with classifiers trained on any of these modalities utilizing features derived from any of these modalities. Particularly, we can semantically segment a modality by training a classifier on the same modality (direct approach) or by transferring predictions from other modalities (indirect approach). Hence, any established well-performing modality-specific classifier can be used for semantic segmentation of these modalities - regardless of whether they follow an end-to-end learning or feature-driven scheme. We perform an extensive ablation study on the impact of multi-modal handcrafted features for automatic 3D scene interpretation - both for the direct and indirect approach. We discuss and analyze various Ground Truth (GT) generation methods. The semi-automatic labeling leveraging the entity linking achieves consistent annotation across modalities and reduces the manual label effort to a single representation. Please note that the multiple epochs of the Hessigheim data consisting of manually annotated PCs and semi-automatically annotated meshes are a result of this thesis and provided to the community as part of the Hessigheim 3D benchmark. To further reduce the labeling effort to a few instances on a single modality, we combine the proposed information transfer with active learning. We recruit non-experts for the tedious labeling task and analyze their annotation quality. Subsequently, we compare the resulting classifier performances to conventional passive learning using expert annotation. In particular, we investigate the impact of visualizing the mesh instead of the PC on the annotation quality achieved by non-experts. In summary, we accentuate the mesh and its utility for multi-modal fusion, GT generation, multi-modal semantics, and visualizational purposes.
Open Access
Radargrammetric DSM generation by semi-global matching and evaluation of penalty functions
(2022) Wang, Jinghui; Gong, Ke; Balz, Timo; Haala, Norbert; Sörgel, Uwe; Zhang, Lu; Liao, Mingsheng
Radargrammetry is a useful approach to generate Digital Surface Models (DSMs) and an alternative to InSAR techniques that are subject to temporal or atmospheric decorrelation. Stereo image matching in radargrammetry refers to the process of determining homologous points in two images. The performance of image matching influences the final quality of DSM used for spatial-temporal analysis of landscapes and terrain. In SAR image matching, local matching methods are commonly used but usually produce sparse and inaccurate homologous points adding ambiguity to final products; global or semi-global matching methods are seldom applied even though more accurate and dense homologous points can be yielded. To fill this gap, we propose a hierarchical semi-global matching (SGM) pipeline to reconstruct DSMs in forested and mountainous regions using stereo TerraSAR-X images. In addition, three penalty functions were implemented in the pipeline and evaluated for effectiveness. To make accuracy and efficiency comparisons between our SGM dense matching method and the local matching method, the normalized cross-correlation (NCC) local matching method was also applied to generate DSMs using the same test data. The accuracy of radargrammetric DSMs was validated against an airborne photogrammetric reference DSM and compared with the accuracy of NASA’s 30 m SRTM DEM. The results show the SGM pipeline produces DSMs with height accuracy and computing efficiency that exceeds the SRTM DEM and NCC-derived DSMs. The penalty function adopting the Canny edge detector yields a higher vertical precision than the other two evaluated penalty functions. SGM is a powerful and efficient tool to produce high-quality DSMs using stereo Spaceborne SAR images.
Open Access
On the analysis and patterns of persistent scatterer interferometry results for satellite-based deformation monitoring
(2023) Schneider, Philipp J.; Sörgel, Uwe (Prof. Dr.-Ing.)
The remote sensing method Persistent Scatterer Interferometry (PSI) has developed in the last two decades into a tool for monitoring deformations of the earths surface. Hereby it can be applied to large areas and scenes like earth quakes, landslides or sinkholes but the increasing availability of high resolution Synthetic Aperture Radar (SAR) data also enables a monitoring of small scenes like individual buildings. PSI is recognized and appreciated in the remote sensing community and its benefit has been proven in countless applications. The PSI principle is based on the evaluation of time series of coherent SAR satellite images and considers the relative phase change over time for individual pixels. As a result of this interferometric evaluation, time series are obtained, which capture line-of-sight (LOS) movement of a scatterer over time with millimetre accuracy. The PSI method is especially suited for urban areas, because of the high density of good radar back scatterers in these locations. For high-resolution SAR data, such as those acquired by the TerraSAR-X mission, millions of such so-called Persistent Scatterer (PS) points and their deformation time series can be obtained. The presentation, evaluation, and interpretation of such data is still a challenge. The here presented research contributes to the question of how the joint analysis of many PS points and their time series can be used to infer the underlying causes of the deformation. The investigation of such a field of time series helps in the understanding of temporal and spatial patterns in movements. A distinction is made between the analysis of large-scale areas and the consideration of points on individual buildings. For wide-area deformations, such as those caused by underground constructions, mining activities or by undermining groundwater flow, adapted methods from meteorology, interpolation and decomposition procedures of different observation geometries are presented and discussed. For the monitoring of individual structures, such as single buildings, methods were developed that combine SAR data and geo-data from other sources, such as Airborne Laser Scanning (ALS) data and crowd sourced building circumferences. It can be shown that by grouping PS points that have correlated motion patterns, a building can be segmented into its statically independently moving elements. To achieve such a clustering in a robust way, so it can be applied to different data sources, a non-linear dimension reduction based on a hybrid distance metric is introduced. The results from such a clustering can then be integrated into detailed 3D models, such as those available for Building Information Modeling (BIM) based construction processes, and thus offer the possibility of a continuous and efficient structural monitoring of a building. Often PSI results have to be communicated to experts from non-SAR affine fields such as civil- and geo- engineering for interpretation, which can be challenging without specialized software. For this purpose, exemplary web portals are presented here, that allow PSI results to be displayed interactively. Such platforms are addressing the specific complexity of PSI data, so that informed decisions can be made. The utility of an ensemble evaluation of many PSI time series can be demonstrated, as it proves beneficial in wide-area processes. Motion patterns become identifiable and their spatial propagation can undergo analysis. When considering PS points on single buildings, a grouping of points based on their deformation patterns leads to redundant measurement and segments a structure into its independently moving parts. This segmentation can then be integrated into existing 3D building models and industry standards, signifying an important advancement towards automated and city-wide risk assessment of buildings. Web-based analysis platforms, specifically tailored for the SAR data, serve as a decision support system (DSS) and aid in sharing the findings with non-SAR experts.
Open Access
Integration of geometric computer vision, endoscopy and computed tomography for 3D modeling of gyroscopic instruments
(2022) Zhan, Kun; Fritsch, Dieter (Prof. Dr.-Ing. habil. Prof. h. c.)
3D digitization is of vital importance for cultural heritage assets for modern civilizations regarding safekeeping and promotion. Generally, cultural heritage indicates old buildings, ancient status or unearthed relics for the public. However, the objectives to be digitized also include tools and instruments that have been widely applied in the past decades, even though they have been replaced with more advanced technologies. We call these technical instruments and artifacts Tech Heritage (TH). Gyroscopes are one group of such fascinating instruments with a history dating back to 200 years. The main characteristics of gyroscopes regarding 3D digitization are (1) having highly complex structure; (2) consisting of different materials; (3) not only the surfaces but also the internal structures are important. All these features decide that no single methodology could meet the demand for their 3D digitization. To fulfill the requirements of gyroscopes in our research, photogrammetry, endoscopy and Computed Tomography (CT) are introduced for complete 3D digitization. With colored point clouds or textured meshes as result, photogrammetry is mainly for the global surface reconstruction of the object. For some cavities, holes or other parts that the regular camera hardly has access to, endoscopy is applied for a local 3D reconstruction, as supplement. As internal structures are also important, X-Ray computed tomography is utilized for volumetric 3D digitization. These three 3D sensor data should then be integrated for a complete 3D model. Additionally, the registration method should be adaptive to the data characteristics such as the geometry, point cloud density, etc. In this thesis, 3D reconstructions with each method as well as the data fusion are investigated. 1. Firstly, we study the stability and reliability of camera calibration before 3D reconstruction with photogrammetry and endoscopy. As the standard pre-calibration solution, Zhang's method suffers from the instability due to the correlations between the calibration parameters. To reduce this effect, the image configuration should be well considered with adequate oblique angles, distance difference as well as roll angels for a convergent image block. In our research, a quantitative analysis is implemented by a statistical approach using large bundles of images and get calibrations from randomly chosen image subsets. In addition, the recovered expected values of parameters are utilized as ground truth to scrutinize the single influencing factors of the imaging configuration. 2. Secondly, the 3D reconstruction processes are investigated with practical implementations. For the endoscope 3D reconstruction, the data acquisition process is the first challenge resulting from the image blur which may caused by the hand shaking as well as the small overlap. The imaging assistant setup and a mixture of image and video strategy are the methods adopted in our research as the solution. With the accurate calibration information and the improved image quality and configuration, we optimize the entire process through optimization of the Structure-from-Motion (SfM) method. As for CT 3D reconstruction, a stack of X-ray images, carrying the information of attenuation, is to be collected from different perspectives of the object. All reconstructed slices are integrated into an uniform 3D coordinate system to construct the complete 3D volumetric representation. 3. Thirdly, data registration methods are proposed regarding different data features. To register these two 3D data with few overlaps such as photogrammetry and endoscopic point clouds, a Gauss-Helmert model with manually picked control points is applied for transformation estimation with precision assessments. To take advantage of the pair-wise point cloud registration research, CT point cloud conversion and surface extraction are implemented from the volumetric CT data. As for the CT and photogrammetry data registration, it could be divided into two cases regarding the completeness of the CT surface representation. If the surface material is completely indicated in the CT data, we could directly project the color information from photogrammetric images to the CT surface after both datasets are transformed into the same coordinate system. In this way, we combine the high precision of CT data with the rich texture information. While low density surface material causes an incomplete representation of the CT surface, the transformation is estimated via the primitive based virtual control points from both surface data. With the determined transformation, the photogrammetric model could then be integrated with the CT model for a complete 3D representation. 4. Finally, in terms of 3D model expression, point clouds are of too big data volume if precision is required and have limited interaction possibilities. Therefore, the point clouds need to be vectorized into Constructive Solid Geometry (CSG) models to enable easier human-computer interaction. This process could be precisely done via manual work with sufficient caution via a Random Sample Consensus (RANSAC)-based geometric fitting process or even with a deep learning strategy via an end-to-end trained framework. The vectorized 3D model could be applied in AR/VR related applications to make full use of the work of 3D digitization. For the first time, three totally different sensors are studied for a fused 3D reconstruction in this research. Among the workflow, the practical application of endoscopy is fully investigated. The integration methods are adaptively designed according to the characteristics of each sensor as well as of the reconstructed object. It provides more possibilities and ideas for the digital tasks of different types of cultural heritage.

06 Fakultät Luft- und Raumfahrttechnik und Geodäsie

Browse

Filters

Settings

Sort By

Results per page

Search Results