05 Fakultät Informatik, Elektrotechnik und Informationstechnik

Permanent URI for this collectionhttps://elib.uni-stuttgart.de/handle/11682/6

Browse

Search Results

Now showing 1 - 10 of 14
  • Thumbnail Image
    ItemOpen Access
    Stochastic neural networks : components, analysis, limitations
    (2022) Neugebauer, Florian; Polian, Ilia (Prof. Dr.)
    Stochastic computing (SC) promises an area and power-efficient alternative to conventional binary implementations of many important arithmetic functions. SC achieves this by employing a stream-based number format called Stochastic numbers (SNs), which enables bit-sequential computations, in contrast to conventional binary computations that are performed on entire words at once. An SN encodes a value probabilistically with equal weight for every bit in the stream. This encoding results in approximate computations, causing a trade-off between power consumption, area and computation accuracy. The prime example for efficient computation in SC is multiplication, which can be performed with only a single gate. SC is therefore an attractive alternative to conventional binary implementations in applications that contain a large number of basic arithmetic operations and are able to tolerate the approximate nature of SC. The most widely considered class of applications in this regard is neural networks (NNs), with convolutional neural networks (CNNs) as the prime target for SC. In recent years, steady advances have been made in the implementation of SC-based CNNs (SCNNs). At the same time however, a number of challenges have been identified as well: SCNNs need to handle large amounts of data, which has to be converted from conventional binary format into SNs. This conversion is hardware intensive and takes up a significant portion of a stochastic circuit's area, especially if the SNs have to be generated independently of each other. Furthermore, some commonly used functions in CNNs, such as max-pooling, have no exact corresponding SC implementation, which reduces the accuracy of SCNNs. The first part of this work proposes solutions to these challenges by introducing new stochastic components: A new stochastic number generator (SNG) that is able to generate a large number of SNs at the same time and a stochastic maximum circuit that enables an accurate implementation of max-pooling operations in SCNNs. In addition, the first part of this work presents a detailed investigation of the behaviour of an SCNN and its components under timing errors. The error tolerance of SC is often quoted as one of its advantages, stemming from the fact that any single bit of an SN contributes only very little to its value. In contrast, bits in conventional binary formats have different weights and can contribute as much as 50\% of a number's value. SC is therefore a candidate for extreme low-power systems, as it could potentially tolerate timing errors that appear in such environments. While the error tolerance of SC image processing systems has been demonstrated before, a detailed investigation into SCNNs in this regard has been missing so far. It will be shown that SC is not error tolerant in general, but rather that SC components behave differently even if they implement the same function, and that error tolerance of an SC system further depends on the error model. In the second part of this work, a theoretical analysis into the accuracy and limitations of SC systems is presented. An existing framework to analyse and manage the accuracy of combinational stochastic circuits is extended to cover sequential circuits. This framework enables a designer to predict the effect of small design changes on the accuracy of a circuit and determine important parameters such as SN length without extensive simulations. It will further be shown that the functions that are possible to implement in SC are limited. Due to the probabilistic nature of SC, some arithmetic functions suffer from a small bias when implemented as a stochastic circuit, including the max-pooling function in SCNNs.
  • Thumbnail Image
    ItemOpen Access
    Scatter and beam hardening correction for high-resolution CT in near real-time based on a fast Monte Carlo photon transport model
    (2022) Alsaffar, Ammar; Simon, Sven (Prof. Dr.-Ing.)
    Computed tomography (CT) is a powerful non-destructive testing (NDT) technique. It provides inception about the inner of the scanned object and is widely used for industrial and medical applications. However, this technique suffers from severe quality degradation artifacts. Among these artifacts, the scatter and the beam hardening (BH) causes severe quality degradation of the reconstructed CT images. The scatter results from the change in the direction, or the direction and the energy of the photon penetrating the object, while the beam hardening results from the polychromatic nature of the X-ray source. When photons of different energies penetrate through the object, low-energy photons are more easily absorbed than high-energy photons. This results in the hardening of the X-ray beam which causes the non-linear relation between the propagation path length and the attenuation of the beam. These kinds of artifacts are the major source of the cupping and the streak artifacts that highly degrades the quality of the computed tomography imaging. The presence of the cupping and the streak artifacts reduce the contrast of this image and the contrast-to-noise and cause distortion of the grey values. As a consequence important analysis of the results from the computed tomography technique is affected, e.g., the detectability of voids and cracks is reduced by the reduction of the contrast and affects the dimensional measurement. Monte Carlo (MC) simulation is considered the most accurate approach for scatter estimation. However, the existing MC estimators are computationally expensive, especially for the considered high-resolution flat-panel CT. In this work, a muli-GPU photon forward projection model and an iterative scatter correction algorithm were implemented. The Monte Carlo model has been highly accelerated and extensively verified using several experimental and simulated examples. The implemented model describes the physics within the 1 keV to 1 MeV range using multiple controllable key parameters. Based on this model, scatter computation for a single projection can be completed within a range of a few seconds under well-defined model parameters. Smoothing and interpolation are performed on the estimated scatter to accelerate the scatter calculation without compromising accuracy too much compared to measured near scatter-free projection images. Combining the scatter estimation with the filtered backprojection (FBP), scatter correction is performed effectively in an iterative manner. In order to evaluate the proposed MC model, extensive experiments have been conducted on the simulated data and real-world high-resolution flat-panel CT. Compared to the state-of-the-art MC simulators, the proposed MC model achieved a 15× acceleration on a single GPU in comparison to the GPU implementation of the Penelope simulator (MCGPU) utilizing several acceleration techniques, and a 202× speed-up on a multi-GPU system comparing to the multi-threaded state-of-the-art EGSnrc MC simulator. Furthermore, it is shown that for high-resolution images, scatter correction with sufficient accuracy is accomplished within one to three iterations using a FBP and the proposed fast MC photon transport model. Moreover, a fast and accurate BH correction method that requires no prior knowledge of the materials and corrects first and higher-order BH artifacts has been implemented. In the first step, a wide sweep of the material is performed based on an experimentally measured look-up table to obtain the closest estimate of the material. Then the non-linearity effect of the BH is corrected by adding the difference between the estimated monochromatic and the polychromatic simulated projections of the segmented image. The estimated monochromatic projection is simulated by selecting the energy from the polychromatic spectrum which produces the lowest mean square error (MSE) with the BH-corrupted projection from the scanner. While the polychromatic projection is accurately estimated using the least square estimation (LSE) method by minimizing the difference between the experimental projection and the linear combination of simulated polychromatic projections using different spectra of different filtration. As a result, an accurate non-linearity correction term is derived that leads to an accurate BH correction result. To evaluate the proposed BH correction method, extensive experiments have been conducted on real-world CT data. Compared to the state-of-the-art empirical BH correction method, the experiments show that the proposed method can highly reduce the BH artifacts without prior knowledge of the materials. In summary, the lack of the availability of fast and computationally efficient methods to correct the major artifacts in CT images, i.e., scatter and beam hardening, has motivated this work in which efficient and fast algorithms have been implemented to correct these artifacts. The correction of these artifacts has led to better visualization of the CT images, a higher contrast-to-noise ratio, and improved contrast. Supported by multiple experimental examples, it is shown that the scatter corrected images, using the proposed method, resample the near artifacts-free reference images acquired experimentally within a reasonable time. On the other hand, the application of the proposed BH correction method after the correction of the scatter artifacts results in the complete removal of the rest cupping and streak artifacts that were degrading the scatter-corrected images and improved the contrast-to-noise (CNR) ratio of the scatter-corrected images. Moreover, assessments of the correction quality of the CT images have been performed using the software Volume Graphics VGSTUDIO MAX. Better surface determination can be derived from the artifacts-corrected images. In addition, enhancing the contrast by correcting these artifacts results in an improved detectability of voids and cracks in several concrete examples. This supports the efficiency of the implemented artifacts correction methods in this work.
  • Thumbnail Image
    ItemOpen Access
    Benchmarking the performance of portfolio optimization with QAOA
    (2022) Brandhofer, Sebastian; Braun, Daniel; Dehn, Vanessa; Hellstern, Gerhard; Hüls, Matthias; Ji, Yanjun; Polian, Ilia; Bhatia, Amandeep Singh; Wellens, Thomas
    We present a detailed study of portfolio optimization using different versions of the quantum approximate optimization algorithm (QAOA). For a given list of assets, the portfolio optimization problem is formulated as quadratic binary optimization constrained on the number of assets contained in the portfolio. QAOA has been suggested as a possible candidate for solving this problem (and similar combinatorial optimization problems) more efficiently than classical computers in the case of a sufficiently large number of assets. However, the practical implementation of this algorithm requires a careful consideration of several technical issues, not all of which are discussed in the present literature. The present article intends to fill this gap and thereby provides the reader with a useful guide for applying QAOA to the portfolio optimization problem (and similar problems). In particular, we will discuss several possible choices of the variational form and of different classical algorithms for finding the corresponding optimized parameters. Viewing at the application of QAOA on error-prone NISQ hardware, we also analyse the influence of statistical sampling errors (due to a finite number of shots) and gate and readout errors (due to imperfect quantum hardware). Finally, we define a criterion for distinguishing between ‘easy’ and ‘hard’ instances of the portfolio optimization problem.
  • Thumbnail Image
    ItemOpen Access
    A GPU-accelerated light-field super-resolution framework based on mixed noise model and weighted regularization
    (2022) Tran, Trung-Hieu; Sun, Kaicong; Simon, Sven
    Light-field (LF) super-resolution (SR) plays an essential role in alleviating the current technology challenge in the acquisition of a 4D LF, which assembles both high-density angular and spatial information. Due to the algorithm complexity and data-intensive property of LF images, LFSR demands a significant computational effort and results in a long CPU processing time. This paper presents a GPU-accelerated computational framework for reconstructing high-resolution (HR) LF images under a mixed Gaussian-Impulse noise condition. The main focus is on developing a high-performance approach considering processing speed and reconstruction quality. From a statistical perspective, we derive a joint ℓ1- ℓ2data fidelity term for penalizing the HR reconstruction error taking into account the mixed noise situation. For regularization, we employ the weighted non-local total variation approach, which allows us to effectively realize LF image prior through a proper weighting scheme. We show that the alternating direction method of the multipliers algorithm (ADMM) can be used to simplify the computation complexity and results in a high-performance parallel computation on the GPU Platform. An extensive experiment is conducted on both synthetic 4D LF dataset and natural image dataset to validate the proposed SR model’s robustness and evaluate the accelerated optimizer’s performance. The experimental results show that our approach achieves better reconstruction quality under severe mixed-noise conditions as compared to the state-of-the-art approaches. In addition, the proposed approach overcomes the limitation of the previous work in handling large-scale SR tasks. While fitting within a single off-the-shelf GPU, the proposed accelerator provides an average speedup of 2.46 ×and 1.57 ×for ×2and ×3SR tasks, respectively. In addition, a speedup of 77×is achieved as compared to CPU execution.
  • Thumbnail Image
    ItemOpen Access
    Stress-aware periodic test of interconnects
    (2022) Sadeghi-Kohan, Somayeh; Hellebrand, Sybille; Wunderlich, Hans-Joachim
    Safety-critical systems have to follow extremely high dependability requirements as specified in the standards for automotive, air, and space applications. The required high fault coverage at runtime is usually obtained by a combination of concurrent error detection or correction and periodic tests within rather short time intervals. The concurrent scheme ensures the integrity of computed results while the periodic test has to identify potential aging problems and to prevent any fault accumulation which may invalidate the concurrent error detection mechanism. Such periodic built-in self-test (BIST) schemes are already commercialized for memories and for random logic. The paper at hand extends this approach to interconnect structures. A BIST scheme is presented which targets interconnect defects before they will actually affect the system functionality at nominal speed. A BIST schedule is developed which significantly reduces aging caused by electromigration during the lifetime application of the periodic test.
  • Thumbnail Image
    ItemOpen Access
    Physics inspired compact modelling of BiFeO3 based memristors
    (2022) Yarragolla, Sahitya; Du, Nan; Hemke, Torben; Zhao, Xianyue; Chen, Ziang; Polian, Ilia; Mussenbrock, Thomas
    With the advent of the Internet of Things, nanoelectronic devices or memristors have been the subject of significant interest for use as new hardware security primitives. Among the several available memristors, BiFe O3 (BFO)-based electroforming-free memristors have attracted considerable attention due to their excellent properties, such as long retention time, self-rectification, intrinsic stochasticity, and fast switching. They have been actively investigated for use in physical unclonable function (PUF) key storage modules, artificial synapses in neural networks, nonvolatile resistive switches, and reconfigurable logic applications. In this work, we present a physics-inspired 1D compact model of a BFO memristor to understand its implementation for such applications (mainly PUFs) and perform circuit simulations. The resistive switching based on electric field-driven vacancy migration and intrinsic stochastic behaviour of the BFO memristor are modelled using the cloud-in-a-cell scheme. The experimental current–voltage characteristics of the BFO memristor are successfully reproduced. The response of the BFO memristor to changes in electrical properties, environmental properties (such as temperature) and stress are analyzed and consistant with experimental results.
  • Thumbnail Image
    ItemOpen Access
    Dependable reconfigurable scan networks
    (2022) Lylina, Natalia; Wunderlich, Hans-Joachim (Prof.)
    The dependability of modern devices is enhanced by integrating an extensive number of extra-functional instruments. These are needed to facilitate cost-efficient bring-up, debug, test, diagnosis, and adaptivity in the field and might include, e.g., sensors, aging monitors, Logic, and Memory Built-In Self-Test (BIST) registers. Reconfigurable Scan Networks (RSNs) provide a flexible way to access such instruments as well the device's registers throughout the lifetime, starting from post-silicon validation (PSV) through manufacturing test and finally during in-field operation. At the same time, the dependability properties of the system can be affected through an improper RSN integration. This doctoral project overcomes these problems and establishes a methodology to integrate dependable RSNs for a given system considering the most relevant dependability aspects, such as robustness, testability, and security compliance of RSNs.
  • Thumbnail Image
    ItemOpen Access
    Whiplash simulation: how muscle modelling and movement interact
    (2022) Millard, Matthew; Siebert, Tobias; Stutzig, Norman; Fehr, Jörg
    Whiplash injury and associated disorders are costly to society and individuals. Accurate simulations of neck movement during car accidents are needed to assess the risk of whiplash injury. Existing simulations indicate that Hill-type muscle models are too compliant, and as a result, predict more neck movement than is observed during in-vivo experiments. Simulating head and neck movement is challenging because many of the neck muscles operate on the descending limb of the force-length curve, a region that Hill-type models inaccurately capture. Hill-type muscle models have negative stiffness on the descending limb of the force-length curve and so develop less force the more they are lengthened. Biological muscle, in contrast, can develop large transient forces during active lengthening and sustain large forces when aggressively lengthened. Recently, a muscle model has been developed that mimics the active impedance of muscle in the short range and can capture the large forces generated during extreme lengthening. In this work, we will compare the accuracy of simulated neck movements, using both a Hill-type model and the model of Millard et al., to the in-vivo neck movement. If successful, the improved accuracy of our simulations will make it possible to predict and help prevent neck injury.
  • Thumbnail Image
    ItemOpen Access
    Steered fiber orientation : correlating orientation and residual tensile strength parameters of SFRC
    (2022) Medeghini, Filippo; Guhathakurta, Jajnabalkya; Tiberti, Giuseppe; Simon, Sven; Plizzari, Giovanni A.; Mark, Peter
    Adding steel fibers to concrete improves the post-cracking tensile strength of the composite material due to fibers bridging the cracks. The residual performance of the material is influenced by fiber type, content and orientation with respect to the crack plane. The latter is a main issue in fiber-reinforced concrete elements, since it significantly influences the structural behavior. The aim of this research is to develop a tailor-made composite material and casting method to orient fibers in order to optimize the performance of the material for structural applications. To this aim, a mechanized concreting device that induces such preferred fiber orientation is designed and fabricated. It uses vibration and a series of narrow channels to guide and orient fibers. A composite with oriented fibers is produced using a hybrid system of macro and micro fibers and high-performance concrete. From the same concrete batch, specimens are cast both with and without the fiber orientation device, obtaining different levels of fiber orientation. Three-point bending tests are performed to measure and compare the residual tensile strength capacities with standard specimens cast according to EN 14651. Elements with favorable fiber orientation show a significant increase in residual tensile strength with respect to the standard beams. Finally, computed tomography and an electromagnetic induction method are employed to better assess the orientation and distribution of fibers in the beams. Their results are in good agreement and enable to link the residual tensile strength parameters with fiber orientation.
  • Thumbnail Image
    ItemOpen Access
    Automatic methods for protection of cryptographic hardware against fault attacks
    (2022) Gay, Maël; Polian, Ilia (Prof. Dr. rer. nat. habil.)
    Since several years, the number of electronic devices in use has been strongly rising, especially in the field of embedded systems. From automotive applications or smartphones, to smaller area and power restricted embedded systems, such as Internet of Things (IoT) devices or smart cards, the wide availability of these systems induces a need for data protection. The implementation of hardware cryptographic primitives on Application Specific Integrated Circuit (ASIC) or Field Programmable Gate Array (FPGA) aims to fulfil the security requirements, while providing faster and lower power encryption than software based solutions on microprocessors, especially in the case of constrained resources. However, cryptographic solutions can be attacked, even if the encryption scheme is proven secure. One possible way to do so is through physical attacks, such as Side-Channel Analysis (SCA), for example by analysing their power consumption, or fault injection attacks, which disturb the computation in a way that allows an attacker to recover the secret key. As such, it is of the utmost relevance to implement cryptographic algorithms in a way that minimises the risk of physical attacks, as well as implement some counter-measures to prevent them, for instance Error Correcting Codes (ECC). Moreover, the evaluation of aforementioned cryptographic hardware and counter-measures is not generally done automatically, but rather empirically. This results in a need for the automation of both counter-measures generation and physical hardware checking against attacks. This thesis will focus on the automation of both aspects. Firstly, Error Detecting Code (EDC), as well as ECC, counter-measures are presented. Their goal is to stop faults from disturbing the encryption process. A discussion on the differences between natural (i.e induced by natural factors such as ageing or cosmic rays) and malicious faults is given in a subsequent chapter, as well as an analysis of the limitations of the evaluation of ECC. This is followed by the presentation of new architectures based on a new class of robust EDC, aimed at preventing multiple faults. They are scalable by construction, and as such it is possible to automatically choose an appropriate EDC implementation with regards to the constraint of the protected hardware. The architectures ensure the detection of faults injected by a strong adversary (who has the ability to inject precise faults on a temporal and spatial level), as well as the correction of low-multiplicity faults. The structure of the implementation, an inner-outer code based construction, and more specifically an efficient decoding method are further detailed, as well as some additional tweaks. Finally, the implementation is validated against physical fault injection on a SAKURA-G FPGA platform, and the results further reinforce the need for such architectures. The second part of the thesis will consider attack scenarios, and more precisely fault attacks. The automatic evaluation of hardware implementations of cryptographic primitives will be the main focus. In this regard, this thesis considers a particular type of fault attacks, hardware based Algebraic Fault Attacks (AFA). AFAs are at the border between mathematical cryptanalysis and physical fault injection attacks. They combine information from fault disturbed encryptions with some cipher description, in order to build an attack and recover the secret key. This work considers the hardware implementations of different ciphers as the source of algebraic information. In such regards, a framework for automated creation of AFAs has been developed in collaboration with the chair of computer architecture of the University of Freiburg. The framework takes the description of the cipher, in Hardware Description Language (HDL) or gate level, as well as a defined fault model as inputs, and through a series of steps, builds an attack in order to recovers the secret key. The detailed steps are presented in this thesis. The automatic generation of attack scenario for a considered cipher allows for an evaluation of any cipher implementation, including any potential changes or optimisation made against different attack scenarios. The framework itself was tested on a variety of different Substitution and Permutation Network (SPN), and some counter-measures. Physical realisation of fault attacks are also considered, from an implementation of the SAKURA-G FPGA platform, as well as software simulations of an idealised fault model. The constructed attacks were successful and the results are discussed, as well as the implication of multiple fault injections for solving. Finally, some counter-measures are considered, in order to validate or invalidate their effectiveness against AFAs.