05 Fakultät Informatik, Elektrotechnik und Informationstechnik

Permanent URI for this collectionhttps://elib.uni-stuttgart.de/handle/11682/6

Browse

Search Results

Now showing 1 - 10 of 38
  • Thumbnail Image
    ItemOpen Access
    Rigorous compilation for near-term quantum computers
    (2024) Brandhofer, Sebastian; Polian, Ilia (Prof.)
    Quantum computing promises an exponential speedup for computational problems in material sciences, cryptography and drug design that are infeasible to resolve by traditional classical systems. As quantum computing technology matures, larger and more complex quantum states can be prepared on a quantum computer, enabling the resolution of larger problem instances, e.g. breaking larger cryptographic keys or modelling larger molecules accurately for the exploration of novel drugs. Near-term quantum computers, however, are characterized by large error rates, a relatively low number of qubits and a low connectivity between qubits. These characteristics impose strict requirements on the structure of quantum computations that must be incorporated by compilation methods targeting near-term quantum computers in order to ensure compatibility and yield highly accurate results. Rigorous compilation methods have been explored for addressing these requirements as they exactly explore the solution space and thus yield a quantum computation that is optimal with respect to the incorporated requirements. However, previous rigorous compilation methods demonstrate limited applicability and typically focus on one aspect of the imposed requirements, i.e. reducing the duration or the number of swap gates in a quantum computation. In this work, opportunities for improving near-term quantum computations through compilation are explored first. These compilation opportunities are included in rigorous compilation methods to investigate each aspect of the imposed requirements, i.e. the number of qubits, connectivity of qubits, duration and incurred errors. The developed rigorous compilation methods are then evaluated with respect to their ability to enable quantum computations that are otherwise not accessible with near-term quantum technology. Experimental results demonstrate the ability of the developed rigorous compilation methods to extend the computational reach of near-term quantum computers by generating quantum computations with a reduced requirement on the number and connectivity of qubits as well as reducing the duration and incurred errors of performed quantum computations. Furthermore, the developed rigorous compilation methods extend their applicability to quantum circuit partitioning, qubit reuse and the translation between quantum computations generated for distinct quantum technologies. Specifically, a developed rigorous compilation method exploiting the structure of a quantum computation to reuse qubits at runtime yielded a reduction in the required number of qubits of up to 5x and result error by up to 33%. The developed quantum circuit partitioning method optimally distributes a quantum computation to distinct separate partitions, reducing the required number of qubits by 40% and the cost of partitioning by 41% on average. Furthermore, a rigorous compilation method was developed for quantum computers based on neutral atoms that combines swap gate insertions and topology changes to reduce the impact of limited qubit connectivity on the quantum computation duration by up to 58% and on the result fidelity by up to 29%. Finally, the developed quantum circuit adaptation method enables to translate between distinct quantum technologies while considering heterogeneous computational primitives with distinct characteristics to reduce the idle time of qubits by up to 87% and the result fidelity by up to 40%.
  • Thumbnail Image
    ItemOpen Access
    Locking-enabled security analysis of cryptographic circuits
    (2024) Upadhyaya, Devanshi; Gay, Maël; Polian, Ilia
    Hardware implementations of cryptographic primitives require protection against physical attacks and supply chain threats. This raises the question of secure composability of different attack countermeasures, i.e., whether protecting a circuit against one threat can make it more vulnerable against a different threat. In this article, we study the consequences of applying logic locking, a popular design-for-trust solution against intellectual property piracy and overproduction, to cryptographic circuits. We show that the ability to unlock the circuit incorrectly gives the adversary new powerful attack options. We introduce LEDFA (locking-enabled differential fault analysis) and demonstrate for several ciphers and families of locking schemes that fault attacks become possible (or consistently easier) for incorrectly unlocked circuits. In several cases, logic locking has made circuit implementations prone to classical algebraic attacks with no fault injection needed altogether. We refer to this “zero-fault” version of LEDFA by the term LEDA, investigate its success factors in-depth and propose a countermeasure to protect the logic-locked implementations against LEDA. We also perform test vector leakage assessment (TVLA) of incorrectly unlocked AES implementations to show the effects of logic locking regarding side-channel leakage. Our results indicate that logic locking is not safe to use in cryptographic circuits, making them less rather than more secure.
  • Thumbnail Image
    ItemOpen Access
    Development of an infrastructure for creating a behavioral model of hardware of measurable parameters in dependency of executed software
    (2021) Schwachhofer, Denis
    System-Level Test (SLT) gains traction not only in the industry but as of recently also in academia. It is used to detect manufacturing defects not caught by previous test steps. The idea behind SLT is to embed the Design Under Test (DUT) in an environment and running software on it that corresponds to its end-user application. But even though it is increasingly used in manufacturing since a decade there are still many open challenges to solve. For example, there is no coverage metric for SLT. Also, tests are not automatically generated but manually composed using existing operating systems and programs. This master thesis introduces the foundation for the AutoGen project, that will tackle the aforementioned challenges in the future. This foundation contains a platform for experiments and a workflow to generate Systems-on-Chip (SoCs). A case study is conducted to show an example on how on-chip sensors can be used in SLT applications to replace missing detailed technology-information. For the case study a “power devil” application has been developed that aims to keep the temperature of the Field Programmable Gate Array (FPGA) it runs on in a target range. The study shows an example on how software and parameters influence the extra-functional behavior of hardware.
  • Thumbnail Image
    ItemOpen Access
    Benchmarking the performance of portfolio optimization with QAOA
    (2022) Brandhofer, Sebastian; Braun, Daniel; Dehn, Vanessa; Hellstern, Gerhard; Hüls, Matthias; Ji, Yanjun; Polian, Ilia; Bhatia, Amandeep Singh; Wellens, Thomas
    We present a detailed study of portfolio optimization using different versions of the quantum approximate optimization algorithm (QAOA). For a given list of assets, the portfolio optimization problem is formulated as quadratic binary optimization constrained on the number of assets contained in the portfolio. QAOA has been suggested as a possible candidate for solving this problem (and similar combinatorial optimization problems) more efficiently than classical computers in the case of a sufficiently large number of assets. However, the practical implementation of this algorithm requires a careful consideration of several technical issues, not all of which are discussed in the present literature. The present article intends to fill this gap and thereby provides the reader with a useful guide for applying QAOA to the portfolio optimization problem (and similar problems). In particular, we will discuss several possible choices of the variational form and of different classical algorithms for finding the corresponding optimized parameters. Viewing at the application of QAOA on error-prone NISQ hardware, we also analyse the influence of statistical sampling errors (due to a finite number of shots) and gate and readout errors (due to imperfect quantum hardware). Finally, we define a criterion for distinguishing between ‘easy’ and ‘hard’ instances of the portfolio optimization problem.
  • Thumbnail Image
    ItemOpen Access
    A muscle model for injury simulation
    (2023) Millard, Matthew; Kempter, Fabian; Fehr, Jörg; Stutzig, Norman; Siebert, Tobias
    Car accidents frequently cause neck injuries that are painful, expensive, and difficult to simulate. The movements that lead to neck injury include phases in which the neck muscles are actively lengthened. Actively lengthened muscle can develop large forces that greatly exceed the maximum isometric force. Although Hill-type models are often used to simulate human movement, this model has no mechanism to develop large tensions during active lengthening. When used to simulate neck injury, a Hill model will underestimate the risk of injury to the muscles but may overestimate the risk of injury to the structures that the muscles protect. We have developed a musculotendon model that includes the viscoelasticity of attached crossbridges and has an active titin element. In this work we evaluate the proposed model to a Hill model by simulating the experiments of Leonard et al. [1] that feature extreme active lengthening.
  • Thumbnail Image
    ItemOpen Access
    Stochastic neural networks : components, analysis, limitations
    (2022) Neugebauer, Florian; Polian, Ilia (Prof. Dr.)
    Stochastic computing (SC) promises an area and power-efficient alternative to conventional binary implementations of many important arithmetic functions. SC achieves this by employing a stream-based number format called Stochastic numbers (SNs), which enables bit-sequential computations, in contrast to conventional binary computations that are performed on entire words at once. An SN encodes a value probabilistically with equal weight for every bit in the stream. This encoding results in approximate computations, causing a trade-off between power consumption, area and computation accuracy. The prime example for efficient computation in SC is multiplication, which can be performed with only a single gate. SC is therefore an attractive alternative to conventional binary implementations in applications that contain a large number of basic arithmetic operations and are able to tolerate the approximate nature of SC. The most widely considered class of applications in this regard is neural networks (NNs), with convolutional neural networks (CNNs) as the prime target for SC. In recent years, steady advances have been made in the implementation of SC-based CNNs (SCNNs). At the same time however, a number of challenges have been identified as well: SCNNs need to handle large amounts of data, which has to be converted from conventional binary format into SNs. This conversion is hardware intensive and takes up a significant portion of a stochastic circuit's area, especially if the SNs have to be generated independently of each other. Furthermore, some commonly used functions in CNNs, such as max-pooling, have no exact corresponding SC implementation, which reduces the accuracy of SCNNs. The first part of this work proposes solutions to these challenges by introducing new stochastic components: A new stochastic number generator (SNG) that is able to generate a large number of SNs at the same time and a stochastic maximum circuit that enables an accurate implementation of max-pooling operations in SCNNs. In addition, the first part of this work presents a detailed investigation of the behaviour of an SCNN and its components under timing errors. The error tolerance of SC is often quoted as one of its advantages, stemming from the fact that any single bit of an SN contributes only very little to its value. In contrast, bits in conventional binary formats have different weights and can contribute as much as 50\% of a number's value. SC is therefore a candidate for extreme low-power systems, as it could potentially tolerate timing errors that appear in such environments. While the error tolerance of SC image processing systems has been demonstrated before, a detailed investigation into SCNNs in this regard has been missing so far. It will be shown that SC is not error tolerant in general, but rather that SC components behave differently even if they implement the same function, and that error tolerance of an SC system further depends on the error model. In the second part of this work, a theoretical analysis into the accuracy and limitations of SC systems is presented. An existing framework to analyse and manage the accuracy of combinational stochastic circuits is extended to cover sequential circuits. This framework enables a designer to predict the effect of small design changes on the accuracy of a circuit and determine important parameters such as SN length without extensive simulations. It will further be shown that the functions that are possible to implement in SC are limited. Due to the probabilistic nature of SC, some arithmetic functions suffer from a small bias when implemented as a stochastic circuit, including the max-pooling function in SCNNs.
  • Thumbnail Image
    ItemOpen Access
    Scatter and beam hardening correction for high-resolution CT in near real-time based on a fast Monte Carlo photon transport model
    (2022) Alsaffar, Ammar; Simon, Sven (Prof. Dr.-Ing.)
    Computed tomography (CT) is a powerful non-destructive testing (NDT) technique. It provides inception about the inner of the scanned object and is widely used for industrial and medical applications. However, this technique suffers from severe quality degradation artifacts. Among these artifacts, the scatter and the beam hardening (BH) causes severe quality degradation of the reconstructed CT images. The scatter results from the change in the direction, or the direction and the energy of the photon penetrating the object, while the beam hardening results from the polychromatic nature of the X-ray source. When photons of different energies penetrate through the object, low-energy photons are more easily absorbed than high-energy photons. This results in the hardening of the X-ray beam which causes the non-linear relation between the propagation path length and the attenuation of the beam. These kinds of artifacts are the major source of the cupping and the streak artifacts that highly degrades the quality of the computed tomography imaging. The presence of the cupping and the streak artifacts reduce the contrast of this image and the contrast-to-noise and cause distortion of the grey values. As a consequence important analysis of the results from the computed tomography technique is affected, e.g., the detectability of voids and cracks is reduced by the reduction of the contrast and affects the dimensional measurement. Monte Carlo (MC) simulation is considered the most accurate approach for scatter estimation. However, the existing MC estimators are computationally expensive, especially for the considered high-resolution flat-panel CT. In this work, a muli-GPU photon forward projection model and an iterative scatter correction algorithm were implemented. The Monte Carlo model has been highly accelerated and extensively verified using several experimental and simulated examples. The implemented model describes the physics within the 1 keV to 1 MeV range using multiple controllable key parameters. Based on this model, scatter computation for a single projection can be completed within a range of a few seconds under well-defined model parameters. Smoothing and interpolation are performed on the estimated scatter to accelerate the scatter calculation without compromising accuracy too much compared to measured near scatter-free projection images. Combining the scatter estimation with the filtered backprojection (FBP), scatter correction is performed effectively in an iterative manner. In order to evaluate the proposed MC model, extensive experiments have been conducted on the simulated data and real-world high-resolution flat-panel CT. Compared to the state-of-the-art MC simulators, the proposed MC model achieved a 15× acceleration on a single GPU in comparison to the GPU implementation of the Penelope simulator (MCGPU) utilizing several acceleration techniques, and a 202× speed-up on a multi-GPU system comparing to the multi-threaded state-of-the-art EGSnrc MC simulator. Furthermore, it is shown that for high-resolution images, scatter correction with sufficient accuracy is accomplished within one to three iterations using a FBP and the proposed fast MC photon transport model. Moreover, a fast and accurate BH correction method that requires no prior knowledge of the materials and corrects first and higher-order BH artifacts has been implemented. In the first step, a wide sweep of the material is performed based on an experimentally measured look-up table to obtain the closest estimate of the material. Then the non-linearity effect of the BH is corrected by adding the difference between the estimated monochromatic and the polychromatic simulated projections of the segmented image. The estimated monochromatic projection is simulated by selecting the energy from the polychromatic spectrum which produces the lowest mean square error (MSE) with the BH-corrupted projection from the scanner. While the polychromatic projection is accurately estimated using the least square estimation (LSE) method by minimizing the difference between the experimental projection and the linear combination of simulated polychromatic projections using different spectra of different filtration. As a result, an accurate non-linearity correction term is derived that leads to an accurate BH correction result. To evaluate the proposed BH correction method, extensive experiments have been conducted on real-world CT data. Compared to the state-of-the-art empirical BH correction method, the experiments show that the proposed method can highly reduce the BH artifacts without prior knowledge of the materials. In summary, the lack of the availability of fast and computationally efficient methods to correct the major artifacts in CT images, i.e., scatter and beam hardening, has motivated this work in which efficient and fast algorithms have been implemented to correct these artifacts. The correction of these artifacts has led to better visualization of the CT images, a higher contrast-to-noise ratio, and improved contrast. Supported by multiple experimental examples, it is shown that the scatter corrected images, using the proposed method, resample the near artifacts-free reference images acquired experimentally within a reasonable time. On the other hand, the application of the proposed BH correction method after the correction of the scatter artifacts results in the complete removal of the rest cupping and streak artifacts that were degrading the scatter-corrected images and improved the contrast-to-noise (CNR) ratio of the scatter-corrected images. Moreover, assessments of the correction quality of the CT images have been performed using the software Volume Graphics VGSTUDIO MAX. Better surface determination can be derived from the artifacts-corrected images. In addition, enhancing the contrast by correcting these artifacts results in an improved detectability of voids and cracks in several concrete examples. This supports the efficiency of the implemented artifacts correction methods in this work.
  • Thumbnail Image
    ItemOpen Access
    Nontraditional design of dynamic logics using FDSOI for ultra-efficient computing
    (2023) Kumar, Shubham; Chatterjee, Swetaki; Dabhi, Chetan Kumar; Chauhan, Yogesh Singh; Amrouch, Hussam
  • Thumbnail Image
    ItemOpen Access
    A GPU-accelerated light-field super-resolution framework based on mixed noise model and weighted regularization
    (2022) Tran, Trung-Hieu; Sun, Kaicong; Simon, Sven
    Light-field (LF) super-resolution (SR) plays an essential role in alleviating the current technology challenge in the acquisition of a 4D LF, which assembles both high-density angular and spatial information. Due to the algorithm complexity and data-intensive property of LF images, LFSR demands a significant computational effort and results in a long CPU processing time. This paper presents a GPU-accelerated computational framework for reconstructing high-resolution (HR) LF images under a mixed Gaussian-Impulse noise condition. The main focus is on developing a high-performance approach considering processing speed and reconstruction quality. From a statistical perspective, we derive a joint ℓ1- ℓ2data fidelity term for penalizing the HR reconstruction error taking into account the mixed noise situation. For regularization, we employ the weighted non-local total variation approach, which allows us to effectively realize LF image prior through a proper weighting scheme. We show that the alternating direction method of the multipliers algorithm (ADMM) can be used to simplify the computation complexity and results in a high-performance parallel computation on the GPU Platform. An extensive experiment is conducted on both synthetic 4D LF dataset and natural image dataset to validate the proposed SR model’s robustness and evaluate the accelerated optimizer’s performance. The experimental results show that our approach achieves better reconstruction quality under severe mixed-noise conditions as compared to the state-of-the-art approaches. In addition, the proposed approach overcomes the limitation of the previous work in handling large-scale SR tasks. While fitting within a single off-the-shelf GPU, the proposed accelerator provides an average speedup of 2.46 ×and 1.57 ×for ×2and ×3SR tasks, respectively. In addition, a speedup of 77×is achieved as compared to CPU execution.
  • Thumbnail Image
    ItemOpen Access
    Modeling and investigating total ionizing dose impact on FeFET
    (2023) Sayed, Munazza; Ni, Kai; Amrouch, Hussam