Universität Stuttgart
Permanent URI for this communityhttps://elib.uni-stuttgart.de/handle/11682/1
Browse
69 results
Search Results
Item Open Access Modeling of a multi-core microblaze system at RTL and TLM abstraction levels in systemC(2013) Eissa, KarimTransaction Level Modeling (TLM) has recently become a popular approach for modeling contemporary Systems-on-Chip (SoCs) on a higher abstraction level than Register Transfer Level (RTL). In this thesis a multi-core system based on the Xilinx MicroBlaze micro-processor is modeled at RTL and TLM abstraction levels in SystemC. Both implemented models have cycle accurate timing, and are verified against the reference VHDL model using a VHDL / SystemC mixed-language simulation with ModelSim. Finally, performance measurements are carried out to evaluate simulation speedup at the transaction level. Modeling of the MicroBlaze processor is based on a MicroBlaze Instruction Set Simulator (ISS) from SoCLib. A wrapper is therefore implemented to provide communication interfaces between the processor and the rest of the system, as well as control the timing of the ISS operation to reach cycle accurate models. Furthermore, a local memory module based on Block Random Access Memories (BRAMs) is modeled to simulate a complete system consisting of a processor and a local memory.Item Open Access Accelerated computation using runtime partial reconfiguration(2013) Nayak, Naresh GaneshRuntime reconfigurable architectures, which integrate a hard processor core along with a reconfigurable fabric on a single device, allow to accelerate a computation by means of hardware accelerators implemented in the reconfigurable fabric. Runtime partial reconfiguration provides the flexibility to dynamically change these hardware accelerators to adapt the computing capacity of the system. This thesis presents the evaluation of design paradigms which exploit partial reconfiguration to implement compute intensive applications on such runtime reconfigurable architectures. For this purpose, image processing applications are implemented on Zynq-7000, a System on a Chip (SoC) from Xilinx Inc. which integrates an ARM Cortex A9 with a reconfigurable fabric. This thesis studies different image processing applications to select suitable candidates that benefit if implemented on the above mentioned class of reconfigurable architectures using runtime partial reconfiguration. Different Intellectual Property (IP) cores for executing basic image operations are generated using high level synthesis for the implementation. A software based scheduler, executed in the Linux environment running on the ARM core, is responsible for implementing the image processing application by means of loading appropriate IP cores into the reconfigurable fabric. The implementation is evaluated to measure the application speed up, resource savings, power savings and the delay on account of partial reconfiguration. The results of the thesis suggest that the use of partial reconfiguration to implement an application provides FPGA resource savings. The extent of resource savings depend on the granularity of the operations into which the application is decomposed. The thesis could also establish that runtime partial reconfiguration can be used to accelerate the computations in reconfigurable architectures with processor core like the Zynq-7000 platform. The achieved computational speed-up depends on factors like the number of hardware accelerators used for the computation and the used reconfiguration schedule. The thesis also highlights the power savings that may be achieved by executing computations in the reconfigurable fabric instead of the processor core.Item Open Access A process insight repository supporting process optimization(2012) Vetlugin, AndreyExisting solutions for analysis and optimization of manufacturing processes, such as online analysis processing or statistical calculations, have shortcomings that limit continuous process improvements. In particular, they lack means of storing and integrating the results of analysis. This makes the valuable information that can be used for process optimizations used only once and then disposed. The goal of the Advanced Manufacturing Analytics (AdMA) research project is to design an integrated platform for data-driven analysis and optimization of manufacturing processes using analytical techniques, especially data mining, in order to carry out continuous improvement of production. The achievement of this goal is based on the integration of the data related to the manufacturing processes, especially from Manufacturing Execution Systems (MES), with the other operating data, e.g. from Enterprise Resource Planning (ERP) systems. This work is based on AdMA platform described in [1] and Deep Business Process Optimization platform described in [2]. It is focused on the conceptual development of the Process Insight Repository, which is a part of the AdMA platform. The Process Insight Repository is aimed at storing the manufacturing process related data and the insights associated with it. Being part of the AdMA platform, the Process Insight Repository is oriented on storing the insights retrieved by application of data mining techniques to the data of manufacturing processes, so that the newly extracted knowledge can be stored along with the process data itself. Chapter 2 describes the conceptual schema of the Process Insight Repository. The conceptual schema defines what data must be stored in the Process Insight Repository and how different parts of this data are interconnected. Chapter 3 provides a review of technologies that can be used for the implementation of the Process Insight Repository. This includes technologies for storing manufacturing process data, free form knowledge and data mining related data. Chapter 4 describes the details of the prototype implementation of the Process Insight Repository. The result of this work is the created conceptual schema of the Process Insight Repository and a prototype implementation as a proof of concept.Item Open Access Entwicklung analysebasierter Optimierungsmuster zur Verbesserung von Fertigungsprozessen(2012) Dapperheld, MoritzDer Produktionsablauf in Industrieunternehmen muss u.a. kosten- und zeiteffizient gestaltet, transparent und flexibel sein. Somit stellen genau kalibrierte Prozesse in der Fertigung eine Grundlage für den Erfolg des Unternehmens dar. Um diese zu erreichen, steht mit der Verbesserung bestehender Fertigungsprozesse ein kritischer Ansatz zur Verfügung. So existiert eine Vielzahl an Optimierungskonzepten im Produktionsbereich, die sich bereits durch erfolgreiche Umsetzung in der Praxis bewährt haben. Jedoch werden für den Optimierungsvorgang oft nur die zur Verbesserung gewählten Bereiche betrachtet, ohne dass eine Interaktion mit den zusammenhängenden Informationsflüssen entsteht. Das Forschungsprojekt Advanced Manufacturing Analytics (AdMA) stellt einen Ansatz zur Verfügung, um eine Analyse und Optimierung von Fertigungsprozessen zu erzielen, indem auf eine Kombination von Ausführungsdaten und Daten aus operativen Systemen zugegriffen wird. Die Optimierung wird auf Basis von Optimierungsmustern ausgeführt. Ziel dieser Arbeit ist Bewertung bestehender Verfahren zur Optimierung hinsichtlich einer Anwendung als Optimierungsmuster. Die Ansätze werden in einem Rahmenwerk zusammengefasst. Hierfür werden Best Practices aus dem Produktionskontext, workflowgetriebene Ansätze und dynamische Vorgehen betrachtet. Die Bewertung zeigt Anwendungsmöglichkeiten für Ansätze aus allen drei Gebieten, aber auch die Kriterien auf, die eine Umsetzung aufwendig oder unmöglich gestalten. Es wird ein Konzept zur Umsetzung des Ansatzes der proaktiven Optimierung erstellt. Das Muster passt die Attribute von Prozessinstanzen an, indem eine Handlungsempfehlung generiert wird. Die Anpassung basiert auf der Erstellung und Auswertung von Entscheidungsbäumen. Auf das Konzept folgend, wird die prototypische Implementierung beschrieben.Item Open Access Development of procedures and evaluation strategies for novel field-effect transistor sensors(2012) Parker, Michael LeeIn order to evaluate new types of sensors based on the field-effect transistor technology, a cost-effective measurement and control system is developed. Because some new types of transistor-based sensors are particularly prone to drift and noise, a measurement system is built around evaluating the effect of a biasing technique known as switched biasing, which has been shown to reduce drift under certain configurations. The result is an implementation of software and hardware that is both able to control a transistor with switched biasing, explore drift-reducing switched biasing configurations, and accurately measure its performance with relatively high precision. Pre-Filtering of the measured data coupled with a fast actuation of an analog-to-digital converter is realized and implemented on a FPGA in the form of a rate-adjustable CIC decimation filter, which increases the signal-to-noise ratio and reduces the required data-transfer rate. The measurement system is controlled internally by a microcontroller and is interfaced through a USB interfaces to a higher-level system, such as a computer running MATLAB, and allows for multiple measurement systems to be operated in parallel. Systematic errors related to limitations of measurement hardware such as offset, temperature and drift are evaluated and compensated for through calibration.Item Open Access The GRACE event calendar(2012) Vishwakarma, Bramha DuttGRACE mission is a joint venture of NASA and GFZ. This mission was launched to provide with unprecedented accuracy, estimates of the global high resolution models of the Earth’s gravity field. The study of time-variability of Earth’s gravity field is very helpful in climate sciences and earth’s sciences studies. People have done a lot of work to demonstrate the effect of many natural phenomenon on gravity. Gravity estimates from GRACE are used for estimating mass redistribution at continental scale. So, we can observe hydrology, seismology and glaciology potential areas where GRACE can be useful. This research work focuses on identifying the hydrological events such as floods and drought, seismic events such as earthquakes and volcanic activity and also the glacier melting in the GRACE time-series. The work includes the development of strategy for the analysis of these events keeping in mind their behaviour and GRACE limitations of spatial resolution and sensitivity. Further in this work we would produce a event calendar for such events stating whether gravity changes caused by such events are visible to GRACE. Calendars are generated for hydrological events, floods and droughts separately and also for earthquake events. For rest of the phenomenon we have not generated calendars since these events are very few in numbers. This work is a qualitative analysis, so we could observe whether GRACE signal is able to observe these events or not. Hydrological events are observed by searching outliers in the grace observed time-series. The large floods such as 2009 Amazon floods can be seen when we take whole catchment, but the small floods affecting smaller region such as Sao Paulo flood is not visible in catchment time-series, so we have to go for selected area time-series generation. The factors such as time period for floods and droughts are very important factors when we want to observe them by GRACE. Earthquakes visibility depends on range rate amplitude, and also the quality of ΔC20, we have discussed these aspects while analysing earthquakes occurred in last decade from GRACE. We have given the possible explanation for the events not visible, and those visible have helped in the development of a methodology for analysis of a particular event. The volcanic activity in Caldera and Bolivia are pushing earth upward so we can expect some signal, but the spatial extent of these areas is small with caldera area greater than that of Bolivia, only caldera showed a trend. We also did trend analysis for 2 Asian glaciers and a part of Greenland for observing the melting of these ice masses. The work finally produces a series of events which we were able to observe by GRACE and we also get the methodology suitable for analysis of an event.Item Open Access Hybrid parallel computing beyond MPI & OpenMP - introducing PGAS & StarSs(2011) Sethi, Muhammad WahajHigh-performance architectures are becoming more and more complex with the passage of time. These large scale, heterogeneous architectures and multi-core system are difficult to program. New programming models are required to make expression of parallelism easier, while keeping productivity of the developer higher. Partition Global Address-space (PGAS) languages such as UPC appeared to augment developer’s productivity for distributed memory systems. UPC provides a simpler, shared memory-like model with a user control over data layout. But it is developer’s responsibility to take care of the data locality, by using appropriate data layouts. SMPSs/StarSs programming model tries to simplify the parallel programming on multicore architectures. It offers task level parallelism, where dependencies among the tasks are determined at the run time. In addition, runtime take cares of the data locality, while scheduling tasks. Hence, providing two-folds improvement in productivity; first, saving developer’s time by using automatic dependency detection, instead of hard coding them. Second, save cache optimization time, as runtime take cares of data locality. The purpose of this thesis is to use the PGAS programming model e.g. UPC for different nodes with the shared memory task based parallelization model i.e. StarSs to take the advantage of the multi core systems and contrast this approach to the legacy MPI and OpenMP combination. Performance as well as programmability is considered in the evaluation. The combination UPC + SMPSs, results in approximately the same execution time as MPI and OpenMP. The current lack of features such as multi-dimensional data distribution or virtual topologies in UPC, make the hybrid UPC + SMPSs/StarSs programming model less programmable than MPI + OpenMP for the application studied in this thesis.Item Open Access Memory-efficient lossless video compression using temporal extended JPEG-LS and on-line compression(2011) Chanda, DebasishUse of temporal predictors in lossless video coders play a significant role in terms of compression gain, but comes with a cost of significant memory requirement since this approach requires to save at least one frame in buffer for residue calculation. An improvement to standard JPEG-LS based lossless video coding algorithm is proposed in this work which requires very small amount of memory comparing to the regular approach keeping the computational complexity low. To obtain a higher compression, a combination of spatial and temporal predictor model has been used where appropriate mode is selected adaptively on a pixel based analysis. Using only one reference frame, the context based temporal coder performs its calculation regarding mode selection and prediction error calculation with already reconstructed pixels. This method eliminates the overhead of transmitting the coding mode in the decoder side. The need for storage space to save the only reference frame is further reduced by introducing on-line lossy compression on that frame. Relevant pixels from the stored reference frame are obtained by partial on-the-fly decompression. The combination of temporally extended context based prediction and on-line compression achieves a significant gain in compression ratio comparing to standard frame-by-frame JPEG-LS video coding keeping the memory requirement low, making it usable as a lightweight lossless video coder for embedded systems.Item Open Access Development of an error detection and recovery technique for a SPARC V8 processor in FPGA technology(2011) Boktor, AndrewField-Programmable Gate Arrays (FPGAs) found widespread use in many areas of applications, including safety and mission-critical systems. More and more manufacturers are choosing to implement designs on FPGAs. However, SRAM-based FPGAs are proven to be much more prone to Single Event Upsets (SEUs) compared to traditional Application-Specific Integrated Circuit (ASIC) designs. Moreover, SEU affects FPGAs in more severe ways compared to ASIC. Techniques to provide fault-tolerance for SRAM-based FPGAs become essential to maintain their advantages over other technologies. This thesis presents a fault-tolerance technique for pipeline architectures in FPGA technology. It provides fault-tolerance against SEUs in the design and is able to detect faults in the FPGA configuration. It also proposes an additional mechanism that detects all SEUs independent of their location. Pipeline operation can be resumed with known techniques of partial reconfiguration. Both designs occupy a much smaller area compared to known techniques such as TMR in combination with Scrubbing. They introduce no additional time penalty in case of fault-free operation. Fault injection and simulation were used to validate the design and calculate the fault coverage.Item Open Access Analysis of cache usability on modern real-time systems(2013) Almheidat, Ahmad Nuraldin FalehCache memories are used in the microprocessors to close the speed gap between the processor and the main memory. Caches can minimize the memory access time by keeping a copy of the highly demanded data closer to the processor. As a result, the overall program execution time is reduced. In safety-critical real-time systems, a worst-case analysis is required, and therefore the cache memories play an essential role in the estimation of the application's worst-case execution time. A simulation tool for the cache structure was developed to provide estimated measurements for both cache predictability and the worst-case memory access time based on the used architectural model. This may help to draw some conclusions about the actual cache operation. The simulation supports several modern uni-core and multi-core architectures, including some used in real-time systems. It also allows configuring different cache structures and hierarchies. The cache architecture, configuration and memory accesses from a simulated running application are specified by the user via an input file. The simulation provides a list of traces for every access. The cache predictability can be formulated as hit and miss rates. At the same time, the traces can be used to estimate total memory access time.