Graph partitioning and scheduling for distributed dataflow computation

dc.contributor.authorLaich, Larissa
dc.date.accessioned2017-10-23T14:45:56Z
dc.date.available2017-10-23T14:45:56Z
dc.date.issued2017de
dc.description.abstractDuring the last years, the amount of data which can be represented and processed as graph structured data has massively increased. To process these large data sets, graph processing systems have been developed which distribute and partition a graph among multiple machines. Due to an increase in processing power and data collection, machine learning and especially neural networks have become very popular. Consequently, machine learning systems like TensorFlow have emerged. Machine learning models can be represented as dataflow graphs and often take days to train as the dataflow graph is executed thousands of times. Graph partitioning determines how the graph is divided and which node is placed on which device. Scheduling decides which node should be computed next during execution. Smart partitioning and scheduling can drastically reduce the total execution time. Most existing solutions do not consider several important constraints like memory limitations, device or colocation constraints which can be directly derived from the machine learning library TensorFlow. This thesis presents and evaluates different partitioning and scheduling strategies meeting the constraints required for a realistic environment. One of these developed partitioning algorithms is based on a heuristic function considering execution time, memory and traffic and tries to map time-critical nodes on fast devices. This partitioning algorithm performed very well in combination with a scheduling strategy which schedules the executable node first whose upwards path takes longest to compute. On the evaluated graphs extracted from TensorFlow, this strategy was up to 75 % and at least 45% better in terms of graph execution time than the slightly adapted popular HEFT algorithm which is a common benchmark. In combination with the aforementioned scheduling strategy a partitioning which aims to assign the critical path nodes to the fastest device showed equally promising results.en
dc.identifier.other495580511
dc.identifier.urihttp://nbn-resolving.de/urn:nbn:de:bsz:93-opus-ds-92797de
dc.identifier.urihttp://elib.uni-stuttgart.de/handle/11682/9279
dc.identifier.urihttp://dx.doi.org/10.18419/opus-9262
dc.language.isoende
dc.rightsinfo:eu-repo/semantics/openAccessde
dc.subject.ddc004de
dc.titleGraph partitioning and scheduling for distributed dataflow computationen
dc.typebachelorThesisde
ubs.fakultaetInformatik, Elektrotechnik und Informationstechnikde
ubs.institutInstitut für Parallele und Verteilte Systemede
ubs.publikation.seiten69de
ubs.publikation.typAbschlussarbeit (Bachelor)de

Files

Original bundle

Now showing 1 - 1 of 1
Thumbnail Image
Name:
Graph Partitioning and Scheduling for Distributed Dataflow Computation.pdf
Size:
8.12 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
3.39 KB
Format:
Item-specific license agreed upon to submission
Description: