Relational regression methods to speed up Monte-Carlo planning

dc.contributor.authorBöpple, Teresa
dc.date.accessioned2022-04-05T08:22:02Z
dc.date.available2022-04-05T08:22:02Z
dc.date.issued2017de
dc.description.abstractMonte-Carlo Tree Search is a planning algorithm that tries to find the best possible next action by using random simulations and estimating the return by them. One big advantage of Monte-Carlo Tree Search is that it can be used with very little domain knowledge, is implemented easily and is applicable for many problems. This thesis shows how Monte-Carlo planning is speed up by applying Relational Regression. A relational domain is given with states that consist of facts. In order to use regression on relation states, features have to be created that map the state to a vector that can be used by regression. Therefore, training data are created that contain states, possible actions and an estimated return of these state-action pairs. These data are created by a standard Monte-Carlo planner. All facts that occur in any state from the training data are written into a factlist. With the help of that factlist, features are created. Every row of the feature stands for a fact. If this fact occurs in a state, the feature of this state contains a 1 in the corresponding row. If not, the row contains a 0. Other features that are created also consider combinations of facts or actions. With the help of regression, these features can be mapped to a real value that corresponds to the expected return of the state or state-action pair. This is evaluated by testing it on a test dataset. The results of this test are that for a big enough and accurate training dataset the return calculated by regression is very close to the one calculated by the planner for the test data. Because of this promising results, the regression is actually integrated into a Monte-Carlo planner. For a well chosen set of training data, that contain a wide range of both terminal- and dead-end states, the planner improves in many ways. First, in average the modified planner needs less steps to reach a terminal state than the original planner. Second, the modified planner reaches a terminal state more often than the original planner, because the planner gets into a dead-end state less often now. With this the actual goal of this work is achieved and it is demonstrated that the planning process speeds up.en
dc.identifier.other1799350932
dc.identifier.urihttp://nbn-resolving.de/urn:nbn:de:bsz:93-opus-ds-120738de
dc.identifier.urihttp://elib.uni-stuttgart.de/handle/11682/12073
dc.identifier.urihttp://dx.doi.org/10.18419/opus-12056
dc.language.isoende
dc.rightsinfo:eu-repo/semantics/openAccessde
dc.subject.ddc004de
dc.titleRelational regression methods to speed up Monte-Carlo planningen
dc.typebachelorThesisde
ubs.fakultaetInformatik, Elektrotechnik und Informationstechnikde
ubs.institutInstitut für Parallele und Verteilte Systemede
ubs.publikation.seiten47de
ubs.publikation.typAbschlussarbeit (Bachelor)de

Files

Original bundle

Now showing 1 - 1 of 1
Thumbnail Image
Name:
ausarbeitung.pdf
Size:
790.98 KB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
3.3 KB
Format:
Item-specific license agreed upon to submission
Description: