Generating random knowledge graphs from rules

dc.contributor.authorGlaser, Gabriel Timon
dc.date.accessioned2025-01-07T14:04:58Z
dc.date.available2025-01-07T14:04:58Z
dc.date.issued2024de
dc.description.abstractA knowledge graph is a datastructure that is capable of storing knowledge. Besides that, there are several methods that use knowledge graphs to derive more information. These methods need to be validated with example knowledge graphs. However, real data might not be available or not contain desired properties. Thus, there are use cases that benefit from the generation of synthetic knowledge graphs. To define a synthetic knowledge graph, there is the need of a characterization that expresses how the synthetic data should look. In this thesis, I use Horn clauses for this characterization because of their good balance of expressiveness and complexity, their use in the field of rule mining, and their base role in the logical language Datalog. As clauses are usually not represented perfectly in real data, the goal of this thesis is to generate a knowledge graph that does not perfectly fulfil given Horn clauses, but in a desired degree of fulfillment. During the thesis, I developed and implemented two modifiable algorithms to generate knowledge graphs. On the one hand, I adapted the general hill climbing technique to generate knowledge graphs. On the other hand, I implemented a greedy algorithm which orders a given set of Horn clauses using logical subsumptions between their bodies, and then add edges to fulfil one Horn clause after the other, in the computed order. Both algorithms aim to fulfil the goal of this thesis by generating synthetic knowledge graphs according to given Horn clauses, each with a degree of fulfilment. The degree of fulfilment of any Horn clauses is characterized by body support, the number of times the premise of the Horn clauses is fulfilled, and support, the number of times the premise and conclusion of the Horn clauses are fulfilled. Additionally, there is the confidence which is the fraction of support and body support, i.e., the percentage of cases the Horn clause is fulfilled. All code is published such that anyone can try it. During the evaluation, random sets of Horn clauses were produced and the implementations generated corresponding knowledge graphs. Generated knowledge graphs were compared by considering the difference between the expected and the actual degree of fulfilment for each Horn clause. The result is that generation variant hill climbing with the initial state set to the result of the greedy algorithm with rule order based on subsumption yields the best results. Also, the difficulty of generating a good knowledge graph increases along with the overlapping degree of the input set of Horn clauses. Note that the overlapping degree reflects how many relation names occur in how many Horn clauses of the set. Lastly, the state-of-the-art mining tool AMIE found many Horn clauses in the generated graphs which were not intended by the set given to the generator.en
dc.identifier.other1916500862
dc.identifier.urihttp://nbn-resolving.de/urn:nbn:de:bsz:93-opus-ds-154867de
dc.identifier.urihttp://elib.uni-stuttgart.de/handle/11682/15486
dc.identifier.urihttps://doi.org/10.18419/opus-15467
dc.language.isoende
dc.rightsinfo:eu-repo/semantics/openAccessde
dc.subject.ddc004de
dc.titleGenerating random knowledge graphs from rulesen
dc.typebachelorThesisde
ubs.fakultaetInformatik, Elektrotechnik und Informationstechnikde
ubs.institutInstitut für Künstliche Intelligenzde
ubs.publikation.seiten58de
ubs.publikation.typAbschlussarbeit (Bachelor)de

Files

Original bundle

Now showing 1 - 1 of 1
Thumbnail Image
Name:
GabrielGlaser_BSc_Generating Random Knowledge Graphs from Rules.pdf
Size:
828.88 KB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
3.3 KB
Format:
Item-specific license agreed upon to submission
Description: