Rule Learning in Knowledge Graphs
File version
Author(s)
Primary Supervisor
Wang, Kewen
Other Supervisors
Wang, Zhe
Editor(s)
Date
Size
File type(s)
Location
License
Abstract
With recent advancements in knowledge extraction and knowledge management systems, an enormous number of knowledge bases have been constructed, such as YAGO, and Wikidata. These automatically built knowledge bases which contain millions of entities and their relations have been stored in graph-based schemas, and thus are usually referred to as knowledge graphs (KGs). Since KGs have been built based on the limited available data, they are far from complete. However, learning frequent patterns in the form of logical rules from these incomplete KGs has two main advantages. First, by applying the learned rules, we can infer new facts, so we could complete the KGs. Second, the rules are stand-alone knowledge which express valuable insight about the data. However, learning rules from KGs in relation to the real-world scenarios imposes several challenges. First, due to the vast size of real-world KGs, developing a rule learning method is challenging. In fact, existing methods are not scalable for learning rst order rules, while various optimisation strategies are used such as sampling and language bias (i.e., restrictions on the form of rules). Second, applying the learned rules to the vast KG and inferring new facts is another di cult issue. Learned rules usually contain a lot of noises and adding new facts can cause inconsistency of KGs. Third, it is useful but non-trivial to extend an existing method of rule learning to the case of stream KGs. Forth, in many data repositories, the facts are augmented with time stamps. In this case, we face a stream of data (KGs). Considering time as a new dimension of data imposes some challenges to the rule learning process. It would be useful to construct a time-sensitive model from the stream of data and apply the obtained model to stream KGs. Last, the density of information in a KG is varied. Although the size of a KG is vast, it contains a limited amount of information for some relations. Consequently, that part of KG is sparse. Learning a set of accurate and informative rules regarding the sparse part of a KG is challenging due to the lack of su cient training data. In this thesis, we investigate these research problems and present our methods for rule learning in various scenarios. We have rst developed a new approach, named Rule Learning via Learning Representation (RLvLR), to learning rules from KGs by using the technique of embedding in representation learning together with a new sampling method. RLvLR learns rst-order rules from vast KGs by exploring the embedding space. It can handle some large KGs that cannot be handled by existing rule learners e ciently, due to a novel sampling method. To improve the performance of RLvLR for handling sparse data, we propose a transfer learning method, Transfer Rule Learner (TRL), for rule learning. Based on a similarity characterised by the embedding representation, our method is able to select most relevant KGs and rules to transfer from a pool of KGs whose rules have been obtained. We have also adapted RLvLR to handle stream KGs instead of static KGs. Then a system called StreamLearner is developed for learning rules from stream KGs. These proposed methods can only learn so-called closed path rules, which is a proper subset of Horn rules. Thus, we have also developed a transfer rule learner (T-LPAD) that learns the structure of logic program with annotated disjunctions. T-LPAD is created by employing transfer learning to explore the space of rules' structures more e ciently. Various experiments have been conducted to test and validate the proposed methods. Our experimental results show that our methods outperform state-of-the-art methods in many ways.
Journal Title
Conference Title
Book Title
Edition
Volume
Issue
Thesis Type
Thesis (PhD Doctorate)
Degree Program
Doctor of Philosophy (PhD)
School
School of Info & Comm Tech
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement
The author owns the copyright in this thesis, unless stated otherwise.
Item Access Status
Note
Access the data
Related item(s)
Subject
Knowledge graphs (KGs)
Learned rules
Stand-alone knowledge
Rule learning
Learning representation