Scalable Subgraph Mining: Methods and Applications
File version
Author(s)
Primary Supervisor
Nguyen, Quoc Viet Hung
Jo, Jun Hyung
Other Supervisors
Rozenberg, Liat
Editor(s)
Date
Size
File type(s)
Location
License
Abstract
Graphs are universal languages to model complex data. Subgraph mining, a fundamental task in graph analysis, aims to discover meaningful and interesting patterns in large-scale graph data. It has wide-ranging applications across diverse domains, including social network analysis, social science, and bioinformatics. Traditional subgraph mining methods commonly employ a two-step process involving candidate generation and pattern verification. However, the computational costs associated with these steps become prohibitive when dealing with large and complex graph datasets, resulting in computational bottlenecks and impractical runtimes. The primary objective of this thesis is to propose scalable and effective methods for subgraph mining, with a specific focus on three widely applicable types of subgraph patterns: anomalous subgraph mining, frequent subgraph mining in multiple graphs, and frequent subgraph mining in a single graph. To achieve this goal, we propose innovative approaches that involve transforming the original graph into a lower-dimensional representation. This new representation captures the complex higher-order relationships within the data graph. Consequently, the search space in both candidate generation and pattern verification steps is substantially reduced, as these steps are performed in the numerical space of the new representations rather than the exponentially combinatorial graph space. Given the inherent challenge of learning a universal representation for the aforementioned three types of patterns, we employ a divide-and-conquer strategy. We break down the overall objective into three smaller and attainable research goals, each targeting the development of an effective model for learning representations that facilitate the mining process of a specific type of subgraph pattern. By conducting thorough experiments, our proposed algorithms demonstrate a substantial improvement over existing techniques, surpassing their performance and successfully scaling to handle larger patterns that previous solutions were unable to accommodate. The findings from these experiments have practical applications across diverse domains.
Journal Title
Conference Title
Book Title
Edition
Volume
Issue
Thesis Type
Thesis (PhD Doctorate)
Degree Program
Doctor of Philosophy (PhD)
School
School of Info & Comm Tech
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement
The author owns the copyright in this thesis, unless stated otherwise.
Item Access Status
Note
Access the data
Related item(s)
Subject
subgraph mining
graph representation learning
subgraph patterns