Scalable Robust Graph Embedding with Spark

Loading...
Thumbnail Image
File version

Version of Record (VoR)

Author(s)
Duong, Chi Thang
Hoang, Trung Dung
Yin, Hongzhi
Weidlich, Matthias
Nguyen, Quoc Viet Hung
Aberer, Karl
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Date
2022
Size
File type(s)
Location

Sydney, Australia

Abstract

Graph embedding aims at learning a vector-based representation of vertices that incorporates the structure of the graph. This representation then enables inference of graph properties. Existing graph embedding techniques, however, do not scale well to large graphs. While several techniques to scale graph embedding using compute clusters have been proposed, they require continuous communication between the compute nodes and cannot handle node failure. We therefore propose a framework for scalable and robust graph embedding based on the MapReduce model, which can distribute any existing embedding technique. Our method splits a graph into subgraphs to learn their embeddings in isolation and subsequently reconciles the embedding spaces derived for the subgraphs. We realize this idea through a novel distributed graph decomposition algorithm. In addition, we show how to implement our framework in Spark to enable efficient learning of effective embeddings. Experimental results illustrate that our approach scales well, while largely maintaining the embedding quality.

Journal Title
Conference Title

Proceedings of the VLDB Endowment

Book Title
Edition
Volume

15

Issue

4

Thesis Type
Degree Program
School
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement

© 2021 Copyright is held by the owner/author(s). Publication rights licensed to the VLDB Endowment. This work is licensed under the Creative Commons BY-NC-ND 4.0 International License. Visit https://creativecommons.org/licenses/by-nc-nd/4.0/ to view a copy of this license. For any use beyond those covered by this license, obtain permission by emailing info@vldb.org.

Item Access Status
Note
Access the data
Related item(s)
Subject

Data management and data science

Persistent link to this record
Citation

Duong, CT; Hoang, TD; Yin, H; Weidlich, M; Nguyen, QVH; Aberer, K, Scalable Robust Graph Embedding with Spark, Proceedings of the VLDB Endowment, 2022, 15 (4), pp. 914-922