GLORIA

GEOMAR Library Ocean Research Information Access

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
Filter
  • Xiao, Xiaokui  (5)
  • 1
    Online Resource
    Online Resource
    Association for Computing Machinery (ACM) ; 2022
    In:  ACM SIGMOD Record Vol. 51, No. 1 ( 2022-05-31), p. 42-49
    In: ACM SIGMOD Record, Association for Computing Machinery (ACM), Vol. 51, No. 1 ( 2022-05-31), p. 42-49
    Abstract: Given a graph G where each node is associated with a set of attributes, attributed network embedding (ANE) maps each node v 2 G to a compact vector Xv, which can be used in downstream machine learning tasks in a variety of applications. Existing ANE solutions do not scale to massive graphs due to prohibitive computation costs or generation of low-quality embeddings. This paper proposes PANE, an effective and scalable approach to ANE computation for massive graphs in a single server that achieves state-of-the-art result quality on multiple benchmark datasets for two common prediction tasks: link prediction and node classification. Under the hood, PANE takes inspiration from well-established data management techniques to scale up ANE in a single server. Specifically, it exploits a carefully formulated problem based on a novel random walk model, a highly efficient solver, and non-trivial parallelization by utilizing modern multi-core CPUs. Extensive experiments demonstrate that PANE consistently outperforms all existing methods in terms of result quality, while being orders of magnitude faster.
    Type of Medium: Online Resource
    ISSN: 0163-5808
    Language: English
    Publisher: Association for Computing Machinery (ACM)
    Publication Date: 2022
    detail.hit.zdb_id: 243829-X
    detail.hit.zdb_id: 2051432-3
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 2
    Online Resource
    Online Resource
    Association for Computing Machinery (ACM) ; 2020
    In:  Proceedings of the VLDB Endowment Vol. 13, No. 5 ( 2020-01), p. 670-683
    In: Proceedings of the VLDB Endowment, Association for Computing Machinery (ACM), Vol. 13, No. 5 ( 2020-01), p. 670-683
    Abstract: Given an input graph G and a node v ∈ G , homogeneous network embedding (HNE) maps the graph structure in the vicinity of v to a compact, fixed-dimensional feature vector. This paper focuses on HNE for massive graphs, e.g. , with billions of edges. On this scale, most existing approaches fail, as they incur either prohibitively high costs, or severely compromised result utility. Our proposed solution, called Node-Reweighted PageRank (NRP), is based on a classic idea of deriving embedding vectors from pairwise personalized PageRank (PPR) values. Our contributions are twofold: first, we design a simple and efficient baseline HNE method based on PPR that is capable of handling billion-edge graphs on commodity hardware; second and more importantly, we identify an inherent drawback of vanilla PPR, and address it in our main proposal NRP. Specifically, PPR was designed for a very different purpose, i.e. , ranking nodes in G based on their relative importance from a source node's perspective. In contrast, HNE aims to build node embeddings considering the whole graph. Consequently, node embeddings derived directly from PPR are of suboptimal utility. The proposed NRP approach overcomes the above deficiency through an effective and efficient node reweighting algorithm, which augments PPR values with node degree information, and iteratively adjusts embedding vectors accordingly. Overall, NRP takes O ( m log n ) time and O ( m ) space to compute all node embeddings for a graph with m edges and n nodes. Our extensive experiments that compare NRP against 18 existing solutions over 7 real graphs demonstrate that NRP achieves higher result utility than all the solutions for link prediction, graph reconstruction and node classification, while being up to orders of magnitude faster. In particular, on a billion-edge Twitter graph, NRP terminates within 4 hours, using a single CPU core.
    Type of Medium: Online Resource
    ISSN: 2150-8097
    Language: English
    Publisher: Association for Computing Machinery (ACM)
    Publication Date: 2020
    detail.hit.zdb_id: 2478691-3
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 3
    Online Resource
    Online Resource
    Association for Computing Machinery (ACM) ; 2020
    In:  Proceedings of the VLDB Endowment Vol. 14, No. 1 ( 2020-09), p. 37-49
    In: Proceedings of the VLDB Endowment, Association for Computing Machinery (ACM), Vol. 14, No. 1 ( 2020-09), p. 37-49
    Abstract: Given a graph G where each node is associated with a set of attributes, attributed network embedding (ANE) maps each node v ∈ G to a compact vector X v , which can be used in downstream machine learning tasks. Ideally, X v should capture node v 's affinity to each attribute, which considers not only v 's own attribute associations, but also those of its connected nodes along edges in G . It is challenging to obtain high-utility embeddings that enable accurate predictions; scaling effective ANE computation to massive graphs with millions of nodes pushes the difficulty of the problem to a whole new level. Existing solutions largely fail on such graphs, leading to prohibitive costs, low-quality embeddings, or both. This paper proposes PANE, an effective and scalable approach to ANE computation for massive graphs that achieves state-of-the-art result quality on multiple benchmark datasets, measured by the accuracy of three common prediction tasks: attribute inference, link prediction, and node classification. In particular, for the large MAG data with over 59 million nodes, 0.98 billion edges, and 2000 attributes, PANE is the only known viable solution that obtains effective embeddings on a single server, within 12 hours. PANE obtains high scalability and effectiveness through three main algorithmic designs. First, it formulates the learning objective based on a novel random walk model for attributed networks. The resulting optimization task is still challenging on large graphs. Second, PANE includes a highly efficient solver for the above optimization problem, whose key module is a carefully designed initialization of the embeddings, which drastically reduces the number of iterations required to converge. Finally, PANE utilizes multi-core CPUs through non-trivial parallelization of the above solver, which achieves scalability while retaining the high quality of the resulting embeddings. Extensive experiments, comparing 10 existing approaches on 8 real datasets, demonstrate that PANE consistently outperforms all existing methods in terms of result quality, while being orders of magnitude faster.
    Type of Medium: Online Resource
    ISSN: 2150-8097
    Language: English
    Publisher: Association for Computing Machinery (ACM)
    Publication Date: 2020
    detail.hit.zdb_id: 2478691-3
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 4
    Online Resource
    Online Resource
    Springer Science and Business Media LLC ; 2023
    In:  The VLDB Journal
    In: The VLDB Journal, Springer Science and Business Media LLC
    Type of Medium: Online Resource
    ISSN: 1066-8888 , 0949-877X
    Language: English
    Publisher: Springer Science and Business Media LLC
    Publication Date: 2023
    detail.hit.zdb_id: 1463009-6
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 5
    Online Resource
    Online Resource
    Springer Science and Business Media LLC ; 2020
    In:  The VLDB Journal Vol. 29, No. 5 ( 2020-09), p. 973-998
    In: The VLDB Journal, Springer Science and Business Media LLC, Vol. 29, No. 5 ( 2020-09), p. 973-998
    Type of Medium: Online Resource
    ISSN: 1066-8888 , 0949-877X
    Language: English
    Publisher: Springer Science and Business Media LLC
    Publication Date: 2020
    detail.hit.zdb_id: 1463009-6
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. More information can be found here...