research-article

SaberLDA: Sparsity-Aware Learning of Topic Models on GPUs

Authors:
Kaiwei Li

Tsinghua University, Beijing, China

Tsinghua University, Beijing, China
View Profile

,
Jianfei Chen

Tsinghua University & TNList Lab, Beijing, China

Tsinghua University & TNList Lab, Beijing, China
View Profile

,
Wenguang Chen

Tsinghua University & Research Institute of Tsinghua University in Shenzhen, Beijing, China

Tsinghua University & Research Institute of Tsinghua University in Shenzhen, Beijing, China
View Profile

,
Jun Zhu

Tsinghua University & TNList Lab, Beijing, China

Tsinghua University & TNList Lab, Beijing, China
View Profile

Authors Info & Claims

ACM SIGARCH Computer Architecture News Volume 45 Issue 1March 2017pp 497–509https://doi.org/10.1145/3093337.3037740

Published:04 April 2017Publication History

ACM SIGARCH Computer Architecture News

Abstract

Latent Dirichlet Allocation (LDA) is a popular tool for analyzing discrete count data such as text and images. Applications require LDA to handle both large datasets and a large number of topics. Though distributed CPU systems have been used, GPU-based systems have emerged as a promising alternative because of the high computational power and memory bandwidth of GPUs. However, existing GPU-based LDA systems cannot support a large number of topics because they use algorithms on dense data structures whose time and space complexity is linear to the number of topics.

In this paper, we propose SaberLDA, a GPU-based LDA system that implements a sparsity-aware algorithm to achieve sublinear time complexity and scales well to learn a large number of topics. To address the challenges introduced by sparsity, we propose a novel data layout, a new warp-based sampling kernel, and an efficient sparse count matrix updating algorithm that improves locality, makes efficient utilization of GPU warps, and reduces memory consumption. Experiments show that SaberLDA can learn from billions-token-scale data with up to 10,000 topics, which is almost two orders of magnitude larger than that of the previous GPU-based systems. With a single GPU card, SaberLDA is able to learn 10,000 topics from a dataset of billions of tokens in a few hours, which is only achievable with clusters with tens of machines before.

References

A. Ahmed, M. Aly, J. Gonzalez, S. Narayanamurthy, and A. J. Smola. Scalable inference in latent variable models. In Proceedings of the fifth ACM international conference on Web search and data mining, pages 123--132. ACM, 2012. Google ScholarDigital Library
A. Asuncion and D. Newman. Uci machine learning repository, 2007.Google Scholar
D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. JMLR, 3: 993--1022, 2003.Google ScholarDigital Library
J. L. Boyd-Graber, D. M. Blei, and X. Zhu. A topic model for word sense disambiguation. In EMNLP-CoNLL, 2007.Google Scholar
L. Cao and L. Fei-Fei. Spatially coherent latent topic model for concurrent segmentation and classification of objects and scenes. In ICCV, 2007. Google ScholarCross Ref
J. Chang and D. Blei. Relational topic models for document networks. In AISTATS, 2009.Google Scholar
J. Chen, K. Li, J. Zhu, and W. Chen. Warplda: a cache efficient o (1) algorithm for latent dirichlet allocation. In VLDB, 2016.Google ScholarDigital Library
N. Chen, J. Zhu, F. Xia, and B. Zhang. Discriminative relational topic models. IEEE Trans. on Pattern Analysis and Machine Intelligence, 37 (5): 973--986, 2015. Google ScholarCross Ref
W.-Y. Chen, J.-C. Chu, J. Luan, H. Bai, Y. Wang, and E. Y. Chang. Collaborative filtering for orkut communities: discovery of user latent behavior. In WWW, 2009. Google ScholarDigital Library
Z. Ghahramani. Probabilistic machine learning and artificial intelligence. Nature, 521: 452--459, 2015. Google ScholarCross Ref
T. L. Griffiths and M. Steyvers. Finding scientific topics. Proceedings of the National academy of Sciences, 101 (suppl 1): 5228--5235, 2004. Google ScholarCross Ref
T. Iwata, T. Yamada, and N. Ueda. Probabilistic latent semantic visualization: topic model for visualizing documents. In KDD, 2008. Google ScholarDigital Library
Z. G. Kingsley. Selective studies and the principle of relative frequency in language, 1932.Google Scholar
A. Q. Li, A. Ahmed, S. Ravi, and A. J. Smola. Reducing the sampling complexity of topic models. In KDD, 2014. Google ScholarDigital Library
J. D. O. Mark Harris, Shubhabrata Sengupta. Parallel prefix sum (scan) with cuda. http://http.developer.nvidia.com/GPUGems3/gpugems3_ch39.html, 2007.Google Scholar
]segreduceNVIDIA. Segmented reduction. https://nvlabs.github.io/moderngpu/segreduce.html, 2013\natexlaba.Google Scholar
]segsortNVIDIA. Segmented sort and locality sort. https://nvlabs.github.io/moderngpu/segsort.html, 2013\natexlabb.Google Scholar
NVIDIA. Cuda c programming guide. http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#warp-vote-functions, 2015.Google Scholar
J.-B. Tristan, J. Tassarotti, and G. Steele. Efficient training of lda on a gpu by mean-for-mode estimation. In Proceedings of the 32nd International Conference on Machine Learning (ICML-15), pages 59--68, 2015.Google Scholar
A. J. Walker. An efficient method for generating discrete random variables with general distributions. ACM Transactions on Mathematical Software (TOMS), 3 (3): 253--256, 1977. Google ScholarDigital Library
H. M. Wallach, I. Murray, R. Salakhutdinov, and D. Mimno. Evaluation methods for topic models. In Proceedings of the 26th Annual International Conference on Machine Learning, pages 1105--1112. ACM, 2009. Google ScholarDigital Library
Y. Wang, X. Zhao, Z. Sun, H. Yan, L. Wang, Z. Jin, L. Wang, Y. Gao, J. Zeng, Q. Yang, et al. Towards topic modeling for big data. ACM Transactions on Intelligent Systems and Technology, 2014.Google Scholar
F. Yan, N. Xu, and Y. Qi. Parallel inference for latent dirichlet allocation on graphics processing units. In Advances in Neural Information Processing Systems, pages 2134--2142, 2009.Google Scholar
L. Yao, D. Mimno, and A. McCallum. Efficient methods for topic model inference on streaming document collections. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 937--946. ACM, 2009. Google ScholarDigital Library
H.-F. Yu, C.-J. Hsieh, H. Yun, S. Vishwanathan, and I. S. Dhillon. A scalable asynchronous distributed algorithm for topic modeling. In Proceedings of the 24th International Conference on World Wide Web, pages 1340--1350. International World Wide Web Conferences Steering Committee, 2015. Google ScholarDigital Library
J. Yuan, F. Gao, Q. Ho, W. Dai, J. Wei, X. Zheng, E. P. Xing, T.-Y. Liu, and W.-Y. Ma. Lightlda: Big topic models on modest compute clusters. In WWW, 2015.Google ScholarDigital Library
M. Zaheer. Dmlc experimental-lda. https://github.com/dmlc/experimental-lda, 2016.Google Scholar
M. Zaheer, M. Wick, J.-B. Tristan, A. Smola, and G. L. Steele Jr. Exponential stochastic cellular automata for massively parallel inference. In AISTATS, 2015.Google Scholar
H. Zhao, B. Jiang, J. F. Canny, and B. Jaros. Same but different: Fast and high quality gibbs parameter estimation. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1495--1502. ACM, 2015. Google ScholarDigital Library
J. Zhu, A. Ahmed, and E. Xing. Medlda: maximum margin supervised topic models. Journal of Machine Learning Research, 13: 2237--2278, 2012.Google ScholarDigital Library
J. Zhu, J. Chen, and W. Hu. Big learning with bayesian methods. pharXiv:1411.6370, 2014.Google Scholar

Index Terms

SaberLDA: Sparsity-Aware Learning of Topic Models on GPUs
1. Information systems
  1. Information retrieval
    1. Document representation
      1. Document topic models
2. Theory of computation
  1. Design and analysis of algorithms
    1. Parallel algorithms
      1. Vector / streaming algorithms

Recommendations

SaberLDA: Sparsity-Aware Learning of Topic Models on GPUs
ASPLOS '17

Latent Dirichlet Allocation (LDA) is a popular tool for analyzing discrete count data such as text and images. Applications require LDA to handle both large datasets and a large number of topics. Though distributed CPU systems have been used, GPU-based ...
Read More
SaberLDA: Sparsity-Aware Learning of Topic Models on GPUs
ASPLOS '17: Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems

Latent Dirichlet Allocation (LDA) is a popular tool for analyzing discrete count data such as text and images. Applications require LDA to handle both large datasets and a large number of topics. Though distributed CPU systems have been used, GPU-based ...
Read More
Research on Multi-document Summarization Based on LDA Topic Model
IHMSC '14: Proceedings of the 2014 Sixth International Conference on Intelligent Human-Machine Systems and Cybernetics - Volume 02

Compared with VSM (Vector Space Model) and graph-ranking models, LDA (Latent Dirichlet Allocation) Model can discover latent topics in the corpus and latent topics are beneficial to use sentence-ranking mechanisms to form a good summary. In the paper, ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM SIGARCH Computer Architecture News Volume 45, Issue 1
Asplos'17
March 2017
812 pages
ISSN:0163-5964
DOI:10.1145/3093337
Editor:
Babak Falsafi
Interim
Issue’s Table of Contents
ASPLOS '17: Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems
April 2017
856 pages
ISBN:9781450344654
DOI:10.1145/3037697
General Chairs:
Yunji Chen
Institute of Computing Technology, CAS, China
,
Olivier Temam
Google, USA
,
Program Chair:
John Carter
IBM, USA
Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 4 April 2017
Check for updates
Author Tags
gpu
lda
palellel computing
topic model
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 11
  Total Citations
  View Citations
- 495
  Total Downloads
- Downloads (Last 12 months)21
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

SaberLDA: Sparsity-Aware Learning of Topic Models on GPUs

ACM SIGARCH Computer Architecture News

Abstract

References

Cited By

Index Terms

Recommendations

SaberLDA: Sparsity-Aware Learning of Topic Models on GPUs

SaberLDA: Sparsity-Aware Learning of Topic Models on GPUs

Research on Multi-document Summarization Based on LDA Topic Model