skip to main content
research-article

SaberLDA: Sparsity-Aware Learning of Topic Models on GPUs

Authors Info & Claims
Published:04 April 2017Publication History
Skip Abstract Section

Abstract

Latent Dirichlet Allocation (LDA) is a popular tool for analyzing discrete count data such as text and images. Applications require LDA to handle both large datasets and a large number of topics. Though distributed CPU systems have been used, GPU-based systems have emerged as a promising alternative because of the high computational power and memory bandwidth of GPUs. However, existing GPU-based LDA systems cannot support a large number of topics because they use algorithms on dense data structures whose time and space complexity is linear to the number of topics.

In this paper, we propose SaberLDA, a GPU-based LDA system that implements a sparsity-aware algorithm to achieve sublinear time complexity and scales well to learn a large number of topics. To address the challenges introduced by sparsity, we propose a novel data layout, a new warp-based sampling kernel, and an efficient sparse count matrix updating algorithm that improves locality, makes efficient utilization of GPU warps, and reduces memory consumption. Experiments show that SaberLDA can learn from billions-token-scale data with up to 10,000 topics, which is almost two orders of magnitude larger than that of the previous GPU-based systems. With a single GPU card, SaberLDA is able to learn 10,000 topics from a dataset of billions of tokens in a few hours, which is only achievable with clusters with tens of machines before.

References

  1. A. Ahmed, M. Aly, J. Gonzalez, S. Narayanamurthy, and A. J. Smola. Scalable inference in latent variable models. In Proceedings of the fifth ACM international conference on Web search and data mining, pages 123--132. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. A. Asuncion and D. Newman. Uci machine learning repository, 2007.Google ScholarGoogle Scholar
  3. D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. JMLR, 3: 993--1022, 2003.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J. L. Boyd-Graber, D. M. Blei, and X. Zhu. A topic model for word sense disambiguation. In EMNLP-CoNLL, 2007.Google ScholarGoogle Scholar
  5. L. Cao and L. Fei-Fei. Spatially coherent latent topic model for concurrent segmentation and classification of objects and scenes. In ICCV, 2007. Google ScholarGoogle ScholarCross RefCross Ref
  6. J. Chang and D. Blei. Relational topic models for document networks. In AISTATS, 2009.Google ScholarGoogle Scholar
  7. J. Chen, K. Li, J. Zhu, and W. Chen. Warplda: a cache efficient o (1) algorithm for latent dirichlet allocation. In VLDB, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. N. Chen, J. Zhu, F. Xia, and B. Zhang. Discriminative relational topic models. IEEE Trans. on Pattern Analysis and Machine Intelligence, 37 (5): 973--986, 2015. Google ScholarGoogle ScholarCross RefCross Ref
  9. W.-Y. Chen, J.-C. Chu, J. Luan, H. Bai, Y. Wang, and E. Y. Chang. Collaborative filtering for orkut communities: discovery of user latent behavior. In WWW, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Z. Ghahramani. Probabilistic machine learning and artificial intelligence. Nature, 521: 452--459, 2015. Google ScholarGoogle ScholarCross RefCross Ref
  11. T. L. Griffiths and M. Steyvers. Finding scientific topics. Proceedings of the National academy of Sciences, 101 (suppl 1): 5228--5235, 2004. Google ScholarGoogle ScholarCross RefCross Ref
  12. T. Iwata, T. Yamada, and N. Ueda. Probabilistic latent semantic visualization: topic model for visualizing documents. In KDD, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Z. G. Kingsley. Selective studies and the principle of relative frequency in language, 1932.Google ScholarGoogle Scholar
  14. A. Q. Li, A. Ahmed, S. Ravi, and A. J. Smola. Reducing the sampling complexity of topic models. In KDD, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. D. O. Mark Harris, Shubhabrata Sengupta. Parallel prefix sum (scan) with cuda. http://http.developer.nvidia.com/GPUGems3/gpugems3_ch39.html, 2007.Google ScholarGoogle Scholar
  16. ]segreduceNVIDIA. Segmented reduction. https://nvlabs.github.io/moderngpu/segreduce.html, 2013\natexlaba.Google ScholarGoogle Scholar
  17. ]segsortNVIDIA. Segmented sort and locality sort. https://nvlabs.github.io/moderngpu/segsort.html, 2013\natexlabb.Google ScholarGoogle Scholar
  18. NVIDIA. Cuda c programming guide. http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#warp-vote-functions, 2015.Google ScholarGoogle Scholar
  19. J.-B. Tristan, J. Tassarotti, and G. Steele. Efficient training of lda on a gpu by mean-for-mode estimation. In Proceedings of the 32nd International Conference on Machine Learning (ICML-15), pages 59--68, 2015.Google ScholarGoogle Scholar
  20. A. J. Walker. An efficient method for generating discrete random variables with general distributions. ACM Transactions on Mathematical Software (TOMS), 3 (3): 253--256, 1977. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. H. M. Wallach, I. Murray, R. Salakhutdinov, and D. Mimno. Evaluation methods for topic models. In Proceedings of the 26th Annual International Conference on Machine Learning, pages 1105--1112. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Y. Wang, X. Zhao, Z. Sun, H. Yan, L. Wang, Z. Jin, L. Wang, Y. Gao, J. Zeng, Q. Yang, et al. Towards topic modeling for big data. ACM Transactions on Intelligent Systems and Technology, 2014.Google ScholarGoogle Scholar
  23. F. Yan, N. Xu, and Y. Qi. Parallel inference for latent dirichlet allocation on graphics processing units. In Advances in Neural Information Processing Systems, pages 2134--2142, 2009.Google ScholarGoogle Scholar
  24. L. Yao, D. Mimno, and A. McCallum. Efficient methods for topic model inference on streaming document collections. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 937--946. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. H.-F. Yu, C.-J. Hsieh, H. Yun, S. Vishwanathan, and I. S. Dhillon. A scalable asynchronous distributed algorithm for topic modeling. In Proceedings of the 24th International Conference on World Wide Web, pages 1340--1350. International World Wide Web Conferences Steering Committee, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. J. Yuan, F. Gao, Q. Ho, W. Dai, J. Wei, X. Zheng, E. P. Xing, T.-Y. Liu, and W.-Y. Ma. Lightlda: Big topic models on modest compute clusters. In WWW, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. M. Zaheer. Dmlc experimental-lda. https://github.com/dmlc/experimental-lda, 2016.Google ScholarGoogle Scholar
  28. M. Zaheer, M. Wick, J.-B. Tristan, A. Smola, and G. L. Steele Jr. Exponential stochastic cellular automata for massively parallel inference. In AISTATS, 2015.Google ScholarGoogle Scholar
  29. H. Zhao, B. Jiang, J. F. Canny, and B. Jaros. Same but different: Fast and high quality gibbs parameter estimation. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1495--1502. ACM, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. J. Zhu, A. Ahmed, and E. Xing. Medlda: maximum margin supervised topic models. Journal of Machine Learning Research, 13: 2237--2278, 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. J. Zhu, J. Chen, and W. Hu. Big learning with bayesian methods. pharXiv:1411.6370, 2014.Google ScholarGoogle Scholar

Index Terms

  1. SaberLDA: Sparsity-Aware Learning of Topic Models on GPUs

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM SIGARCH Computer Architecture News
        ACM SIGARCH Computer Architecture News  Volume 45, Issue 1
        Asplos'17
        March 2017
        812 pages
        ISSN:0163-5964
        DOI:10.1145/3093337
        Issue’s Table of Contents
        • cover image ACM Conferences
          ASPLOS '17: Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems
          April 2017
          856 pages
          ISBN:9781450344654
          DOI:10.1145/3037697

        Copyright © 2017 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 4 April 2017

        Check for updates

        Qualifiers

        • research-article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader