Neural Topic Model Training with the REBAR Gradient Estimator

Author:

Kumar Amit1ORCID,Esmaili Nazanin2ORCID,Piccardi Massimo2ORCID

Affiliation:

1. University of Technology Sydney and Food Agility CRC Ltd, Sydney, NSW, Australia

2. University of Technology Sydney, Sydney, Australia

Abstract

Topic modelling is an important approach of unsupervised machine learning that allows automatically extracting the main “topics” from large collections of documents. In addition, topic modelling is able to identify the topic proportions of each individual document, which can be helpful for organizing the collections. Many topic modelling algorithms have been proposed to date, including several that leverage advanced techniques such as variational inference and deep autoencoders. However, to date topic modelling has made limited use of reinforcement learning, a framework that has obtained vast success in many other unsupervised learning tasks. For this reason, in this article we propose training a neural topic model using a reinforcement learning objective and minimizing the objective with the recently-proposed REBAR gradient estimator. Experiments performed over two probing datasets have shown that the proposed model has achieved improvements over all the compared models in terms of both model perplexity and topic coherence, and produced topics that appear qualitatively informative and consistent.

Funder

Food Agility CRC Ltd, funded under the Commonwealth Government CRC Program

Publisher

Association for Computing Machinery (ACM)

Subject

General Computer Science

Reference54 articles.

1. Martín Abadi Ashish Agarwal Paul Barham Eugene Brevdo Zhifeng Chen Craig Citro Greg Corrado Andy Davis Jeffrey Dean Matthieu Devin Sanjay Ghemawat Ian Goodfellow Andrew Harp Geoffrey Irving Michael Isard Yangqing Jia Rafal Jozefowicz Lukasz Kaiser Manjunath Kudlur Josh Levenberg Dan Mané Rajat Monga Sherry Moore Derek Murray Chris Olah Mike Schuster Jonathon Shlens Benoit Steiner Ilya Sutskever Kunal Talwar Paul Tucker Vincent Vanhoucke Vijay Vasudevan Fernanda Viégas Oriol Vinyals Pete Warden Martin Wattenberg Martin Wicke Yuan Yu and Xiaoqiang Zheng. 2015. TensorFlow: Large-Scale machine learning on heterogeneous distributed systems. http://download.tensorflow.org/paper/whitepaper2015.pdf.

2. David Alvarez-Melis and Martin Saveski. 2016. Topic modeling in Twitter: Aggregating tweets by conversations. In Proceedings of the 10th International Conference on Web and Social Media. 519–522.

3. Clinical case-based retrieval using latent topic analysis;Arnold Corey;AMIA Annual Symposium Proceedings,2010

4. James Bradbury, Roy Frostig, Peter Hawkins, Matthew James Johnson, Chris Leary, Dougal Maclaurin, and Skye Wanderman-Milne. 2018. JAX: composable transformations of Python+NumPy programs. Retrieved from http://github.com/google/jax.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3