Text classification by untrained sentence embeddings

Author:

Di Sarli Daniele1,Gallicchio Claudio1,Micheli Alessio1

Affiliation:

1. Department of Computer Science, University of Pisa, Largo B. Pontecorvo, Pisa, Italy

Abstract

Recurrent Neural Networks (RNNs) represent a natural paradigm for modeling sequential data like text written in natural language. In fact, RNNs and their variations have long been the architecture of choice in many applications, however in practice they require the use of labored architectures (such as gating mechanisms) and computationally heavy training processes. In this paper we address the question of whether it is possible to generate sentence embeddings via completely untrained recurrent dynamics, on top of which to apply a simple learning algorithm for text classification. This would allow to obtain extremely efficient models in terms of training time. Our work investigates the extent to which this approach can be used, by analyzing the results on different tasks. Finally, we show that, within certain limits, it is possible to build extremely efficient models for text classification that remain competitive in accuracy with reference models in the state-of-the-art.

Publisher

IOS Press

Subject

Artificial Intelligence

Reference56 articles.

1. Semisupervised learning using frequent itemset and ensemble learning for SMS classification;Ahmed;Expert Syst Appl,2015

2. Almeida T.A. , Hidalgo J.M.G. , Yamakami A. , Contributions to the study of SMS spam filtering: new collection and results. In HardyM. R. B. and TompaF. W., editors, Proceedings of the 2011 ACM Symposium on Document Engineering, Mountain View, CA, USA, September 19-22, 2011, pp. 259–262. ACM, 2011.

3. Bahdanau D. , Cho K. , Bengio Y. , Neural machine translation by jointly learning to align and translate. In 3rd International Conference on Learning Representations, ICLR 2015, Conference Track Proceedings, 2015.

4. Spam filtering using integrated distribution-based balancing approach and regularized deep neural networks;Barushka;Appl Intell,2018

5. Learning long-term dependencies with gradient descent is difficult;Bengio;IEEE Trans Neural Networks,1994

Cited by 3 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3