A Robustly Optimized BERT Pre-training Approach with Post-training-Reference-Cited by-同舟云学术

A Robustly Optimized BERT Pre-training Approach with Post-training

Published:2021 Issue: Volume: Page:471-484
ISSN:0302-9743
Container-title:Lecture Notes in Computer Science
language:
Short-container-title:

Author:

Liu Zhuang,Lin Wayne,Shi Ya,Zhao Jun

Publisher

Springer International Publishing

Link

https://link.springer.com/content/pdf/10.1007/978-3-030-84186-7_31

Reference28 articles.

1. Bowman, S.R., Pavlick, E., Grave, E.: Looking for Elmo’s friends: sentence-level pretraining beyond language modeling. CoRR abs/1812.10860 (2018)

2. Chen, Z., Liu, B.: Lifelong Machine Learning. Synthesis Lectures on Artificial Intelligence and Machine Learning, 2nd edn. Morgan & Claypool Publishers, Williston (2018). https://doi.org/10.2200/S00832ED1V01Y201802AIM037

3. Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Burstein, J., Doran, C., Solorio, T. (eds.) Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, 2–7 June, 2019, Volume 1 (Long and Short Papers), pp. 4171–4186. Association for Computational Linguistics (2019). https://doi.org/10.18653/v1/n19-1423

4. Hou, M., Chen, X., Huang, S., Xie, S., Zhou, G.: Generalizing deep multi-task learning with heterogeneous structured networks. In: Proceedings of ICLR (2020)

5. Joshi, M., Chen, D., Liu, Y., Weld, D.S., Zettlemoyer, L., Levy, O.: SpanBERT: improving pre-training by representing and predicting spans. CoRR abs/1907.10529 (2019). http://arxiv.org/abs/1907.10529

Cited by 72 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Text-enhanced knowledge graph representation learning with local structure;Information Processing & Management;2024-09

2. Spatial–Temporal Transformer Networks for Traffic Flow Forecasting Using a Pre-Trained Language Model;Sensors;2024-08-25

3. Predicting Cryptocurrency Prices During Periods of Conflict: A Comparative Sentiment Analysis Using SVM, CNN-LSTM, and Pysentimento;Operations Research Forum;2024-08-14

4. Joint Extraction Method for Hydraulic Engineering Entity Relations Based on Multi-Features;Electronics;2024-07-28

5. A new weighted ensemble model-based method for text implication recognition;Multimedia Tools and Applications;2024-07-10