POSPAN: Position-Constrained Span Masking for Language Model Pre-training-Reference-Cited by-同舟云学术

POSPAN: Position-Constrained Span Masking for Language Model Pre-training

Published:2023-10-21 Issue: Volume: Page:
ISSN:
Container-title:Proceedings of the 32nd ACM International Conference on Information and Knowledge Management
language:
Short-container-title:

Author:

Zhang Zhenyu¹^ORCID,Shen Lei¹^ORCID,Zhao Yuming¹^ORCID,Chen Meng¹^ORCID,He Xiaodong¹^ORCID

Affiliation:

1. JD AI Research, Beijing, China

Publisher

ACM

Link

https://dl.acm.org/doi/pdf/10.1145/3583780.3615197

Reference31 articles.

1. On Losses for Modern Language Models

2. Christopher Clark , Kenton Lee , Ming-Wei Chang , Tom Kwiatkowski , Michael Collins , and Kristina Toutanova . 2019. BoolQ: Exploring the surprising difficulty of natural yes/no questions. arXiv preprint arXiv:1905.10044 ( 2019 ). Christopher Clark, Kenton Lee, Ming-Wei Chang, Tom Kwiatkowski, Michael Collins, and Kristina Toutanova. 2019. BoolQ: Exploring the surprising difficulty of natural yes/no questions. arXiv preprint arXiv:1905.10044 (2019).

3. Yiming Cui , Wanxiang Che , Ting Liu , Bing Qin , Shijin Wang , and Guoping Hu . 2020 . Revisiting Pre-Trained Models for Chinese Natural Language Processing. In Findings of the Association for Computational Linguistics: EMNLP 2020, Online Event, 16--20 November 2020 (Findings of ACL , Vol. EMNLP 2020), , Trevor Cohn, Yulan He, and Yang Liu (Eds.). Association for Computational Linguistics, 657-- 668 . https://doi.org/10.18653/v1/2020.findings-emnlp. 58 10.18653/v1 Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Shijin Wang, and Guoping Hu. 2020. Revisiting Pre-Trained Models for Chinese Natural Language Processing. In Findings of the Association for Computational Linguistics: EMNLP 2020, Online Event, 16--20 November 2020 (Findings of ACL, Vol. EMNLP 2020), , Trevor Cohn, Yulan He, and Yang Liu (Eds.). Association for Computational Linguistics, 657--668. https://doi.org/10.18653/v1/2020.findings-emnlp.58

4. Pre-Training with Whole Word Masking for Chinese BERT;Cui Yiming;IEEE Transactions on Audio, Speech and Language Processing. https://doi.org/10.1109/TASLP.,2021

5. Jacob Devlin , Ming-Wei Chang , Kenton Lee , and Kristina Toutanova . 2019 . BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding . In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019 , Minneapolis, MN, USA, June 2--7 , 2019, Volume 1 (Long and Short Papers), , Jill Burstein, Christy Doran, and Thamar Solorio (Eds.). Association for Computational Linguistics, 4171--4186. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2--7, 2019, Volume 1 (Long and Short Papers), , Jill Burstein, Christy Doran, and Thamar Solorio (Eds.). Association for Computational Linguistics, 4171--4186.