CPT: a pre-trained unbalanced transformer for both Chinese language understanding and generation-Reference-Cited by-同舟云学术

CPT: a pre-trained unbalanced transformer for both Chinese language understanding and generation

Published:2024-03-27 Issue:5 Volume:67 Page:
ISSN:1674-733X
Container-title:Science China Information Sciences
language:en
Short-container-title:Sci. China Inf. Sci.

Author:

Shao Yunfan,Geng Zhichao,Liu Yitao,Dai Junqi,Yan Hang,Yang Fei,Li Zhe,Bao Hujun,Qiu Xipeng

Publisher

Springer Science and Business Media LLC

Link

https://link.springer.com/content/pdf/10.1007/s11432-021-3536-5.pdf

Reference45 articles.

1. Qiu X P, Sun T X, Xu Y G, et al. Pre-trained models for natural language processing: a survey. Sci China Tech Sci, 2020, 63: 1872–1897

2. Devlin J, Chang M, Lee K, et al. BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019. 4171–4186

3. Liu Y, Ott M, Goyal N, et al. RoBERTa: a robustly optimized BERT pretraining approach. 2019. ArXiv:1907.11692

4. Lewis M, Liu Y, Goyal N, et al. BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020. 7871–7880

5. Radford A, Narasimhan K, Salimans T, et al. Improving language understanding by generative pre-training. 2018. https://www.cs.ubc.ca/∼amuham01/LING530/papers/radford2018improving.pdf

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A novel reconstruction method for displacement missing data of arch dam via hierarchical clustering and deep learning;Engineering Applications of Artificial Intelligence;2024-07

2. Query-Oriented Micro-Video Summarization;IEEE Transactions on Pattern Analysis and Machine Intelligence;2024-06