1. A context-enhanced transformer with abbr-recover policy for Chinese abbreviation prediction;Cao,2022
2. Relation-constrained decoding for text generation;Chen;Advances in Neural Information Processing Systems,2022
3. Scaling instruction-finetuned language models;Chung;Journal of Machine Learning Research,2024
4. Pre-training with whole word masking for Chinese BERT;Cui;IEEE/ACM Transactions on Audio, Speech, and Language Processing,2021
5. BERT: Pre-training of deep bidirectional transformers for language understanding;Devlin,2019