Affiliation:
1. Nanjing Normal University , International College for Chinese Studies , Nanjing , Jiangsu , , China .
Abstract
Abstract
This study delves into paraphrase meta-language for linguistic domains in the age of artificial intelligence. The study includes text preprocessing, text representation based on vector space modeling, statistical disambiguation, feature selection, and LDA topic modeling application. The research results show that these methods can effectively extract and understand paraphrased meta-language. The thematic distribution and dynamic changes of paraphrased meta-language are revealed by LDA modeling analysis in 4623552 Twitter data and 532565 linguistic documents. In addition, this study empirically analyzes paraphrase meta-language based on lexical understanding and finds that the average correctness of the annotators meets the expected range in all types of polysemous words. In the era of artificial intelligence, the study of paraphrase meta-language can bring new insights to linguistics, especially showing its value in understanding and processing large-scale linguistic data.