Contextual Sentence Similarity from News Articles
-
Published:2024-03-14
Issue:2
Volume:10
Page:24-37
-
ISSN:2456-3307
-
Container-title:International Journal of Scientific Research in Computer Science, Engineering and Information Technology
-
language:
-
Short-container-title:Int. J. Sci. Res. Comput. Sci. Eng. Inf. Technol
Author:
Chaturvedi Nikhil,Dubey Jigyasu
Abstract
An important topic in the field of natural language processing is the measurement of sentence similarity. It's important to precisely gauge how similar two sentences are. Existing methods for determining sentence similarity challenge two problems Because sentence level semantics are not explicitly modelled at training, labelled datasets are typically small, making them insufficient for training supervised neural models; and there is a training-test gap for unsupervised language modelling (LM) based models to compute semantic scores between sentences. As a result, this task is performed at a lower level. In this paper, we suggest a novel paradigm to handle these two concerns by robotics method framework. The suggested robotics framework is built on the essential premise that a sentence's meaning is determined by its context and that sentence similarity may be determined by comparing the probabilities of forming two phrases given the same context. In an unsupervised way, the proposed approach can create high-quality, large-scale datasets with semantic similarity scores between two sentences, bridging the train-test gap to a great extent. Extensive testing shows that the proposed framework does better than existing baselines on a wide range of datasets.
Publisher
Technoscience Academy
Reference32 articles.
1. Agirre, E., Banea, C., Cardie, C., Cer, D., Diab, M., Gonzalez-Agirre, A., ... & Wiebe, J. (2015, June). Semeval-2015 task 2: Semantic textual similarity, english, spanish and pilot on interpretability. In Proceedings of the 9th international workshop on semantic evaluation (SemEval 2015) (pp. 252-263). 2. Agirre, E., Banea, C., Cardie, C., Cer, D. M., Diab, M. T., Gonzalez-Agirre, A., ... & Wiebe, J. (2014, August). SemEval-2014 Task 10: Multilingual Semantic Textual Similarity. In SemEval@ COLING (pp. 81-91). 3. Agirre, E., Banea, C., Cer, D., Diab, M., Gonzalez Agirre, A., Mihalcea, R., ... & Wiebe, J. (2016). Semeval-2016 task 1: Semantic textual similarity, monolingual and cross-lingual evaluation. In SemEval-2016. 10th International Workshop on Semantic Evaluation; 2016 Jun 16-17; San Diego, CA. Stroudsburg (PA): ACL; 2016. p. 497-511.. ACL (Association for Computational Linguistics). 4. Agirre, E., Cer, D., Diab, M., Gonzalez-Agirre, A., & Guo, W. (2013, June). * SEM 2013 shared task: Semantic textual similarity. In Second joint conference on lexical and computational semantics (* SEM), volume 1: proceedings of the Main conference and the shared task: semantic textual similarity (pp. 32-43). 5. Agirre, E., Cer, D., Diab, M., & Gonzalez-Agirre, A. (2012). Semeval-2012 task 6: A pilot on semantic textual similarity. In * SEM 2012: The First Joint Conference on Lexical and Computational Semantics–Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012) (pp. 385-393).
|
|