Semantic Similarity Comparison Between Production Line Failures for Predictive Maintenance-Reference-Cited by-同舟云学术

Semantic Similarity Comparison Between Production Line Failures for Predictive Maintenance

Published:2023-02-15 Issue:1 Volume:3 Page:1-11
ISSN:2757-7422
Container-title:Advances in Artificial Intelligence Research
language:
Short-container-title:

Author:

TEKGÖZ Hilal¹^ORCID,İLHAN OMURCA Sevinç¹^ORCID,KOÇ Kadir Yunus²^ORCID,TOPÇU Umut³^ORCID,ÇELİK Osman²^ORCID

Affiliation:

1. KOCAELİ ÜNİVERSİTESİ

2. IBSS Teknoloji ve Yazılım A.Ş.

3. Vestel Beyaz Eşya A.Ş

Abstract

With the introduction of Industry 4.0 into our lives and the creation of smart factories, predictive maintenance has become even more important. Predictive maintenance systems are often used in the manufacturing industry. On the other hand, text analysis and Natural Language Processing (NLP) techniques are gaining a lot of attention by both research and industry due to their ability to combine natural languages and industrial solutions. There is a great increase in the number of studies on NLP in the literature. Even though there are studies in the field of NLP in predictive maintenance systems, no studies were found on Turkish NLP for predictive maintenance. This study focuses on the similarity analysis of failure texts that can be used in the predictive maintenance system we developed for VESTEL, one of the leading consumer electronics manufacturers in Turkey. In the manufacturing industry, operators record descriptions of failure that occur on production lines as short texts. However, these descriptions are not often used in predictive maintenance work. In this study, semantic text similarities between fault definitions in the production line were compared using traditional word representations, modern word representations and Transformer models. Levenshtein, Jaccard, Pearson, and Cosine scales were used as similarity measures and the effectiveness of these measures were compared. Experimental data including failure texts were obtained from a consumer electronics manufacturer in Turkey. When the experimental results are examined, it is seen that the Jaccard similarity metric is not successful in grouping semantic similarities according to the other three similarity measures. In addition, Multilingual Universal Sentence Encoder (MUSE), Language-agnostic BERT Sentence Embedding (LAbSE), Bag of Words (BoW) and Term Frequency - Inverse Document Frequency (TF-IDF) outperform FastText and Language-Agnostic Sentence Representations (LASER) models in semantic discovery of error identification in embedding methods. Briefly to conclude, Pearson and Cosine are more effective at finding similar failure texts; MUSE, LAbSE, BoW and TF-IDF methods are more successful at representing the failure text.

Funder

TUBİTAK

Publisher

International Conference on Artificial Intelligence and Applied Mathematics in Engineering

Reference33 articles.

1. Chandrasekaran D, and Vijay M. "Evolution of semantic similarity—a survey." ACM Computing Surveys (CSUR) 54.2, 1-37, 2021.

2. Wang Y, et al. "A comparison of word embeddings for the biomedical natural language processing." Journal of biomedical informatics 87,12-20, 2018.

3. Liu J, Tianqi L, and Cong Y. “Newsembed: Modeling news through pre-trained document representations”, arXiv preprint arXiv:2106.00590, 2021.

4. Mikolov T, et al. "Efficient estimation of word representations in vector space." arXiv preprint arXiv:1301.3781, 2013.

5. Pennington J, Richard S, and Christopher D.M. “Glove: Global vectors for word representation”. Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 2014.

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Technical language processing for Prognostics and Health Management: applying text similarity and topic modeling to maintenance work orders;Journal of Intelligent Manufacturing;2024-02-24