A multi-dimensional fusion strategy similarity measure method for patent application technology disclosure document

Author:

Zhu MeilongORCID,Li MingdaORCID,Hou Kangwei,Wang Zhaohui,Long Xianjun

Abstract

Patent application technology disclosure document is one of the important bases for judging patent novelty and uniqueness. Automated evaluation can effectively solve the problems of long time and strong subjectivity of human evaluation. The text similarity evaluation algorithm based on corpus and deep learning technology has problems such as insufficient amount of cross-library learning data and insufficient core content tendency in the similarity judgment of patent application technology disclosure document, which limits their performance and practical application. In this paper, we propose a similarity evaluation method of patent application technology disclosure document based on multi-dimensional fusion strategy to realize the similarity measurement of patents. Firstly, in the text preprocessing section, word segmentation reconstruction and similarity evaluation optimization strategies based on word frequency and part-of-speech score weighted fusion are proposed. Then, a similarity calculation method of patent application technology disclosure document based on two new mapping spaces of dot matrix and image is proposed to achieve a more diversified comprehensive evaluation. The algorithm was evaluated by using four published text similarity matching datasets (containing 0–5 or 0/1 labels) and a set of patent application technology disclosure documents. Experimental results show that on the published text similarity matching datasets, the similarity evaluation method under the multi-dimensional fusion strategy proposed in this paper has a discrimination accuracy improvement of about 10% compared to traditional vector semantic model, and can match the discriminative ability of lightweight deep learning models without the need for training. At the same time, the discrimination accuracy of the proposed method on the sample dataset of patent application technology disclosure document is superior to traditional vector semantic model (20%) and various deep learning models (1%-8%), and the precision and recall rate are relatively balanced. The visual analysis results on the dataset of the patent application technology disclosure documents also prove the effectiveness and reliability of the similarity calculation method proposed in the dot matrix and image space, which provide a new idea and method for the similarity evaluation between patent application technology disclosure document.

Publisher

Public Library of Science (PLoS)

Subject

Multidisciplinary

Reference34 articles.

1. The method of improving the quality of patent application of high-tech enterprises—Taking the drafting of patent technology disclosure as an example;Lyu Shen;Jiangsu Science and Technology Information,2018

2. Semantic Based Text Similarity Computation;Liu Yaqi;Lecture Notes in Electrical Engineering,2017

3. Batch Text Similarity Search with MapReduce;Li Rui;Web Technologies and Applications,2011

4. Short Text Similarity Calculation Using Semantic Information;Pu Haoyu;International Conference on Big Data Computing and Communications (BIGCOM),2017

5. Text similarity computing based on sememe Vector Space;Ke Zhang;International Conference on Software Engineering and Service Science,2013

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3