Abstract
International patent classifications (IPCs) are assigned to patent documents; however, since the procedure for assigning classifications is manually done by the patent examiner, it takes a lot of time and effort to select some IPCs from about 70,000 IPCs. Hence, some research has been conducted on patent classification with machine learning. However, patent documents are very voluminous, and learning with all the claims (the part describing the content of the patent) as input would run out of the necessary memory, even if the batch size is set to a very small size. Therefore, most of the existing methods learn by excluding some information, such as using only the first claim as input. In this study, we propose a model that considers the contents of all claims by extracting important information for input. In addition, we focus on the hierarchical structure of the IPC, and propose a new decoder architecture to consider it. Finally, we conducted an experiment using actual patent data to verify the accuracy of the prediction. The results showed a significant improvement in accuracy compared to existing methods, and the actual applicability of the method was also discussed.
Publisher
Public Library of Science (PLoS)
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献