A novel text representation which enables image classifiers to also simultaneously classify text, applied to name disambiguation-Reference-Cited by-同舟云学术

A novel text representation which enables image classifiers to also simultaneously classify text, applied to name disambiguation

Published:2023-06-05 Issue: Volume: Page:
ISSN:0138-9130
Container-title:Scientometrics
language:en
Short-container-title:Scientometrics

Author:

Petrie Stephen M.^ORCID,Julius T’Mir D.

Abstract

AbstractWe introduce a novel method for converting text data into abstract image representations, which allows image-based processing techniques (e.g. image classification networks) to be applied to text-based comparison problems. We apply the technique to entity disambiguation of inventor names in US patents, obtaining a list of IDs which identify individual inventors with high accuracy. The method involves converting text from each pairwise comparison between two inventor name records into a 2D RGB (stacked) image representation. We then train an image classification neural network to discriminate between such pairwise comparison images. The trained neural network then labels each pair of records as either matched (same inventor) or non-matched (different inventors), producing highly accurate results. Our new text-to-image representation method could also be used more broadly for other text comparison problems, such as entity disambiguation of academic publications, or for problems that require simultaneous classification of both text and image datasets.

Funder

Swinburne University of Technology

Publisher

Springer Science and Business Media LLC

Subject

Library and Information Sciences,Computer Science Applications,General Social Sciences

Link

https://link.springer.com/content/pdf/10.1007/s11192-023-04712-7.pdf

Reference25 articles.

1. Bromley, J., Bentz, J. W., Bottou, L., Guyon, I., Lecun, Y., Moore, C., Säckinger, E., & Shah, R. (1993). Signature verification using a “Siamese’’ time delay neural network. International Journal of Pattern Recognition and Artificial Intelligence, 07(04), 669–688. https://doi.org/10.1142/S0218001493000339

2. Gay, C., Latham, W., & Le Bas, C. (2008). Collective knowledge, prolific inventors and the value of inventions: An empirical study of French, German and British patents in the US, 1975–1999. Economics of Innovation and New Technology, 17(1–2), 5–22. https://doi.org/10.1080/10438590701279193

3. Ge, C., Huang, K., & Png, I. P. L. (2016). Engineer/scientist careers: Patents, online profiles, and misclassification bias. Strategic Management Journal, 37, 232–253. https://doi.org/10.1002/smj

4. Hall, B. H., Jaffe, A. B., & Trajtenberg, M. (2001). The NBER patent citation data file: Lessons, insights and methodological tools. National Bureau of Economic Research Working Paper 8498. https://doi.org/10.1186/1471-2164-12-148.

5. Hoisl, K. (2009). Does mobility increase the productivity of inventors? Journal of Technology Transfer, 34(2), 212–225. https://doi.org/10.1007/s10961-007-9068-5