Affiliation:
1. Kazan Federal University
Abstract
This article outlines the process of creating an automated system for knowledge graph construction from collections of mathematical documents in LATEX format. The MathCollectionOntology, which defines the types of objects and relationships in knowledge graphs, was developed. The introduced toolkit includes methods for extracting mathematical terms, browsing and identifying document topics, extracting entities from LATEX code, and calculating statistical parameters of the graph. The parsed entities are mathematical terms, topics generated through the Latent Dirichlet Allocation, UDC codes, used formulas, author affiliations, cited literature, and others. The knowledge graph captures each extracted object using specific types of relationships defined in the MathCollectionOntology. Here, a knowledge graph was coined for a collection of articles published in Izvestiya VUZov. Matematika journal (1114 Russian-language documents in LATEX format). The thematic terms of the document topics were described. The quantitative parameters of the constructed knowledge graph were obtained.
Reference35 articles.
1. National Research Council. Developing a 21st Century Global Library for Mathematics Research. Washington, DC, Natl. Acad. Press, 2014. 142 p. https://doi.org/10.17226/18619.
2. Ion P.D.F., Watt S.M. The Global Digital Mathematics Library and the International Mathematical Knowledge Trust. In: CICM 2017: Intelligent Computer Mathematics. Geuvers H., England M., Hasan O., Rabe F., Teschke O. (Eds.). Ser.: Lecture Notes in Computer Science. Vol. 10383. Cham, Springer, 2017, pp. 56–69. https://doi.org/10.1007/978-3-319-62075-6_5.
3. Bouche T., R´akosnik J. Report on the EuDML External Cooperation Model. Proc. Joint Math. Meet. AMS Special Session. San Diego, 2013, pp. 99–108. URL: https://www.emis.de/proceedings/TIEP2013/07bouche_rakosnik.pdf.
4. Carette J., Farmer W.M., Kohlhase M., Rabe F. Big math and the one-brain barrier: The tetrapod model of mathematical knowledge. Math. Intell., 2021, vol. 43, pp. 78–87. https://doi.org/10.1007/s00283-020-10006-0.
5. Borwein J., Rocha E.M., Rodrigues J.F. (Eds.) Communicating Mathematics in the Digital Era. Wellesley, MA, A K Peters, CRC Press, 2008. 325 p. https://doi.org/10.1201/b10587.