Affiliation:
1. Bar Ilan University Dept. of Information Science Israel
Abstract
ABSTRACTThis paper presents a new methodology for AI‐based research and exploration of large genealogical corpora. The proposed approach is based on an automatic quantitative question‐answering (QA) system that enables researchers to ask questions in natural language and learn about trends related to individuals, families, and communities in the corpus of the study. The proposed methodology includes: 1) an automatic method for training dataset generation, 2) a transformer‐based table selection method, and 3) an optimized transformer‐based quantitative QA model. The findings indicate that the proposed architecture outperforms the state‐of‐the‐art models by achieving 87% accuracy on the large corpus of Jewish genealogical data. This study may have practical implications for genealogical information centers and museums, making genealogical data research easy and scalable for experts as well as the general public.
Subject
Library and Information Sciences,General Computer Science