Contribution Analysis of Large Language Models and Data Augmentations for Person Names in Solving Legal Bar Examination at COLIEE 2023-Reference-Cited by-同舟云学术

Contribution Analysis of Large Language Models and Data Augmentations for Person Names in Solving Legal Bar Examination at COLIEE 2023

Published:2024-03-08 Issue:1 Volume:18 Page:123-143
ISSN:2523-3173
Container-title:The Review of Socionetwork Strategies
language:en
Short-container-title:Rev Socionetwork Strat

Author:

Onaga Takaaki,Fujita Masaki,Kano Yoshinobu

Abstract

AbstractThis paper describes our system for COLIEE 2023 Task 4, which automatically answers Japanese legal bar exam problems. We propose an extension to our previous system in COLIEE 2022, which achieved the highest accuracy among all submissions using data augmentation. We focus on problems that include mentions of person names. In this paper, we present two main contributions. First, we incorporate LUKE as our deep learning component, which is a named entity recognition model trained on RoBERTa. Second, we fine-tune the pretrained LUKE model in multiple ways, comparing fine-tuning on training datasets that include alphabetical person names and ensembling different fine-tuning models. We confirmed that LUKE and its fine-tuned model on person type problems improve their accuracies. Our formal run results show that LUKE and our fine-tuning approach using alphabetical person names were effective, achieving an accuracy of 0.69 in the COLIEE 2023 Task 4 formal run.

Funder

KAKENHI

Secom Science and Technology Foundation

Publisher

Springer Science and Business Media LLC

Link

https://link.springer.com/content/pdf/10.1007/s12626-024-00155-5.pdf

Reference19 articles.

1. Competition on legal information extraction/entailment (coliee-14) workshop on ju-ris-informatics (jurisin) 2014 (2014). http://webdocs.cs.ualberta.ca/miyoung2/jurisin_task/index.html

2. Devlin, J., Chang, M.W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171–4186. Association for Computational Linguistics, Minneapolis, Minnesota. https://doi.org/10.18653/v1/N19-1423. https://aclanthology.org/N19-1423

3. Fujita, M., Onaga, T., Ueyama, A., & Kano, Y. (2022). Legal textual entailment using ensemble of rule based and bert based method with data augmentaion by rekated articke generation. In: Proceedings of the Sixteenth International Workshop on Jurisinformatics (JURISIN 2022), pp. 84–97.

4. Hoshino, R., Kiyota, N., & Kano, Y. (2019). Question answering system for legal bar examination using predicate argument structures focusing on exceptions. In: Proceedings of the Sixth International Competition on Legal Information Extraction/Entailment (COLIEE), pp. 38–42.

5. Kano, Y., Kim, M.Y., Goebel, R., & Satoh, K. (2017). Overview of coliee 2017. In K. Satoh, M.Y. Kim, Y. Kano, R. Goebel, T. Oliveira (Eds.) COLIEE 2017. 4th Competition on Legal Information Extraction and Entailment, EPiC Series in Computing, vol. 47, pp. 1–8. EasyChair. https://doi.org/10.29007/fm8f. https://easychair.org/publications/paper/Fglr