Contribution Analysis of Large Language Models and Data Augmentations for Person Names in Solving Legal Bar Examination at COLIEE 2023

Author:

Onaga Takaaki,Fujita Masaki,Kano Yoshinobu

Abstract

AbstractThis paper describes our system for COLIEE 2023 Task 4, which automatically answers Japanese legal bar exam problems. We propose an extension to our previous system in COLIEE 2022, which achieved the highest accuracy among all submissions using data augmentation. We focus on problems that include mentions of person names. In this paper, we present two main contributions. First, we incorporate LUKE as our deep learning component, which is a named entity recognition model trained on RoBERTa. Second, we fine-tune the pretrained LUKE model in multiple ways, comparing fine-tuning on training datasets that include alphabetical person names and ensembling different fine-tuning models. We confirmed that LUKE and its fine-tuned model on person type problems improve their accuracies. Our formal run results show that LUKE and our fine-tuning approach using alphabetical person names were effective, achieving an accuracy of 0.69 in the COLIEE 2023 Task 4 formal run.

Funder

KAKENHI

Secom Science and Technology Foundation

Publisher

Springer Science and Business Media LLC

Reference19 articles.

1. Competition on legal information extraction/entailment (coliee-14) workshop on ju-ris-informatics (jurisin) 2014 (2014). http://webdocs.cs.ualberta.ca/miyoung2/jurisin_task/index.html

2. Devlin, J., Chang, M.W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171–4186. Association for Computational Linguistics, Minneapolis, Minnesota. https://doi.org/10.18653/v1/N19-1423. https://aclanthology.org/N19-1423

3. Fujita, M., Onaga, T., Ueyama, A., & Kano, Y. (2022). Legal textual entailment using ensemble of rule based and bert based method with data augmentaion by rekated articke generation. In: Proceedings of the Sixteenth International Workshop on Jurisinformatics (JURISIN 2022), pp. 84–97.

4. Hoshino, R., Kiyota, N., & Kano, Y. (2019). Question answering system for legal bar examination using predicate argument structures focusing on exceptions. In: Proceedings of the Sixth International Competition on Legal Information Extraction/Entailment (COLIEE), pp. 38–42.

5. Kano, Y., Kim, M.Y., Goebel, R., & Satoh, K. (2017). Overview of coliee 2017. In K. Satoh, M.Y. Kim, Y. Kano, R. Goebel, T. Oliveira (Eds.) COLIEE 2017. 4th Competition on Legal Information Extraction and Entailment, EPiC Series in Computing, vol. 47, pp. 1–8. EasyChair. https://doi.org/10.29007/fm8f. https://easychair.org/publications/paper/Fglr

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3