Building a Production-Ready Multi-Label Classifier for Legal Documents with Digital-Twin-Distiller-Reference-Cited by-同舟云学术

Building a Production-Ready Multi-Label Classifier for Legal Documents with Digital-Twin-Distiller

Published:2022-01-29 Issue:3 Volume:12 Page:1470
ISSN:2076-3417
Container-title:Applied Sciences
language:en
Short-container-title:Applied Sciences

Author:

Csányi Gergely Márk^ORCID,Vági Renátó^ORCID,Nagy Dániel^ORCID,Üveges István^ORCID,Vadász János Pál^ORCID,Megyeri Andrea^ORCID,Orosz Tamás^ORCID

Abstract

One of the most time-consuming parts of an attorney’s job is finding similar legal cases. Categorization of legal documents by their subject matter can significantly increase the discoverability of digitalized court decisions. This is a multi-label classification problem, where each relatively long text can fit into more than one legal category. The proposed paper shows a solution where this multi-label classification problem is decomposed into more than a hundred binary classification problems. Several approaches have been tested, including different machine-learning and text-augmentation techniques to produce a practically applicable model. The proposed models and the methodologies were encapsulated and deployed as a digital-twin into a production environment. The performance of the created machine learning-based application reaches and could also improve the human-experts performance on this monotonous and labor-intensive task. It could increase the e-discoverability of the documents by about 50%.

Funder

National Research, Development and Innovation Office

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Link

https://www.mdpi.com/2076-3417/12/3/1470/pdf

Reference60 articles.

1. Multi-label legal document classification: A deep learning-based approach with label-attention and domain-specific pre-training

2. Long-length Legal Document Classification;Wan;arXiv,2019

3. Text summarization from legal documents: a survey

Cited by 5 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Toward Sustainable Development: Exploring the Value and Benefits of Digital Twins;Telecom;2024-08-12

2. How Could Semantic Processing and Other NLP Tools Improve Online Legal Databases?;TalTech Journal of European Studies;2023-12-01

3. Can Triplet Loss Be Used for Multi-Label Few-Shot Classification? A Case Study;Information;2023-09-23

4. Multi-Label Quantification;ACM Transactions on Knowledge Discovery from Data;2023-08-10

5. Explainable machine learning multi-label classification of Spanish legal judgements;Journal of King Saud University - Computer and Information Sciences;2022-11