Abstract
AbstractLegal professionals strongly demand an automatic and convenient legal document recommendation system (LDRS) to identify similar judgments for preparing the advantageous and strategic arguments in the Court. Doc2Vec excellently learns semantically rich embedding (i.e., vector) space from the textual information of judgment corpus. During Doc2Vec learning, the practice of prior domain-specific knowledge can potentially enhance the embedding representation. This research thus proposes a pre-learned word embedding based LDRS (P-LDRS) that learns the Doc2Vec embedding using Legal domain-specific pre-learned word embedding possessing the Legal semantic knowledge. However, learning the judgment embedding from existing substantial Legal documents turns out to be a scalability issue for Doc2Vec. The proposed P-LDRS also provides additional functionality to learn the judgment embedding distributedly over the cluster of computing nodes using frameworks like MapReduce and Spark to address the scalability issue. The empirical analysis is performed with a non-distributed and a distributed variant of the proposed P-LDRS to validate the effectiveness and scalability. Experiment results showcase that proposed non-distributed P-LDRS perform significantly better than traditional Doc2Vec based LDRS with an Accuracy of 0.88, F1-Score of 0.82 and MCC Score of 0.73. They also demonstrate that the proposed distributed P-LDRS improves the time efficiency and achieves stable Accuracy of $$\approx $$
≈
0.88, F1-Score of $$\approx $$
≈
0.83 and MCC Score of $$\approx $$
≈
0.72, with an increasing number of nodes.
Publisher
Springer Science and Business Media LLC
Subject
General Earth and Planetary Sciences,General Environmental Science
Reference32 articles.
1. Blei D, Ng A, Jordan M (2003) Latent dirichlet allocation. J Mach Learn Res 3(Jan):993–1022
2. Chakrabarti D, Patodia N, Bhattacharya U, Mitra I, Roy S, Mandi J, Roy N, Nandy P (2018) Use of artificial intelligence to analyse risk in legal documents for a better decision support. In: TENCON 2018-2018 IEEE region 10 conference, IEEE, pp 683–688
3. Chalkidis I, Kampas D (2019) Deep learning in law: early adaptation and legal word embeddings trained on large corpora. Artificial Intell Law 27(2):171–198
4. Chang LLH, Phoa FKH, Nakano J (2019) A new metric for the analysis of the scientific article citation network. IEEE Access 7:132027–132032
5. Chicco D (2017) Ten quick tips for machine learning in computational biology. BioData Min 10(35):1–17
Cited by
10 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献