A Big Data Pipeline and Machine Learning for Uniform Semantic Representation of Data and Documents From IT Systems of the Italian Ministry of Justice-Reference-Cited by-同舟云学术

A Big Data Pipeline and Machine Learning for Uniform Semantic Representation of Data and Documents From IT Systems of the Italian Ministry of Justice

Published:2022-06-29 Issue:1 Volume:14 Page:1-31
ISSN:1938-0259
Container-title:International Journal of Grid and High Performance Computing
language:ng
Short-container-title:

Author:

Di Martino Beniamino¹^ORCID,Colucci Cante Luigi²,D'Angelo Salvatore³,Esposito Antonio⁴^ORCID,Graziano Mariangela²,Marulli Fiammetta⁵,Lupi Pietro⁶,Cataldi Alessandra⁷

Affiliation:

1. Dept. of Engineering, Università della Campania "Luigi Vanvitelli", Italy & Dept. of Computer Science, University of Vienna, Austria & Dept. of Computer Science and Information Engineering, Asia University,, Taiwan & Consorzio Interuniversitario Nazionale per l'Informatica, Italy

2. Dept. of Engineering, Università della Campania “Luigi Vanvitelli”, Italy

3. Dept. of Engineering, Università della Campania "Luigi Vanvitelli", Italy & Consorzio Interuniversitario Nazionale per l'Informatica, Italy

4. Università della Campania “Luigi Vanvitelli”, Italy & Consorzio Interuniversitario Nazionale per l'Informatica, Italy

5. Dept. of Mathematics, Università della Campania “Luigi Vanvitelli”, Italy & Consorzio Interuniversitario Nazionale per l'Informatica, Italy

6. Tribunale di Napoli, Ministero della Giustizia, Italy

7. Directorate General for Automated Information Systems (DGSIA), Ministero della Giustizia, Italy

Abstract

In this paper a Big Data Pipeline is presented, taking in consideration both structured and unstructured data made available by the Italian Ministry of Justice, regarding their Telematic Civil Process. Indeed, the complexity and volume of the data provided by the Ministry requires the application of Big Data analysis techniques, in concert with Machine and Deep Learning frameworks, to be correctly analysed and to obtain meaningful information that could support the Ministry itself in better managing Civil Processes. The Pipeline has two main objectives: to provide a consistent workflow of activities to be applied to the incoming data, aiming at extracting useful information for the Ministry's decision making tasks; to homogenize the incoming data, so that they can be stored in a centralized and coherent Datalake to be used as a reference for further analysis and considerations.

Publisher

IGI Global

Subject

Computer Networks and Communications

Reference26 articles.

1. Deep learning applications and challenges in big data analytics.;M. M.Apress. Najafabadi;Journal of Big Data,2015

2. Aprosio, A. P., & Moretti, G. (2016). Italy goes to stanford: a collection of corenlp modules for italian. arXiv preprint arXiv:1609.06204.

3. An associative engines based approach supporting collaborative analytics in the internet of cultural things.;P.Benedusi;Proceedings of the 3rd international workshop on cloud and distributed system application and the 10th international 3pgcic-2015 conference.,2015

4. Enriching Word Vectors with Subword Information

5. Inducing Relational Knowledge from BERT

Cited by 14 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A semantic-based methodology for the management of document workflows in e-government: a case study for judicial processes;Knowledge and Information Systems;2024-03-14

2. Towards a Semantic Annotation Software Design for Images and Texts;Lecture Notes on Data Engineering and Communications Technologies;2024

3. Semantic, Business Process and Natural Language Processing for eBuilding;Lecture Notes on Data Engineering and Communications Technologies;2024

4. Text Annotation Tools: A Comprehensive Review and Comparative Analysis;Lecture Notes on Data Engineering and Communications Technologies;2024

5. Towards a Methodology for Comparing Legal Texts Based on Semantic, Storytelling and Natural Language Processing;Lecture Notes on Data Engineering and Communications Technologies;2024