A Big Data Pipeline and Machine Learning for Uniform Semantic Representation of Data and Documents From IT Systems of the Italian Ministry of Justice

Author:

Di Martino Beniamino1ORCID,Colucci Cante Luigi2,D'Angelo Salvatore3,Esposito Antonio4ORCID,Graziano Mariangela2,Marulli Fiammetta5,Lupi Pietro6,Cataldi Alessandra7

Affiliation:

1. Dept. of Engineering, Università della Campania "Luigi Vanvitelli", Italy & Dept. of Computer Science, University of Vienna, Austria & Dept. of Computer Science and Information Engineering, Asia University,, Taiwan & Consorzio Interuniversitario Nazionale per l'Informatica, Italy

2. Dept. of Engineering, Università della Campania “Luigi Vanvitelli”, Italy

3. Dept. of Engineering, Università della Campania "Luigi Vanvitelli", Italy & Consorzio Interuniversitario Nazionale per l'Informatica, Italy

4. Università della Campania “Luigi Vanvitelli”, Italy & Consorzio Interuniversitario Nazionale per l'Informatica, Italy

5. Dept. of Mathematics, Università della Campania “Luigi Vanvitelli”, Italy & Consorzio Interuniversitario Nazionale per l'Informatica, Italy

6. Tribunale di Napoli, Ministero della Giustizia, Italy

7. Directorate General for Automated Information Systems (DGSIA), Ministero della Giustizia, Italy

Abstract

In this paper a Big Data Pipeline is presented, taking in consideration both structured and unstructured data made available by the Italian Ministry of Justice, regarding their Telematic Civil Process. Indeed, the complexity and volume of the data provided by the Ministry requires the application of Big Data analysis techniques, in concert with Machine and Deep Learning frameworks, to be correctly analysed and to obtain meaningful information that could support the Ministry itself in better managing Civil Processes. The Pipeline has two main objectives: to provide a consistent workflow of activities to be applied to the incoming data, aiming at extracting useful information for the Ministry's decision making tasks; to homogenize the incoming data, so that they can be stored in a centralized and coherent Datalake to be used as a reference for further analysis and considerations.

Publisher

IGI Global

Subject

Computer Networks and Communications

Reference26 articles.

1. Deep learning applications and challenges in big data analytics.;M. M.Apress. Najafabadi;Journal of Big Data,2015

2. Aprosio, A. P., & Moretti, G. (2016). Italy goes to stanford: a collection of corenlp modules for italian. arXiv preprint arXiv:1609.06204.

3. An associative engines based approach supporting collaborative analytics in the internet of cultural things.;P.Benedusi;Proceedings of the 3rd international workshop on cloud and distributed system application and the 10th international 3pgcic-2015 conference.,2015

4. Enriching Word Vectors with Subword Information

5. Inducing Relational Knowledge from BERT

Cited by 9 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3