Affiliation:
1. Politecnico Di Milano, Milano, Italy
Abstract
The article presents a textual Big Data analytics solution developed in a real setting as a part of a high-capacity document digitization and storage system. A software based on machine learning techniques performs automated extraction and processing of textual contents. The work focuses on performance and data confidence evaluation and describes the approach to computing a set of indicators for textual data quality. It then presents experimental results.
Funder
Seamless Project
Regione Lombardia
EU Horizon 2020 Research and Innovation Programme
Working Age
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Graphics and Computer-Aided Design,Computer Science Applications,Information Systems,Conservation
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献