Ontology-Based Mapping for Automated Document Management-Reference-Cited by-同舟云学术

Ontology-Based Mapping for Automated Document Management

Published:2015-04-03 Issue:1 Volume:6 Page:1-22
ISSN:2158-656X
Container-title:ACM Transactions on Management Information Systems
language:en
Short-container-title:ACM Trans. Manage. Inf. Syst.

Author:

Lee Yen-Hsien¹,Hu Paul Jen-Hwa²,Tu Ching-Yi³

Affiliation:

1. National Chiayi University

2. University of Utah

3. ASE Group Kaohsiung

Abstract

Document clustering is crucial to automated document management, especially for the fast-growing volume of textual documents available digitally. Traditional lexicon-based approaches depend on document content analysis and measure overlap of the feature vectors representing different documents, which cannot effectively address word mismatch or ambiguity problems. Alternative query expansion and local context discovery approaches are developed but suffer from limited efficiency and effectiveness, because the large number of expanded terms create noise and increase the dimensionality and complexity of the overall feature space. Several techniques extend lexicon-based analysis by incorporating latent semantic indexing but produce less comprehensible clustering results and questionable performance. We instead propose a concept-based document representation and clustering (CDRC) technique and empirically examine its effectiveness using 433 articles concerning information systems and technology, randomly selected from a popular digital library. Our evaluation includes two widely used benchmark techniques and shows that CDRC outperforms them. Overall, our results reveal that clustering documents at an ontology-based, concept-based level is more effective than techniques using lexicon-based document features and can generate more comprehensible clustering results.

Funder

National Science Council Taiwan

Publisher

Association for Computing Machinery (ACM)

Subject

General Computer Science,Management Information Systems

Link

https://dl.acm.org/doi/pdf/10.1145/2688488

Reference63 articles.

1. Local Feedback in Full-Text Retrieval Systems

2. Partitioning-based clustering for Web document categorization

3. Disambiguation by short contexts

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Review of ambiguity problem in text summarization using hybrid ACA and SLR;Intelligent Systems with Applications;2024-06

2. An Efficient Document Clustering Approach for Devising Semantic Clusters;Cybernetics and Systems;2023-02-11

3. Automating Research Data Management Using Machine-Actionable Data Management Plans;ACM Transactions on Management Information Systems;2022-06-30

4. Design of an Inclusive Financial Privacy Index (INF-PIE): A Financial Privacy and Digital Financial Inclusion Perspective;ACM Transactions on Management Information Systems;2021-03