Assisting the appraisal of e-mail records with automatic classification

Author:

Vellino André,Alberts Inge

Abstract

Purpose This paper aims to investigate how automatic classification can assist employees and records managers with the appraisal of e-mails as records of value for the organization. Design/methodology/approach The study performed a qualitative analysis of the appraisal behaviours of eight records management experts to train a series of support vector machine classifiers to replicate the decision process for identifying e-mails of business value. Automatic classification experiments were performed on a corpus of 846 e-mails from two of these experts’ mailboxes. Findings Despite the highly contextual nature of record value, these experiments show that classifiers have a high degree of accuracy. Unlike existing manual practices in corporate e-mail archiving, machine classification models are not highly dependent on features such as the identity of the sender and receiver or on threading, forwarding or importance flags. Rather, the dominant discriminating features are textual features from the e-mail body and subject field. Research limitations/implications The need to automatically classify corporate e-mails is growing in importance, as e-mail remains one of the prevalent recordkeeping challenges. Practical implications Automated methods for identifying e-mail records promise to be of significant benefit to organizations that need to appraise e-mail for long-term preservation and access on demand. Social implications The research adopts an innovative approach to assist employees and records managers with the appraisal of digital records. By doing so, the research fosters new insights on the adoption of technological strategies to automate recordkeeping tasks, an important research gap. Originality/value Our experiment show that a SVM classifier can be trained to replicate an expert's decision process for identifying e-mails of business value with a reasonably high degree of accuracy. In principle, such a classifier could be integrated into a corporate Electronic Document and Records Management System (EDRMS) to improve the quality of e-mail records appraisal.

Publisher

Emerald

Subject

Library and Information Sciences,Management Information Systems

Reference62 articles.

1. Alberts, I. (2009), “Exploitation des genres de textes pour assister les pratiques textuelles dans les environnements numériques de travail: le cas du courriel chez des cadres et des secrétaires dans une municipalité et une administration fédérale canadiennes”, PhD Thesis, available at: http://hdl.handle.net/1866/2839

2. Challenges of information system use by knowledge workers: the email productivity paradox,2013

3. Email pragmatics and automatic classification: a study in the organizational context;Journal of the American Society for Information Science and Technology,2012

4. The importance of context in the automatic classification of email as records of business value: a pilot study;Proceedings of the ASIST Annual Meeting,2013

5. Apache Software Foundation (2015), “OpenNLP”, available at: https://opennlp.apache.org/

Cited by 17 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3