Abstract
Imagine returning from an excused absence because of Covid-19 or any force majeure alike, and having to immediately face 300+ unread emails; getting overwhelmed by emails has become part of office workers’ daily routine. Numerous pieces of research have shown effective methods to categorize email messages, detect potential harassment, and even automatically send a reply. But still, email is an interesting type of text to analyze and gives rise to many challenges. First discussing the challenge in the problem, this paper aims to research, study, and propose a method that can deal with a specific challenge: making folders out of income email messages and then classifying emails automatically. By cooperating basic methods, techniques, and algorithms, an intuitive program is developed that can perform the task with the given public email dataset. The method is then expected to raise prospects for future investigations and improvements in performance and robustness.
Publisher
Darcy & Roy Press Co. Ltd.
Reference10 articles.
1. G. Mujtaba, L. Shuib, R. G. Raj, N. Majeed and M. A. Al-Garadi. Email Classification Research Trends: Review and Open Issues [J]. IEEE Access, 2017, 5, 9044-9064, 10.1109/ACCESS.2017.2702187.
2. Shinjae Yoo, Yiming Yang, Frank Lin, Il-Chul Moon. Mining Social Networks for Personalized Email Prioritization [B]. ACM, 2009. 10.1145/1557019.1557124
3. Hesham Altwaijry, Saeed Algarny. Bayesian-based intrusion detection system [J]. Journal of King Saud University - Computer and Information Sciences, 2012 24(1): 1-6.
4. Klimt, B., Yang, Y. The Enron Corpus: A New Dataset for Email Classification Research. Machine Learning: ECML 2004. 3201. https://doi.org/10.1007/978-3-540-30115-8_22.
5. Dhillon, I.S., Fan, J., Guan, Y. Efficient Clustering of Very Large Document Collections. Data Mining for Scientific and Engineering Applications. Massive Computing, 2001, 2.