Improving Logging Prediction on Imbalanced Datasets

Author:

Lal Sangeeta1,Sardana Neetu1,Sureka Ashish2

Affiliation:

1. Jaypee Institute of Information Technology Noida, Department of CSE & IT, Noida, Uttar-Pradesh, India

2. ABB Corporate Research Center, Bangalore, India

Abstract

Logging is an important yet tough decision for OSS developers. Machine-learning models are useful in improving several steps of OSS development, including logging. Several recent studies propose machine-learning models to predict logged code construct. The prediction performances of these models are limited due to the class-imbalance problem since the number of logged code constructs is small as compared to non-logged code constructs. No previous study analyzes the class-imbalance problem for logged code construct prediction. The authors first analyze the performances of J48, RF, and SVM classifiers for catch-blocks and if-blocks logged code constructs prediction on imbalanced datasets. Second, the authors propose LogIm, an ensemble and threshold-based machine-learning model. Third, the authors evaluate the performance of LogIm on three open-source projects. On average, LogIm model improves the performance of baseline classifiers, J48, RF, and SVM, by 7.38%, 9.24%, and 4.6% for catch-blocks, and 12.11%, 14.95%, and 19.13% for if-blocks logging prediction.

Publisher

IGI Global

Subject

Software

Reference58 articles.

1. Who should fix this bug?;J.Anvik;Proceedings of the 28th International Conference on Software Engineering,2006

2. Apache Cloudstack. (n. d.). Retrieved from https://cloudstack.apache.org/

3. Apache Hadoop. (n. d.). Retrieved from https://hadoop.apache.org/

4. Apache Tomcat. (n. d.). Retrieved from http://tomcat.apache.org/

5. BlackBerry Enterprise Server Logs Submission. (2015). Retrieved from https://www.blackberry.com/beslog/

Cited by 6 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. An exploratory semantic analysis of logging questions;Journal of Software: Evolution and Process;2021-06-16

2. Logging Analysis and Prediction in Open Source Java Project;Research Anthology on Usage and Development of Open Source Software;2021

3. Three-level learning for improving cross-project logging prediction for if-blocks;Journal of King Saud University - Computer and Information Sciences;2019-10

4. A Three Dimensional Empirical Study of Logging Questions from Six Popular Q & A Websites;E-INFORMATICA;2019

5. Feature Selection Techniques to Counter Class Imbalance Problem for Aging Related Bug Prediction;Proceedings of the 11th Innovations in Software Engineering Conference;2018-02-09

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3