PR-Miner

Author:

Li Zhenmin1,Zhou Yuanyuan1

Affiliation:

1. University of Illinois at Urbana-Champaign, Urbana, IL

Abstract

Programs usually follow many implicit programming rules, most of which are too tedious to be documented by programmers. When these rules are violated by programmers who are unaware of or forget about them, defects can be easily introduced. Therefore, it is highly desirable to have tools to automatically extract such rules and also to automatically detect violations. Previous work in this direction focuses on simple function-pair based programming rules and additionally requires programmers to provide rule templates.This paper proposes a general method called PR-Miner that uses a data mining technique called frequent itemset mining to efficiently extract implicit programming rules from large software code written in an industrial programming language such as C, requiring little effort from programmers and no prior knowledge of the software . Benefiting from frequent itemset mining, PR-Miner can extract programming rules in general forms (without being constrained by any fixed rule templates) that can contain multiple program elements of various types such as functions, variables and data types. In addition, we also propose an efficient algorithm to automatically detect violations to the extracted programming rules, which are strong indications of bugs.Our evaluation with large software code, including Linux, PostgreSQL Server and the Apache HTTP Server, with 84K--3M lines of code each, shows that PR-Miner can efficiently extract thousands of general programming rules and detect violations within 2 minutes. Moreover, PR-Miner has detected many violations to the extracted rules. Among the top 60 violations reported by PR-Miner, 16 have been confirmed as bugs in the latest version of Linux, 6 in PostgreSQL and 1 in Apache. Most of them violate complex programming rules that contain more than 2 elements and are thereby difficult for previous tools to detect. We reported these bugs and they are currently being fixed by developers.

Publisher

Association for Computing Machinery (ACM)

Reference31 articles.

1. A. V. Aho R. Sethi and J. D. Ullman. Compilers: principles techniques and tools. 1986.]] A. V. Aho R. Sethi and J. D. Ullman. Compilers: principles techniques and tools. 1986.]]

2. Mining specifications

3. Automatic generation of invariants and intermediate assertions

Cited by 88 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. API Misuse Detection via Probabilistic Graphical Model;Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis;2024-09-11

2. Towards a Block-Level ML-Based Python Vulnerability Detection Tool;Acta Cybernetica;2024-07-22

3. Fuzzing API Error Handling Behaviors using Coverage Guided Fault Injection;Proceedings of the 19th ACM Asia Conference on Computer and Communications Security;2024-07

4. Boosting API Misuse Detection via Integrating API Constraints from Multiple Sources;Proceedings of the 21st International Conference on Mining Software Repositories;2024-04-15

5. ASKDetector: An AST-Semantic and Key Features Fusion based Code Comment Mismatch Detector;Proceedings of the 32nd IEEE/ACM International Conference on Program Comprehension;2024-04-15

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3