Abstract
AbstractJupyter notebooks have emerged as the predominant tool for data scientists to develop and share machine learning solutions, primarily using Python as the programming language. Despite their widespread adoption, a significant fraction of these notebooks, when shared on public repositories, suffer from insufficient documentation and a lack of coherent narrative. Such shortcomings compromise the readability and understandability of the notebook. Addressing this shortcoming, this paper introduces HeaderGen, a tool-based approach that automatically augments code cells in these notebooks with descriptive markdown headers, derived from a predefined taxonomy of machine learning operations. Additionally, it systematically classifies and displays function calls in line with this taxonomy. The mechanism that powers HeaderGen is an enhanced call graph analysis technique, building upon the foundational analysis available in PyCG. To improve precision, HeaderGen extends PyCG’s analysis with return-type resolution of external function calls, type inference, and flow-sensitivity. Furthermore, leveraging type information, HeaderGen employs pattern matching techniques on the code syntax to annotate code cells. We conducted an empirical evaluation on 15 real-world Jupyter notebooks sourced from Kaggle. The results indicate a high accuracy in call graph analysis, with precision at 95.6% and recall at 95.3%. The header generation has a precision of 85.7% and a recall rate of 92.8% with regard to headers created manually by experts. A user study corroborated the practical utility of HeaderGen, revealing that users found HeaderGen useful in tasks related to comprehension and navigation. To further evaluate the type inference capability of static analysis tools, we introduce TypeEvalPy, a framework for evaluating type inference tools for Python with an in-built micro-benchmark containing 154 code snippets and 845 type annotations in the ground truth. Our comparative analysis on four tools revealed that HeaderGen outperforms other tools in exact matches with the ground truth.
Publisher
Springer Science and Business Media LLC
Reference41 articles.
1. Pyright (2022) static type checker for Python. https://github.com/microsoft/pyright
2. Pytype (2022) Google, https://github.com/google/pytype
3. MOPSA/MOPSA (2024) analyzer $$\cdot $$ GitLab.https://gitlab.com/mopsa/mopsa-analyzer
4. Adeli M, Nelson N, Chattopadhyay S, Coffey H, Henley A, Sarma A (2020) Supporting Code Comprehension via Annotations: Right Information at the Right Time and Place. In: 2020 IEEE symposium on visual languages and human-centric computing (VL/HCC), pp 1–10. https://doi.org/10.1109/VL/HCC50065.2020.9127264
5. Allamanis M, Barr ET, Ducousso S, Gao Z (2020) Typilus: Neural type hints. In: Proceedings of the 41st ACM SIGPLAN conference on programming language design and implementation, ACM, London UK, pp 91–105. https://doi.org/10.1145/3385412.3385997