MEM and MEM4PP: New Tools Supporting the Parallel Generation of Critical Metrics in the Evaluation of Statistical Models-Reference-Cited by-同舟云学术

MEM and MEM4PP: New Tools Supporting the Parallel Generation of Critical Metrics in the Evaluation of Statistical Models

Published:2022-10-12 Issue:10 Volume:11 Page:549
ISSN:2075-1680
Container-title:Axioms
language:en
Short-container-title:Axioms

Author:

Homocianu Daniel^ORCID,Tîrnăucă Cristina^ORCID

Abstract

This paper describes MEM and MEM4PP as new Stata tools and commands. They support the automatic reporting and selection of the best regression and classification models by adding supplemental performance metrics based on statistical post-estimation and custom computation. In particular, MEM provides helpful metrics, such as the maximum acceptable variance inflation factor (maxAcceptVIF) together with the maximum computed variance inflation factor (maxComputVIF) for ordinary least squares (OLS) regression, the maximum absolute value of the correlation coefficient in the predictors’ correlation matrix (maxAbsVPMCC), the area under the curve of receiving operator characteristics (AUC-ROC), p and chi-squared of the goodness-of-fit (GOF) test for logit and probit, and also the maximum probability thresholds (maxProbNlogPenultThrsh and maxProbNlogLastThrsh) from Zlotnik and Abraira risk-prediction nomograms (nomolog) for logistic regressions. This new tool also performs the automatic identification of the list of variables if run after most regression commands. After simple successive invocations of MEM (in a .do file acting as a batch file), the collectible results are produced in the console or exported to specially designated files (one .csv for all models in a batch). MEM4PP is MEM’s version for parallel processing. It starts from the same batch (the same .do file with its path provided as a parameter) and triggers different instances of Stata to parallelly generate the same results (one .csv for each model in a batch). The paper also includes some examples using real-world data from the World Values Survey (the evidence between 1981 and 2020, version number 1.6). They help us understand how MEM and MEM4PP support the testing of predictor independence, reverse causality checks, the best model selection starting from such metrics, and, ultimately, the replication of all these steps.

Publisher

MDPI AG

Subject

Geometry and Topology,Logic,Mathematical Physics,Algebra and Number Theory,Analysis

Link

https://www.mdpi.com/2075-1680/11/10/549/pdf

Reference48 articles.

1. Markdoc: Literate Programming in Stata

2. Insights into the area under the receiver operating characteristic curve (AUC) as a discrimination measure in species distribution modelling

3. A chi-square goodness-of-fit test for continuous distributions against a known alternative

4. Multi-collinearity in Regression Analyses Conducted in Epidemiologic Studies;Vatcheva;Epidemiology,2016

5. Introduction to Panel Data, Multiple Regression Method, and Principal Components Analysis Using Stata: Study on the Determinants of Executive Compensation—A Behavioral Approach Using Evidence from Chinese Listed Firms;Gao,2019

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Life Satisfaction: Insights from the World Values Survey;Societies;2024-07-15

2. Pairwise Collinearity Detection Using Parallel Algorithms: Preliminary Details;SSRN Electronic Journal;2024

3. Exploring the Predictors of Co-Nationals’ Preference over Immigrants in Accessing Jobs—Evidence from World Values Survey;Mathematics;2023-02-03