A Review of the F-Measure: Its History, Properties, Criticism, and Alternatives-Reference-Cited by-同舟云学术

A Review of the F-Measure: Its History, Properties, Criticism, and Alternatives

Published:2023-10-06 Issue:3 Volume:56 Page:1-24
ISSN:0360-0300
Container-title:ACM Computing Surveys
language:en
Short-container-title:ACM Comput. Surv.

Author:

Christen Peter¹^ORCID,Hand David J.²^ORCID,Kirielle Nishadi¹^ORCID

Affiliation:

1. The Australian National University, Australia

2. Imperial College London, UK

Abstract

Methods to classify objects into two or more classes are at the core of various disciplines. When a set of objects with their true classes is available, a supervised classifier can be trained and employed to decide if, for example, a new patient has cancer or not. The choice of performance measure is critical in deciding which supervised method to use in any particular classification problem. Different measures can lead to very different choices, so the measure should match the objectives. Many performance measures have been developed, and one of them is the F-measure, the harmonic mean of precision and recall. Originally proposed in information retrieval, the F-measure has gained increasing interest in the context of classification. However, the rationale underlying this measure appears weak, and unlike other measures, it does not have a representational meaning. The use of the harmonic mean also has little theoretical justification. The F-measure also stresses one class, which seems inappropriate for general classification problems. We provide a history of the F-measure and its use in computational disciplines, describe its properties, and discuss criticism about the F-Measure. We conclude with alternatives to the F-measure, and recommendations of how to use it effectively.

Publisher

Association for Computing Machinery (ACM)

Subject

General Computer Science,Theoretical Computer Science

Link

https://dl.acm.org/doi/pdf/10.1145/3606367

Reference60 articles.

1. Frequency-tuned salient region detection

2. Evaluation of some coefficients for use in numerical taxonomy of microorganisms;Austin Brian;International Journal of Systematic and Evolutionary Microbiology,1977

3. Thomas Benton. 2001. Theoretical and empirical models. Ph. D. Dissertation. Department of Mathematics, Imperial College, London.

4. (Almost) all of entity resolution

Cited by 22 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Deep learning model for optimizing control and planning in stochastic manufacturing environments;Expert Systems with Applications;2024-12

2. Detection and comparison of reversible shape transformations in responsive polymers using deep learning and knowledge transfer by identifying stimulus-triggering characteristic points;Engineering Applications of Artificial Intelligence;2024-10

3. A grid-wise approach for accurate computation of Standardized Runoff Index (SRI);Science of The Total Environment;2024-10

4. NCBench: providing an open, reproducible, transparent, adaptable, and continuous benchmark approach for DNA-sequencing-based variant calling;F1000Research;2024-09-12

5. Predicting eye-tracking assisted web page segmentation;Multimedia Tools and Applications;2024-09-09