A Unified Definition of Mutual Information with Applications in Machine Learning-Reference-Cited by-同舟云学术

A Unified Definition of Mutual Information with Applications in Machine Learning

Published:2015 Issue: Volume:2015 Page:1-12
ISSN:1024-123X
Container-title:Mathematical Problems in Engineering
language:en
Short-container-title:Mathematical Problems in Engineering

Author:

Zeng Guoping¹^ORCID

Affiliation:

1. Elevate, 4150 International Plaza, Fort Worth, TX 76109, USA

Abstract

There are various definitions of mutual information. Essentially, these definitions can be divided into two classes: (1) definitions with random variables and (2) definitions with ensembles. However, there are some mathematical flaws in these definitions. For instance, Class 1 definitions either neglect the probability spaces or assume the two random variables have the same probability space. Class 2 definitions redefine marginal probabilities from the joint probabilities. In fact, the marginal probabilities are given from the ensembles and should not be redefined from the joint probabilities. Both Class 1 and Class 2 definitions assume a joint distribution exists. Yet, they all ignore an important fact that the joint or the joint probability measure is not unique. In this paper, we first present a new unified definition of mutual information to cover all the various definitions and to fix their mathematical flaws. Our idea is to define the joint distribution of two random variables by taking the marginal probabilities into consideration. Next, we establish some properties of the newly defined mutual information. We then propose a method to calculate mutual information in machine learning. Finally, we apply our newly defined mutual information to credit scoring.

Publisher

Hindawi Limited

Subject

General Engineering,General Mathematics

Link

http://downloads.hindawi.com/journals/mpe/2015/201874.pdf

Reference20 articles.

1. Application of the mutual information criterion for feature selection in computer-aided diagnosis

2. A Mathematical Theory of Communication

Cited by 25 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A Novel Multivariate Feature Ranking Method for Incomplete Categorical Data Based on Weighted Frequency;2024 16th International Conference on Computer and Automation Engineering (ICCAE);2024-03-14

2. Bioclimatic similarity between species locations and their environment revealed by dimensionality reduction analysis;Ecological Informatics;2024-03

3. Identification of the associations between genes and quantitative traits using entropy-based kernel density estimation;Genomics & Informatics;2022-06-30

4. Automated Ocular Artifacts Removal Framework Based on Adaptive Chirp Mode Decomposition;IEEE Sensors Journal;2022-03-15

5. Improved radiation expression profiling in blood by sequential application of sensitive and specific gene signatures;International Journal of Radiation Biology;2021-11-12