An in-depth analysis of logarithmic data transformation and per-class normalization in machine learning: Application to unsupervised classification of a turbidite system in the Canterbury Basin, New Zealand, and supervised classification of salt in the Eugene Island minibasin, Gulf of Mexico

Author:

Ha Thang N.1,Lubo-Robles David1ORCID,Marfurt Kurt J.1ORCID,Wallet Bradley C.2ORCID

Affiliation:

1. The University of Oklahoma, School of Geosciences, Oklahoma, Norman 73019, USA.(corresponding author); .

2. Aramco Service Company, Aramco Research Center, Houston, Texas 77084, USA..

Abstract

In a machine-learning workflow, data normalization is a crucial step that compensates for the large variation in data ranges and averages associated with different types of input measured with different units. However, most machine-learning implementations do not provide data normalization beyond the z-score algorithm, which subtracts the mean from the distribution and then scales the result by dividing by the standard deviation. Although the z-score converts data with Gaussian behavior to have the same shape and size, many of our seismic attribute volumes exhibit log-normal, or even more complicated, distributions. Because many machine-learning applications are based on Gaussian statistics, we have evaluated the impact of more sophisticated data normalization techniques on the resulting classification. To do so, we provide an in-depth analysis of data normalization in machine-learning classifications by formulating and applying a logarithmic data transformation scheme to the unsupervised classifications (including principal component analysis, independent component analysis, self-organizing maps, and generative topographic mapping) of a turbidite channel system in the Canterbury Basin, New Zealand, as well as implementing a per-class normalization scheme to the supervised probabilistic neural network (PNN) classification of salt in the Eugene Island minibasin, Gulf of Mexico. Compared to the simple z-score normalization, a single logarithmic transformation applied to each input attribute significantly increases the spread of the resulting clusters (and the corresponding color contrast), thereby enhancing subtle details in projection and unsupervised classification. However, this same uniform transformation produces less-confident results in supervised classification using PNNs. We find that more accurate supervised classifications can be found by applying class-dependent normalization for each input attribute.

Publisher

Society of Exploration Geophysicists

Subject

Geology,Geophysics

Reference36 articles.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3