Regularized Pairwise Relationship based Analytics for Structured Data-Reference-Cited by-同舟云学术

Regularized Pairwise Relationship based Analytics for Structured Data

Published:2023-05-26 Issue:1 Volume:1 Page:1-27
ISSN:2836-6573
Container-title:Proceedings of the ACM on Management of Data
language:en
Short-container-title:Proc. ACM Manag. Data

Author:

Luo Zhaojing¹^ORCID,Cai Shaofeng¹^ORCID,Wang Yatong²^ORCID,Ooi Beng Chin¹^ORCID

Affiliation:

1. National University of Singapore, Singapore, Singapore

2. University of Electronic Science and Technology of China, Chengdu, China

Abstract

In line with the increasing machine learning model inference accuracy, deep learning (DL) models have been increasingly applied to structured data for a wide spectrum of real-world applications, including product recommendations, online advertisement, healthcare analytics and risk analysis. However, unlike unstructured data, structured data is high-dimensional and sparse and therefore engenders a large number of parameters in DL, making DL models more prone to overfitting. To alleviate the overfitting problem, various regularization methods have been designed to constrain the model parameters as a means to control the model complexity. Unfortunately, these methods are often restricted to regularizing the parameter values directly without considering the intrinsic correlations and dependencies between attribute fields of structured data which is however key to effective structured data modeling. In this paper, we re-examine DL for structured data from a new perspective of attribute interactions. In particular, we seek to explicitly model and regularize the pairwise relationships between attribute fields of structured data, in a field-adaptive manner, via a proposed attentive and interpretable framework called ATT-Reg. Specifically, in this framework, a set of attentive weight matrices are introduced to each attribute field for modeling obviously different relationships with its neighboring attribute fields. Further, we derive from the Bayesian viewpoint a novel Attentive Regularization method for imposing adaptive regularization strengths on different pairs of attribute fields, based on the informativeness of their relationship, which is calculated using both data-driven information and functional dependency (FD) knowledge. Such adaptive regularization facilitates each attribute field to learn discriminative and diversified representations for more effective predictive analytics. We also develop a feature attribution method for supporting more interpretable predictions We validate the effectiveness of our ATT-Reg on six real-world datasets. Extensive experimental results show that ATT-Reg achieves significant improvement over state-of-the-art graph models, attentive models as well as regularization methods and supports an excellent degree of interpretation.

Funder

Singapore Ministry of Education Academic Research Fund Tier 3

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.1145/3588936

Reference64 articles.

1. Yuichiro Anzai . 2012. Pattern recognition and machine learning . Elsevier . Yuichiro Anzai. 2012. Pattern recognition and machine learning. Elsevier.

2. Nabiha Asghar and Amira Ghenai . 2015. Automatic discovery of functional dependencies and conditional functional dependencies: a comparative study. university of Waterloo ( 2015 ). Nabiha Asghar and Amira Ghenai. 2015. Automatic discovery of functional dependencies and conditional functional dependencies: a comparative study. university of Waterloo (2015).

3. Or Biran and Courtenay Cotton . 2017 . Explanation and justification in machine learning: A survey . In IJCAI-17 workshop on explainable AI , Vol. 8 . 8--13. Or Biran and Courtenay Cotton. 2017. Explanation and justification in machine learning: A survey. In IJCAI-17 workshop on explainable AI, Vol. 8. 8--13.

4. Using Word Embedding to Enable Semantic Queries in Relational Databases

5. Model slicing for supporting complex analytics with elastic inference cost and resource constraints

Cited by 5 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Applications and Challenges for Large Language Models: From Data Management Perspective;2024 IEEE 40th International Conference on Data Engineering (ICDE);2024-05-13

2. DMRNet: Effective Network for Accurate Discharge Medication Recommendation;2024 IEEE 40th International Conference on Data Engineering (ICDE);2024-05-13

3. Database Native Model Selection: Harnessing Deep Neural Networks in Database Systems;Proceedings of the VLDB Endowment;2024-01

4. ECGGAN: A Framework for Effective and Interpretable Electrocardiogram Anomaly Detection;Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining;2023-08-04

5. MINT: Detecting Fraudulent Behaviors from Time-Series Relational Data;Proceedings of the VLDB Endowment;2023-08