Patch seriation to visualize data and model parameters-Reference-Cited by-同舟云学术

Patch seriation to visualize data and model parameters

Published:2023-09-09 Issue:1 Volume:15 Page:
ISSN:1758-2946
Container-title:Journal of Cheminformatics
language:en
Short-container-title:J Cheminform

Author:

Lasfar Rita,Tóth Gergely

Abstract

AbstractWe developed a new seriation merit function for enhancing the visual information of data matrices. A local similarity matrix is calculated, where the average similarity of neighbouring objects is calculated in a limited variable space and a global function is constructed to maximize the local similarities and cluster them into patches by simple row and column ordering. The method identifies data clusters in a powerful way, if the similarity of objects is caused by some variables and these variables differ for the distinct clusters. The method can be used in the presence of missing data and also on more than two-dimensional data arrays. We show the feasibility of the method on different data sets: on QSAR, chemical, material science, food science, cheminformatics and environmental data in two- and three-dimensional cases. The method can be used during the development and the interpretation of artificial neural network models by seriating different features of the models. It helps to identify interpretable models by elucidating clusters of objects, variables and hidden layer neurons. Graphical Abstract

Funder

NKFI

Eötvös Loránd University

Publisher

Springer Science and Business Media LLC

Subject

Library and Information Sciences,Computer Graphics and Computer-Aided Design,Physical and Theoretical Chemistry,Computer Science Applications

Link

https://link.springer.com/content/pdf/10.1186/s13321-023-00757-1.pdf

Reference49 articles.

1. Petrie WM (1899) Flinders sequences in prehistoric remains. J Anthropol Inst Great Br Irel 29:295–301

2. Bertin J (1981) Graphics and graphic information processing. Walter de Gruyter, Berlin, Boston. https://doi.org/10.1515/9783110854688

3. Brower JC, Kile KM (1988) Seriation of an original data matrix as applied to palaeoecology. Lethaia 21:79–93. https://doi.org/10.1111/j.1502-3931.1988.tb01756.x

4. Arabie P, Hubert LJ (1996) An overview of combinatorial data analysis. In: Arabie P, Hubert LJ, De Soete G (eds) Clustering and classification. World Scientific, River Edge, pp 5–63

5. Liiv I (2010) Seriation and Matrix Reordering Methods: an historical overview. Stat Anal Data Min 3:70–91. https://doi.org/10.1002/sam.10071

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. The difference of model robustness assessment using cross‐validation and bootstrap methods;Journal of Chemometrics;2024-01-11