Author:
Ersavas Tansel,Smith Martin A.,Mattick John S.
Abstract
AbstractConvolutional Neural Networks (CNNs) have been central to the Deep Learning revolution and played a key role in initiating the new age of Artificial Intelligence. However, in recent years newer architectures such as Transformers have dominated both research and practical applications. While CNNs still play critical roles in many of the newer developments such as Generative AI, they are far from being thoroughly understood and utilised to their full potential. Here we show that CNNs can recognise patterns in images with scattered pixels and can be used to analyse complex datasets by transforming them into pseudo images with minimal processing for any high dimensional dataset, representing a more general approach to the application of CNNs to datasets such as in molecular biology, text, and speech. We introduce a pipeline called DeepMapper, which allows analysis of very high dimensional datasets without intermediate filtering and dimension reduction, thus preserving the full texture of the data, enabling detection of small variations normally deemed ‘noise’. We demonstrate that DeepMapper can identify very small perturbations in large datasets with mostly random variables, and that it is superior in speed and on par in accuracy to prior work in processing large datasets with large numbers of features.
Funder
Australian Government Research Training Program Scholarship
Fonds de Recherche du Quebec Santé
University of New South Wales
Publisher
Springer Science and Business Media LLC
Reference57 articles.
1. Taylor, P. Volume of data/information created, captured, copied, and consumed worldwide from 2010 to 2020, with forecasts from 2021 to 2025. https://www.statista.com/statistics/871513/worldwide-data-created/ (2023).
2. Ghys, É. The butterfly effect. in The Proceedings of the 12th International Congress on Mathematical Education: Intellectual and attitudinal challenges, pp. 19–39 (Springer). (2015).
3. Jolliffe, I. T. Mathematical and statistical properties of sample principal components. Principal Component Analysis, pp. 29–61 (Springer). https://doi.org/10.1007/0-387-22440-8_3 (2002).
4. Landauer, R. The noise is the signal. Nature 392, 658–659. https://doi.org/10.1038/33551 (1998).
5. Goodfellow, I., Bengio, Y. & Courville, A. Deep Learning (MIT Press). http://www.deeplearningbook.org (2016).