Abstract
There has been remarkable progress in the field of deep learning, particularly in areas such as image classification, object detection, speech recognition, and natural language processing. Convolutional Neural Networks (CNNs) have emerged as a dominant model of computation in this domain, delivering exceptional accuracy in image recognition tasks. Inspired by their success, researchers have explored the application of CNNs to tabular data. However, CNNs trained on structured tabular data often yield subpar results. Hence, there has been a demonstrated gap between the performance of deep learning models and shallow models on tabular data. To that end, Tabular-to-Image (T2I) algorithms have been introduced to convert tabular data into an unstructured image format. T2I algorithms enable the encoding of spatial information into the image, which CNN models can effectively utilize for classification. In this work, we propose two novel T2I algorithms, Binary Image Encoding (BIE) and correlated Binary Image Encoding (cBIE), which preserve complex relationships in the generated image by leveraging the native binary representation of the data. Additionally, cBIE captures more spatial information by reordering columns based on their correlation to a feature. To evaluate the performance of our algorithms, we conducted experiments using four benchmark datasets, employing ResNet-50 as the deep learning model. Our results show that the ResNet-50 models trained with images generated using BIE and cBIE consistently outperformed or matched models trained on images created using the previous State of the Art method, Image Generator for Tabular Data (IGTD).
Publisher
School of Statistics, Renmin University of China
Reference37 articles.
1. A comparative analysis of correlation approaches in finance;The Journal of Derivatives,2013
2. On over-fitting in model selection and subsequent selection bias in performance evaluation;Journal of Machine Learning Research,2010