Affiliation:
1. Tel Aviv University, Tel Aviv, Israel
2. eBay Research, Netanya, Israel
Abstract
Tabular embedding methods have become increasingly popular due to their effectiveness in improving the results of various tasks, including classic databases tasks and machine learning predictions. However, most current methods treat these embedding models as "black boxes" making it difficult to understand the insights captured by the models. Our research proposes a novel approach to interpret these models, aiming to provide local and global explanations for the original data and detect potential flaws in the embedding models. The proposed solution is appropriate for every tabular embedding algorithm, as it fits the black box view of the embedding model. Furthermore, we propose methods for comparing different embedding models, which can help identify data biases that might impact the models' credibility without the user's knowledge. Our approach is evaluated on multiple datasets and multiple embeddings, demonstrating that our proposed explanations provide valuable insights into the behavior of tabular embedding methods. By making these models more transparent, we believe our research will contribute to the development of more effective and reliable embedding methods for a wide range of applications.
Funder
BSF - the US-Israel Binational Science foundation
iSF - the Israel Science foundation
Publisher
Association for Computing Machinery (ACM)
Reference54 articles.
1. 2015. Flights Dataset. https://www.kaggle.com/usdot/flight-delays'select=flights.csv.
2. 2020. Spotify Dataset. https://www.kaggle.com/datasets/mrmorj/dataset-of-songs-in-spotify.
3. 2023. TabEE git repository. https://github.com/KathyRaz/TabEE.
4. DIFF: a relational interface for large-scale data explanation
5. TabNet: Attentive Interpretable Tabular Learning