Affiliation:
1. Department of Computer Science, Swansea University, Swansea SA2 8PP, UK
2. UK Hydrographic Office, Taunton TA1 2DN, UK
Abstract
We present a novel approach to providing greater insight into the characteristics of an unlabelled dataset, increasing the efficiency with which labelled datasets can be created. We leverage dimension-reduction techniques in combination with autoencoders to create an efficient feature representation for image tiles derived from remote sensing satellite imagery. The proposed methodology consists of two main stages. Firstly, an autoencoder network is utilised to reduce the high-dimensional image tile data into a compact and expressive latentfeature representation. Subsequently, features are further reduced to a two-dimensional embedding space using the manifold learning algorithm Uniform Manifold Approximation and Projection (UMAP) and t-distributed Stochastic Neighbour Embedding (t-SNE). This step enables the visualization of the image tile clusters in a 2D plot, providing an intuitive and interactive representation that can be used to aid rapid and geographically distributed image labelling. To facilitate the labelling process, our approach allows users to interact with the 2D visualization and label clusters based on their domain knowledge. In cases where certain classes are not effectively separated, users can re-apply dimension reduction to interactively refine subsets of clusters and achieve better class separation, enabling a comprehensively labelled dataset. We evaluate the proposed approach on real-world remote sensing satellite image datasets and demonstrate its effectiveness in achieving accurate and efficient image tile clustering and labelling. Users actively participate in the labelling process through our interactive approach, leading to enhanced relevance of the labelled data, by allowing domain experts to contribute their expertise and enrich the dataset for improved downstream analysis and applications.
Subject
Computational Mathematics,Computational Theory and Mathematics,Numerical Analysis,Theoretical Computer Science