Abstract
The image acquisition process involves finding regions of interest and defining feature vectors as visual features of the image. This encompasses local and global delineations for specific areas of interest, enabling the classification of images through the extraction of high-level and low-level information. The proposed approach computes the Harris determinants and Hessian matrix after converting the input image to grayscale. Blob structuring is then performed to identify potential regions of interest that can adequately describe texture, color, and shape at different representation levels and the Harris corner detector is used to identify keypoints within these regions. Moreover, scale adaptation method is applied to the determinants of the Harris matrix and the Laplacian operator to extract scale-invariant features. Meanwhile, the input image undergoes processing through VGG-19, DenseNet, and AlexNet architectures to extract features representing diverse levels of abstraction. Furthermore, the RGB channels of the input image are extracted and their color values are computed. All extracted features local, global, and color are then integrated in feature set and encoded through a bag-of-words model to rank and retrieve images based on their shared visual characteristics. The proposed technique is tested on challenging datasets including Caltech-256, Cifar-10, and Corel-1000. The presented approach shows remarkable precision, recall and f-score rates in most of the image categories. The proposed approach leverages the complementary strengths of multiple feature extraction techniques to achieve high accuracy.