Abstract
Recognizing food images presents unique challenges due to the variable spatial layout and shape changes of ingredients with different cooking and cutting methods. This study introduces an advanced approach for recognizing multiple ingredients segmented from food images. The method localizes the candidate regions of the ingredients using the locating and sliding window techniques. Then, these regions are assigned into ingredient classes using a convolutional neural network (CNN)-based single-ingredient classification model trained on a dataset of single-ingredient images. To address the challenge of processing speed in multi-ingredient recognition, a novel model pruning method is proposed to enhances the efficiency of the classification model. Subsequently, the multi-ingredient identification is achieved through a decision-making scheme, incorporating a novel top n algorithm with integrating the classification results from various candidate regions to improve the ingredient recognition accuracy. The single-ingredient image dataset, designed in accordance with the “New Food Ingredients List FOODS 2021”, encompasses 9,982 images across 110 diverse categories, emphasizing variety in ingredient shapes. In addition, a multi-ingredient image dataset is developed to rigorously evaluate the performance of our approach. Experimental results validate the effectiveness and efficiency of our method, particularly highlighting its competitive capability in recognizing multiple ingredients to SOTA methods. Furthermore, it is found that the CNN-based pruned model enhances the ingredient segmentation accuracy of food images. This marks a significant advancement in the field of food image analysis.