Author:
Monika ,Sharma Yogesh K.,Tomar Deepak S.,Pateriya R. K.
Abstract
Everyday there is an increase in the number of malwares being created which presents a significant danger to the Android systems holding a large share in the operating systems market. This surge in malware creation also makes it challenging to analyse and detect these malicious applications. Machine learning techniques are commonly used for malware detection, but the development of an effective system requires a reliable dataset to train and test it. This paper provides an overview of the most commonly used datasets in malware detection research conducted between 2015-2020, based on their performance, usability, availability, and effectiveness. By analysing and comparing these datasets, this paper aims to provide insights into the selection of appropriate datasets for future research in this area.