Author:
Kozachok A. V., ,Spirin A. A.,Golembiovskaya O. M., ,
Abstract
Recently, the number of confidential data leaks caused by internal violators has increased. Since modern DLP-systems cannot detect and prevent information leakage channels in encrypted or compressed form, an algorithm was proposed to classify pseudo-random sequences formed by data encryption and compression algorithms. Algorithm for constructing a random forest was used. An array of the frequency of occurrence of binary subsequences of 9-bit length and statistical characteristics of the byte distribution of sequences was chosen as the feature space. The presented algorithm showed the accuracy of 0,99 for classification of pseudorandom sequences. The proposed algorithm will improve the existing DLP-systems by increasing the accuracy of classification of encrypted and compressed data.
Publisher
Tomsk State University of Control Systems and Radioelectronics (TUSUR)
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献