Multi-Identity Recognition of Darknet Vendors Based on Metric Learning

Author:

Wang Yilei1,Hu Yuelin2ORCID,Xu Wenliang2,Zou Futai2ORCID

Affiliation:

1. Research Institute, State Grid Zhejiang Electric Power Co., Ltd., Hangzhou 311152, China

2. School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China

Abstract

Dark web vendor identification can be seen as an authorship aliasing problem, aiming to determine whether different accounts on different markets belong to the same real-world vendor, in order to locate cybercriminals involved in dark web market transactions. Existing open-source datasets for dark web marketplaces are outdated and cannot simulate real-world situations, while data labeling methods are difficult and suffer from issues such as inaccurate labeling and limited cross-market research. The problem of identifying vendors’ multiple identities on the dark web involves a large number of categories and a limited number of samples, making it difficult to use traditional multiclass classification models. To address these issues, this paper proposes a metric learning-based method for dark web vendor identification, collecting product data from 21 currently active English dark web marketplaces and using a multi-dimensional feature extraction method based on product titles, descriptions, and images. Using pseudo-labeling technology combined with manual labeling improves data labeling accuracy compared to previous labeling methods. The proposed method uses a Siamese neural network with metric learning to learn the similarity between vendors and achieve the recognition of vendors’ multiple identities. This method achieved better performance with an average F1-score of 0.889 and an accuracy rate of 97.535% on the constructed dataset. The contributions of this paper lie in the proposed method for collecting and labeling data for dark web marketplaces and overcoming the limitations of traditional multiclass classifiers to achieve effective recognition of vendors’ multiple identities.

Funder

The State Grid Science and Technology Program

Publisher

MDPI AG

Reference23 articles.

1. Untraceable electronic mail, return addresses, and digital pseudonyms;Chaum;Commun. ACM,1981

2. Anonymous connections and onion routing;Reed;IEEE J. Sel. Areas Commun.,1998

3. Astolfi, F., Kroese, J., and Van Oorschot, J. (2015). I2p—The Invisible Internet Project, Leiden University. Leiden University Web Technology Report.

4. Clarke, I., Sandberg, O., Wiley, B., and Hong, T.W. (2000, January 25–26). Freenet: A distributed anonymous information storage and retrieval system. Proceedings of the Designing Privacy Enhancing Technologies, Berkeley, CA, USA.

5. (2021). The 2021 Crypto Crime Report, Chainalysis.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3