Affiliation:
1. Huawei Technologies France SASU
Abstract
The recent success of Artificial Intelligence (AI) is rooted into several concomitant factors, namely theoretical progress coupled with abundance of data and computing power. Large companies can take advantage of a deluge of data, typically withhold from the research community due to privacy or business sensitivity concerns, and this is particularly true for networking data. Therefore, the lack of high quality data is often recognized as one of the main factors currently limiting networking research from fully leveraging AI methodologies potential.
Following numerous requests we received from the scientific community, we release AppClassNet, a commercial-grade dataset for benchmarking traffic classification and management methodologies. AppClassNet is significantly larger than the datasets generally available to the academic community in terms of both the number of samples and classes, and reaches scales similar to the popular ImageNet dataset commonly used in computer vision literature. To avoid leaking user- and business-sensitive information, we opportunely anonymized the dataset, while empirically showing that it still represents a relevant benchmark for algorithmic research. In this paper, we describe the public dataset and our anonymization process. We hope that AppClassNet can be instrumental for other researchers to address more complex commercial-grade problems in the broad field of traffic classification and management.
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Networks and Communications,Software
Reference70 articles.
1. https://www.image-net.org/download.php. https://www.image-net.org/download.php.
2. https://commoncrawl.org/. https://commoncrawl.org/.
3. https://recon.meddle.mobi/cross-market.html. https://recon.meddle.mobi/cross-market.html.
4. https://wand.net.nz/projects/details/libprotoident. https://wand.net.nz/projects/details/libprotoident.
5. https://sourceforge.net/projects/l7-filter/. https://sourceforge.net/projects/l7-filter/.
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. DataZoo: Streamlining Traffic Classification Experiments;Proceedings of the 2023 on Explainable and Safety Bounded, Fidelitous, Machine Learning for Networking;2023-12-05
2. Many or Few Samples?: Comparing Transfer, Contrastive and Meta-Learning in Encrypted Traffic Classification;2023 7th Network Traffic Measurement and Analysis Conference (TMA);2023-06-26
3. Few Shot Learning Approaches for Classifying Rare Mobile-App Encrypted Traffic Samples;IEEE INFOCOM 2023 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS);2023-05-20