Affiliation:
1. Università di Milano, Italy
Abstract
We describe the techniques developed to gather and distribute in a highly compressed, yet accessible, form a series of twelve snapshot of the .uk web domain. Ad hoc compression techniques made it possible to store the twelve snapshots using just 1:9 bits per link, with constant-time access to temporal information. Our collection makes it possible to study the temporal evolution link-based scores (e.g., PageRank), the growth of online communities, and in general time-dependent phenomena related to the link structure.
Publisher
Association for Computing Machinery (ACM)
Subject
Hardware and Architecture,Management Information Systems
Reference7 articles.
1. UbiCrawler: a scalable fully distributed Web crawler
2. The webgraph framework I
3. Efficient Storage and Retrieval by Content and Address of Static Files
4. R. M. Fano. On the number of bits required to implement an associative memory. Memorandum 61 Computer Structures Group Project MAC MIT Cambridge Mass. n.d. 1971. R. M. Fano. On the number of bits required to implement an associative memory. Memorandum 61 Computer Structures Group Project MAC MIT Cambridge Mass. n.d. 1971.
5. Efficient decoding of prefix codes
Cited by
102 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. TeGraph+: Scalable Temporal Graph Processing Enabling Flexible Edge Modifications;IEEE Transactions on Parallel and Distributed Systems;2024-08
2. Wings: Efficient Online Multiple Graph Pattern Matching;2024 IEEE 40th International Conference on Data Engineering (ICDE);2024-05-13
3. NewSP: A New Search Process for Continuous Subgraph Matching over Dynamic Graphs;2024 IEEE 40th International Conference on Data Engineering (ICDE);2024-05-13
4. Graph Computation with Adaptive Granularity;2024 IEEE 40th International Conference on Data Engineering (ICDE);2024-05-13
5. WebGraph: The Next Generation (Is in Rust);Companion Proceedings of the ACM Web Conference 2024;2024-05-13