History and hosting Compressed graph
A compact and highly-efficient representation of the graph dataset, suited for scale-up analysis on high-end machines with large amounts of memory. The graph is compressed in Boldi-Vigna representation, designed to be loaded by the WebGraph framework, specifically using our swh-graph library.
- Comments
-
This is a compressed graph of only the "history and hosting" layer (origins, snapshots, releases, revisions) and the root directory (or rarely content) of every revision/release; but most directories and contents are excluded
- Dataset size
- 1 TiB
- Export date
- S3 URL
- s3://softwareheritage/graph/2022-12-07-history/compressed/
- Deprecated
- False
Download the dataset
For Amazon S3 links, you'll need to install either awscli or swh.datasets.
aws s3 cp --recursive --no-sign-request s3://softwareheritage/graph/2022-12-07-history/compressed/ 2022-12-07-history-compressed
# ORswh datasets download-graph 2022-12-07-history