author and committer timestamps were shifted back 1 or 2 hours, based on the Europe/Paris timezone, see https://gitlab.softwareheritage.org/swh/devel/swh-graph/-/issues/4788
This is a compressed graph of only the "history and hosting" layer (origins, snapshots, releases, revisions) and the root directory (or rarely content) of every revision/release; but most directories and contents are excluded
If you use these datasets for research purposes, please cite the following paper:
Antoine Pietri, Diomidis Spinellis, Stefano Zacchiroli.
The Software Heritage Graph Dataset: Public software development under one roof.
In proceedings of MSR 2019: The 16th International Conference on Mining Software Repositories, May 2019, Montreal, Canada. Co-located with ICSE 2019.
preprint, bibtex