author and committer timestamps were shifted back 1 or 2 hours, based on the Europe/Paris timezone, see https://gitlab.softwareheritage.org/swh/devel/swh-graph/-/issues/4788
If you use this dataset for research purposes, please acknowledge Software Heritage as recommended in the publications page, which means doing the next two things:
Add a footnote on the title page of your paper, formatted as: “This work was made possible by Software Heritage, the universal source code archive: https://www.softwareheritage.org”
Roberto Di Cosmo and Stefano Zacchiroli.
Software heritage: why and how to preserve software source code.
In Shoichiro Hara, Shigeo Sugimoto, and Makoto Goto, editors, Proceedings of the 14th International Conference on Digital Preservation, iPRES 2017, Kyoto, Japan, September 25-29, 2017. 2017.
URL: https://hdl.handle.net/11353/10.931064. (BibTeX)
Antoine Pietri, Diomidis Spinellis, and Stefano Zacchiroli.
The software heritage graph dataset: public software development under one roof.
In MSR 2019: The 16th International Conference on Mining Software Repositories, 138–142. IEEE, 2019.
doi:10.1109/MSR.2019.00030. (BibTeX)
If you use this dataset for research purposes, please acknowledge Software Heritage as recommended in the publications page, which means doing the next two things:
Add a footnote on the title page of your paper, formatted as: “This work was made possible by Software Heritage, the universal source code archive: https://www.softwareheritage.org”
Roberto Di Cosmo and Stefano Zacchiroli.
Software heritage: why and how to preserve software source code.
In Shoichiro Hara, Shigeo Sugimoto, and Makoto Goto, editors, Proceedings of the 14th International Conference on Digital Preservation, iPRES 2017, Kyoto, Japan, September 25-29, 2017. 2017.
URL: https://hdl.handle.net/11353/10.931064. (BibTeX)
This is a compressed graph of only the "history and hosting" layer (origins, snapshots, releases, revisions) and the root directory (or rarely content) of every revision/release; but most directories and contents are excluded
If you use this dataset for research purposes, please acknowledge Software Heritage as recommended in the publications page, which means doing the next two things:
Add a footnote on the title page of your paper, formatted as: “This work was made possible by Software Heritage, the universal source code archive: https://www.softwareheritage.org”
Roberto Di Cosmo and Stefano Zacchiroli.
Software heritage: why and how to preserve software source code.
In Shoichiro Hara, Shigeo Sugimoto, and Makoto Goto, editors, Proceedings of the 14th International Conference on Digital Preservation, iPRES 2017, Kyoto, Japan, September 25-29, 2017. 2017.
URL: https://hdl.handle.net/11353/10.931064. (BibTeX)
Antoine Pietri, Diomidis Spinellis, and Stefano Zacchiroli.
The software heritage graph dataset: public software development under one roof.
In MSR 2019: The 16th International Conference on Mining Software Repositories, 138–142. IEEE, 2019.
doi:10.1109/MSR.2019.00030. (BibTeX)
If you use this dataset for research purposes, please acknowledge Software Heritage as recommended in the publications page, which means doing the next two things:
Add a footnote on the title page of your paper, formatted as: “This work was made possible by Software Heritage, the universal source code archive: https://www.softwareheritage.org”
Cite the following papers:
Stefano Zacchiroli.
A large-scale dataset of (open source) license text variants.
In 19th IEEE/ACM International Conference on Mining Software Repositories, MSR 2022, Pittsburgh, PA, USA, May 23-24, 2022, 757–761. ACM, 2022.
URL: https://doi.org/10.1145/3524842.3528491, doi:10.1145/3524842.3528491. (BibTeX)
If you use this dataset for research purposes, please acknowledge Software Heritage as recommended in the publications page, which means doing the next two things:
Add a footnote on the title page of your paper, formatted as: “This work was made possible by Software Heritage, the universal source code archive: https://www.softwareheritage.org”
Roberto Di Cosmo and Stefano Zacchiroli.
Software heritage: why and how to preserve software source code.
In Shoichiro Hara, Shigeo Sugimoto, and Makoto Goto, editors, Proceedings of the 14th International Conference on Digital Preservation, iPRES 2017, Kyoto, Japan, September 25-29, 2017. 2017.
URL: https://hdl.handle.net/11353/10.931064. (BibTeX)