temporal-reddit-reply dataset
This is a temporal network of reddit comments, derived from a large collection of comments
curated by Jack Hessel et al.,
using data from Jason Baumgartner at pushshift.io.
In this temporal network, an edge (i, j, t) means that user i commented on user j's post or comment at
time t. Users whose acccounts were deleted were removed from the data.
Nodes are indexed from 1 to n. Some basic summary statistics of the dataset are as follows:
- number of nodes: 8,396,162
- number of timestamped edges: 636,295,809
- number of static edges: 517,201,096
- time span of dataset: 10.06 years
- temporal-reddit-reply.tar.gz (temporal network and node labels)
- Sampling Methods for Counting Temporal Motifs.
Paul Liu, Austin R. Benson, and Moses Charikar.
Proceedings of the ACM International Conference on Web Search and Data Mining (WSDM), 2019. [bibtex] - Science, AskScience, and BadScience: On the Coexistence of Highly Related Communities.
Jack Hessel, Chenhao Tan, and Lillian Lee.
Proceedings of the Tenth International AAAI Conference on Web and Social Media (ICWSM), 2016. [bibtex] - Jason Baumgartner. pushshift.io.