threads-ask-ubuntu dataset
This is a temporal higher-order network dataset, which here means a
sequence of timestamped simplices where each simplex is a set of
nodes. In this dataset, nodes are users
on askubuntu.com, and a simplex
comes from users participating in a thread that lasts for at most 24
hours. The timestamps are the time of the post in millisecond but
normalized so that the earliest post starts at 0. The projected
graph is a weighted undirected graph representing how many times
each pair of nodes co-appears in a simplex. Some basic statistics of
this dataset are:
- number of nodes: 125,602
- number of timestamped simplices: 192,947
- number of unique simplices: 167,001
- number of edges in projected graph: 187,157
- threads-ask-ubuntu.tar.gz (timestamped simplices and simplex labels)
- threads-ask-ubuntu-proj-graph.tar.gz (weighted projected graph)
- Simplicial closure and higher-order link prediction.
Austin R. Benson, Rediet Abebe, Michael T. Schaub, Ali Jadbabaie, and Jon Kleinberg.
Proceedings of the National Academy of Sciences (PNAS), 2018. [bibtex]