stackoverflow-answers dataset
The stackoverflow-answers network is a hypergraph where hyperedges are sets
of questions answered by users on Stack Overflow.
Nodes are labeled by the tags used in the questions, and nodes often have multiple labels.
Some summary statistics of the dataset are:
- number of nodes: 15,211,989
- number of hyperedges: 1,103,243
- mean / median hyperedge size: 23.7 / 5
- rank of hypergraph (maximum hyperedge size): 61,315
- number of node classes: 56,502
- Minimizing Localized Ratio Cut Objectives in Hypergraphs.
Nate Veldt, Austin R. Benson, and Jon Kleinberg.
Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2020. [bibtex]