phs-email-W3C dataset
This is a hypergraph dataset with core-fringe structure constructed
from emails on W3C mailing lists. Nodes are labeled as either "core"
or "fringe", with core nodes corresponding to email addresses with a
w3c.org domain. Each hyperedge consists of a set of email addresses,
which have all appeared on the same email. Each hyperedge has at least
one core node, so the core forms a hitting set for the hypergraph. We
studied ways of recorvering core labels from network structure, i.e.,
the case of finding a planted hitting set. Some summary statistics of
the network are:
- number of nodes: 14,317
- number of hyperedges: 19,821
- number of core nodes: 1,509
- rank of hypergraph (maximum hyperedge size): 25