pvc-email-W3C dataset
This is a dataset with a "planted vertex cover" core-periphery
structure. Every edge in the network contains at least one node in the
core. The network comes from the corpus
of crawled
W3C mailing lists. The core nodes correspond to email addresses
with w3.org in the domain. The dataset contains the identification of
the core nodes, the timestamped emails, and the email addresses. Some
summary statistics of the network are:
- number of nodes: 20,081
- number of edges: 31,874
- number of core nodes: 1,994
- minimum vertex cover size: 1,107