sos-email-Enron-core dataset
This dataset is a collection of sequences of sets. Each sequence is
derived from the recipients of emails sent by a particular email
address at Enron. We restrict the dataset to the "core" group of
employees whose email inboxes were made public by the FERC
investigation of the company (each sequence corresponds to one
employee's emails). All sequences contain at least 10 sets, and only
sets of size at most 5 are considered. Some basic statistics of this
dataset are:
- number of sequences: 93
- number of unique elements appearing in sets: 141
- number of sets: 10,428
- number of unique sets: 649
- Sequences of sets.
Austin R. Benson, Ravi Kumar, and Andrew Tomkins.
Proceedings of KDD, 2018. [bibtex]