sos-tags-math-sx dataset
This dataset is a collection of sequences of sets. Stack exchange is a
collection of question-and-answer web sites. Users post questions and
annotate them with up to 5 tags. In this dataset, each sequence is the
time-ordered set of tags applied to questions asked by a user on
Mathematics Stack Exchange.
All sequences contain at least 10 sets, and only sets of size at most
5 are considered. Some basic statistics of this dataset are:
- number of sequences: 15,726
- number of unique elements appearing in sets: 1,650
- number of sets: 517,810
- number of unique sets: 122,099
- Sequences of sets.
Austin R. Benson, Ravi Kumar, and Andrew Tomkins.
Proceedings of KDD, 2018. [bibtex]