Task I: Citation Prediction
-
Winner: J N Manjunatha, Raghavendra Pandey, Sivaramakrishnan R., and
M Narasimha Murty (1329)
-
Second place: Claudia Perlich, Foster Provost, and Sofus Macskassy
(1360)
-
Third place: David Vogel (1398)
The number in parentheses after each winner is the L_1 difference between the
solution and the submission.
The solution for Task 1 is now available. The
first column is the hep-th arxiv-id and the second column is (# of citations
from May-July) - (# of citations from Feb-April) for all papers that received
at least 6 citations between Feb and April.
In addition, the full list of new citations
for all papers between May and July is also available.
Task II: Data Cleaning
-
Winner: David Vogel (421,582)
-
Second place: Sunita Sarawagi, Kapil M. Bhudhia, Sumana Srinivasan,
and V.G.Vinod Vydiswaran (516,242)
-
Third place: Martine Cadot and Joseph di Martino (538,013)
The number in parentheses after each winner is the size of the symmetric
difference between the submission and the solution.
The solution for Task 2 is a citation graph provided by SLAC/SPIRES for hep-ph
papers available as a zip file. Papers in the
left column cite papers in the right column.
Task III: Download Estimation
-
Winner: Janez Brank and Jure Leskovec (21,232)
-
Second place: Joseph Milana, Joseph Sirosh, Joel Carleton, Gabriela
Surpi, Daragh Hartnett, and Michinari Momma (21,950.6)
-
Third place: Kohsuke Konishi (23,759)
The number in parentheses after each winner is the L_1 difference between the
contestant's submission and the solution.
The actual download counts for the top 150 papers (50 from each of the three
missing periods) are available here. The left
column is the number of downloads the paper received in its first 60 days and
the right column is the hep-th arxiv-id.
Task IV: Open Task
-
Winner: Amy McGovern, Lisa Friedland, Michael Hay, Brian Gallagher,
Andrew Fast, Jennifer Neville, and David Jensen. "Exploiting Relational
Structure to Understand Publication Patterns in High-Energy Physics"
-
Second place: Shou-de Lin and Hans Chalupsky. "Using Unsupervised
Link Discovery Methods to Find Interesting Facts and Connections in a
Bibliography Dataset"
-
Third place: Shawndra Hill and Foster Provost "The Myth of the
Double-Blind Review"
The submissions for Task 4 were evaluated by a small program committee
consisting of the three
KDD Cup 2003 co-chairs, Mark
Craven (University of Wisconsin-Madison),
David Page (University of Wisconsin-Madison), and
Soumen Chakrabarti (Indian Institute of Technology Bombay).
|