We study distributional similarity measures for the purpose of improving probability estimation for unseen cooccurrences. Our contributions are three-fold: an empirical comparison of a broad range of measures; a classification of similarity functions based on the information that they incorporate; and the introduction of a novel function that is superior at evaluating potential proxy distributions.
@inproceedings{Lee:99a, author = {Lillian Lee}, title = {Measures of Distributional Similarity}, year = {1999}, pages = {25--32}, booktitle = {Proceedings of the ACL}, note = {25-year Test of Time award, ACL 2024} }
This material is based upon work supported by the National Science Foundation under Grant No. IRI9712068. Any opinions, findings, and conclusions or recommendations expressed are those of the authors and do not necessarily reflect the views or official policies, either expressed or implied, of any sponsoring institutions, the U.S. government, or any other entity.