For the sake of simplicity: unsupervised extraction of lexical simplifications from Wikipedia.

Mark Yatskar, Bo Pang, Cristian Danescu-Niculescu-Mizil and Lillian Lee.

Proceedings of NAACL HLT, 2010. Short paper.



PDF



Poster



Data



xkcd



ABSTRACT:

                                   

We report on work in progress on extracting lexical simplifications (e.g., “collaborate” → “work together”), focusing on utilizing edit histories in Simple English Wikipedia for this task. We consider two main approaches: (1) deriving simplification probabilities via an edit model that accounts for a mixture of different operations, and (2) using metadata to focus on edits that are more likely to be simplification operations. We find our methods to outperform a reasonable baseline and yield many high-quality lexical simplifications not included in an independently-created manually prepared list.



BibTeX ENTRY:

                                    

@InProceedings{Yatskar+al:10a,

   author={Mark Yatskar and Bo Pang and Cristian Danescu-Niculescu-Mizil and

   Lillian Lee},

   title={For the sake of simplicity: {Unsupervised} extraction of lexical

   simplifications from {Wikipedia}},

   booktitle={Proceedings of NAACL HLT},

   year={2010},

   pages={365--368},

   annote={Short paper}

}