MSD-1030: A Well-built Multi-Sense Evaluation Dataset for Sense Representation Models

Introduction

A word similarity dataset with high proportion of multi-sense words that is designed to facilitate more reliable evaluations of sense embeddings.

Download

Click here to download the dataset. readme.txt describes each file and the format.

How to Cite this resource

Please cite the following paper when referring to MSD-1030 in academic publications and papers.

Ting-Yu Yen, Yang-Yin Lee, Yow-Ting Shiue, Hen-Hsen Huang, and Hsin-Hsi Chen. 2020. MSD-1030: A Well-built Multi-Sense Evaluation Dataset for Sense Representation Models. In Proceedings of 12th Language Resources and Evaluation Conference (LREC 2020), May 11-16, 2020, Palais du Pharo, France.