Linguistic Data Consortium
“The Linguistic Data Consortium (LDC) is an open consortium of universities, libraries, corporations and government research laboratories. LDC was formed in 1992 to address the critical data shortage then facing language technology research and development. Initially, LDC’s primary role was as a repository and distribution point for language resources. Since that time, and with the help of its members, LDC has grown into an organization that creates and distributes a wide array of language resources. LDC also supports sponsored research programs and language-based technology evaluations by providing resources and contributing organizational expertise. LDC is hosted by the University of Pennsylvania and is a center within the University’s School of Arts and Sciences. LDC’s connection with Penn provides a strong foundation for the Consortium’s research and outreach to an active and diverse member community.” – From About LDC
UVA does not pay membership fees, but is listed as the umbrella organization. Users should create an individual account listing University of Virginia as their institution. A local administrator will approve. (Jenn Huck, Data Librarian, and Erin Pappas, Linguistics Librarian, are the local administrators.) Licensing is paid by the end user by invoice or credit card – the library does not pay for any data/corpora.
There are no costs to create an account and download any data already in the account (please note there are some “E” datasets in the account which are part of a specific Evaluation someone participated in previously and are not for public download). Users can request data as non-members, each dataset in our catalog has a “View Fees” button which shows the fee schedule. Users can either use a credit card to license things via the online transaction system or can request an invoice be sent. Some corpora are free.
See Licensing Data as a Nonmember for information about obtaining data, licensing, and paying for data.
Questions? Contact email@example.com.
Last Updated: 30 Oct 2019