The following file is a list of words from a wide range of Indo-European languages. The words are given here without diacritics. The list was used for a ‘lexicostatistical’ study of Indo-European. This is a method which comparative linguists use to estimate the date at which related languages diverged. A short list of 100 or 200 words which are known to change at a slow rate are used, rather than a complete dictionary. The words are analyzed using classic comparative linguistic methods to extract sets of ‘cognates’ — words that can be related by consistent sound changes. Then the number of shared words between each language is counted. Because we know the rate of evolution of Indo-European from historical records, this study is an important baseline. It can be used to derive chronologies for language groups for which there is no historical record, for instance the languages of New Guinea or Native America.
Comparative Indo-European Database Collected by Isidore Dyen 875,998 bytes.
Data Copyright © 1997 by Isidore Dyen, Joseph Kruskal, and Paul Black. Redistributable for academic, non-commercial purposes.
This is a set of small vocabularies of some of the more obscure members of Indo-European, mostly languages only known from very ancient inscriptions: