Rates of word evolution: The less a word is used, the faster it evolves
Today’s Nature has a couple of lovely little papers on language evolution. The first of these is Frequency of word-use predicts rates of lexical evolution throughout Indo-European history by some colleagues and friends of mine, Mark Pagel, Quentin Atkinson, and Andy Meade.
Here, Mark et al begin to explore a rather important issue in historical linguistics: how fast do words evolve? To put it simply, one common way of inferring historical relationships between languages is by looking for systematic sound correspondences in words. For example, the English word for “water” is related to the German word “wasser” by a simple sound change from “t” to “s”. We can track these changes to find sets of homologous words or “cognates”, which are words that have come from some common ancestral language.
For example, I’ve mapped the words meaning “hand” in a number of Austronesian languages (from my research project) in the figure below. The colors of the words mark the different homologous sets, so the white set, denoting some form like lima is very widespread and represents a form that’s been passed on down from the ancestor of all Austronesian languages, Proto-Austronesian which had an inferred form like *(qa)lima. You can also see a few other sets in that figure, including a few forms in Micronesia (colored red and orange), and a small blue set in Maluku.
So, given that we can use these bits of information to track historical relationships and do cool things with them, one of the big questions is how stable these things are. As time goes by, these words will be evolving and changing, and dying out, making it harder and harder to find the systematic sound correspondences needed. Most linguists would argue that in their experience, these sorts of historical relationships in the lexicon become washed out by around 10,000 years. Indeed, in the picture above, none of these languages is older than around 6,000 years, and some of them, such as the Micronesian group can’t be older than around 2,000 years, and the word for “hand” is exceptionally stable over time.
To try and quantify this, Mark, Quentin and Andy have estimated the rates of evolution in a number of words the Indo-European languages (i.e. most of the languages in Europe, including things like English, French and German). To do this, they used a sample of basic vocabulary to estimate a phylogenetic tree of these languages representing their historical relationships. They then estimated the replacement rate of each cognate set along the tree. That is, how long does it take for one homologous/cognate set to be replaced by another non-related form?
Their results show a 100-fold difference in the rates of word evolution within their basic vocabulary sample. Some of the words like “two”, “who”, “tongue”, “night” etc show a very slow rate of evolution with around one cognate set replacement per 10,000 years. Other words like “dirty”, “to stab”, and “guts” changed much faster with around 9 cognate set replacements per 10,000 years.
Big question: why is there this difference? One of the theories is that how fast a word evolves is linked to how often it’s used, so words that are used a lot don’t change as much as words that are used rarely. To test this hypothesis, Mark & co worked out how often their words were used in four large spoken and written language databases (aka “corpora”) for English, Spanish, Russian and Greek.
In each of these four languages, there was a strong significant negative correlation (r=-0.32 — r=-0.41, p < 0.0001) between the frequency of words in the corpora and their rates of evolution. That is, words that are used more today, had slower rates of evolution, confirming the above hypothesis.
One potential flaw here was that different types of speech are used with different frequencies, so it could be possible that they were just showing this effect. To check this, they recategorised their entries into classes like nouns, verbs, pronouns, conjunctions, prepositions or special adverbs (”what”, “when”, “where”, “how”, “there”, and “not”). Using a regression model, they controlled for this effect and showed that the correlation between word-use and rate of evolution still held for each class of word.
This is beautifully elegant stuff, and it has a lot of potential. One of the huge debates in linguistics revolves around deep history - as mentioned above, most linguist argue that all historical signal is lost by around 10,000 years. However, as this paper shows, there are some very stable words that might be able to be used to push things a little bit deeper…
Update: Watch Nature interview Mark and Quentin here.
Posted on
October 10th, 2007 by
Simon Greenhill
2 Responses to “Rates of word evolution: The less a word is used, the faster it evolves”
Leave a Reply
Categories
- africa
- americas
- anthropology
- art
- austronesian
- bacteria
- bees
- birds
- bongo-bongoism
- books
- chimpanzees
- conferences
- creationism-is-stupid
- cultural evolution
- culture
- dinosaurs
- disease
- europe
- evolution
- Evolutionary Psychology
- fossils
- genetics
- henry
- horizontal gene transfer
- human prehistory
- humor
- it-was-better-in-my-day
- language
- language preservation
- linguistics
- literature
- microsatellites
- misc
- mtDNA
- music
- neanderthals
- neuroscience
- new-caledonian-crows
- non-human
- ook!
- orangutans
- papers-I-should-read
- people
- phylogenetics
- polynesia
- primates
- psychology
- punctuated equilibrium
- quotes
- religion
- science
- self-improvement
- sexual selection
- six-degrees
- software
- SSTA
- stupidity
- tool-use
- Tree Tuesday
- Uncategorized
- websites
- wednesday-wiki
- Y chromosome
Related Sites
- Anthropology.net
- bayblab
- Computational Biology and Evolution
- Culture evolves!
- Dechronization
- Expelled
- Genomicron
- iPhylo
- John Hawks
- language.psy.auckland.ac.nz
- Of Two Minds
- Primatology.net
- Quentin Atkinson
- simon.net.nz


October 11th, 2007 at 7:31 pm
Reminds me of this article Philological Considerations on the Whence of the Maori from 1873:
October 12th, 2007 at 3:29 am
What a cool paper! Thanks for pointing that out, Conal, I’m reading it now
–Simon