HENRY the Human Evolution News Relay

22May/08Off

“Lexomics” – Breaking the language barrier

Emma Marris in today's Nature reviews my field of research, and chats to a number of my friends and colleagues:

In the past five to ten years, more and more non-linguists such as Pagel have used the computational tools with which they model evolution to take a crack at languages. And one can see why. Like biological species, languages slowly change and sometimes split over time. Darwin's Galapagos finches evolved either large beaks or small; Latin amor became French amour and Italian amore. Darwin himself noted the 'curious parallel' between the evolution of languages and species in The Descent of Man, and Selection in Relation to Sex.

The advent of molecular genetics provided a new depth to the analogy. Just as the four nucleotides of DNA can produce a staggering variety of creatures, the alphabets of the world's languages can generate an infinite number of sentences. These alphabets, the words they make, and the sounds and grammar rules that frame them are passed down from parent to child in a process that, at least superficially, resembles the inheritance of DNA.

Even some complications are the same. Just as species can shade off into a maddening continuum of subspecies, populations and hybrids, languages dissolve into an untidy collection of dialects and intermediate forms. And the rampant borrowing of words between languages resembles, graphically at least, the promiscuous horizontal gene transfer that microbes engage in.

The full story is here, and I've written about some of the (our) research here before.

Comments (2) Trackbacks (0)
  1. The trouble with “lexomics” is, as some of the commenters on the Nature article pointed out,is that the process is Lamarckian, not Darwinian; it’s driven, not followed. If I was a lithping king, I could make all my thubjects lithp without too much trouble.

    The other major problem is that there aren’t any fossils*. All the ancestors are hypothetical proto-languages. If you take all the most common characteristics of an existing clade (or as many as you can find) and distill them down to the lowest common denominators, you’ll end up with a ‘proto-language’.

    But you can’t be at all sure that major characteristics of the original ancestral language have not been entirely lost, or preserved in only a minority of the existing remnant languages. (Which you ignored, just because they were a minority).

    Then, to trace the ‘descendents’ from this hypothetical language is absurd.

    Even then, though, I hope some of the newer generation of linguists (you, Simon? – please)can use the mechanical/statistical techniques used by geneticists to resolve some major ‘language family tree’ problems, like the star-like pattern of supposed descendents from proto-Austronesian and proto-Oceanic.

    regards

    Richard
    *Except where we have surviving scripts. But it was pointed out a long time ago that if the Comparative Method was used, retrospectively, on the Romance languages, the resulting proto-language would NOT be Latin.

  2. Hi Richard, thanks for the comment(s)!

    The trouble with “lexomics” is, as some of the commenters on the Nature article pointed out,is that the process is Lamarckian, not Darwinian; it’s driven, not followed. If I was a lithping king, I could make all my thubjects lithp without too much trouble.

    Sure, but this is entirely irrelevant. The methods we’re using (phylogenetics) are top-down views of evolution and just reconstruct the history. They’re relying on a system that evolves through “descent with modification” and not necessarily “survival of the fittest”. We discuss this in more detail in a paper called The Pleasures and Perils of Darwinizing Culture.

    The same goes for fossils – phylogenetic methods absolutely don’t need fossils. If we have them, then we can incorporate them. However, modern languages carry around remnants of their ancestors in them. That is, two languages that are closely related should share a lot of features with their ancestor. One of the benefits of phylogenetics is that we can infer ancestral states using probabilistic models to estimate the most likely ancestral variant. In fact the Bayesian likelihood methods we’re using deliberately sum across all possible ancestral state assignments at every node on every tree. Yes, it is just an estimate, but it’s the closest we can get without a time machine. I’m working on a paper explaining this in more detail, but it’s quite a way off yet, but Quentin and Russell’s paper here might help explain a bit.

    Hope this helps!
    –Simon

Trackbacks are disabled.

This blog uses DigoWatchWP an anti-fraud plugin for Wordpress.