Why Rao et al.'s work proves nothing.

Richard Sproat
September, 2009; January, 2010

The popular press continues to wax enthusiastic about the recent work by Rajesh Rao (U. Washington) and colleagues (Science, PNAS), which most discussions claim "proves" that the Indus Valley symbol system was a true writing system -- because it has structure. In fact, they seem incapable of shutting up about it, with new stories appearing almost every week. I won't give links here, and thereby give Rao any more free press than he already has: a Google search on "Rao Indus Valley" or things of that nature will bring up many links.

(A particularly stupid article in The National, based in Abu Dhabi, featuring one of Rao's collaborator's, Ronojoy Adhikari, just appeared on January 7, 2010. Among the many things we learn in that article are that English writing is syllabic, Chinese writing has "an ideogram for every conceivable word", and Rao and colleagues' software reported in their Science article "worked for 45 minutes". My own reimplementation of Rao et al's work took a few seconds to run on any given corpus of the size they were dealing with. So unless the reporter who wrote the article got his facts completely wrong, which is entirely possible, they must have been doing something very very wrong.)

The popular discussion also frequently misprepresents the arguments from the 2004 paper that Farmer, Witzel and I wrote arguing that the system was not linguistic at all. A sampling of such discussion would have that we were acting "perhaps out of befuddlement and frustration" with attempts to make sense of the symbols, or that we argued that the symbols were just pretty pictures. Such speculations are amusing only insofar as they underscore the degree to which the reporters in question did not do their homework.

Notably absent in the vast majority of the popular write-ups is any evidence whatsoever that the reporters attempted to get input from the one group of experts who are most qualified to evaluate the work: computational linguists.

Not only the popular press, but various bloggers have come out on the side of Rao et al: the discussion in such venues often makes it abundantly clear that the discussants really do not understand how language works, how writing systems work, how non-linguistic symbol systems work, and what kind of evidence (if any) statistical arguments are likely to be able to provide.

Clearly people want it to be true that the Indus symbols constitute the writing system of a mysterious ancient civilization. Clearly people want to believe they can be deciphered. Clearly people want to believe that Rao and colleagues have used advanced computational techniques to uncover a crucial path towards this eventual goal. Unfortunately just because they make for a pretty story, does not mean that there is any basis for any of these beliefs.

In lieu of a much longer response I will just point out three simple things:

If you understand just these three points you will understand enough to critically evaluate the contribution that Rao et al. have supposedly made.

Feel free to write to me at are double you ess at ex oh bee ay dot
com if you think you can argue that Rao et al. have demonstrated that the Indus symbols must have been writing, and could not have been a non-linguistic system. I warn you though that the argument had better be a good one: I have seen quite a few attempts to support the work with further argumentation and nothing so far holds up under even mild scrutiny.