language context, ambiguity and inference

This article on today’s MIT front page sketches an argument that I’ve been thinking about for a while, ever since the IBM Watson Jeopardy contest — that natural language processing is hopeless to the extent that there is additional context (side information) not visible to the machine.

Many prominent linguists … have argued that language is … poorly designed for communication [and] … point to the existence of ambiguity: In a system optimized for conveying information…, they argue, each word would have just one meaning. … Now, a group of MIT cognitive scientists has turned this idea on its head … [T]hey claim that ambiguity actually makes language more efficient, by allowing for the reuse of short, efficient sounds that listeners can easily disambiguate with the help of context.

Although this is just talking about homophonic ambiguity at the word level, the same applies to all levels of language, including full messages whose decoding requires potentially deep context.
(Read the article)

watson v. mit

http://cache.boston.com/resize/bonzai-fba/Globe_Photo/2011/02/14/1297740468_0202/539w.jpg

So being at the event captured in the image, I got to ask a question toward the end. Actually I asked two questions. The first was whether Watson would ring in and use the remaining 3 seconds or whatever to continue to compute. Gondek said it would if it helped. In actual competition it doesn’t appear to be the case, as the buzz-in thresholding condition ensured that further computation would not have been helpful. The second question was a follow-up on the identified weakness of Watson — learning on “common sense” knowledge. I asked what path AI research would take to tackle such knowledge, which are by its very definition, “not in the books.” Gondek said that IBM is building up semantic information (e.g. a “report” is something that can be “turned in” and “assessed,” etc.) from corpus. That wasn’t exactly what I was asking, however.

My point was whether all “knowledge” is written down. There is such a thing as experiential “knowledge,” and humans take years to learn it/be trained in it through parenting (i.e., to “mature”). If only there were a handbook on life, or that life could be learned through reading a series of textbooks, then perhaps I’d believe that the kind of general-purpose AI that most people are probably imagining (rather than an expert/Q&A system) can be achieved along the lines of current methods.

fail

Must have been a piece of work by MIT students… Windsor Street near Mass. Ave.