Some stuff » mit

language context, ambiguity and inference

admin — Thu, 19 Jan 2012 23:49:49 +0000

This article on today’s MIT front page sketches an argument that I’ve been thinking about for a while, ever since the IBM Watson Jeopardy contest — that natural language processing is hopeless to the extent that there is additional context (side information) not visible to the machine.

Many prominent linguists … have argued that language is … poorly designed for communication [and] … point to the existence of ambiguity: In a system optimized for conveying information…, they argue, each word would have just one meaning. … Now, a group of MIT cognitive scientists has turned this idea on its head … [T]hey claim that ambiguity actually makes language more efficient, by allowing for the reuse of short, efficient sounds that listeners can easily disambiguate with the help of context.

Although this is just talking about homophonic ambiguity at the word level, the same applies to all levels of language, including full messages whose decoding requires potentially deep context.

It isn’t surprising that human speakers use context — ad-hoc or shared understanding — to save effort. It is annoying to have to spell out everything and delightful when the listener understands anyway. It is like a communication problem, but with a power constraint on the encoder and a complexity constraint on the decoder. The listener (decoder) is really trying to choose the most probable meaning from all the possibilities, and the speaker (encoder) attempts to construct — specifically for the decoder’s context — the most modal distribution of meanings where the intended meaning takes the largest probability.

A few thoughts on this. One, successful encoder-decoder pairs synchronize context. Two, viewed in this way, double entendres and certain jokes are not mere oddities of language, but sophisticated communication schemes!

One could go further and argue that there is even a counterpart to steganography, because there is a certain amusement in hiding most of the intended information within the context, while making the direct message without context empty or obfuscating for any other decoder. An encoder that successfully does this knows the decoder well, and a decoder that successfully processes this information possesses either a large amount of shared context with the encoder or a sophisticated inference capability.

watson v. mit

admin — Thu, 17 Feb 2011 02:52:42 +0000

So being at the event captured in the image, I got to ask a question toward the end. Actually I asked two questions. The first was whether Watson would ring in and use the remaining 3 seconds or whatever to continue to compute. Gondek said it would if it helped. In actual competition it doesn’t appear to be the case, as the buzz-in thresholding condition ensured that further computation would not have been helpful. The second question was a follow-up on the identified weakness of Watson — learning on “common sense” knowledge. I asked what path AI research would take to tackle such knowledge, which are by its very definition, “not in the books.” Gondek said that IBM is building up semantic information (e.g. a “report” is something that can be “turned in” and “assessed,” etc.) from corpus. That wasn’t exactly what I was asking, however.

My point was whether all “knowledge” is written down. There is such a thing as experiential “knowledge,” and humans take years to learn it/be trained in it through parenting (i.e., to “mature”). If only there were a handbook on life, or that life could be learned through reading a series of textbooks, then perhaps I’d believe that the kind of general-purpose AI that most people are probably imagining (rather than an expert/Q&A system) can be achieved along the lines of current methods.

fail

admin — Mon, 15 Feb 2010 06:48:24 +0000

Must have been a piece of work by MIT students… Windsor Street near Mass. Ave.