language context, ambiguity and inference

This article on today’s MIT front page sketches an argument that I’ve been thinking about for a while, ever since the IBM Watson Jeopardy contest — that natural language processing is hopeless to the extent that there is additional context (side information) not visible to the machine.

Many prominent linguists … have argued that language is … poorly designed for communication [and] … point to the existence of ambiguity: In a system optimized for conveying information…, they argue, each word would have just one meaning. … Now, a group of MIT cognitive scientists has turned this idea on its head … [T]hey claim that ambiguity actually makes language more efficient, by allowing for the reuse of short, efficient sounds that listeners can easily disambiguate with the help of context.

Although this is just talking about homophonic ambiguity at the word level, the same applies to all levels of language, including full messages whose decoding requires potentially deep context.
(Read the article)


Should pandas be extinct? There is a lot to be said for the panda’s maladaptation to nature, but this comment is interesting:

Panda’s evolutionary path has lead them down a very narrow cul de sac from which there is no escape. By all Darwinian rights they should become extinct.

But I like pandas, as do many others. Pandas have the one survival trait that is very important in today’s environment – they have value to human beings.

And on that note,

How about DOGS people? I mean come on, those little dogs do not serve any purpose in this life!! Billions of dollars go to them for food and cleaning, drugs etc.

chivava or something, why waste resources on DOGS? esecially little one with nothing to do but look pretty!?

Indeed, looked at in this way, pandas and dogs are very well adapted — to humans, who determine their survival. It’s an increasingly singular niche that likely all animals will fall into. In a million years, if humans are still around, probably all animals will look like stuffed toys.

conformal cyclical cosmology

Something in the news here today, referring also to Penrose’s paper from several years ago.

In my limited understanding, Penrose suggests that the universe goes through these cycles of what can be interpreted as infinite expansions “followed by” big bangs, where the cycle renewal “happens” in a mathematical sense: in the way spacetime is metrized. He says that in the infinite future, when all massive particles will have evaporated, we will be returned to a situation without a notion of space or time (since all things are lightlike, I suppose). From this, the very large scale of the given final universe can be reinterpreted as the very small beginning of the next universe. It’s an interesting thought.


How do escalators work? I’ve wondered for years how escalators recycled their step blocks internally. At one point I thought they slid past each other on all four faces to save on turning radius (because I thought the blocks locked along grooves). Today I see an escalator under repair. Now the answer is clear. It’s much simpler than that: the blocks just turn along a track in the most obvious way imaginable.

(Read the article)

over-the-top legalese

I read this in a document today:

All masculine and singular pronouns shall include the feminine, neuter and plural thereof, and vice-versa, wherever the sense of the language so requires.

senate voting model graph

There was a talk today that referenced this paper by Banerjee, El Ghaoui, and d’Aspremont on obtaining sparse graphical models for parameterized distributions.

This undirected graphical model stating conditional independence relationships of senate voting behavior was shown.

If two nodes A and B are connected only through a set of nodes C, then A and B are independent, conditioned on C. Basically it says if you want to predict anything about B from A and C, then C is enough, because A won’t tell you anything more. As pretty as the graph looks, this is a rather odd visualization. Without seeing the (Ising) model parameters, especially where the edge weights are positive or negative, this graph is hard to interpret, and the conclusions in the paper are especially questionable to me. In particular, being in the middle of this graph does not necessarily imply “moderation” or “independence”, (unlike in let’s say this graph). We would expect moderates to exhibit weak dependency to either party’s large cliques. But if, for example, the edge weight between Allen and B. Nelson is a strongly negative one (which it very well may be, since the two parties are not otherwise connected via negatively weighted edges), then the graph seems to imply that how the two parties vote can largely be predicted from the votes of the likes of Allen or B. Nelson; in that sense, they are the indicators for their parties, disagreeing on exactly those party-disambiguating issues.

There is some additional funny stuff going on. According to the paper, a missing vote counts as a “no” because they only solved the problem for binary and Gaussian distributions. I also count only about 80 nodes in there, while there are 100 senators. The graph structure also seems a bit too sparse, but this may be intentional, in order to drop weak dependencies from the graph. One does wonder though, whether the results weren’t really that good without manual fudging.

Unrelatedly, this reminds me of another famous academic paper graph, the high school dating graph:

If you look carefully, there is some oddball stuff going on here, too.

google wave lacks structure

Got an invitation to Google Wave today. The problem I find immediately is the lack of structure. Say what you will about the restrictions of email or IM, but the same restrictions of those ways of communication, namely time-flow or thread-flow, are also well enforced structures to keep things sane. Wave takes away these and substitutes “playback.” Unfortunately, playback is not natural. (The other way is to fall back on social convention to keep order, but that rarely works with more than 2 peers.)

I think there are two options here. Either structure needs to be explicitly enforced or presentation needs to be refined.

In the former, perhaps it is better to only allow replies in certain places. Perhaps it is better to only allow edits in certain places. Perhaps it is better to separate the two and keep the distinction between edit mode, thread mode, and conversation mode, and only allow mixing in very restricted settings (or require some extra steps to discourage its use). After all, in preparing a shared endeavor, the purpose should be defined and known ahead of time.

In the latter, perhaps a lot of hiding and collapsing should be used. Perhaps hyperlinks should be used for in-place edits that often hijack a topic. And now that subthreads can sprout like a tree, it makes little sense to retain the linear structure of conversations. Perhaps a topic based graph, or a conversation stack would be the more appropriate presentation metaphor.

Wave is a good idea, but not well thought out. In its attempt to differentiate, it has forsaken useability for chaotic flexibility, which would have had redeeming value, were it matched by equally ambitious presentation/visualization.

chrome os, wave, collaboration

Something in the news says Chrome OS got a demo today. I don’t even care, since I don’t think what’s being demonstrated — a glorified PDA with internet connection — is, by itself, very interesting. What’s important is what runs on it that can’t be run in another way or with as much ease. What might that be? It seems to me this “novel experience” (not necessarily novel technology) is in the roadmap of Google and other big companies — but only in pieces spread among them, with none of them seeming to have the entirety of it. And that is ridiculous…

So Google has the ideas. Microsoft has the delivery mechanism in the form of the installed base and the ready platform with the ability to propogate via a simple update. Apple has the hardware designs and marketing to get people to adopt. Yet, each is missing the critical pieces held by others. And so we stall in Year 2009 as each company tries to replicate some existing thing that another company is already good at.

(This very good article gives too much credit to Google, I believe. The situation is a lot more symmetrical and Google should not be elevated to a privileged position. The current Chrome OS for netbooks, I believe, is a clear misread of the market. People want a better phone, not a worse computer, and Google will likely fail with this if they make the latter without the former (Android?) catching on first. I think the “PC companies” are not that far behind either. It’s much harder for inexperienced Google to make a good cloud client than for say Microsoft to deliver good cloud integration. In some sense, Microsoft’s lack of execution on this front is due to politics, i.e. lack of will-power to lose a cash cow until it is inevitable, not due to technical barriers.)
(Read the article)

product integrals

Nice. Used one today.

bank of america

Well, it gets resolved (in the stupidest way), if you’re patient.

Welcome to an online chat session at Bank of America. Please hold while we connect you to the next available Bank of America Online Banking Specialist. Your chat may be monitored and recorded for quality purposes. Your current wait time is approximately 2 minutes. Thank you for your patience.

We are currently experiencing a high volume of chat sessions. All Online Banking Specialists are currently assisting other customers. We apologize for any inconvenience. Thank you for your patience.

Thank you for choosing Bank of America. You are now being connected to a Bank of America Online Banking Specialist. (Read the article)

Next Page »