nchoosetwo and collaborative ranking

Walking around campus these days, there are cryptic-looking things like

\(\binom{n}{2}\mathrm{.com}\) and \(\binom{n}{2} \ni \binom{i}{u}\)

obviously referring to a dating site — currently it’s restricted to MIT and Harvard students. This one tries on an idea that I’ve heard discussed numerous times in different contexts, but apparently nobody went and did it in all these years. Instead of running a matching algorithm, it asks third parties (i.e. matchmakers) as well as the interested parties themselves to suggest matches. The thing that is supposed to keep this low-risk is anonymity: a match isn’t revealed until the two primary parties involved mutually accept or their lists intersect.

As with all things that involve anonymity, this asks for trollish and antisocial behavior. I’ve already registered three aliases on moira for exactly this purpose — ok, ok, so they’ve suppressed that antic after people raised concerns, though these and other ramifications should have perhaps been worked through a bit more carefully pre-launch.

The spam potential remains. A matchmaker’s identity isn’t revealed unless both people accept her suggestion, so pranks and insults can be conducted to an extent. One way around this may be grafting social graph data onto the system for collaborative filtering (if they manage to obtain such data…). And if they do, perhaps the suggestions of more closely related people should weigh more, along with those of successful matchmakers. Perhaps there should even be more weight if multiple matchmakers concur. This is extremely intriguing, because eliminating spam is equivalent to predicting who is a likely match, and collaborative filtering for this problem is an unexplored direction.

The more fundamental question is why such a site is even necessary.
(Read the article)

zuckerberg of facebook, the machiavellian

According to this article, At Last — The Full Story Of How Facebook Was Founded, Mark Zuckerberg’s founding of Facebook was slightly devious, but no doubt, that little bit of ruthlessness and his generally risk-taking behavior at Harvard made his business succeed.

The story goes that on November 30, 2003, he was asked to write some code for an on-campus dating site called the Harvard Connection. Then on December 7, 2003, Zuckerberg wrote in private communication:

Check this site out: www.harvardconnection.com and then go to harvardconnection.com/datehome.php. Someone is already trying to make a dating site. But they made a mistake haha. They asked me to make it for them. So I’m like delaying it so it won’t be ready until after the facebook thing comes out.

Although it isn’t clear to me that Zuckerberg did not conceive of the “facebook thing” long before his meeting, for somebody who claimed he did not want to work under anybody else, it seems at least curious why he would accept spending time to write code for a similar site. On the other hand, he was skeptical about a straight dating site, designed facebook to be not primarily for dating but something more stealthily innocent (the right call). So it is possible that he was mostly concerned with the timing of the launches of two different sites, rather than their content.

In any case, the real shock is that the whole thing took only two months. The facebook domain was registered on January 11, 2004, and the site launched on February 4, 2004.

on analogic reality

I was on UK’s Daily Mail newspaper web site and suddenly this thing caught my eye. It was a flash ad promoting the newspaper, a veritable montage of “interesting things” that, through implication, the newspaper reported on. This is very common on local TV stations, which promote themselves this way. Innocuous enough, except I thought … wait a minute, isn’t this the infamous Kingdome implosion?

Sure enough, it was, and you can see it on Youtube.

Some local Seattle event is a bit far from the UK, and sure enough the Daily Mail never reported on this back in the day so far as I can tell. A bit of artistic liberty, surely, but perhaps there is more to it.
(Read the article)

stuff on internet tv

NBC hosts flash versions of all of its TV series on its site. The interface is generally good but there are some quirks about when ads must be viewed. The ads are forcibly inserted between chapters of an episode. Whenever the chapter boundary is crossed in the forward direction, the overlay ad is played and the underlying video is paused. (As an aside, this kludgy architecture actually makes me believe it is possible to disable the ads…) So for example, if you’ve already watched to certain chapter, then backtrack, then return to it, the boundary is crossed again in the forward direction and the ad must play again. This shouldn’t happen. Also, suppose you want to start playing a late chapter. The ad immediately before that chapter plays, but if the video was freshly loaded, the ad at the beginning of the video is also forcibly played before a chapter can even be selected. This also seems like incorrect behavior.

But I understand… playing ads too many times in error is of no concern to NBC, of course.

So what’s on NBC’s site? Most shows have only the latest episodes on a time delay (to not preempt live broadcast and DVD sales, I suppose), but a few have all past episodes available. One of these I’ve been watching is Friday Night Lights. Actually never saw any ads for this, but all of Season 1 was surprisingly good. Unfortunately, Season 2 was crap and total BS. Sadly, there is only so much good material for writers to crank out.

human deCAPTCHA service

About 10 years ago, when .NET was put out as a strategy for providing software services over the internet, I jokingly quipped that across the API interface, it’s just a black box, you’ll never know if you have actual humans answering your queries and passing the data back, as long as it’s in the right format! Imagine if “Jeeves” were an actual person answering what you “Ask”ed. Or if some translation tool were actually human-powered. It’d be pretty cool in a horrible way, like a reverse Turing-test. Students of the Humanities may even call it “dehumanizing” but we’re all evil engineers so who cares… hohoho

But guess what, this is an actual industry. Here is an article that shows, to my great amazement, that people have not only taken this concept to heart to solve the real problem (for spammers and hackers) of automated CAPTCHA decoding by low-wage humans, but they’ve even managed to load-balance the whole thing to reduce latency! What … the hell!
(Read the article)

Transcription: How Chinese Wikipedia fell into disarray

The evolution of the Chinese language Wikipedia follows a tortuous path. I suppose I’ve been around since the beginning, but really only to watch from the sidelines. In the beginning it was mostly mainland users who dominated in numbers, but since a year or two ago, with the on-again-off-again filtering of mainland Chinese users, the site has shifted towards more users from Taiwan, Hong Kong, and elsewhere.

In recent months, some changes were made to the site with interesting implications. These changes are fairly unique to the Chinese language site but there is something to be learned from them.
(Read the article)