Pandora's Bots: Watson Kicks Around the Humans Again, Skips Victory Lap

The third and final night of the IBM Watson Jeopardy Challenge (http://www-943.ibm.com/innovation/us/watson/) was last night, and IBM's green-haired wonder (I started thinking of the five emo-lines on its avatar as hair) pulled it out again. The individual score for the second game was much closer by Final Jeopardy than the first game - Watson had $23,440, and Ken had managed to rack up $18,200 (Brad had $5,600 and seemed rather cavalier about the whole thing - after all, given the prizes and the charity split, he was guaranteed $100,000 just for showing up).

Some analysis of the second game is making much of Watson's "changed" wagering behavior (it bet $17,973, where it had low-balled Final Jeopardy in the first game), but it's not very surprising. Watson was always basing its wagers on the two-game total score victory conditions. By Final Jeopardy (in fact, rather earlier than that, as I'm sure Watson was aware more than I was) in the second game, Watson had already mathematically eliminated the other two contestants...its large wager was based on the most it could risk while still guaranteeing overall victory. Ken low-balled for similar reasons. He knew there was zero chance of Watson over-bidding, and thus zero chance of him taking first place over-all. He might have tried for a symbolic victory in the second game, but I suspect he was being more practical and guaranteeing that he took second, for a much larger prize.

Performance-wise, Watson seemed to be having a few more category misunderstandings than during the first game. It had a few flubs, and many questions where I suspect if Ken had not beaten it to the buzzer, Watson's answer would have been rejected for poor phrasing. Both humans seemed to be doing significantly better in the buzzer contest this game, compared to the first - I bet my speculation about them practicing in their hotel rooms before bed has some truth to it. Though based on Watson's would-have-been-wrong answers, Ken might have actually had a chance to get closer to Watson's total if he'd been less fast on the buzzer, letting Watson blow it and lose score, then buzzing in for clean up. But given Watson's high percentage of correct answers, it seems clear Ken had to do whatever he could to maximize his chances of being allowed to answer (and score) at all.

Overall, I don't have much more to add to my analysis from the first game. Watson's technology is impressive (and as Ray K. mentions, is only going to improve) in practical terms, but it's not impressive in conceptual terms - the intelligence displayed isn't general and isn't reliable (say, for unsupervised question-answering that might have consequences)...and it's not clear if it's adaptable or learning. Still, it's much better at answering Jeopardy clues than I am, and demonstrably better at playing Jeopardy than two noted Champions of the game.

I was a bit worried that, past my excitement, my dismissive analysis of the technology is an example of the "moving target" that AI is, seemingly based on an innate human desire for exceptionalism - intelligence tends to be re-defined to exclude whatever the most recent AI has managed to accomplish. I don't think that's the case here - the output of Watson is both impressive and laughable (because when Watson is wrong, he's really wrong), but the means to produce it don't seem generally applicable. As Noam Chomsky put it when asked (by Gavin Schmitt), "I'm not impressed by a bigger steamroller."

The question is, how much is IBM's technology here going to advance the field? The hardware required to run Watson alone is apparently north of one million dollars. But I wonder how much of that is involved in the knowledge library and the relevant searches...could they pare down the natural language parts of the system into something more manageable and affordable? Either way, the costs are going to come down with time and volume, which IBM itself mentioned. Ideally, the language innovations they've come up with can be incorporated into more general, learning-based approaches which will produce more robust results.

The avatar is cute, though. I wonder if there will be Watson costumes this Halloween?

2011/02/17

Watson Kicks Around the Humans Again, Skips Victory Lap

No comments:

Post a Comment