Tim O'Reilly, ever a few steps ahead of the rest of us, has some thoughtful musings on what's behind Google's new 411 service. Tim doesn't cast this as an open source move, but rather a Web 2.0 move designed to build up a treasure trove of data against which to build better speech recognition:But it also seems to me that there's a hidden story here about the speech recognition itself. I was talking recently to E Tim O’Reilly, ever a few steps ahead of the rest of us, has some thoughtful musings on what’s behind Google’s new 411 service. Tim doesn’t cast this as an open source move, but rather a Web 2.0 move designed to build up a treasure trove of data against which to build better speech recognition:But it also seems to me that there’s a hidden story here about the speech recognition itself. I was talking recently to Eckart Walther of Yahoo!, who used to be at Tellme, and he pointed out that speech recognition took a huge leap in capability when automated speech recognition started being used for directory assistance. All of a sudden, there were millions of voices, millions of accents to train speech recognition systems on, and much less need for the individual user to train the system. This is reminiscent of a comment that Peter Norvig, Director of Research at Google, made to me last year about automated translation, and why it’s getting better. “We don’t have better algorithms. We just have more data.” In short, I’m speculating that the 1-800-GOOG-411 service is designed to harvest voice data to build Google’s own speech database….Fascinating insight – whether true or not – and it shows why Sebastopol is the mecca for forward thinking in technology. It also makes me wonder if MySQL could somehow be “learning” from all the transactions that take place in its databases. Not that it does, but that it should, if a way were devised to do so without violating customer privacy. Surely there must be value in the way its users are tweaking the system/tables/etc.? In short, maybe there’s value for open source companies in learning how their software is modified…if for no other reason than to figure out how to create a more generally useful project in the first place. Open Source