Microsoft: from research to reality

feature
Jan 3, 200313 mins

Senior VP Rick Rashid talks about wireless, natural language, and search research projects

RICK RASHID, SENIOR vice president at Microsoft, heads up Microsoft’s research group, which is charged with researching many of the innovative technologies transferred to key Microsoft products. Rashid met with InfoWorld Test Center Director Steve Gillmor, Lead Analyst Jon Udell, and Test Center Technical Director Tom Yager to discuss his group’s projects, particularly in the areas of wireless, natural language, and search technologies.

InfoWorld: Would you agree that there is more of a connection now between Microsoft Research technologies and the architecture and products coming from Microsoft?

InfoWorld: It was interesting to hear Bill Gates’ comments on the Charlie Rose Show about the notion of interface uniformity and the way people are struggling with issues of hierarchy and hierarchy associations. What do you see as the alternative?

Rashid: One of the alternatives is obviously search or, more generally, query [and] the notion of having information where the index is fairly deep — meaning that it’s not just indexed on a small number of properties — [because] mentally your tendency is to associate things you want to find with other things or other circumstances. One of our research teams has been looking at different user interfaces built around these ideas. One of the ways people organize their thinking about the things that they’re doing in documents is through time. So you can say “I would like to be able to look for a particular document that I have seen in the last day or that I worked on yesterday or that I worked on [at] the same time I was working on this other document.” One of the things that [has been] pretty effective is that you can bring up a little calendar, [and] if you put pictures of what the weather looked like, people remember the days better, because they remember the weather on those particular days. … [The difficulty with] hierarchies historically is that if you get a very elaborate hierarchy, it’s hard to know why something is where it’s supposed to be. The Dewey Decimal System is the ultimate in not-completely understandable hierarchies. If you look at the way people use the Internet [or] the way people use their corporate networks today, a lot of it is done through search and association. If I go to the Internet to find something, I might go to a Web site and try to navigate down from there. But more often than not, I’m probably going to a search engine first to find something.

InfoWorld: In other words, the phenomenon where Google becomes a spell checker, appointment management tool, and an ad hoc UI?

Rashid: Something like Google is a great way of finding information that isn’t particularly well structured, but the one thing that something like Google really lacks is indexing depth in terms of the information they’re working with. They don’t know a lot about the data. If you think about longer-term user interfaces for computers, there’s a huge amount of data that we actually know about documents, for example. If a computer created the document, it knows when it was done [and] who it was done by. You know where pieces of the document likely came from: Were they cut and pasted from other sources? What was on the screen at the same time? There’s lots of related information which you could use to deeply index and remember about things so that searches can be that much more effective. With a search on something like Google you don’t have any strong relationships, you can’t go find things in time, you don’t know how things were related. Unless relatively simple command and word associations work, Google really won’t help you. … We’ve done [some work] on people hierarchies, work structures, and this notion of what connects people in an organization. Who’s communicated with whom, what mailing list do they tend to be on? What is their org structure relationship to each other? These all create an interconnection fabric between people in a large organization, and you can represent that visually.

InfoWorld: Doesn’t the language-understanding engine in Microsoft Word go to the question of extracting relationships from within what would otherwise be unstructured content?

Rashid: Right. [We] built a system that can effectively try to prioritize the messages that you received based on their content. What they do is use this natural-language engine to do a certain level of analysis to extract the sets of features. And then they can effectively try to match the features against what historically you have deemed important vs. things that you haven’t. It’s basically a learning engine that’s built using natural-language technology to do a certain level of analysis, and then on top of that uses statistical techniques to be able to extract just those portions of documents that relate to a particular question.

InfoWorld: How has that surfaced in Office 11? Is it different than what you have in Office .Net today?

Rashid: The first place that stuff surfaced was in the Outlook Mobile Manager, which was something that we released about a year ago as an add-on for corporations and for individuals that wanted to allow you to prioritize your mail and prioritize notifications to you. It’s like a little applet that basically tries to do a level of characterization, and you specify what’s important to you, what isn’t, and some information about it.

InfoWorld: I see many recurring things in the projects Microsoft Research is working on: natural language, speech recognition, text-to-speech translation, security. What are the areas of potential breakthroughs in the projects you’re working on?

Rashid: We’re doing research in something like 55 or 60 different research areas. It covers a pretty broad collection of things, many of which are basic areas of research in computer science that have been going on for a long time. Natural language is a good example: The field has been doing natural language research for 50 some odd years, and they’ll probably continue to do it for another 50 some odd years. … If you look at the quality of speech recognition today, [it] is dramatically better than it was 10 years ago. The engines are dramatically better, they work faster, they handle a broader collection of accents and noise conditions and circumstances. On the other hand, they’re still not at the point that you just simply talk at something and it’s going to understand what you said. And they’re not going to get there without other problems being solved than acoustic recognition. Computers are largely recognizing things that to them are gibberish. They don’t have a model of the semantics of what’s being said, they’re just listening to the sounds. So one of the things we’re going to have to do to make speech work well is to integrate speech better with language understanding and domain-specific knowledge.

InfoWorld: Is 802.11 technology a significant project for Microsoft Research?

Rashid: We’ve been doing a lot of work with 802.11 and other underlying wireless technologies. [While] 802.11 is incredibly cool and lets you do a tremendous number of things, [it] wasn’t actually designed for some of the things people would like it to do. For example, the notion of building a mesh network: 802.11b is not really designed for that purpose, [and] there are lots of issues in trying to build mesh networks using 802.11b simply because of the way the signaling protocol works between devices and between devices and the base station. There are lots of areas like that where we’ve been trying to see how things can be improved. We’ve been doing a lot of work in peer-to-peer, which is another area of technology that is particularly important in wireless, where you’d want to be able to create these ad hoc networks of wireless devices and interconnect them. But then you have to [figure out] how do they establish identities, how do they communicate with each other, how do they form groups, and how can the communication be robust? We’ve also been heavily involved in the IPv6 work at the IETF, and the IPv6 stack for Windows originally came from the research team.

InfoWorld: What is the value proposition of a mesh network?

Rashid: There are lots of different ways that you can imagine using mesh networks, from the very simple ad hoc [networks] to the more purposeful, [where] somebody actually wants to create a business out of using wireless 802.11 technologies to create a mesh infrastructure in some community. If a bunch of people walk into a room and there is no base station, they can hear each other potentially with their 802.11 devices. That’s one part of the value proposition: people come together, they have these devices, they’d like to be able to talk with each other, how do you make that work well? Strictly speaking, you can’t do it today; there isn’t a good software infrastructure to support it properly. In a number of places people have been trying to set up mesh networks between their homes. If you go to my neighborhood in Redmond, [Wash.] there are enough people with 802.11 networks that they actually do overlap with each other and naturally form a kind of mesh. One opportunity would be to [use] that mesh to allow people to share directly with each other rather than having to drop data through a DSL back or through a cable modem, [and] instead achieve much higher bandwidth in person-to-person or house-to-house communication. Some people have looked at the notion of creating mesh networks specifically as a way of creating a networking infrastructure. Yet another set of issues comes up there, having to do with things like antenna design and channel management and a bunch of other things associated with that.

One of the problems with wireless is that you don’t know who you can talk to. It isn’t like a wired network; if I’m on a wired network and I send out a little note saying “OK, here I am,” I know that the people who are on the same physical wired network are going to hear me. [With a] wireless network, I don’t know who could hear me and who can’t. And those relationships change dynamically as devices move around. You can even get into asymmetric situations where I can hear you and you can’t hear me. You have to look at algorithms that allow you to create maximally connected sub-networks within a mesh and to be able to route dynamically between them, taking into account the fact that the communication channels are not fixed, meaning that they can come and go and the individual nodes may come and go. That’s where most of the complication comes in. There are algorithms that people have been devising to do this. We’ve got some great work going on in our Cambridge Research Lab. There’s been some really good work done at MIT. But it is kind of a hard problem.

InfoWorld: With authenticating ad hoc networks in general, isn’t the technology we have today too slow? I’m thinking particularly of some enhancements that Microsoft put in .Net Server 2002 to accelerate authentication on a wireless network.

Rashid: As you say, there are questions of how quickly you can establish reliable connections and advertise them to others and maintain them and so forth. But some of the problems have to do with the fact that you don’t have a lot of the fundamental things people have used in networking. [People] have assumed that connectivity was basically there, or that once established it was going to stay there for some period of time. [In] the wireless ad hoc environments that’s not necessarily the case. So you need to have a set of algorithms that are much more robust. People have been looking at new ways of thinking about naming and addressing of routing using hyper-cube-like techniques. But it is complicated, and it is not quite the way it used to be done. That’s why I’m never going to be out of a job.

InfoWorld: Once we’ve all agreed that a loosely coupled event-driven system is the way to go, how are we going to debug them?

Rashid: The technology that we’re using for analyzing and testing systems is undergoing a pretty dramatic change. Increasingly we’re able to specify certain properties of large bodies of code that we can then prove to be true or not. We’re now beginning to use this [technology] in analyzing some of the device drivers and things for the next generation of Windows. … I’m talking about normal proofs of specific properties and being able to look at 100,000 lines of code in a device driver and rigorously prove that those specific properties are true or false. That’s something that’s a relatively new capability in the field.

We’re [also] doing things like automatic test case prioritization, where we analyze and maintain in a database the relationships between the tests that are run and the pieces of the code that those tests touch. So when a change is made to the software, we can now know what tests need to be run that are relevant for those particular areas of software, and we can prioritize the test process to maximize the probability that we’ll find errors in a given period of time. We’re also working on things like executable specifications, which are ways of specifying interfaces and protocols, especially. It’s a programming language, but it’s a very abstract programming language that you can write specifications in and the spec is maybe 30, 40 times more condensed than a C program would be. This particular technology right now works best for things like API interactions and protocols, but we’re hoping eventually to be able to use it more broadly in our products. Another thing we’re doing is a tremendous amount of monitoring of code as it executes, so that when problems do arise, you can get an audit trail of what happens so that you can try to debug the behaviors. I’m not saying there’s anything wrong with your point that it’s hard to debug loosely coupled event-driven systems. But the reality is we’re there, [and] we’re trying to build the technical infrastructure to be able to analyze and debug these large-scale distributed systems.