New Directions for Microsoft | This article orginally appeared in the Jan/Feb 1992 issue of Language Industry Monitor Do all roads lead to Redmond? Redmond, WA—“We don’t have much to show yet, but we are always happy to talk about natural language processing.” This was the way Karen Jensen fielded an inquiry into the possibility of visiting her and her colleagues in Microsoft’s new natural language group, one of the groups of researchers in Microsoft’s basic research program. So, one morning in mid-December, Jensen, George Heidorn, Stephen Richardson, and Mark Langley gathered around a table for an informal chat in their small conference room in building 20 of the sprawling Microsoft “campus,” where bulldozers and cement trucks seem almost as numerous as imported sports cars. Permanent construction crews labor to build new office space for this enormously successful and still rapidly growing company, while thousands of programmers work on the operating systems and applications of your future. Microsoft’s new research program is under the command of computer scientist Rick Rashid, who came from Carnegie Mellon University, where he created the MACH system, an increasingly popular flavor of UNIX. The research program, which has received a fair amount of publicity in recent months, falls under Advanced Technology and Business Development (ATBD) at Microsoft. Director of ATBD is physicist Nathan Myhrvold. Basic research is a bold new direction for Microsoft, which has hitherto concentrated solely on product development. Myhrvold has said he hopes eventually to have a staff of a hundred people working here. Three of the first came from IBM. Jensen, Heidorn, and Richardson were associated for many years with the renowned Thomas J. Watson Research Center in New York, where they developed the CRITIQUE mainframe text-proofing system. This past summer they packed their bags and headed for Twin Peaks country to join Microsoft. Mark Langley and Diana Peterson round out the current group there. Hard software problems Mark Langley, a compiler specialist who will be providing some software engineering infrastructure for the group, explains the motive for establishing the new lab: “Microsoft has enjoyed great success as a development company. We’ve done very well with wordprocessors and spreadsheets, but these products are now mature. There is less obvious room for growth; we also realize, however, that there are certain problems, like natural language processing, which can’t be solved simply by throwing more processor cycles at them. Through basic research, Microsoft would like to solve some hard software problems.” The first order of business for the group will be tooling up. They come from IBM barehanded and suffering from an intriguing, contractually-induced bout of collective amnesia concerning previous activities. Mark Langley will be assisting the other members of the group in developing the programming tools they will require to do natural language processing. To ensure that the results of their efforts get into circulation in a timely and efficient fashion, Research Program Manager Diana Peterson will function as the liaison between the many application development groups within Microsoft. After years of seeing their efforts mired in the bureaucratic inertia of a monolithic multinational, Jensen, Heidorn, and Richardson are clearly delighted with the entrepreneurial spirit which prevails at Microsoft and are enthusiastic about the prospects of developing software that may one day actually get marketed and used by people. “With Microsoft, we’ll have a very big audience,” says George Heidorn, rubbing his hands. “How many copies of MS-DOS and Windows are out there?” For computational linguists, the three have agreeably pragmatic natures and seem determined to solve real problems for real people. Karen Jensen has written in the past: “Existing linguistic theories are of limited usefulness to broad-coverage, real-world computational grammars, perhaps largely because existing theorists focus on limited notions of ‘grammaticality’ rather than the goal of dealing, in some fashion, with any piece of input text. We need to deal with huge amounts of data, with many (and messy) details.” Envisaging life beyond Chomsky, she went on to suggest that “the true goal of real-world grammatical analysis should be re-defined: a grammar should try to describe ‘all,’ but not ‘only,’ the grammatical strings of a language.” Speculation As the visit comes to a close, Jensen steers the discussion briefly into a speculative mode. What kind of things would the world like to see result from their research? In response to that query, some useful, incremental enhancements to our standard working environments spring to mind: for example, smarter text processing based on an improved awareness of linguistic phenomena, or a standardized method of storing linguistic information together with text to facilitate subsequent process-ing, such as content scanning or translation. How about automatic programming? (Or would that conflict with Bill Gates’ long-cherished desire to see BASIC established as the universal macro-language?) Or the groundwork for advanced support in Microsoft software for third-party NLP subsystems, such as speech recognition, speech synthesis, and automatic translation? Or… COPYRIGHT © 1992 BY LANGUAGE INDUSTRY MONITOR
|