BIM: testing the NLP waters


This article orginally appeared in the May-June 1992 issue of Language Industry Monitor

As databases get bigger and more complex, users need commensurately more sophisticated tools to retrieve information from them. For Belgian systems house BIM, that is where natural language processing comes in.

“Five years ago, I came to BIM with the belief that all the software applications which could benefit from natural language processing should make use of it,” says Jean Louis Binot. “But only now do I think we finally have the technology to realize that.” As manager of the R&D group at BIM, Binot has overseen the development of Loqui, a natural language interface for databases. If it is up to Binot, Loqui will represent one more tool in an arsenal of many to help usher in the next generation of information technology systems.
    Located on the outskirts of Brussels, BIM is a fifteen year-old, privately-owned company of about 180 people. It has a strong R&D program, partly supported by its bread-and-butter systems integration activities and partly through government funding. Among other things, BIM is the developer of “a world-class Prolog” environment, ProLog by BIM, which has been designated the standard Prolog by IBM for its R/6000 line of RISC-based Unix workstations. Up until recently, BIM has focussed primarily on the Belgian market; it is now starting to look further afield, with offices established in France, Holland and the United States. Among the happy mixture of Flemings, Walloons, and foreigners at BIM, English alternates with French as the medium of communication.

SQL forsaken
This past March, BIM announced the commercial release of Loqui, which runs on Unix workstations. Loqui differs in a number of significant ways from similar products, such as nli’s Natural Language and aiCorp’s Intellect. For one, it does not use the standard database query language SQL to search and retrieve information from databases. Rather, it interacts with BIM Prolog, the language in which the package has been developed. BIM Prolog, in turn, has access to standard rdbmss via its proprietary interfaces.
    Second, BIM does not position Loqui as a shrink-wrapped solution for end-users, but rather as an environment for developers to build and maintain natural language front-ends for large systems. “It’s more a technology than a tool,” explains Lieve Debille of BIM R&D. Significantly, the team refer to developing a Loqui application as “porting” it to a given context. They regard this as a job for a specialist, not an end-user.
    Demonstrating Loqui on a Sun sparcstation at BIM’s offices, Debille explains their goal in developing Loqui was to allow users to access a database without knowing anything about such matters as field and table names. “We’re trying to free end-users from having to know about the structure of a database,” she explains. The physical location of information as well as a basic set of operators are all mapped to “plain English” words. This active vocabulary can be viewed on-screen to give users a quick indication of the kind of questions they can ask. “Essentially, we are putting a supplementary database between the main database and the user,” says Debille. The size of the vocabulary and the number of synonyms would depend on the specific application.
    Loqui has two important features, explains Debille. The first is the response generator designed to offer unambiguous replies on the basis of dialog rules. “Such a system should be cooperative and intelligent,” says Debille. “if you ask a question about an employee who is not in the employee database, the system should not simply return a negative answer. It should clearly indicate that the subject is not in the database.” This leads Debille to explain the group’s reason for shunting SQL in favor of their own Prolog interface: while SQL offers a certain degree of control over when and why database queries fail, Prolog makes more a cooperative dialog possible. “Global answers are just not enough,” she maintains.

Anaphora but not ellipses
The second feature is a contextual interpreter which enables Loqui to handle different types of anaphoric reference. “Experience shows, however, that many SQL users tend to back away from the idea of using anaphora,” says Debille. “Strangely enough, people seem to a distrust technique which represents an enormous timesaver in man-machine interaction. It’s a challenge to get users who are accustomed to SQL to break their request down into a number of simple questions employing anaphora instead of formulating one long SQL-type query. The advantage of this approach is that it is much easier to figure out where you went wrong if you don’t get the reply you want. You can much more easily backtrack.”.
    While Loqui can handle long chains of antecedents, it cannot handle ellipsis. You cannot, for example, ask “Who earns the most in the East?” followed by “And in the West?” Bart Vandecapelle says that they had incorporated this functionality into Loqui at one point, but removed it because it was not reliable enough. For a commercial application, an accuracy rate of fifty or even seventy percent is not enough, he says. The group is keenly aware of some of the misrepresentations and unfulfilled promises in the field of nl and have no interest in propagating further misunderstanding.
    While Debille and Vandecapelle — both Eurotra Belgium veterans — address technical development, Bruno Schröder is handling strategic business development within the nl group — marketing in other words. It will be his task to try to interest potential customers for Loqui. Where will he be looking? “We’re seeing large corporations now beginning to use their computers for more than just accounting and transaction-processing purposes. There’s a lot of interest now in document image processing and full text retrieval as well as logistics and decision support. It’s becoming clear, however, that something like only ten percent of the employees in a company use the corporate information system. This information is inaccessible to both management and staff because of the difficulty in retrieving it; they are dependent on SQL programmers within the centralized mis departments,”.
    “Moreover, the formbased query facilities common to traditional database systems are simply too rigid for these new applications. The structure of the information is too complex for them to be effective. We think the key to making this information more easily accessible will be nl. Decision support, project management — these applications are wellsuited to nl.” Schröder cites as an example a huge public database being developed in Germany containing information about pollution; a system of this size and complexity would be a ripe candidate for an nl interface such as Loqui. He believes that an nl interface is the best way to allow people to easily obtain diverse information from large, complex databases.

Multimodal interfaces
JeanLouis Binot carries the discussion a step further. “In reality, we see nl being just one aspect of many in a user interface. This is what we refer to as multimodality. Certain information can more easily be displayed graphically, so why not use that technology? Ultimately, with such a system, graphic objects should also be active vocabulary, so you can refer to them using words as well. Asked whether the drive to establish standards and develop common resources within the NLP world is simply an academic exercise, Binot replies that it is indeed an important development, something that is sorely needed. The lack of tools is a burden; you should not have to redefine a noun phrase every time you write a grammar. Another ongoing project at BIM, a eurotra spinoff called et9, is in fact the development of a linguist workbench. While it is still about twenty months from completion, Binot says that the project’s benefactors at the ec would like to see it become a de facto standard within the NLP world.
    Binot feels there is much yet to be done in terms of awareness. “The French have a sense of the Language Industry,” he points out. “There, you have the government taking a leadership role with prestigious, highprofile projects like the French electronic Yellow Pages developed by GsiErli. But in Belgium and in other European countries, there isn’t that kind of highlevel interest yet.”

BIM, Kwikstraat 4, b3078 Everberg, Belgium Tel +32 2 759 5925, Fax +32 2 759 4795

COPYRIGHT © 1992 BY LANGUAGE INDUSTRY MONITOR

[ return | top | home ]