By Paul Buitelaar This article orginally appeared in the Sep/Oct 1991 issue of Language Industry Monitor Minting new coins may come first Ljubljana — Computational linguist Jelena Meznaric returned home in late July from the Prague Summer School in Computational and Formal Semantics to find a broadcasting tower on a government building behind her house had been bombed. Although a fragile ceasefire has since been declared among the warring states of Yugoslavia, the war has clearly left its mark on the physical infrastructure of Meznaric's native Slovenia. With this in mind, we asked Meznaric about the effects of the strife on Slovenian research in computational linguistics and, by definition, on the technologizing of the Slovene language. As with events over the past three years in other Eastern European countries, rapid change and impending civil war in Slovenia have been a disaster for research programs. “ Already our wages have been reduced to a subsistence-level two hundred dollars a month,” says Meznaric. “ There's no money for new equipment. And there will be no resources for new projects for a long while.” Although funding for the Jozef Sefan Institute has always come from the Slovenian Ministry of Science — not the previous Yugoslavian federal government in Belgrade — the shattered economy and the familiar tale of funds diverted to other, more urgent (read: military) pursuits have meant Slovenian government resources are severely depleted. At the Institute in Ljubljana, a total of eight researchers, (computational) linguists, computer scientists, and logicians form the Natural Language lab, of which Meznaric is deputy head. Next door neighbor to the NLP group is famed Prolog guru Ivan Bratko, who heads the Institute's AI lab. The main project underway in the Natural Language lab is the compilation of a Computer Readable Lexicon (CRL) of standard Slovene. Project leader is Tomaz Erjavec, who, together with Meznarac and colleagues, is currently implementing the CRL in a unification-based fashion, using Quintus Prolog on VAX machines. Meznarac says the group is compiling the lexicon from scratch, having been denied access to machine-readable wordlists by technophobic Slovenian publishers. In general, the software they use is of English-language origin, with scant support for the rich collection of diacritics found in the Slovene language. Crude workarounds come in the form of brackets to indicate accented characters. According to Meznaric, an important application of the CRL project will be a spelling checker for Slovene. Some added intelligence will come in the form of morphosyntactic knowledge derived from their CRL entries to help flag some of the more treacherous spelling errors in Slovene. Although a small quantity of software has already been developed for the language, Meznaric finds current efforts amateurish and of limited value. As government funding dries up, the lab may be able to attract sponsorship in the form of subsidies from Slovenian industry. PETROL (Slovenian oil industry) and Ljubljanska Banka are attracted by commercially interesting applications such as the spellchecker. Natural Language Lab, Dept. E4, Institute Jozef Stefan, Jamova 39, 6111 Ljubljana, Slovenia, Yugoslavia COPYRIGHT © 1991 BY LANGUAGE INDUSTRY MONITOR
|