This sidebar orginally appeared in the Nov-Dec 1993 issue of Language Industry Monitor With a budget of Ecu3,210.000 for 238.5 person-months, MULTEXT is by far the largest of the LRE shared, cost projects (the industrials pay half of their own way). In brief, a two-tier group of academic and industrial partners from six countries will be testing and extending the TEl specifications to encompass multilingual corpora. They will also design and develop modular, largely language-independent software tools for corpora creation and analysis. In addition, the consortium will assemble a multilingual corpus of English. French. Spanish, German, Italian and Dutch for test purposes. Like several other LRE II projects, MULTEXT boasts a built,in evaluation mechanism as well; at a later stage in the project. its industrial participants will demonstrate the usefulness of these corpus,based technologies in appropri, ate applications. While an increasing number of corpus tools and English,language corpora are becoming available. there is a dearth of such resources for other languages, and this is one of the problems MULTEXT will be addressing. (See article that this sidebar accompanied) |