SECC: A Simplified English Checker/Corrector

Commission of the European Communities (LRE)
G. Adriaens, L. Pauwels, L. Macken, I. Pype, E. Van Everbroeck, B. Tersago
Other participants: Siemens Nixdorf (Liège), Alcatel Bell (Antwerp), Cap Gemini (Paris)

The SECC project addresses the area of NLP applications, more precisely the development of ``electronic authoring and publishing systems providing style and terminological guidance, specifically designed for non-native authors, using controlled sublanguage''. It will develop a grammar and style checker for simplified/controlled English (SE), meant to serve two purposes. In the first place, it is a writing tool for technical writers who have to produce easily readable, unambiguous texts (manuals, for instance) that should be understandable by a wide audience of non-native speakers/writers of English. In the second place, the tool can serve as a front-end of pretranslator for machine translation products helping to improve translation quality and reduce postedition work by simplifying the input (short sentences, clear and simple syntactic structures, unambiguous word usage, etc.).

The SECC tool will be based on existing NLP technologies, and it will reuse NLP and linguistic resources. As to the technologies, SECC will be built within a machine translation framework (i.e using the NLP environment of the METAL MT system): the task of the tool will be conceived as translation from English to a subset of it (SE); in this respect, SECC will not limit itself to outputting diagnoses about mistakes, it will also correct (translate) erroneous sentences as much as possible. A solid 140-rule grammar of SE developed in the context of the telecommunication subdomain of telephony (from Alcatel Bell), as well as a union of electronically available existing basic SE lexicons plus technical terminology will together form the ``transfer'' modules of the checker/corrector. The tool will do syntactic and lexical (terminological) checking, on all levels of the text. Special attention will be paid to mistakes by non-native (viz. Dutch, French and German) writers of SE.

The different interfaces for the user form an important AI/IT subpart of the project. The SECC tool will run both as a DTP-independent application in batch mode (checking of complete texts) and as an integrated application inside Interleaf6 in batch mode and interactive mode (checking subparts of a text while it is written) on Sun workstations. Developments here concern the complexities of the communication of the DTP package with the NLP application, user-friendly interfaces using the Motif standard, hypertext-like presentation of the checker's output, and internal representation of the output using the SGML standard.