Computer Assisted Translation Basics

About Computer-Assisted Translation (CAT)

Firms in the U.S. are increasing doing business on a global scale and need to produce technical documentation in multiple languages. A rough census of the Internet shows that the demand for multilingual content increases annually. It is reasonable to assume that the demand for multilingual technical communication increases likewise.

Annual Demand for Multilingual Internet Content

Year	Number of Websites
2000	10,000,000
2004	57,000,000
2005	74,000,000
2006	101,000,000
2007	155,000,000
2008	186,000,000
2009	255,000,000

In the past, translating fell almost exclusively to humans processing individual documents and updating each one at a time. No surprise, the finite supply of human translators and rapid advances in technology made computers indispensable for mediating and managing translation. In practice, computer assisted translation (CAT) can execute complex and tedious tasks quickly and accurately, leaving the human to concentrate on maximizing output and tuning the fluency and usability of translated documents. CAT also guides the translation process from granular, word-to-word matches to managing libraries of translated content.

CAT and Terminology

CAT can compare source and translated documents for consistent use of terminology and then store these translations so that the same rules can be applied when aligning another source/translation pairs. The terminology management function gives the translator a means of automatically searching a database for terms appearing in a document to ensure that the correct source/target term combination has been used. The alignment function takes completed translations, divides source and target texts and segments, and compares them to determines how closely they match. Positive matches are used to build up a database of correct translations that can be reused. While terminology management can’t take context or localization into account, it can free the translator to focus on improving the usability of the translated material (Lionel Lim, personal communication, April, 2011)

CAT and Natural Language

CAT parses text according to predefined linguistic rules that codify grammatical and stylistic requirements. Simplified Technical English and other versions of constrained English make rules-based translation easier because source texts written with a limited vocabulary leave fewer opportunities for mistranslation. One drawback: Rules-based methods generally sacrifice fluency in favor of predictable output. The statistical translation method relies on the global analysis of one language (e.g. English) and the same analysis of a second language (e.g. Spanish). The translation is based on the statistical likelihood that translated material will approximate the meaning of the source material. With the purely statistical method, there is no guarantee that identical source phrases will produce identical translated output.

Rule-Based Processing	Statistical-Based Processing
Consistent and predictable quality	Unpredictable quality
Knows grammatical rules	Does not know grammar
Lack of fluency	Good fluency
Hard-to-handle exceptions to rules	Good for catching exceptions to rules
High development and customization costs	Low-cost development (Babelfish, Google Translate)

Tatiana Batova, personal communication, September, 2011

CAT and Content Management

CAT is not a language-to-language translator; it also refers to content management systems that organize translated material and maintains a database of segmented source text and its corresponding translation. CAT allows the user to change source text in one document even while the work is in progress and then propagate the new material to other documents. With CAT software in hand, the translator can automate much of the tedious work associated with translating and reusing copy across families of documents.

Establish Guidelines

If you decide CAT is appropriate for your organization, establish some guidelines for authoring and managing content. For any global audience, use care when referring to geographical locations, references, holidays, celebrities, seasons and humor. Use simple sentences or simple compound sentences in your text and avoid words that can be both a noun and a verb. For example, instead of using “drill” for both a thing and an action, consider using bore as the verb.

For help deciding if CAT is appropriate for your organization, consider Are You Ready For Machine Translation?

No doubt, companies selling technology-intensive products and services to professional users need to place a premium on accurate, consistent translation. Even so, many firms do not recognize translation as a complex activity requiring a high level of skill, and are therefore not prepared to pay what it is worth. Maybe the up-front cost of translation is high but consider the cost of squandering credibility in non English-speaking markets.

Sources

Duchamps, Katherine, Lim, Lionel (April 2011). Proceedings from: Writing for Translation. Milwaukee, WI

Olivia Craciunescu, Constanza Gerding-Salas, Susan Stringer-O’Keeffe (July 2004). Machine Translation and Computer-Assisted Translation: a New Way of Translating?

Translation Journal Vol. 8, No. 3, http://translationjournal.net/journal/29computers.htm Wikipedia topic entry: Computer Assisted Translation. http://en.wikipedia.org/wiki/Computer-assisted_translation