By Scott Bass, President, Advanced Language Translation

imageI am not sure what happened between 2001 and 2009. It could be that with a growing business and family, I just lost focus. But, I distinctly recall hearing about a new and exciting standard called XML Localization Interchange File Format (XLIFF) sometime around the end of 2001 or the start of 2002…

Well nearly seven years later, the details escape me as to how I became familiar with the XLIFF. Working in the translation and localization industry, the standard is of direct use to me and my colleagues. We have even managed to make use of XLIFF on a few occasions, since the tools we use for computer-aided translation natively support files in this format.

This article is not about XLIFF as a standard. As far as I am concerned it is a fine standard and appears to be getting the requisite attention from the XLIFF Technical Committee within the Organization for the Advancement of Structured Information Standards (OASIS). My commentary is directed at everyone involved with content management system (CMS) tool development. (And, I don’t mind using a shotgun where a scalpel may be more appropriate).

In localization work, we interface with lots of different web content management systems–from those built by smaller web development companies to help their customers manage corporate sites to larger enterprise-wide deployments of third-party CMS software. From my experience and where I sit in the localization food…er, supply chain, most CMS developers recognize that they must offer hooks for localization, i.e. an easy way for localization vendors to easily access content within the authoring-to-publishing workflow. For this, we are quite grateful, because just a few short years ago, we didn’t even get a hook.

The real issue is the nature of the hook. Some CMS developers assume or are directed by their customers to provide a “translator interface” (scare quotes intended). Others are instructed (rightfully so) to simply make it easy to extract source content and readily reinsert translated content in the CMS database. The latter is the kind of hook we localizers like. We do not want to be given a translator interface because most developers are not translators and don’t have a clue as to how to design an efficient and translator-friendly UI. (If you are a CMS developer, don’t be offended, even the localization tool developers have yet to come up with a truly killer translation UI, and most translators still tremble at the thought of “Web-based translation”).

The first thing CMS developers miss in regard to translation interfaces is to support translation memory functionality. It is typically assumed by non-translator developers that what is required is a simple tabular interface that displays the original (source) text in the left column, and translation can be inserted into the right column. They will typically segment the text at the sentence level. This does allow someone to type translation for insertion into the CMS pretty easily. And, these forms are trivial for developers to construct. However, through these simple forms, there is no way to access a translation memory database that will enable translators to reuse common text or to leverage translated content from related projects; something translators have become dependent upon, having used desktop software that has been around since the early ‘90s.

I have seen a few smaller Web development vendors attempt online translation interfaces that even supported variable matches, a rudimentary type of translation memory. The form that contained the translation UI would constantly poll the database while the translator moved from segment to segment, checking to see if the current source segment matched a previously translated one. This approach, while accomplishing the intended goal, does not perform as well as expected in real time and is too slow for most translators. In addition, online translation interfaces usually lack otherwise standard features such as target language spellchecking and access to integrated terminology management tools.

At this point, I am bracing myself for the flood of emails from various Web and localization tool developers who, I am sure, will inform me of the respective capabilities of their online translation tools. Feel free to save yourself the carpel tunnel; I am aware of tools that do support these features. Unfortunately, your developers weren’t working with the respective CMS developers who attempted to implement their own poorly conceived solutions.

Here is the ultimate point:

To CMS developers—Don’t get into the translation business! Trust me; it’s crowded and full of people with liberal arts degrees. Don’t try taking the translator away from his or her desktop. You will only expose yourself to incessant whining. Focus instead on utilizing well-established standards such as XLIFF. Create robust and easy to use import and export functionality. Throw in a little bit of administrative functionality too that will enable CMS admins to quickly batch, bundle, deliver, track and receive XLIFF files from localization service providers. Also be sure to include smart hooks that allow translators to see the content where it appears either online or in documents; context means everything to translators.

To localization tool developers—Start thinking about working with the CMS developers. They are smart enough to build complex software just like you. It won’t take long for them to figure out translation memory engines. Once they have that, you’re sunk.