Development/Tutorials/Localization/i18n Mistakes: Difference between revisions

From KDE TechBase
(relative urls; for when they get moved to their own area. nice.)
m (Development/Tutorials/i18n Mistakes moved to Development/Tutorials/Localization/i18n Mistakes: moving into it's own "namespace" as per conversation the other day on irc with dhaumann and milliams)
(No difference)

Revision as of 05:19, 10 January 2007

Avoiding Common Localization Pitfalls
Tutorial Series   Localization
Previous   Writing Applications With Localization in Mind
What's Next   Incorporating i18n Into the Build System
Further Reading   n/a

Abstract

There are a few common pitfalls that prevent applications from being properly translated or otherwise localized. These include using pixel based layouts, "word puzzles" and writing code that does not deal with Unicode characters properly. This tutorial covers each of these issues, explaining what to avoid and how to do it properly.


Pitfall #1: Pixel Based Layouts

English text is often very compact compared to other languages where the translated text is often substantially longer. Therefore the interface much be able to adjust size to accommodate the length of translations provided at runtime. If it can't do this, then messages will end up misaligned and truncated.

The answer is to use layout managers. Qt provides a number of such layout managers pre-made for you. They include QHBoxLayout, QVBoxLayout, QGridLayout and QStackedLayout, all of which are subclasses of QLayout. You may also create your own QLayout based classes, but this is generally not needed.

These layout classes manage the pixel positioning of widgets for you at runtime, so no matter what the size of the translated strings your interface will adjust properly. For more information look at the documentation for [QLayout].

Pitfall #2: Word Puzzles

Another thing to be aware of is to not concatenate pieces of sentences together like this:

QString msg=i18n("Do you want to replace ")+oldFile+i18n(" with ")+newFile+"?"

Such "word puzzles" are very hard or even impossible to translate. This is because the structure of the sentence will often be completely different in another language and thus must be controlled by the translator. When the order of words and phrases is hard-coded as in the above example, the translator can not create a proper translation.

Adding to this problem, a translator will only see parts of the sentence while translating and will have to guess at what belongs together.

The solution thankfully is quite simple: use QString::arg() which lets the translator not only make good translations because they can see the entirety of the sentence during translation but which also lets them change the order of the arguments freely. Because of this latter advantage, QString::arg() is also recommended to use instead of sprintf and similar functions.

The above example written properly would then look like this:

QString msg=i18n("Do you want to replace %1 with %2?").arg(oldFile, newFile)

Note
Avoid inserting anything other than numbers or nouns with this method, since in some languages the translation depends on the inserted words. It is therefore best to create strings that are as complete sentences as possible.


Similarly, messages that contain a version string or other often changing parts should use QString::arg() to insert them into the message. This prevents unnecessary changes that cause the translators to have to change the translated messages as well.

Since KDE is translated into more than 65 languages a single string change causes at least 65 people to open the file, find the changed message, look carefully if this is the only thing that has changed, change the translation, save the file again and commit the changed file into the code repository. All in all such a small change might create hours of work which could be easily avoided.

Pitfall #3: Lack of Unicode Support

Whenever there is source code that handles strings which does not use a datatype (such as char) or class (such as std::string) that can not handle Unicode, translations will break.

To avoid this, never call QString::latin1() or QString::ascii() on translated strings. This also applies to information resulting from user input such as passwords, URLs and filenames. If you really need a plain char* representation of a string, it is better to use QString::utf8().

Note
For more information on character sets and Unicode, see the Unicode tutorial.


KIO slaves may also provide paths and file names encoded using UTF-8. It is up to the programmer, however, to take care of passing properly encoded filenames to any KIO method in question. The correct way to do this is not to guess at user's filesystem encoding but to use QFile::encodeName() and QFile::decodeName() instead.

Tip
You can turn KIO's UTF-8 file name support on for testing by exporting the KDE_UTF8_FILENAMES environment variable in your shell's startup file (e.g. ~/.bashrc).


Success!

If you avoid the three common categories of pitfalls detailed in this tutorial, your application should be fully localizable by the various KDE translation teams around the world and open up your application to the majority of people on the planet.