Localization

From KDE TechBase


Welcome, adventurer, to the universe of KDE localization! Brought along a generous reserve of breathing air?

Warning
This assembly of articles is intended to become the central source on localization of KDE, but is far from being there yet. In the meantime, the authoritative body of documentation is the KDE Translation Howto.


The Milky Way

This page is the hub of articles on all things concerning localization of KDE. The topics covered are intended to detail facts, issue advice and satisfy curiosity about KDE localization ways. And to do so for "all audiences": from newly interested in free software localization, over those more experienced seeking to get involved into KDE localization, to veteran KDE translators curious about new developments.

To achieve this aim, the topics are by necessity not structured linearly. When you read a textbook, in the introduction you can frequently find authors' instructions on how to "follow" the book: possible coherent chains of chapters, which chapter is a prerequisite for another, and which optional. So is with the topics presented here, taken to the extreme, with many entry and exit points depending on individual interests.

The hub currently connects three spiralling arms of topics:

Concepts
Expositions of widespread concepts in free software localization in general and KDE localization in particular. General mechanics and formats, advanced technical possibilities, organization and communication processes.
Tools
Descriptions of tools which may benefit the localization process, from command line scripts to full-featured GUI applications. Their possible roles in support of localization concepts.
Workflows
Instructions and procedures on how to contribute to KDE localization, on various levels of engagement. What are the need-to-know concepts in different scenarios, and which tools are appropriate to carry them through.
Tip
If you are fresh in the trade, do not feel intimidated by the expanse. Not even crack KDE translators are supposed to be intimately acquainted with, or need all this stuff. Instead, if you are eager to start churning out results, head to the rookie workflow and follow the leads therein. If you just want to fix some translation issue, the best way currently is to create a bug report at Bugzilla for the "i18n" product choosing your language as "Component".


Note to Editors

It is important to correctly place certain bits of information in the localization universe. The reader should be made aware of what is a local KDE convention, what a special feature of the tool they use, and what a general concept and its embodiment in specific KDE context.

While this is obviously a KDE-focused resource, it is nevertheless useful to provide examples of how some elements are handled outside of KDE. Through contrast and comparison, the reader may better understand the whys and hows of presented material. Likewise, it may help those already familiar with other localization environments.

Concepts

For better or for worse, there is no lack of frameworks, formats, procedures and other vague notions that a KDE translator may stumble upon along the way. They may be a burden of sorts -- many things to keep in mind -- but also a source of fun, challenge, and deep satisfaction when creatively combined towards great efficiency of everyday work and mirror-perfect polish of the final output. Thus, expect the articles here not to withold much.

Text Encoding
Text is the most basic object of localization. However, to handle it at low level -- to encode text -- such that languages of the world are smoothly supported, was historically not trivial. Read about the current standards, proper setups and errors due to text encoding which may pop up.
The PO Format
The PO format is the mainstay of free software translation. Regardless of the actual workflows and tools used, translators should maintain a good measure of familiarity with the underlying PO format. This article thoroughly describes the elements of the PO format and various uses of PO catalog files which embody it.
XML Markup
Parts of text are sometimes presented to the user in special way: bold or italic, title sized, etc. XML-like text markup is a popular way of specifying such presentation, and translators will frequently find it embedded in the source texts. This article deals with XML markup from translators' viewpoint.
Version Control
KDE evolves by integrating a lot of work contributed by a lot of people scattered around the planet, and that along parallel lines of development. To prevent information collapse into the gravity well of unhindered creativity, programmers employ version control systems -- and so do the translators.
Rings of Communication
Translators have many communication outlets at their disposal. Beyond the circle of one's own language team, translators from other teams and programmers are orbiting close at reach. Find out which issues are best dealt with at which level, for the time spent in constructive discussion to achieve most positive impact.
Automatic Translation Assists
Automatic assists help translators to speed up the work, avoid common errors, and achieve consistency in style and terminology. They may be provided as standalone tools, or as features of specialized translation tools. These assists include spell checkers, glossaries, translation memories, etc.
Collecting and Using Statistics
Statistics provide overall indicators of the past localization progress, and means to extrapolate for the future. However, as always with statistics, for meaningful conclusions they should be used with care. Read about the sources of statistics, things to count, and effort estimates.
Translating Documentation
Compared to localizing application interfaces, translating their documentation both throws new elements into the fray, and casts others in a different light. The need to keep the interfaces and documentation in sync, demands another layer of attention. This article covers those peculiarities.
Writing Systems Distinctions
User interfaces have been historically centered and developed around alphabetic writing system, using Latin alphabet, and written from left to right. Drastic changes to any of these assumptions, such as right to left writing or using ideographs instead of alphabet, requires some consideration.
Language/Country Specific Formats
Applications frequently process and display pieces of information which, while universal in concept, are presented differently accros cultures: time and dates, numbers, calendars, and so on. KDE apps gracefully handle such differences, relying on the language/country-based specifications in the KDE core.
Special Entries in Translation Catalogs
Most messages in translation catalogs are ordinary text intended for the user, but some are not. Programmers may use messages which let translators choose behavior for their language, add language-specific data, or state translation credits. Such special entries typically found in KDE catalogs are described here.
Translation Scripting with Transcript
Traditional UI translation frameworks are based on English as the pivotal language. This frequently leads to technical problems in target languages, where translators may be forced to choose between bad and worse. To alleviate such issues, KDE provides a way to act on translation at runtime -- the Transcript engine.
Localizing Non-Text Resources
While the first thought of localization is that of text translation, text is not the sole resource for localization. Any content presented to the user -- an icon, image, sound -- may require localization in certain cultural contexts. Learn how to localize non-text resources in KDE as the need arises.

Tools

Great many tools exist to support the localization process. Some may be quite general, and other tightly coupled with KDE localization process. Tools are thus presented in different ways. More general tools typically have referent documentation of their own, and here it is explained how they relate to concepts and workflows used in KDE. Custom, KDE-specific tools are explained in greater detail, sometimes to the point of these articles being their referent documentation.

Gettext Tools
GNU Gettext is the de-facto referent implementation of the PO format. It is used to extract templates from the code and update PO catalogs, and has many tools for processing PO files. Its compiler of PO format into binary format for application use, is the final arbiter of validity of a PO file.
Lokalize
Lokalize is a computer-aided translation (CAT) tool, a full-featured GUI application for translators, written from scratch using KDE4 framework. Aside from basic editing of PO files with nifty auxiliary details, it integrates support for glossary, translation memory, diff-modes for QA, project managing, etc.
Emacs & Vim
Emacs and Vim are ubiquitous Unix text editors, in continuous use and development from times immemorial. Both very different from today's typical editors, as well as between each other, powerfull and extendable, they have been pressed into many roles. One is power-assisted editing of PO files.
Translate Toolkit
Translate Toolkit is a host of command-line utilities, written mostly in Python, that expands and extends on Gettext's tools. They provide advanced search, selection and merging of PO files, and environment-specific validity checks. Also included are converters between various non-PO formats.
Subversion
Subversion, SVN for short, is the version control system currently used by the KDE project as whole. Same as the code, KDE localization data are stored in the central SVN repository. This article describes the use cases of Subversion for translators, as well as repository organization of localized data.
Pology
Pology is a Python library and a set of command-line tools for processing PO files. The tools perform various specialized operations on PO files, some of which have explicit support for PO files within the KDE Translation Project. The library is geared towards quick and robust assembly of "field" scripts for processing PO files, in a version-controlled environment.
Lbundle Checker
For text resources translated through PO files there are well-established means of tracking changes as the underling code evolves. This is the script to provide a degree of such support for localized non-text resources, when organized as localization bundles.
Miscellaneous Scripts
KDE repository contains many standalone scripts to check and process localized data, of various degrees of specificity -- many even tied to the exact repository organization. Collected here are the descriptions of such scripts which may be generally useful to translators.

Workflows

There is no single way to participate in KDE localization. Contributors will differ by the amount and direction of effort they put in, and the workflow articles are here to provide guidance for the frequently observed roles. Also presented are the technical and organizational details which have tidal influence on everyday translator's workflow.

Rookie Translator
This is your first forray into localization and you're looking for the sign saying "Don't Panic", in large friendly letters? Find about the essential prerequisites to start translating early, but productively.
Seasoned Translator
Translators that have surmounted the difficulties and acquired the concepts and tools for the work to become daily routine, may start looking into new directions. How to better coordinate effort, cooperate with other language teams, test the localization quality in live environment, etc.
Language Coordinator
Each language team in KDE needs one or few persons tasked additionally to their basic work on localization. They take care that the team effort progresses smoothly, and voice their teams in global matters. Language coordinators have write-access to KDE repository, and the responsibility to boot.
Global Coordinator
While the cooperation can and does take place between translators from different language teams, more focused attention is needed to handle some global issues. To that end, some translators engage in maintenance tasks on wider scale, and one of them is appointed the KDE Localization Coordinator.
Introducing a New Language
Great many languages are already being localized into within KDE project. Sometimes, however, a new language is to be introduced, which requires coordination between its translators and core KDE team. Also, to ship a language as part of official KDE release, some essentials must be satisfied.
Repository Automation
Daily in the KDE repository, the lumbering machinery scrutinizes code, updates translations to reflect changes, performs checks, serving the results to translators. Learn to follow its hum, and know where to grease when the gears clog.
Translating in Summit
Handling two branches for translation, stable and trunk, can be tedious. Porting fixes from one to another branch, making sure team members work on correct branch, etc. Especially so when non-core modules are considered, like extragear. Read about one possibility to curb this overhead.
Reviewing by Ascriptions
It is essential for quality that translations are reviewed: for context, terminology, grammar, style, etc. Given text is normally reviewed by persons other than its translator, which prompts the question of how to coordinate and track reviewing. This article presents the ascription system of reviews, building on the summit workflow.