Difference between revisions of "Development/Tutorials/Localization/i18n Semantics"

Jump to: navigation, search
(Added @tooltip and @whatsthis contexts, /shell format modifier.)
Line 4: Line 4:
 
== Abstract ==
 
== Abstract ==
  
Typical way of formatting user visible strings in application interfaces, for a long time has been that of plain text or at most visual markup like HTML tags. In most textual content environments, shift to ''semantic'' markup has been recognized as superior to visual (for example, Docbook XML for documentation). Why not go down the same road for UI strings?
+
Typical way of formatting user visible strings in application interfaces, for a long time has been that of plain text or at most visual markup like HTML tags. In most textual content environments, shift to ''semantic'' markup has been recognized as superior to visual (for example, the Docbook XML for documentation). Why not go down the same road for UI strings?
  
 
== Semantic Markup by Examples ==
 
== Semantic Markup by Examples ==
  
In semantic model, text elements are marked for their ''meaning'', rather than for their visual appearance. Consider a few i18n examples of visual formatting:
+
In the semantic model, text elements are marked for their ''meaning'', rather than for their visual appearance. Consider a few i18n examples of visual formatting:
 
<code cpp>
 
<code cpp>
 
i18n("Move");
 
i18n("Move");
Line 21: Line 21:
 
Using KDE UI text semantic markup (KUIT for short), these strings would be formated like this:
 
Using KDE UI text semantic markup (KUIT for short), these strings would be formated like this:
 
<code cpp>
 
<code cpp>
i18nc("@action", "Move");
+
i18nc("@action:button", "Move");
  
i18nc("@item", "Descending");
+
i18nc("@item:inmenu", "Descending");
  
 
i18nc("@info", "<filename>%1</filename> does not exist", fname);
 
i18nc("@info", "<filename>%1</filename> does not exist", fname);
  
i18nc("@info",
+
i18nc("@info:whatsthis",
 
       "<title>History Sidebar</title>"
 
       "<title>History Sidebar</title>"
 
       "<para>You can configure the history sidebar here.</para>");
 
       "<para>You can configure the history sidebar here.</para>");
Line 34: Line 34:
 
Two distinct differences between visual and KUIT markup can be observed.
 
Two distinct differences between visual and KUIT markup can be observed.
  
The first is the use of context i18n calls, the <tt>i18nc()</tt>, to convey the broad semantic context of a string. "Move" has been assigned the <tt>@action</tt> context, which means that clicking on its widget in the interface will cause something to happen (e.g. a button text). String "Descending" is in <tt>@item</tt> context, which indicates that it is one of several possible choices (e.g. item in drop-down list).
+
The first is the use of context i18n calls, the <tt>i18nc()</tt>, to convey the broad usage context of a string by means of ''context marker''. The first message above, "Move", has been assigned the <tt>@action:button</tt> marker, where <tt>@action</tt> is the ''main context'' that describes the text as an action to be taken (e.g. operation on data or opening of a new dialog), and <tt>:button</tt> the ''interface subcue'' saying that this text is displayed on a pushbutton widget. The second message, "Descending", has been marked as a semantically a list entry (<tt>@item</tt>), displayed in a menu (<tt>:inmenu</tt>). The interface subcue can be left out if none is appropriate.
  
The other difference is the use of semantic tags, which convey the meaning of the element. For example, the <tt><filename>%1</filename></tt> bit tells that the substituted text is the name of a file, and the <tt><title></tt> and <tt><para></tt> tags in the other message lay out a clear structure of longer informational texts.
+
The other difference is the use of semantic tags, which convey the meaning of the element. For example, the <tt><filename>%1</filename></tt> bit in the third message tells that the substituted text is the name of a file, and the <tt><title></tt> and <tt><para></tt> tags in the last message lay out a clear structure of longer informational texts.
  
{{note|The semantic context <tt>@</tt>-markers can be added when working with Qt Designer too. Each text label of a widget has a comment attribute, which can be used in same manner as context argument of <tt>i18nc()</tt> call.}}
+
{{note|The context marker can be added when working with Qt Designer too. Each text label of a widget has a comment attribute, which can be used in same manner as context argument of <tt>i18nc()</tt> call.}}
  
The context markers do not prevent programmers from providing the free-form context to translators. It is just separated by a whitespace from context marker, like this:
+
Even when context marker is present, sometimes the programmer may want to provide an additional free-form clarification to translators, in order shed more light on particularly ambiguous strings. It is just separated by a whitespace from the context marker proper, like this:
 
<code cpp>
 
<code cpp>
i18nc("@item Sorting order", "Descending");
+
i18nc("@item:inmenu Sorting order", "Descending");
 
</code>
 
</code>
  
 
== Advantages of Semantic Markup ==
 
== Advantages of Semantic Markup ==
  
In general, KUIT markup has advantages both to users and to translators of the application.
+
In general, KUIT markup has advantages both to users and to translators of thes application.
  
 
For the users, the use of semantic tags means consistent formatting of same kinds of text. The notorious example of inconsistent visual formatting would be filenames and paths, which are sometimes put in as is, sometimes in quotes (and ordinary quotes at that, rather than proper English fancy quotes), and sometimes in bold tags. Furthermore, the text withing the tag may be modified with semantic markup; for example, the <tt><filename></tt> text is transformed from "/" path delimiters to platform specific ones.
 
For the users, the use of semantic tags means consistent formatting of same kinds of text. The notorious example of inconsistent visual formatting would be filenames and paths, which are sometimes put in as is, sometimes in quotes (and ordinary quotes at that, rather than proper English fancy quotes), and sometimes in bold tags. Furthermore, the text withing the tag may be modified with semantic markup; for example, the <tt><filename></tt> text is transformed from "/" path delimiters to platform specific ones.
  
Translators will benefit from both semantic context markers and tags. The "Move" string in the example above had <tt>@action</tt> context, for which the translator may use command form of the verb, while gerund form (like "Moving") may be more appropriate in the <tt>@title</tt> context, which would be used if the string was title of the menu, option group, etc. Tags will benefit translators in the similar way, as they may clear up the structure of the sentence, especially in presence of placeholder substitutions.
+
Translators will benefit from both context markers and tags. The "Move" string in the example above had <tt>@action</tt> context, for which the translator may use command form of the verb, while gerund form (like "Moving") may be more appropriate in the <tt>@title</tt> context, which would be used if the string was title of the menu, option group, etc. The interface subcue, like <tt>:button</tt> above, if present additionally enables translator to mentally picture the actual runtime appearance on the spot. Tags within the text will also benefit translators, as they may clear up the structure of the sentence, especially in presence of placeholder substitutions.
  
The context markers also serve a technical purpose. They decide whether what form of visual formatting is used. E.g. the <tt>@title</tt> context will use plain text, whereas <tt>@info</tt> will be formatted with HTML tags.
+
The context markers also serve a technical purpose. They decide whether what form of visual formatting is used. E.g. any <tt>@title</tt> context will use plain text, whereas <tt>@info</tt> contexts will mostly be formatted with HTML tags (may depend on the interface subcue too).
  
None the least, while contexts and tags may be something more to learn for programmers, it removes the burden of thinking about the visual formatting to apply. "''Should I put the path in quotes or &lt;b>?''", "''Should the title be &lt;h2> or &lt;h3>?''", and so on.
+
None the least, semantic markup removes the burden of thinking about the visual formatting to apply while programming, like "''Should I put the path in quotes or &lt;b>?''", or "''Should the title be &lt;h2> or &lt;h3>?''", and so on.
  
 
== Context Markers ==
 
== Context Markers ==
  
KUIT presently defines the following semantic contexts:
+
Context marker consist of the main context and the interface subcue. The subcue needs not be present, and should be provided only when the text clearly maps to what the subcue describes. KUIT presently defines the following main contexts and their possible interface subcues:
  
 
; <tt>@action</tt>
 
; <tt>@action</tt>
: Text to all clickable widgets that cause some action to be performed, like an operation on the data, view restructuring, or opening a dialog. The button texts and menu items (except submenus) all fall into this category.
+
: Text to all clickable widgets that cause some action to be performed, like an operation on the data, view restructuring, or opening a dialog. The button texts and menu entries (except submenu titles) all fall into this category.
 +
:
 +
:: <tt>:button</tt> - pushbuttons in windows and dialogs
 +
:: <tt>:inmenu</tt> - menu entries that perform an action
 +
:: <tt>:intoolbar</tt> - toolbar buttons
  
 
; <tt>@title</tt>
 
; <tt>@title</tt>
: Text that is semantically a title in the interface. These would include window titles, menu titles, option group names and tab names.
+
: Text that is semantically a title in the interface. These would include window titles, menu titles, tab names, option group names in configuration dialogs, and column names in list views.
 +
:
 +
:: <tt>window</tt> - window title
 +
:: <tt>menu</tt> - menu name
 +
:: <tt>tab</tt> - tab name
 +
:: <tt>group</tt> - option group
  
 
; <tt>@option</tt>
 
; <tt>@option</tt>
: Text to yes/no or on/off logical choices. These are the labels to GUI checkboxes.
+
: Text to options which user can turn on and off, or choose between. These are the labels to checkboxes (either in dialogs or in menus) and radio buttons.
 
+
:
; <tt>@item</tt>
+
:: <tt>check</tt> - checkbox label
: Strings that can be considered one from a list of possible choices. Items in dropdown and combo boxes are obvious, but also some menu items (like encoding selection, or sort orderings), and especially ''radio-buttons'' are included here. (Radio buttons are just another way to convey a list of possibilities.)
+
:: <tt>radio</tt> - radio-button label
  
 
; <tt>@label</tt>
 
; <tt>@label</tt>
: Text labels to other widgets in the interface, which are none of <tt>@action</tt>, <tt>@title</tt>, <tt>@option</tt> or <tt>@item</tt>. These include labels to sliders, counters, font and color choosers, combo, edit and text boxes.
+
: Text labels to various widgets in the interface, which are none of <tt>@action</tt>, <tt>@title</tt>, <tt>@option</tt>. These include labels to sliders, spinboxes, combo, list and text boxes, font and color choosers.
 +
:
 +
:: <tt>slider</tt> - slider labels (but end-ranges are @item!)
 +
:: <tt>spinbox</tt> - spinbox labels
 +
:: <tt>listbox</tt> - list and combo boxes
 +
:: <tt>textbox</tt> - text and edit boxes
 +
:: <tt>chooser</tt> - chooser widgets (fonts, colors, etc.)
  
; <tt>@process</tt>
+
; <tt>@item</tt>
: Text which shows phases of a process in progress. For example, "Copying files..." in file-copy progress dialog, or "Computing checksum..." in CD burning application.
+
: Strings that can be considered one from a range of possible choices or properties. Entries in listings, dropdown and combo boxes are obvious, but also some menu items (e.g. encoding selection, sort orderings), end-labels to ranges (e.g. high/low, more/less), and object properites (e.g. file types, permissions).
 
+
:
; <tt>@tooltip</tt>
+
:: <tt>inmenu</tt> - items presented as menu entries
: Text to hovering tooltips.
+
:: <tt>inlistbox</tt> - items in list and combo boxes
 
+
:: <tt>range</tt> - range labels to sliders
; <tt>@whatsthis</tt>
+
:: <tt>property</tt> - object properties
: Text to "What's This?" explanations.
+
  
 
; <tt>@info</tt>
 
; <tt>@info</tt>
: Any general body of text for user's information, that does not fall under previous contexts. These are typically texts to message boxes.
+
: Any general text for user's information, that does not fall under previous contexts. These are for example tooltip and "What's This?" texts, text in message boxes, fields in status bar, and strings in progress dialogs.
 +
:
 +
:: <tt>tooltip</tt> - hovering tooltips
 +
:: <tt>whatsthis</tt> - "What's This?" explanations of widgets
 +
:: <tt>status</tt> - texts in status displays (e.g. in status bar)
 +
:: <tt>progress</tt> - the current state of ongoing process
 +
:: <tt>tipoftheday</tt> - introductory tips on application startup
  
 
== Semantic Tags ==
 
== Semantic Tags ==
  
=== Terminal tags ===
+
KUIT semantic tags come in several logical groups:
 +
* ''phrase tags'' - those that ascribe meaning to certain phrases and inserts
 +
* ''sentence tags'' - which describe the purpose of a complete sentence in text
 +
* ''structure tags'' - used to order longer text into paragraphs, titles, etc.
 +
 
 +
=== Phrase tags ===
  
The following KUIT tags are mostly-terminal, meaning that they will not admit any subtags (or just a few selected, where indicated):
+
Phrase tags are mostly terminal, meaning that by default they will not admit any subtags; where some subtags can be used, it is so indicated. KUIT defines the folowing description tags:
  
 
; <tt>&lt;application&gt;</tt>
 
; <tt>&lt;application&gt;</tt>
 
: Name of an application.
 
: Name of an application.
 
<code cpp>
 
<code cpp>
i18nc("@action", "Open with <application>%1</application>", appName);
+
i18nc("@action::inmenu",
 +
      "Open with <application>%1</application>", appName);
 
</code>
 
</code>
  
Line 103: Line 129:
 
: Line-braking body of code, for short listings.
 
: Line-braking body of code, for short listings.
 
<code cpp>
 
<code cpp>
i18nc("@info",
+
i18nc("@info:whatsthis",
 
       "You can try the following snippet:<bcode>"
 
       "You can try the following snippet:<bcode>"
 
       "\begin{equation}"
 
       "\begin{equation}"
Line 114: Line 140:
 
: Name of shell command or system call. Man section can be provided via <tt>section</tt> attribute.
 
: Name of shell command or system call. Man section can be provided via <tt>section</tt> attribute.
 
<code cpp>
 
<code cpp>
i18nc("@info", "This will call <command>%1</command> internally.", cmdName);
+
i18nc("@info",
 +
      "This will call <command>%1</command> internally.", cmdName);
  
 
i18nc("@info",
 
i18nc("@info",
Line 123: Line 150:
 
: Email addres. Without attributes, the tag text is the address. Address can also be given with <tt>address</tt> attribute, in which case the tag text is name or description attached to the address.
 
: Email addres. Without attributes, the tag text is the address. Address can also be given with <tt>address</tt> attribute, in which case the tag text is name or description attached to the address.
 
<code cpp>
 
<code cpp>
i18nc("@info", "Send bug reports to <email>%1</email>.", emailNull);
+
i18nc("@info",
 +
      "Send bug reports to <email>%1</email>.", emailNull);
  
 
i18nc("@info",
 
i18nc("@info",
Line 133: Line 161:
 
: Emphasize a word or phrase in the text.
 
: Emphasize a word or phrase in the text.
 
<code cpp>
 
<code cpp>
i18nc("@process", "Checking <emphasis>feedback</emphasis> circuits...");
+
i18nc("@info:progress",
 +
      "Checking <emphasis>feedback</emphasis> circuits...");
 
</code>
 
</code>
  
Line 139: Line 168:
 
: Environment variable. The $ sign will be prepended automatically in formatted text.
 
: Environment variable. The $ sign will be prepended automatically in formatted text.
 
<code cpp>
 
<code cpp>
i18nc("@info", "Assure that your <envar>PATH</envar> is properly set.");
+
i18nc("@info",
 +
      "Assure that your <envar>PATH</envar> is properly set.");
 
</code>
 
</code>
  
Line 155: Line 185:
 
: Inline code, like shell command lines.
 
: Inline code, like shell command lines.
 
<code cpp>
 
<code cpp>
i18nc("@info", "Executes <icode>svn merge</icode> with given revisions.");
+
i18nc("@info:tooltip",
 +
      "Execute <icode>svn merge</icode> on selected revisions.");
 
</code>
 
</code>
 
: The <tt><placeholder></tt> can be used as subtag.
 
: The <tt><placeholder></tt> can be used as subtag.
Line 162: Line 193:
 
: Path to GUI interface element. Use "/", "|" or "->" to delimit elements, which will be converted into canonical form.
 
: Path to GUI interface element. Use "/", "|" or "->" to delimit elements, which will be converted into canonical form.
 
<code cpp>
 
<code cpp>
i18nc("@info",
+
i18nc("@info:whatsthis",
 
       "The line colors can be changed under "
 
       "The line colors can be changed under "
 
       "<interface>Settings->Visuals</interface>.");
 
       "<interface>Settings->Visuals</interface>.");
Line 170: Line 201:
 
: Link to a URL-addressable resource. Without attributes, the tag text is the URL; alternatively, URL can be given by <tt>url</tt> attribute, and then the tag text serves as description.
 
: Link to a URL-addressable resource. Without attributes, the tag text is the URL; alternatively, URL can be given by <tt>url</tt> attribute, and then the tag text serves as description.
 
<code cpp>
 
<code cpp>
i18nc("@info", "Check the <link>%1</link> website.", urlKDE);
+
i18nc("@info:tooltip",
 +
      "Go to <link>%1</link> website.", urlKDE);
  
i18nc("@info", "Check <link url='%1'>the KDE website</link>.", urlKDE);
+
i18nc("@info:tooltip",
 +
      "Go to <link url='%1'>the KDE website</link>.", urlKDE);
 
</code>
 
</code>
 
: The variant with URL/description separation is preferred when applicable. The construct will be hyperlinked in rich text format.
 
: The variant with URL/description separation is preferred when applicable. The construct will be hyperlinked in rich text format.
Line 179: Line 212:
 
: An external message to be reported to the user.
 
: An external message to be reported to the user.
 
<code cpp>
 
<code cpp>
i18nc("@info", "Fortune cookie says: <message>%1</message>", trouble);
+
i18nc("@info",
 +
      "The fortune cookie says: <message>%1</message>", trouble);
 
</code>
 
</code>
  
Line 185: Line 219:
 
: By default, numbers supplied as arguments to i18n calls are formatted into localized form. If the number is supposed to be a numeric identifier instead, like a port number, use this tag to signal numeric-id context.
 
: By default, numbers supplied as arguments to i18n calls are formatted into localized form. If the number is supposed to be a numeric identifier instead, like a port number, use this tag to signal numeric-id context.
 
<code cpp>
 
<code cpp>
i18nc("@process", "Connecting to <numid>%1</numid>...", portNo);
+
i18nc("@info:progress",
 +
      "Connecting to <numid>%1</numid>...", portNo);
 
</code>
 
</code>
  
Line 191: Line 226:
 
: A placeholder text, either something to be replaced by the user, or a generic item in a list.
 
: A placeholder text, either something to be replaced by the user, or a generic item in a list.
 
<code cpp>
 
<code cpp>
i18nc("@info", "Replace <placeholder>name</placeholder> with your name.");
+
i18nc("@info",
i18nc("@item", "<placeholder>All images</placeholder>");
+
      "Replace <placeholder>name</placeholder> with your name.");
 +
 
 +
i18nc("@item:type",
 +
      "<placeholder>All images</placeholder>");
 
</code>
 
</code>
  
Line 204: Line 242:
 
: Combination of key to press. Separate the keys by "+" or "-", and the shortcut will be converted into canonical form.
 
: Combination of key to press. Separate the keys by "+" or "-", and the shortcut will be converted into canonical form.
 
<code cpp>
 
<code cpp>
i18nc("@info",
+
i18nc("@info:whatsthis",
       "Cycle through layouts by <shortcut>Alt+Space</shortcut>.");
+
       "Cycle through layouts using <shortcut>Alt+Space</shortcut>.");
 
</code>
 
</code>
  
 
=== Sentence tags ===
 
=== Sentence tags ===
  
Some sentences can be given a special meaning, by using the sentence tags. These tags will admit any terminal tags for subtags.
+
Sentence tags mark complete sentences in text, and will admit any phrase tags as subtags. The following are defined:
  
 
; <tt>&lt;note&gt;</tt>
 
; <tt>&lt;note&gt;</tt>
Line 231: Line 269:
 
: Do not explicitly add "Warning:", it will be added automatically. If you really need other label than "Warning", use attribute <tt>label</tt>, e.g. <tt>"<warning label='Danger'>...</warning>"</tt>.
 
: Do not explicitly add "Warning:", it will be added automatically. If you really need other label than "Warning", use attribute <tt>label</tt>, e.g. <tt>"<warning label='Danger'>...</warning>"</tt>.
  
=== Structuring tags ===
+
=== Structure tags ===
  
For structuring longer texts, the following tags are available:
+
Structure tags are used to split longer texts into titles, paragraphs, and lists. By default they can contain any phrase or sentence tags, unless indicated otherwise.
  
 
; <tt>&lt;para&gt;</tt>
 
; <tt>&lt;para&gt;</tt>
Line 250: Line 288:
 
: List item.
 
: List item.
  
Structuring tags (other than <tt>&lt;list&gt;</tt>) can contain any terminal or sentence tags.
+
If any of the structure tags is present, then there must be no text outside of structure tags. The following is not valid KUIT markup:
 
+
If any of the structuring tags is present, then there must be no text outside of these elements. The following is not valid KUIT markup:
+
 
<code cpp>
 
<code cpp>
 
// invalid markup
 
// invalid markup
Line 259: Line 295:
 
       "You can configure the history sidebar here."); // <para> missing
 
       "You can configure the history sidebar here."); // <para> missing
 
</code>
 
</code>
 
Qt's rich text HTML tags can be used concurrently with KUIT tags, but this is not advised unless necessary. They may be needed, for example, to create tables or insert images, as KUIT does not implement this functionality at the moment.
 
  
 
== Limitations to Use of Semantic Markup ==
 
== Limitations to Use of Semantic Markup ==
Line 266: Line 300:
 
Semantic markup cannot be used in "dumb" strings, which do not pass through KDE's i18n subsystem. These would be, for example, strings in <tt>.desktop</tt> format files. But ''not'' the strings in UI files, as in Qt Designer they can be equipped with both context markers (via comment field to text properties) and semantic tags.
 
Semantic markup cannot be used in "dumb" strings, which do not pass through KDE's i18n subsystem. These would be, for example, strings in <tt>.desktop</tt> format files. But ''not'' the strings in UI files, as in Qt Designer they can be equipped with both context markers (via comment field to text properties) and semantic tags.
  
Sometimes, the visual formatting may not be quite appropriate for the output device; each context comes with a default formatting. For example, if the <tt>@info</tt> context is applied to a string which is used in a widget that does not handle rich text, it will come out with HTML tags displayed verbatim. To handle this, formatting can be explicitly signaled by <tt>/''format''</tt> modifier to context marker:
+
Qt's rich text HTML tags can be used concurrently with KUIT tags, but this is not advised unless necessary. They may be needed, for example, to create tables or insert images, as KUIT does not implement this functionality at the moment.
 +
 
 +
Sometimes, the visual formatting may not be quite appropriate for the output device; each context comes with a default formatting. For example, if the <tt>@info</tt> context is applied to a string which is used in a widget that does not handle rich text, it will come out with HTML tags displayed verbatim. To handle this, visual formatting can be explicitly signaled by <tt>/''format''</tt> modifier to context marker:
  
 
<code cpp>
 
<code cpp>
i18nc("@info/plain", "<filename>%1</filename> does not exist", fname);
+
i18nc("@info:message/plain",
 +
      "<filename>%1</filename> does not exist", fname);
 
</code>
 
</code>
  
 
Presently, the possible format modifiers are <tt>/plain</tt>, <tt>/rich</tt> and <tt>/shell</tt>. In particular, any strings which are output to the terminal should have explicit <tt>/shell</tt> format.
 
Presently, the possible format modifiers are <tt>/plain</tt>, <tt>/rich</tt> and <tt>/shell</tt>. In particular, any strings which are output to the terminal should have explicit <tt>/shell</tt> format.
 +
 +
== Should I Go For Semantic Markup? ==
 +
 +
Admittedly, KUIT markup is an additional thing to be learned and applied throughout the course of development. By now you may be wondering if it is worthwile to invest time into that, particularly in view of two cases:
 +
* starting work on a new application, and
 +
* porting messages in existing applications.
 +
 +
You are strongly advised to use KUIT for new code. Compared to the total time spent on code, writing UI messages is only a small fraction. Context markers will help translators a lot, and message tags will provide consistent visual text formatting to your application.
 +
 +
When modifying existing code, there are two issues. First, obviously it is a daunting task to go through hundreds (or worse) of messages and equip them with semantic markup. Second, by changing the messages, the translators too will have to review their existing translations; however, it is not expected that the porting will take so "epic" proportions that the translators cannot keep up. Sumarily, feel free to do as you see fit.
 +
 +
Additionally, for porting, keep in mind that it is not all-or-nothing proposal. Any amount of semantic messages are useful to translators, and users can only see the difference for the better. Thus, for example, deciding to make all ''new'' messages semantic and slowly over time fix old messages, is a perfectly fine strategy.
 +
 +
To make your job easier, there is a standalone i18n-checker script that you can run on your code by yourself, and which is also daily run on the code repository wide (as part Krazy-framework), that will report the problems in KUIT markup, as well as check some other i18n nuances. It can be found in the KDE repository, in ((when it's done)).
 +
 +
Last but not the least, there is also a chic-effect to the KUIT. Its wide use, together with some under-the-hood elements on translators' disposal, will make KDE4's i18n layer without peer in free or proprietary software world. Insofar as you consider localization excellence an important part of the overall KDE excellence, this is something that may also tip your decision :) -- Your Friendly Translator.

Revision as of 21:12, 20 June 2007

noframe
 
This section needs improvements: Please help us to

cleanup confusing sections and fix sections which contain a todo


This article is describing a system currently in RFC phase, pending design modifications. Skip it unless you would like to make comments and proposals yourself.

Contents

Abstract

Typical way of formatting user visible strings in application interfaces, for a long time has been that of plain text or at most visual markup like HTML tags. In most textual content environments, shift to semantic markup has been recognized as superior to visual (for example, the Docbook XML for documentation). Why not go down the same road for UI strings?

Semantic Markup by Examples

In the semantic model, text elements are marked for their meaning, rather than for their visual appearance. Consider a few i18n examples of visual formatting: i18n("Move");

i18n("Descending");

i18n("<qt>%1 does not exist</qt>", fname);

i18n("

History Sidebar

You can configure the history sidebar here.");

Using KDE UI text semantic markup (KUIT for short), these strings would be formated like this: i18nc("@action:button", "Move");

i18nc("@item:inmenu", "Descending");

i18nc("@info", "<filename>%1</filename> does not exist", fname);

i18nc("@info:whatsthis",

     "<title>History Sidebar</title>"
     "<para>You can configure the history sidebar here.</para>");

Two distinct differences between visual and KUIT markup can be observed.

The first is the use of context i18n calls, the i18nc(), to convey the broad usage context of a string by means of context marker. The first message above, "Move", has been assigned the @action:button marker, where @action is the main context that describes the text as an action to be taken (e.g. operation on data or opening of a new dialog), and :button the interface subcue saying that this text is displayed on a pushbutton widget. The second message, "Descending", has been marked as a semantically a list entry (@item), displayed in a menu (:inmenu). The interface subcue can be left out if none is appropriate.

The other difference is the use of semantic tags, which convey the meaning of the element. For example, the <filename>%1</filename> bit in the third message tells that the substituted text is the name of a file, and the <title> and <para> tags in the last message lay out a clear structure of longer informational texts.

noframe
 
Note
The context marker can be added when working with Qt Designer too. Each text label of a widget has a comment attribute, which can be used in same manner as context argument of i18nc() call.

Even when context marker is present, sometimes the programmer may want to provide an additional free-form clarification to translators, in order shed more light on particularly ambiguous strings. It is just separated by a whitespace from the context marker proper, like this: i18nc("@item:inmenu Sorting order", "Descending");

Advantages of Semantic Markup

In general, KUIT markup has advantages both to users and to translators of thes application.

For the users, the use of semantic tags means consistent formatting of same kinds of text. The notorious example of inconsistent visual formatting would be filenames and paths, which are sometimes put in as is, sometimes in quotes (and ordinary quotes at that, rather than proper English fancy quotes), and sometimes in bold tags. Furthermore, the text withing the tag may be modified with semantic markup; for example, the <filename> text is transformed from "/" path delimiters to platform specific ones.

Translators will benefit from both context markers and tags. The "Move" string in the example above had @action context, for which the translator may use command form of the verb, while gerund form (like "Moving") may be more appropriate in the @title context, which would be used if the string was title of the menu, option group, etc. The interface subcue, like :button above, if present additionally enables translator to mentally picture the actual runtime appearance on the spot. Tags within the text will also benefit translators, as they may clear up the structure of the sentence, especially in presence of placeholder substitutions.

The context markers also serve a technical purpose. They decide whether what form of visual formatting is used. E.g. any @title context will use plain text, whereas @info contexts will mostly be formatted with HTML tags (may depend on the interface subcue too).

None the least, semantic markup removes the burden of thinking about the visual formatting to apply while programming, like "Should I put the path in quotes or <b>?", or "Should the title be <h2> or <h3>?", and so on.

Context Markers

Context marker consist of the main context and the interface subcue. The subcue needs not be present, and should be provided only when the text clearly maps to what the subcue describes. KUIT presently defines the following main contexts and their possible interface subcues:

@action
Text to all clickable widgets that cause some action to be performed, like an operation on the data, view restructuring, or opening a dialog. The button texts and menu entries (except submenu titles) all fall into this category.
:button - pushbuttons in windows and dialogs
:inmenu - menu entries that perform an action
:intoolbar - toolbar buttons
@title
Text that is semantically a title in the interface. These would include window titles, menu titles, tab names, option group names in configuration dialogs, and column names in list views.
window - window title
menu - menu name
tab - tab name
group - option group
@option
Text to options which user can turn on and off, or choose between. These are the labels to checkboxes (either in dialogs or in menus) and radio buttons.
check - checkbox label
radio - radio-button label
@label
Text labels to various widgets in the interface, which are none of @action, @title, @option. These include labels to sliders, spinboxes, combo, list and text boxes, font and color choosers.
slider - slider labels (but end-ranges are @item!)
spinbox - spinbox labels
listbox - list and combo boxes
textbox - text and edit boxes
chooser - chooser widgets (fonts, colors, etc.)
@item
Strings that can be considered one from a range of possible choices or properties. Entries in listings, dropdown and combo boxes are obvious, but also some menu items (e.g. encoding selection, sort orderings), end-labels to ranges (e.g. high/low, more/less), and object properites (e.g. file types, permissions).
inmenu - items presented as menu entries
inlistbox - items in list and combo boxes
range - range labels to sliders
property - object properties
@info
Any general text for user's information, that does not fall under previous contexts. These are for example tooltip and "What's This?" texts, text in message boxes, fields in status bar, and strings in progress dialogs.
tooltip - hovering tooltips
whatsthis - "What's This?" explanations of widgets
status - texts in status displays (e.g. in status bar)
progress - the current state of ongoing process
tipoftheday - introductory tips on application startup

Semantic Tags

KUIT semantic tags come in several logical groups:

  • phrase tags - those that ascribe meaning to certain phrases and inserts
  • sentence tags - which describe the purpose of a complete sentence in text
  • structure tags - used to order longer text into paragraphs, titles, etc.

Phrase tags

Phrase tags are mostly terminal, meaning that by default they will not admit any subtags; where some subtags can be used, it is so indicated. KUIT defines the folowing description tags:

<application>
Name of an application.

i18nc("@action::inmenu",

     "Open with <application>%1</application>", appName);

<bcode>
Line-braking body of code, for short listings.

i18nc("@info:whatsthis",

     "You can try the following snippet:<bcode>"
     "\begin{equation}"
     "  C_{D_i} = \frac{C_z^2}{e \pi \lambda}"
     "\end{equation}"
     "</bcode>");

<command>
Name of shell command or system call. Man section can be provided via section attribute.

i18nc("@info",

     "This will call <command>%1</command> internally.", cmdName);

i18nc("@info",

     "Consult man entry for <command section='1'>%1</command>", cmdName);

<email>
Email addres. Without attributes, the tag text is the address. Address can also be given with address attribute, in which case the tag text is name or description attached to the address.

i18nc("@info",

     "Send bug reports to <email>%1</email>.", emailNull);

i18nc("@info",

     "Send praises to <email address='%1'>the author</email>.", emailMy);

The construct will be hyperlinked in rich text format.
<emphasis>
Emphasize a word or phrase in the text.

i18nc("@info:progress",

     "Checking <emphasis>feedback</emphasis> circuits...");

<envar>
Environment variable. The $ sign will be prepended automatically in formatted text.

i18nc("@info",

     "Assure that your <envar>PATH</envar> is properly set.");

<filename>
File or folder name or path. The path separators will be transformed into what is native to the platform.

i18nc("@info", "Cannot read <filename>%1</filename>.", filename);

i18nc("@info",

     "<filename><envar>HOME</envar>/.foorc</filename> does not exist.");

The <envar> can be used as subtag.
<icode>
Inline code, like shell command lines.

i18nc("@info:tooltip",

     "Execute <icode>svn merge</icode> on selected revisions.");

The <placeholder> can be used as subtag.
<interface>
Path to GUI interface element. Use "/", "|" or "->" to delimit elements, which will be converted into canonical form.

i18nc("@info:whatsthis",

     "The line colors can be changed under "
     "<interface>Settings->Visuals</interface>.");

<link>
Link to a URL-addressable resource. Without attributes, the tag text is the URL; alternatively, URL can be given by url attribute, and then the tag text serves as description.

i18nc("@info:tooltip",

     "Go to <link>%1</link> website.", urlKDE);

i18nc("@info:tooltip",

     "Go to <link url='%1'>the KDE website</link>.", urlKDE);

The variant with URL/description separation is preferred when applicable. The construct will be hyperlinked in rich text format.
<message>
An external message to be reported to the user.

i18nc("@info",

     "The fortune cookie says: <message>%1</message>", trouble);

<numid>
By default, numbers supplied as arguments to i18n calls are formatted into localized form. If the number is supposed to be a numeric identifier instead, like a port number, use this tag to signal numeric-id context.

i18nc("@info:progress",

     "Connecting to <numid>%1</numid>...", portNo);

<placeholder>
A placeholder text, either something to be replaced by the user, or a generic item in a list.

i18nc("@info",

     "Replace <placeholder>name</placeholder> with your name.");

i18nc("@item:type",

     "<placeholder>All images</placeholder>");

<resource>
General named resource. Names of documents, sessions, projects, toolbars, plugins, schemes and themes, accounts, etc.

i18nc("@info", "Apply color scheme <resource>%1</resource>?", colScheme);

<shortcut>
Combination of key to press. Separate the keys by "+" or "-", and the shortcut will be converted into canonical form.

i18nc("@info:whatsthis",

     "Cycle through layouts using <shortcut>Alt+Space</shortcut>.");

Sentence tags

Sentence tags mark complete sentences in text, and will admit any phrase tags as subtags. The following are defined:

<note>
The sentence is a side note of significance to the topic.

i18nc("@info",

     "Probably the best known of all duck species is the Mallard. "
     "It breeds throughout the temperate areas around the world. "
     "<note>Most domestic ducks are derived from Mallard.</note>");

Do not explicitly add "Note:", it will be added automatically. If you really need other label than "Note", use attribute label, e.g. "<note label='Trivia'>...</note>".
<warning>
The sentence is a warning.

i18nc("@info",

     "Really delete this key?"
     "<warning>This cannot be undone.</warning>");

Do not explicitly add "Warning:", it will be added automatically. If you really need other label than "Warning", use attribute label, e.g. "<warning label='Danger'>...</warning>".

Structure tags

Structure tags are used to split longer texts into titles, paragraphs, and lists. By default they can contain any phrase or sentence tags, unless indicated otherwise.

<para>
Text paragraph.
<title>
The title of the text. Must be the first tag if present, but can be omitted.
<subtitle>
Subtitle in the text. Must be followed by at least one <para>.
<list>
List of items. Can contain only <item> as subtags. List is considered an element of the paragraph, so the <list> must be found inside <para>.
<item>
List item.

If any of the structure tags is present, then there must be no text outside of structure tags. The following is not valid KUIT markup: // invalid markup i18nc("@info",

     "<title>History Sidebar</title>"
     "You can configure the history sidebar here."); // <para> missing

Limitations to Use of Semantic Markup

Semantic markup cannot be used in "dumb" strings, which do not pass through KDE's i18n subsystem. These would be, for example, strings in .desktop format files. But not the strings in UI files, as in Qt Designer they can be equipped with both context markers (via comment field to text properties) and semantic tags.

Qt's rich text HTML tags can be used concurrently with KUIT tags, but this is not advised unless necessary. They may be needed, for example, to create tables or insert images, as KUIT does not implement this functionality at the moment.

Sometimes, the visual formatting may not be quite appropriate for the output device; each context comes with a default formatting. For example, if the @info context is applied to a string which is used in a widget that does not handle rich text, it will come out with HTML tags displayed verbatim. To handle this, visual formatting can be explicitly signaled by /format modifier to context marker:

i18nc("@info:message/plain",

     "<filename>%1</filename> does not exist", fname);

Presently, the possible format modifiers are /plain, /rich and /shell. In particular, any strings which are output to the terminal should have explicit /shell format.

Should I Go For Semantic Markup?

Admittedly, KUIT markup is an additional thing to be learned and applied throughout the course of development. By now you may be wondering if it is worthwile to invest time into that, particularly in view of two cases:

  • starting work on a new application, and
  • porting messages in existing applications.

You are strongly advised to use KUIT for new code. Compared to the total time spent on code, writing UI messages is only a small fraction. Context markers will help translators a lot, and message tags will provide consistent visual text formatting to your application.

When modifying existing code, there are two issues. First, obviously it is a daunting task to go through hundreds (or worse) of messages and equip them with semantic markup. Second, by changing the messages, the translators too will have to review their existing translations; however, it is not expected that the porting will take so "epic" proportions that the translators cannot keep up. Sumarily, feel free to do as you see fit.

Additionally, for porting, keep in mind that it is not all-or-nothing proposal. Any amount of semantic messages are useful to translators, and users can only see the difference for the better. Thus, for example, deciding to make all new messages semantic and slowly over time fix old messages, is a perfectly fine strategy.

To make your job easier, there is a standalone i18n-checker script that you can run on your code by yourself, and which is also daily run on the code repository wide (as part Krazy-framework), that will report the problems in KUIT markup, as well as check some other i18n nuances. It can be found in the KDE repository, in ((when it's done)).

Last but not the least, there is also a chic-effect to the KUIT. Its wide use, together with some under-the-hood elements on translators' disposal, will make KDE4's i18n layer without peer in free or proprietary software world. Insofar as you consider localization excellence an important part of the overall KDE excellence, this is something that may also tip your decision :) -- Your Friendly Translator.


KDE® and the K Desktop Environment® logo are registered trademarks of KDE e.V.Legal