Localization/Tools/Pology: Difference between revisions

    From KDE TechBase
    (Point to to-be created article on embedded diffing.)
    No edit summary
     
    (10 intermediate revisions by 2 users not shown)
    Line 1: Line 1:
    {{Template:I18n/Language Navigation Bar|Localization/Tools/Pology}}


    {{LocalizationBrowser|
    {{LocalizationBrowser|
    Line 5: Line 4:
    name=Pology|
    name=Pology|
    prereqs=[[Localization/Tools/Gettext_Tools|Gettext Tools]]|
    prereqs=[[Localization/Tools/Gettext_Tools|Gettext Tools]]|
    related=[[Localization/Tools/Pology/PO_Embedded_Diffing|PO Diffing]], [[Localization/Workflows/PO_Summit|Summiting Translation Branches]], [[Localization/Workflows/PO_Ascription|Review by Ascription]]|
    }}
    }}


    == About ==
    == About ==


    Pology is a Python framework for custom processing of PO files. It aims to facilitate easy, fast and robust creation of scripts for tackling problems encountered in the "field", everyday translation work, and to collect all sorts of specific, narrow purpose tools written in this direction. It does not aim to be a collection of several feature-rounded, monolithic, and general purpose tools (though it may contain some which could qualify). In particular, it does not attempt to handle any other translation formats but PO. All of Pology's end-user tools and programming interfaces are geared towards the PO format and conventions. The name itself should be parsed as PO-logy, "the study of POs".
    Pology is a Python library and a set of command-line tools for processing PO files. The library aims to enable easy, fast and robust creation of scripts for tackling problems encountered in the "field", in everyday translation work. The tools perform various specialized operations, beyond that of other PO-handling software. Some of the tools are designed to (or have provisions to) treat a collection of PO files as an entity unto itself.


    At the moment, Pology still has not reached release state. But it can already be used effectively for day-to-day work, especially through end-user scripts. Obtaining and preparing Pology for use is simple: fetch its code repository, set <tt>PATH</tt> to use the scripts that come with it, and possibly set <tt>PYTHONPATH</tt> to be able to write own code based on Pology. The following commands should suffice:
    In Pology, no attempt is made to handle any other translation file format but PO. All of the end-user tools and library programming interfaces are geared towards the technical aspects, conventions and workflows around the PO format. For example, some tools explicitly take into account that PO files are frequently kept under version control, providing the functionality to support that workflow.


    <code bash>
    Language- and project-specific support is present throughout Pology, and it is designed to be easily expanded in the future. In the context of KDE, for example, there is a tool for validating PO files within the KDE Translation Project: it will recognize the "role" of the particular PO file (e.g. a native KDE code PO file, a .desktop PO file, Docbook PO file...) and then apply checks appropriate for that role. (This tool is run weekly on KDE servers, on PO files of all languages, and results announced to <tt>kde-i18n-doc</tt> mailing list.)
    $ svn co svn://anonsvn.kde.org/home/kde/trunk/l10n-support/pology
    $ export PATH=$PWD/pology/scripts:$PATH
    $ export PYTHONPATH=$PWD:$PYTHONPATH
    </code>


    (Of course, for continuous use, environment variables should rather be set in <tt>~/.bashrc</tt>, or the configuration file of whatever the shell you are using.) After these steps are successfully performed, Pology is fully prepared for use and scripting.
    Pology source distribution can be fetched from its home page at:


    == Ready-Made Tools ==
    http://pology.nedohodnik.net


    Pology provides a number of tools for end use, with varying degrees of specificity, embodied as several scripts within <tt>scripts/</tt> subfolder of Pology's source. Details of operation of each script are provided within Pology documentation, and the following sections give overview and some examples of their functionality.
    The home page also links to on-line documentation (user manual and library API), and provides instructions for browsing or fetching the current development code.
     
    === Sieving ===
     
    ((To be done.))
     
    === Diffing and Patching ===
     
    Line-oriented diffing and patching, as conducted by <tt>diff(1)</tt> and <tt>patch(1)</tt> commands, is not quite appropriate for PO files. Due to PO content being composed of variably-formatted multiline entries, which combine translator, programmer, and automatically controlled elements, line-oriented diff may indicate difference where semantically there is none, or not show real difference in a useful form. One could even claim line diffing of PO files to be almost useless, especially for the purpose of sending patches for translation.
     
    For this reason, Pology contains two scripts, <tt>poediff</tt> and <tt>poepatch</tt>, which create message-oriented, ''embedded'' diffs of PO files. These diffs can be used both for reviewing changes and applying patches to PO files. Concept of embedded diffing and details on operation of the mentioned scripts are described in a [[Localization/Tools/Pology/PO_Embedded_Diffing|separate article]].
     
    === Reformatting ===
     
    ((To be done.))
     
    === Heavy Artillery ===
     
    ((To be done.))
     
    == Writing Own Tools ==
     
    Pology comes with detailed API documentation, but for a quick start into writing custom tools based on Pology, the following sections will describe and illustrate some of its more salient elements.
     
    === Catalogs and Messages ===
     
    For an obligatory hello-world demonstration, let us create a PO template named <tt>hello.pot</tt> with a single message of this greet:
     
    <code python>
    from pology.file.catalog import Catalog
    from pology.file.message import Message
     
    cat = Catalog("hello.pot", mode="w")
    msg = Message()
    msg.msgid = u"Hello, world!"
    cat.add(msg)
    cat.sync()
    </code>
     
    Most of these few lines are self-explanatory, except the last one: modifications to catalogs in Pology are never automatically written to disk, instead the <tt>sync()</tt> method must be called to initiate writes. Catalog is not gone after this, but you can continue to use it normally, including further syncings. In this example, after syncing the file <tt>hello.pot</tt> will be created in current working directory.
     
    Practically, however, it is usually Gettext tools that will be used to create templates and catalogs, while a much more common use of Pology is to iterate over existing catalogs. The following code will open a catalog with various greetings, look for all messages that contain "hello" in the original text but do ''not'' contain "zdravo" in the translation, and report their content to standard output:
     
    <code python>
    from pology.file.catalog import Catalog
    from pology.misc.msgreport import report_msg_content
     
    cat = Catalog("greets.po")
    for msg in cat:
        if "hello" in msg.msgid.lower():
            matched = False
            for text in msg.msgstr:
                if "zdravo" in text.lower():
                    matched = True
                    break
            if not matched:
                report_msg_content(msg, cat)
    </code>
     
    Note how <tt>msgstr</tt> is represented as a list regardless of whether the message is plural or not, the difference being only in the number of elements. This removes the special case of singular/plural translations, and makes programmer always think of plural messages (though plural of original text is accessed through <tt>msgid_plural</tt> instance variable). Function <tt>report_msg_content</tt> will output the message to standard output, nicely formatted and preceded with a line stating the originating catalog and message's referent line and entry number in it. But <tt>report_msg_content</tt> can do much more, e.g. highlight parts of the message in the shell, add notes and delimiters, and so on (its API documentation provides all the details). Since no changes were done to the catalog, it is perfectly fine, even appropriate, not to call <tt>sync()</tt> at the end.
     
    Of course, the previous snippet is just an illustration of iterating through catalogs and examining messages, in practice superfluous next to the functionality already provided by <tt>find-messages</tt> sieve:
     
    <code bash>
    $ posieve find-messages -smsgid:'hello' -snmsgstr:'zdravo' greets.po
    </code>
     
    ((To be continued...))
     
    === Sieves ===
     
    ((To be done.))
     
    === Hooks ===
     
    ((To be done.))
     
    === Language Support ===
     
    ((To be done.))

    Latest revision as of 18:40, 15 July 2012

    Pology
    On Localization   Tools
    Prerequisites   Gettext Tools
    Related Articles   PO Diffing, Summiting Translation Branches, Review by Ascription
    External Reading   n/a

    About

    Pology is a Python library and a set of command-line tools for processing PO files. The library aims to enable easy, fast and robust creation of scripts for tackling problems encountered in the "field", in everyday translation work. The tools perform various specialized operations, beyond that of other PO-handling software. Some of the tools are designed to (or have provisions to) treat a collection of PO files as an entity unto itself.

    In Pology, no attempt is made to handle any other translation file format but PO. All of the end-user tools and library programming interfaces are geared towards the technical aspects, conventions and workflows around the PO format. For example, some tools explicitly take into account that PO files are frequently kept under version control, providing the functionality to support that workflow.

    Language- and project-specific support is present throughout Pology, and it is designed to be easily expanded in the future. In the context of KDE, for example, there is a tool for validating PO files within the KDE Translation Project: it will recognize the "role" of the particular PO file (e.g. a native KDE code PO file, a .desktop PO file, Docbook PO file...) and then apply checks appropriate for that role. (This tool is run weekly on KDE servers, on PO files of all languages, and results announced to kde-i18n-doc mailing list.)

    Pology source distribution can be fetched from its home page at:

    http://pology.nedohodnik.net
    

    The home page also links to on-line documentation (user manual and library API), and provides instructions for browsing or fetching the current development code.