Projects/Summer of Code/2007/Projects/KAider: Difference between revisions
| Shaforostoff (talk | contribs) | Shaforostoff (talk | contribs) No edit summary | ||
| Line 7: | Line 7: | ||
| * spellcheck (problems with dividing filter:doesnt check the last word) | * spellcheck (problems with dividing filter:doesnt check the last word) | ||
| * search-n-replace, ignoring accel marks | * search-n-replace, ignoring accel marks | ||
| * small features like quick tag insert, placing text cursor right after the tag in the beginning (e.g. '<qt>|foobar</qt>' where "|" is a cursor) | * small features like quick tag insert, placing text cursor right after the tag in the beginning (e.g. '<qt>|foobar</qt>' where "|" is a cursor), entry bookmarks | ||
| * viewer of the difference between current msgid and previous one (i.e. msgid translation of which current msgstr really is -- for fuzzies generated with --previous gettext option) | * viewer of the difference between current msgid and previous one (i.e. msgid translation of which current msgstr really is -- for fuzzies generated with --previous gettext option) | ||
| * merge mode for editors (QA) or when several translators work on the same file [http://kv-53.narod.ru/kaider2.png screenshot] | * merge mode for editors (QA) or when several translators work on the same file [http://kv-53.narod.ru/kaider2.png screenshot] | ||
| * basic projectmanager functionality [http://kv-53.narod.ru/kaider1.png screenshot] | * basic projectmanager functionality [http://kv-53.narod.ru/kaider1.png screenshot] | ||
| * Translation Memory (threaded) with shortcuts for inserting suggestions into current 'msgstr', scores are computed based on common/total length ratio, removed+added length, and count of removed+added parts | |||
| * for difference representation in all places wordDiff algorithm is used (based on Longest Common Sequence o(n*n) algorithm and my own experience) | |||
| * glossary with basic [http://www.lisa.org/standards/tbx/ tbx] format support. KAider displays relevant entries on-the-fly and provides shortcuts to insert them. also, you can add new glossary terms via context menu of the glossary [http://kv-53.narod.ru/kaider3.png screenshot] | * glossary with basic [http://www.lisa.org/standards/tbx/ tbx] format support. KAider displays relevant entries on-the-fly and provides shortcuts to insert them. also, you can add new glossary terms via context menu of the glossary [http://kv-53.narod.ru/kaider3.png screenshot] | ||
| * webquery view, flexible thanks to kross | * webquery view, flexible thanks to kross | ||
| Line 31: | Line 32: | ||
| *[basic framework DONE] project management+scripting API -- 2 weeks | *[basic framework DONE] project management+scripting API -- 2 weeks | ||
| *[DONE] context glossary -- 0.5-1 week | *[DONE] context glossary -- 0.5-1 week | ||
| *translation DB (QtSql) -- 2 weeks | *[DONE] translation DB (QtSql) -- 2 weeks | ||
| *[DONE] mode for merging translations for editors (QA) -- 1 week | *[DONE] mode for merging translations for editors (QA) -- 1 week | ||
| *[DONE] sipping on google translate for live glossary (kross) - 2 weeks | *[DONE] sipping on google translate for live glossary (kross) - 2 weeks | ||
| *the remaining time is for perfection/polishing/small improvements and xliff+qt-linguist support | *the remaining time is for perfection/polishing/small improvements and xliff+qt-linguist support (see [[#KBabel features to be implemented|KBabel features to be implemented]]) | ||
| ==What i'm doing these days== | ==What i'm doing these days== | ||
| *TM | *TM: last actions | ||
| * | *WebQuery for twin languages (like ukrainian and russian) | ||
| *impovements on ProjectView (dbus, etc), Glossary | *impovements on ProjectView (dbus, etc), Glossary | ||
| ==Ideas== | ==Ideas== | ||
| Current: | Current: | ||
| * project-wise and program-wise: webquery scripts, glossaries, TMs | * project-wise and program-wise: webquery scripts, glossaries, TMs | ||
| * Glossary editing usability | * Glossary editing usability | ||
| * ... (more on papers around my table :) | |||
| Further work: | Further work: | ||
| Line 82: | Line 83: | ||
| ==Setup== | ==Setup== | ||
| * Create project, saving *.ktp file to l10n-kde4/<LangCode> | * Create project, saving *.ktp file to l10n-kde4/<LangCode> | ||
| *  | * Populate Glossary via GlossaryView context menu (.tbx file will be created automatically for you on the first entry addition). | ||
| * Populate Translation Memory by dropping .po files onto TM View | |||
Revision as of 01:22, 30 July 2007
KAider is a computer-aided translation system that focuses on productivity and performance. Translator does only creative work (of delivering message in his/her mother language in laconic and easy to understand form). KAider implies parapgraph-by-paragrah translation approach (when translating documentation) and message-by-message approach (when translating GUI). See KAider/Introduction
Current state
Already has:
- syntax highlighting
- spellcheck (problems with dividing filter:doesnt check the last word)
- search-n-replace, ignoring accel marks
- small features like quick tag insert, placing text cursor right after the tag in the beginning (e.g. '<qt>|foobar</qt>' where "|" is a cursor), entry bookmarks
- viewer of the difference between current msgid and previous one (i.e. msgid translation of which current msgstr really is -- for fuzzies generated with --previous gettext option)
- merge mode for editors (QA) or when several translators work on the same file screenshot
- basic projectmanager functionality screenshot
- Translation Memory (threaded) with shortcuts for inserting suggestions into current 'msgstr', scores are computed based on common/total length ratio, removed+added length, and count of removed+added parts
- for difference representation in all places wordDiff algorithm is used (based on Longest Common Sequence o(n*n) algorithm and my own experience)
- glossary with basic tbx format support. KAider displays relevant entries on-the-fly and provides shortcuts to insert them. also, you can add new glossary terms via context menu of the glossary screenshot
- webquery view, flexible thanks to kross
Compiling
After you set kde env up (compiling kdelibs is enough):
cd trunk svn up playground/devtools/kaider su kde-devel mkdir playground/devtools/kaider/build cd playground/devtools/kaider/build cmakekde ..
as a root, run sshd and then from the usual shell:
ssh -XC kde-devel@localhost kaider
Roadmap
- [basic framework DONE] project management+scripting API -- 2 weeks
- [DONE] context glossary -- 0.5-1 week
- [DONE] translation DB (QtSql) -- 2 weeks
- [DONE] mode for merging translations for editors (QA) -- 1 week
- [DONE] sipping on google translate for live glossary (kross) - 2 weeks
- the remaining time is for perfection/polishing/small improvements and xliff+qt-linguist support (see KBabel features to be implemented)
What i'm doing these days
- TM: last actions
- WebQuery for twin languages (like ukrainian and russian)
- impovements on ProjectView (dbus, etc), Glossary
Ideas
Current:
- project-wise and program-wise: webquery scripts, glossaries, TMs
- Glossary editing usability
- ... (more on papers around my table :)
Further work:
- Research on dividing into sentences rules (e.g. srx)
- Automate submitting translation suggestions to translate.google.com [Kross action]
Not for KDE:
- Be complete computer-aided translation system by providing e.g. actions to import+export openoffice, txt and documents of other formats by calling appropriate scripts/commands. Define for that general kross actions interface.
- Make nice windoze package for the windowzerz
KBabel features to be implemented
...in the smarter way :). After or during the summer.
- Character selection tool integration (kdelibs rule); sort by the frequency
- persistent bookmarks for messages in a file saved in the project
- extended marking of .po and .pot files (e.g. translator that currently works on the file and cince when) saved in the project
- Search/Replace functions in multiple files at once.
- Spellchecking of multiple files at once.
- Opening source code by references in message comments [Kross action]
- A plugin framework for validation tools for consistency checks [Kross action triggered on saving]
- Sending the file using email [Kross (project) action]
- Automatic syntax check with msgfmt when saving and, if an error occured, easy navigation to messages, which contain errors. == Syntax check (msgfmt --statistics) for existing files to control if the translated files will compile and, accordingly, work when distributed [Kross (project) action]
- CVS and SVN support [Kross project action] (is 'svn ci' so hard?)
- Automatic comparisons and statistics of POT and PO files for a quick overview which and how many files are translated (or not) and which files may be obsolete + [Kross (project) action] that merges translations with updated template
- PO File Header change [Kross action (+triggered on saving)]
- Printing of selected messages (eg fuzzy ) [Kross action]
KBabel features NOT to be implemented
- Automatic ("rough" in kbabel terms) translation. Pure machine translation is a joke. All machine-made translations must be verified by human (and let it be a translator rather than a user).
Why? because sometimes one English string may have two or more different translations depending on the context.
What I'm going to do is implement _interactive_ (or message-by-message) rough translation. If the message is already translated somewhere else, it suggests the translations (several, not one!) and displays them in the helper window. Translator may then choose one of the translation suggestions by pressing ctrl+1, ctrl+2, .. or ctrl+9, which will immediately insert it into msgstr (replacing the old translation if it exists).
What old rough translation didn't provide is the ability to choose.
Setup
- Create project, saving *.ktp file to l10n-kde4/<LangCode>
- Populate Glossary via GlossaryView context menu (.tbx file will be created automatically for you on the first entry addition).
- Populate Translation Memory by dropping .po files onto TM View