Development/Tutorials/Writing file analyzers

    From KDE TechBase
    Revision as of 20:10, 8 March 2007 by Vandenoever (talk | contribs) (fix link)

    Note: this tutorial is not finished yet

    Writing KDE4 file analyzers

    File analyzers extract data from files to display in the file dialogs and file managers. The data gathered this way is also used to search for files. KDE4 allows the use of multiple analyzers per file type. Analyzers can extract text which is used for indexing, but they can also retrieve other data such as song title, album title, recipient, md5 sum, the mimetype of a file, and much more.

    This tutorial describes how you can write new analyzers.

    Primer

    What are file analyzers?

    File analyzers in KDE4

    KDE4 uses stream based file analyzers for retrieving text and metadata from files. This has a number of advantages over file based methods. Stream based access

    • is faster for 90% of the file types,
    • allows easy analysis of embedded files such as email attachments or entries from zip files, rpms and many other container file formats.

    Writing stream-based analyzers requires a different approach than the usual file-based methods and in the tutorial we will explain how to go about it.

    Finding documentation

    Look for existing code

    If you want to see some code examples, take a look at the already implemented file analyzers at /kdesupport/strigi/src/streamindexer/

    Testing your code

    Strigi comes with a simple command line tool to check if your plugins work. This tool is called xmlindexer. It extracts data from files and outputs it as simple xml. To use it call it like this:

    xmlindexer [FILE]
    

    or

    xmlindexer [DIR]
    

    This is very fast and I recommend using it with valgrind. This hardly slows down your workflow but helps to keep memory managment in good shape:

    valgrind xmlindexer [DIR]