Projects/Strigi/Strigi status meeting 2008

From KDE TechBase

Strigi Planning for 2008

Strigi is seeing more use now that it is a requirement for KDE4. We must make sure that Strigi meets the expectations of the users. This gives us the question: how shall we improve Strigi this year?

Status

To see what we want to improve, let's look at the parts that Strigi provides:

  • libstreamanalyzer
  • metadata ontology
  • strigidaemon
  • qt4 dbus bindings
  • strigiclient

libstreamanalyzer

This is the library that collects metadata from files and embedded files. It does this by streaming the file content and applying an arbitrary number of analyzers on the file. It writes the results to an abstract index and allows searching in this index. By writing implementations of the index, e.g. a clucene implementation, one can use libstreamanalyzer.

In KDE4, currently the most important use of Strigi is this library that provides metadata to all KDE application via the KFileMetaInfo class.

Because of the streaming analysis and the abstract index, this library is the most revolutionary part of Strigi.

metadata ontology

Metadata is meaningless unless it is described properly. In collaboration with Nepomuk and Xesam we tag the data in a hierarchical ontology that allows powerful queries to be performed on the metadata. Without this ontology, we would only be able to do text search.

strigidaemon

Strigi comes with a clucene implementation of the index. This is being used by strigidaemon. Strigidaemon is a daemon that indexes user files and allows searching via a simple socket and dbus. Strigidaemon is very lightweight and like libstreamanalyzer has very few dependencies.

Querying can be done via two interfaces: the native strigi dbus interface or the xesam interface. The former is, of course, complete. The latter is not complete yet. Completing this is one of the near term goals of Strigi.

dbus bindings

The dbus bindings Strigi provides can be used as a library or generated by e.g. qtdbus.

strigiclient

This program started as a small client for demonstration purposes. Unfortunately it has improved little in recent times. We should also work on this, perhaps by improving Kerry.

Tasks ahead

These are the challenges ahead in the new year. I've sorted them in what I think is the order of importance. So that is not really definite and the meeting today may formulate more points.

make indexing more configurable

People are seeing large indexes and this is not good. They should be able to tell Strigi to index less. For a convenient user experience this is very important.

make a powerful search client

We should take (a good look at) Kerry for doing searching in KDE4.

improve xesam support

This ties in with the search client, but also allows other apps to do searching more easily. We should have good docs on how to use the dbus bindings to do searching.

write more analyzers

Some people would put this point higher in the list. I think however that we already have very many analyzers. More will be written. Partially we can rely on the community for that. It is more important that people can see and use the results of any analyzers they write. Take plasma as an example: aaron does not write many applets, he writes the framework that allows others to use it.