Development/Tutorials/Metadata/Nepomuk/TipsAndTricks: Difference between revisions
(→Debugging the created data: Removed information about sopranocmd - The Dbus model no longer exists, and you cannot monitor for statements any more.) |
No edit summary |
||
Line 8: | Line 8: | ||
[[../RDFIntroduction|RDF and Ontologies in Nepomuk]] | [[../RDFIntroduction|RDF and Ontologies in Nepomuk]] | ||
}} | }} | ||
== Using ontology URIs in your code == | == Using ontology URIs in your code == |
Revision as of 15:58, 23 August 2012
Tutorial Series | Nepomuk |
Previous | None |
What's Next | n/a |
Further Reading | Resource Handling with Nepomuk, |
Using ontology URIs in your code
One often needs the URI of a specific class or a specific property in ones code. And not all ontologies are provided by the very convenient Soprano::Vocabulary namespace.
The solution is rather simple: create your own vocabulary namespaces by using Soprano's own onto2vocabularyclass command line tool. It can generate convenient vocabulary namespaces for you. The Soprano documentation shows how to use it manually or even simpler with a simple CMake macro.
Mind the Difference between QString and QUrl
Nepomuk::Resource provides two constructors: one taking a QString as identifier or URI and one taking a QUrl.
The latter one is really simple: the given URI is used as the resource URI. If the resource exists, its data is used, otherwise it will be created with exactly that URI.
The QString one is a bit trickier. It will try to be clever about the parameter and see if it is a URI. If no resource with that URI (if it is a URI) exists, it is interpreted as an identifier (nao:identifier). Resource checks if a resource with that identifier exists. If so, its data is loaded, if not, a new resource with a random URI and that string as identifier is created.
However, be aware that nothing is written to Nepomuk until the first writing call to Resource such as setProperty or addType.
Debugging the created data
Soprano provides a command line client to connect to the storage service. It's called sopranocmd
. It provides all the features one needs to debug data. It is recommended that you only use sopranocmd for running queries.
Running sopranocmd is cumbersome because of the large number of arguments it requires. This can be made simpler by adding the following alias -
alias nepomukcmd="sopranocmd --socket `kde4-config --path socket`nepomuk-socket --model main --nrl"
For example -
# nepomukcmd query \
"select ?r where { ?r nao:hasTag ?tag . \
?tag nao:prefLabel 'foobar'^^xsd:string . }"
Using Konqueror
In the Nepomuk playground repository lives a KIO slave which can handle the nepomuk:/ protocol. It will display all properties of a Nepomuk resource including its links to other resources and the backlinks. This is a convenient way of looking at the Nepomuk data. The KIO slave even support removal of resources.
Using NepomukShell
NepomukShell is a maintenance and debugging tool, which lives in its own git repository at nepomukshell. It is a simple tool that let's one browse all resources in Nepomuk. Additionally it allows to create subclasses and properties (Caution: do only create subclasses and properties from PIMO classes and properties!) and remove resources.
Constructing SPARQL queries
Hint: In most cases the Nepomuk Query API should be enough and prevent you from writing your own SPARQL which is hard to debug.
Whenever doing something a bit fancier with Nepomuk one has to use SPARQL queries via
Nepomuk::ResourceManager::instance()->mainModel()
->executeQuery( myQueryString,
Soprano::Query::QueryLanguageSparql );
Constructing these queries can be a bit cumbersome since one has to use a lot of class and property URIs from different ontologies. Also literals have to be formatted according to the N3 syntax used in SPARQL. Luckily Soprano provides the necessary tools to do exactly that: Soprano::Node::toN3, Soprano::Node::resourceToN3, and Soprano::Node::literalToN3 take care of all formatting and percent-encoding you need. Using those methods the code to create queries might look ugly but the resulting queries are more likely to be correctly encoded and introduce less code duplication.
Typically one would use QString::arg like so (be aware that the standard prefixes are NOT supported out-of-the-box as with sopranocmd):
using namespace Soprano;
QString myQuery
= QString("select ?r where { "
"?r %1 ?v . "
"?v %2 %3 . }")
.arg(Node::resourceToN3(Vocabulary::NAO::hasTag()))
.arg(Node::resourceToN3(Vocabulary::NAO::prefLabel()))
.arg(Node::literalToN3("foobar")));
This will create the same query we used above only using no hard-coded components whatsoever.
Restarting Nepomuk and its Services
The Nepomuk services are controlled by the nepomukserver application which is started on KDE login. The nepomukserver will take care of starting and stopping all services.
It is possible to stop the server and all services alltogether by simply calling a D-Bus method:
# qdbus org.kde.NepomukServer /nepomukserver \
org.kde.NepomukServer.quit
It can then be restarted by simply calling nepomukserver again. In many debugging situations it might be of interest to pipe the output of the server (and all services) to a file:
# nepomukserver 2> /tmp/nepomuk.stderr
Also interesting to know is that Nepomuk defines a set of debugging areas for the services and the server itself. Use kdebugdialog to enable or disable them.
Or one can stop and start single services. In most cases this is sufficient since each service is run in its own process. Thus, changes to a service plugins will be picked up directly:
# qdbus org.kde.NepomukServer /servicemanager \
org.kde.nepomuk.ServiceManager.stopService <servicename>
# qdbus org.kde.NepomukServer /servicemanager \
org.kde.nepomuk.ServiceManager.startService <servicename>
Listening to changes in the database
Write about the Resource Watcher.
Remove all Strigi-indexed data
Strigi produces a lot of data in Nepomuk. There might be times where one wants to remove all that data manually.
The little command below removes all data created by Strigi (caution: this could take a long time):
DOESN'T WORK - Update it
for a in `nepomukcmd --foo query "select distinct ?g where { \
?g <http://www.strigi.org/fields#indexGraphFor> ?r . }"`;
do nepomukcmd rmgraph "$a"; done
Starting Nepomuk Sever from the Trunk in Ubuntu
Note: Starting with (K)ubuntu 10.10 (Maverick Meerkat), virtuoso-t is in /usr/bin. So the work around described below is no longer necessary.
Ubuntu packages virtuoso slightly differently. It provides a package called virtuoso-nepomuk which installs the executable virtuoso-t in the /usr/lib/virtuoso/ directory for security purposes.
When running Nepomuk from the trunk, the nepomukserver is unable to find the virtuoso-t executable, and therefore the NepomukStorage Service fails to initialize. One way to fix this is to adjust the PATH environment variable.
PATH=/usr/lib/virtuoso:$PATH
export PATH
Debugging virtuoso-t
If virtuoso-t consumes a lot of CPU resources but there are no active queries analysis has to go a bit deeper. Virtuoso is started through Soprano with certain parameters which are set in a temporary ini-file (/tmp/virtuoso_XXXX.ini). Soprano needs to be modified manually to start Virtuoso with different parameters in the ini-file, e.g. to improve virtuoso-t's behaviour by modifying backends/virtuoso/virtuosocontroller.cpp (Soprano) and setting NumberOfBuffers to 40000 (line 344) and SchedulerInterval to 0 (line 350).
After re-compiling soprano one has to attach gdb to virtuoso-t as soon as it starts consuming CPU and create a full threaded backtrace:
set logging file /tmp/virtuoso-t.out
set logging on
thread apply all bt full
Note: The above settings should only be used for debugging!