Projects/Nepomuk/DataFeeders: Difference between revisions

    From KDE TechBase
    (Basic structure)
     
    (Added tons of information)
    Line 1: Line 1:
    Write about the use case
    Some applications need to push large quantities of data into Nepomuk. They are typically called "feeder" applications as they provide Nepomuk with the data it requires. A database is only as powerful as the data it holds.


    = SimpleResource API =
    While one can use the <code>Resource</code> class to push the data. It'll be slow as the <code>Resource</code> class is synchronous and writes back into the database after each command. What one requires is an asynchronous API to push the application can just write all the data, and then Nepomuk can process and merge all of the data provided with its internal database. 
     
    = SimpleResources =
     
    Applications can use the <code>SimpleResource</code> class to model the data that they want to push.  The <code>SimpleResource</code> class is not connected to the Nepomuk database, and is just a convenience wrapper around a <code>QMultiHash</code>. Any changes made to these SimpleResources are not reflected back to the database, unless explicitly specified.
     
    An example -
     
    <syntaxhighlight lang="cpp-qt">
        Nepomuk2::SimpleResource coldplay;
        coldplay.addType( NCO::Contact() );
        coldplay.addProperty( NCO::fullname(), "Coldplay" );
     
        Nepomuk2::SimpleResource album;
        album.addType( NMM::MusicAlbum() );
        album.addProperty( NIE::title(), "X&Y" );
     
        Nepomuk2::SimpleResource fileRes;
        fileRes.addType( NFO::FileDataObject() );
        fileRes.addType( NMM::MusicPiece() );
        fileRes.addProperty( NMM::performer(), coldplay );
        fileRes.addProperty( NMM::musicAlbum(), album );
        fileRes.addProperty( NIE::url(), fileUrl );
        fileRes.addProperty( NIE::title(), "What If" );
    </syntaxhighlight>
     
     
    In the above example we wish to push data about a song "What If" by the popular english artist "Coldplay". We create a different SimpleResource for each resource that we want to push into Nepomuk, and then add the relevant metadata. These <code>SimpleResource</code>s can reference each other.
     
    All of this data is currently just stored in memory in a hash table. In order to push the data into Nepomuk, we group it all together using a <code>SimpleResourceGraph</code>. After which was can push the data by calling <code>SimpleResourceGraph::save()</code>.
     
    Example -
    <syntaxhighlight lang="cpp-qt">
        Nepomuk2::SimpleResourceGraph graph;
        graph << coldplay << album << fileRes;
     
        KJob* job = graph.save();
    </syntaxhighlight>
     
     
    The save operation returns a KJob which has already begun execution. This operation will continue asynchronously, and on completion will emit the signal completed.
     
    The completed signals also return the respective KJob. This job can then be checked for errors, which may have occurred if we tried to save invalid data. It is up to the programmer to make sure that the data is valid. Invalid valid data is completely ignored and an error is given.


    = StoreResources =
    = StoreResources =
    Line 8: Line 50:


    == Merging ==
    == Merging ==
    = Who else is using it? =
    The SimpleResource API is currently the de facto method of pushing data into Nepomuk. It is being heavily utilized by our own file indexer, and KDE PIM. PIM uses the SimpleResource api in order to push emails, contacts and event information into Nepomuk.
    For more examples on how to use SimpleResource, we suggest you look at our comprehensive tests present in the datamanagementmodel. Add link!!


    = Graph Handling =
    = Graph Handling =
    Most developers do not need to worry about graphs present in Nepomuk. However, for the sake of completion we're documenting what happens internally. Hopefully, this will help you better understand the intricacies on Nepomuk.
    When a <code>SimpleResourceGraph</code> is saved or passed onto <code>storeResources</code>, each statement in the graph is checked for existance in the database. If that triple already exists, it is set aside and specially handled. All other triples are pushed into this one big graph that is created with each call to <code>storeResources</code>.
    That graph contains the following data -
    <code>
    <nepomuk:/ctx/some-graph> a nrl:Graph .
    get some data!!
    </code>
    When ..

    Revision as of 13:41, 8 August 2012

    Some applications need to push large quantities of data into Nepomuk. They are typically called "feeder" applications as they provide Nepomuk with the data it requires. A database is only as powerful as the data it holds.

    While one can use the Resource class to push the data. It'll be slow as the Resource class is synchronous and writes back into the database after each command. What one requires is an asynchronous API to push the application can just write all the data, and then Nepomuk can process and merge all of the data provided with its internal database.

    SimpleResources

    Applications can use the SimpleResource class to model the data that they want to push. The SimpleResource class is not connected to the Nepomuk database, and is just a convenience wrapper around a QMultiHash. Any changes made to these SimpleResources are not reflected back to the database, unless explicitly specified.

    An example -

        Nepomuk2::SimpleResource coldplay;
        coldplay.addType( NCO::Contact() );
        coldplay.addProperty( NCO::fullname(), "Coldplay" );
    
        Nepomuk2::SimpleResource album;
        album.addType( NMM::MusicAlbum() );
        album.addProperty( NIE::title(), "X&Y" );
    
        Nepomuk2::SimpleResource fileRes;
        fileRes.addType( NFO::FileDataObject() );
        fileRes.addType( NMM::MusicPiece() );
        fileRes.addProperty( NMM::performer(), coldplay );
        fileRes.addProperty( NMM::musicAlbum(), album );
        fileRes.addProperty( NIE::url(), fileUrl );
        fileRes.addProperty( NIE::title(), "What If" );
    


    In the above example we wish to push data about a song "What If" by the popular english artist "Coldplay". We create a different SimpleResource for each resource that we want to push into Nepomuk, and then add the relevant metadata. These SimpleResources can reference each other.

    All of this data is currently just stored in memory in a hash table. In order to push the data into Nepomuk, we group it all together using a SimpleResourceGraph. After which was can push the data by calling SimpleResourceGraph::save().

    Example -

        Nepomuk2::SimpleResourceGraph graph;
        graph << coldplay << album << fileRes;
    
        KJob* job = graph.save();
    


    The save operation returns a KJob which has already begun execution. This operation will continue asynchronously, and on completion will emit the signal completed.

    The completed signals also return the respective KJob. This job can then be checked for errors, which may have occurred if we tried to save invalid data. It is up to the programmer to make sure that the data is valid. Invalid valid data is completely ignored and an error is given.

    StoreResources

    Identification

    Merging

    Who else is using it?

    The SimpleResource API is currently the de facto method of pushing data into Nepomuk. It is being heavily utilized by our own file indexer, and KDE PIM. PIM uses the SimpleResource api in order to push emails, contacts and event information into Nepomuk.

    For more examples on how to use SimpleResource, we suggest you look at our comprehensive tests present in the datamanagementmodel. Add link!!

    Graph Handling

    Most developers do not need to worry about graphs present in Nepomuk. However, for the sake of completion we're documenting what happens internally. Hopefully, this will help you better understand the intricacies on Nepomuk.

    When a SimpleResourceGraph is saved or passed onto storeResources, each statement in the graph is checked for existance in the database. If that triple already exists, it is set aside and specially handled. All other triples are pushed into this one big graph that is created with each call to storeResources.

    That graph contains the following data - <nepomuk:/ctx/some-graph> a nrl:Graph . get some data!!

    When ..