Projects/Nepomuk/GraphConcepts: Difference between revisions

    From KDE TechBase
    (Graph Creation)
    Line 3: Line 3:
    However, if you still want to know why they are present. Read on.
    However, if you still want to know why they are present. Read on.


    = Introduction to Graphs =
    = Graph Creation =
    The first obvious question is - When are graphs created? Is everything just added into one big graph?
     
    Nepomuk works by creating new graphs for each atomic operation. These operations are the ones defined by the DataManagement functions -
    * addProperty
    * setProperty
    * createResource
    * storeResources
     
     
    Each time any of these operations is called and new data is being pushed, a new graph with the current date time is created.
     
    = Graph contents =
    The <code>graph</code> is always called the context of the statement. It is used to store some metadata about each triple. In our case we currently store the following information
    The <code>graph</code> is always called the context of the statement. It is used to store some metadata about each triple. In our case we currently store the following information



    Revision as of 07:57, 24 August 2012

    Nepomuk is based on the Semantic Web, and even though the Semantic Web is largely advertised as a triple store, Nepomuk is NOT a triple store. We in fact store quadruples -> Subject - Predicate - Object - Graph. The Graph part of these quadruples is always automatically managed by Nepomuk. Clients rarely (if ever) need to care about the graph.

    However, if you still want to know why they are present. Read on.

    Graph Creation

    The first obvious question is - When are graphs created? Is everything just added into one big graph?

    Nepomuk works by creating new graphs for each atomic operation. These operations are the ones defined by the DataManagement functions -

    • addProperty
    • setProperty
    • createResource
    • storeResources


    Each time any of these operations is called and new data is being pushed, a new graph with the current date time is created.

    Graph contents

    The graph is always called the context of the statement. It is used to store some metadata about each triple. In our case we currently store the following information

    • Graph Type
    • Creation Date
    • Modification Date
    • Graph Maintainer


    Example -

    <nepomuk:/ctx/c4d93812-7d8c-4c8f-b8a7-e1d4dbc5fed5>
            rdf:type                nrl:InstanceBase
            nao:created             2011-07-30T11:34:19.587Z
            nao:modified            2011-07-30T11:34:19.587Z
            nao:maintainedBy        nepomuk:/res/e2eb2efb-14ee-4038-ac24-698f916289b0
    


    The nao:created and nao:lastModified properties are identical to those used in resources. Except for the fact that graphs are generally never modified. They are only created or destroyed.

    Graph Type

    Each graph has a type which depends on the type of data it holds. The 3 most commonly used types are -

    • nrl:Ontology - Used to hold ontology data.
    • nrl:InstanceBase - Used for normal data
    • nrl:DiscardableInstanceBase - Used for data which can be reproduced and should not be backed up. This type is generally reserved for data created during indexing of files.

    Agents

    Each graph contains a property called nao:maintainedBy which specifies which application created that data -

    <nepomuk:/ctx/c4d93812-7d8c-4c8f-b8a7-e1d4dbc5fed5>
            rdf:type                nrl:DiscardableInstanceBase
            nao:created             2011-07-30T11:34:19.587Z
            nao:modified            2011-07-30T11:34:19.587Z
            nao:maintainedBy        nepomuk:/res/e2eb2efb-14ee-4038-ac24-698f916289b0
    
    <nepomuk:/res/e2eb2efb-14ee-4038-ac24-698f916289b0>
            rdf:type                nao:Agent
            rdf:type                rdfs:Resource
            nao:identifier          nepomukindexer
    


    Each application is represented as a nao:Agent with the application name specified as the nao:identifier. This name is automatically determined via KComponentData and can be spoofed. Security was not one of our concerns.

    Each triple may be maintained by a number of different applications. This information is valuable for the advanced DataManagement functions such as removeDataByApplication.