Projects/Nepomuk: Difference between revisions

    From KDE TechBase
    (→‎Ideas: Moving the ideas to a separate page)
    (48 intermediate revisions by 2 users not shown)
    Line 1: Line 1:
    {{Template:I18n/Language Navigation Bar|Projects/Nepomuk}}
     


    [[Image:Nepomuk_logo_big.png|center|300px]]
    [[Image:Nepomuk_logo_big.png|center|300px]]
    Line 5: Line 5:
    == About Nepomuk ==
    == About Nepomuk ==


    This page is dedicated to Nepomuk development ideas, progress, experiments, and is a general starting point for new developers.
    Nepomuk serves as a cross application semantic storage backend. It aims at collecting data from various sources - file indexing, the web, applications, etc, and linking them all together to form a cohesive map of data.


    For general information about the Nepomuk project see the [http://nepomuk.kde.org/ dedicated Nepomuk homepage].
    This page is dedicated to third party documentation for Nepomuk. To know more about Nepomuk from a user's point of view, head over to the [http://userbase.kde.org/Nepomuk Nepomuk page on UserBase]. Or to know more about the Nepomuk community and getting involved in Nepomuk, head over to the [http://community.kde.org/Projects/Nepomuk Nepomuk Community Page].
     
     
    == Contact ==
     
    The Nepomuk project is maintained by [mailto:[email protected] Sebastian Trueg] of Mandriva.
     
    The "official" IRC channel is '''#nepomuk-kde''' on freenode.
     
    All development questions should be discussed on the [https://mail.kde.org/mailman/listinfo/nepomuk Nepomuk mailing list].


    == Documentation ==
    == Documentation ==
    The [http://userbase.kde.org/Nepomuk Nepomuk page on UserBase] has information and troubleshooting for users.
    Any new project is intimidating and jumping right into the [http://api.kde.org/4.x-api/kdelibs-apidocs/nepomuk-core/html/index.html API Documentation] can be scary. So, we have prepared some articles which explain the different aspects of Nepomuk and even touch on some advanced features.
     
    The following links provide good reads for getting used to the Nepomuk system and its APIs.
    * [[Development/Tutorials/Metadata/Nepomuk|Development Tutorials]]
    *'''[[Development/Tutorials/Metadata/Nepomuk/TipsAndTricks|Nepomuk Tips and Tricks]]'''
    * [http://api.kde.org/4.x-api/kdelibs-apidocs/nepomuk/html/index.html Nepomuk API Documentation]
    * [http://soprano.sourceforge.net/apidox/trunk/index.html Soprano (RDF storage) API]
    * [http://trueg.wordpress.com/2009/06/02/nepomuk-and-some-cmake-magic/ Using the Nepomuk Resource Code generator and the Soprano Ontology class generator in cmake]


    The documentation of any project is always in progress as the code base is always evolving. If you feel that the documentation is lacking in some regard, please come talk to us. We'd love to hear your feedback, and the documentation might just get improved in the process.


    As Nepomuk is highly dependent on its data in the RDF store and the used ontologies, one might consider to read up on RDF and the Nepomuk ontogies:
    '''Nepomuk Mailing List: ''' nepomuk@kde.org <br/>
    * [http://www.w3.org/TR/REC-rdf-syntax/ RDF Primer]
    '''IRC Channel:''' #nepomuk-kde on freenode
    * [http://www.semanticdesktop.org/ontologies Nepomuk Ontologies]
    * [http://dev.nepomuk.semanticdesktop.org/wiki/OntologyMaintenance Experimental Nepomuk Ontologies and Ideas for new ones]


    == Events ==
    === Introductory Material ===
    If you're just getting started with Nepomuk and want to know a quick way to fetch some data.


    [[Projects/Nepomuk/CodingSprint2009|June 19-21, 2009 - Coding Sprint 2009 Freiburg, Germany]]
    * [[Projects/Nepomuk/QuickStart| Quick Start]]
    * [[Projects/Nepomuk/OntologyBasics| Basic Ontology concepts]]
    * [[Projects/Nepomuk/Uris| Questions about URIs]]


    [[Projects/Nepomuk/OpenSocialSemanticDesktopWorkshop2009|Open Social Semantic Desktop Workshop 2009 Freiburg, Germany]]
    === Managing Data ===
    This section includes more in-depth articles on how manage the data in Nepomuk. As a starting point you should probably open up the [http://api.kde.org/4.x-api/kdelibs-apidocs/nepomuk-core/html/index.html Nepomuk API Documentation]. It is generally more up to date than the articles mentioned below.


    == ToDo  ==
    * [[Projects/Nepomuk/Resources| Using Resources]]
    * [[Projects/Nepomuk/ResourceWatcher| Monitoring Changes]]
    * [[Projects/Nepomuk/BulkChanges| Bulk Changes]]
    * [[Projects/Nepomuk/DataFeeders| Data Feeders]]


    Nepomuk is a rather young project with a notorious shortage in developers. There are many tasks and subprojects to get ones hands dirty on. Unlike other projects like Plasma, however, developing for Nepomuk is not easy. One has to read up on a lot of things and fight some day-to-day annoyances. But: helping with the development will improve the situation in any case.  
    === File Indexing ===
    With 4.10, the file indexing architecture has substantially changed. We no longer rely on strigi, and have our own plugin based interface.


    If you are interested in working on a task in this list, please contact [mailto:[email protected] Sebastian Trueg].
    * [[Projects/Nepomuk/IndexingPlugin| Writing an Indexing Plugin]]


    === Junior Jobs ===
    === Querying ===
    If you want to get into Nepomuk development quickly by taking over a small task have a look at our [[Projects/Nepomuk/JuniorJobs|Junior Job page]].
    As you advance into Nepomuk, you'll want to move beyond just fetching and pushing data and will want to query Nepomuk for specialized data. One can query Nepomuk is many different ways, the important part is to optimize your queries and make sure they run well on production systems where the database sizes may way very large.


    === Low level Nepomuk Development Tasks  ===
    * [[Projects/Nepomuk/QueryingMethods| Different ways to Query Nepomuk]]
    * [[Projects/Nepomuk/QueryLibrary| Nepomuk Query Library]]
    * [[Projects/Nepomuk/SparqlQueries| Sparql Queries]]


    The low-level development tasks are those that are not directly reflected in the GUI or even in the API used by most developers. However, they are important in terms of performance, scalability, and compatibility.  
    === Architectural Overview ===
    If you're looking to get more involved with Nepomuk development process, you should probably need to need to figure out our basic architecture and where you can find all the relevant code.


    * [[Projects/Nepomuk/Repositories| Nepomuk Repositories]]
    * [[Projects/Nepomuk/ComponentOverview| Nepomuk Architectural Overview]]
    * [[Projects/Nepomuk/kioslaves| Nepomuk KIO Slaves]]


    ==== Add Inference Configuration to the Virtuoso Soprano Backend ====
    === Nepomuk Internals ===
    When you decide to dig even deeper.


    Virtuoso 5 provides inference on rdfs:subClassOf and rdfs:subPropertyOf. These are the most important ones and for now all we need in Nepomuk.
    * [[Projects/Nepomuk/GraphConcepts| Graph handling]]
    * [[Projects/Nepomuk/VirtuosoInternal| Virtuoso Internals]]
    * [[Projects/Nepomuk/OntologyExtention| Extending the Ontologies]]


    The current implementation of the Virtuoso Soprano backend does not enable inference. We need a configuration option to do exactly that. It could happen along the lines of the [http://soprano.sourceforge.net/apidox/trunk/soprano_backend_virtuoso.html existing config options] or with the introduction of dedicated inference configuration options on the Soprano::Backend level.
    === Miscellaneous ===
    * [[Projects/Nepomuk/Nepomuk2Port| Porting to Nepomuk2]]
    * [[Projects/Nepomuk/ManagingNepomukProcesses| Managing Nepomuk Processes]]
    * [[Projects/Nepomuk/TestEnvironment| Nepomuk Test Environment]]
    * [[Development/Tutorials/Metadata/Nepomuk/TipsAndTricks| Nepomuk Tips and Tricks]]
    * [[Projects/Nepomuk/NepomukShow| Debugging Nepomuk Data]]


    ==== Outdated links ====


     
    The following links provide good reads for getting used to the Nepomuk system and its APIs. <br\>
     
    They are slightly outdated, but still has some useful material.
    ==== Soprano Transaction Support  ====
    * [[Development/Tutorials/Metadata/Nepomuk|Development Tutorials]]
     
    * [[Projects/Nepomuk/Ideas|Random Ideas]]
    [http://soprano.sf.net/ Soprano] is the RDF database framework used in Nepomuk. Currently Soprano does not support transactions, i.e. sets of commands that can be rolled back. An [http://websvn.kde.org/branches/soprano/experimental experimental development] branch exists which already contains new API for transaction support (while keeping BC).
    * [[Projects/Nepomuk/Qualified_Relations_Idea| Qualified Relations Idea]]
     
    * [[Projects/Nepomuk/ScenarioExamples| Scenario Examples]]
    It still misses an implementation of the transaction support in Soprano backends (Sesame2 and Virtuoso) and in the client/server architecture.
     
    Another idea is to create a new API based on the design that Sesame2 follows: Repository and RepositoryConnection classes. The former creates instances of the latter which then has all the actual data handling methods and acts as one transaction object.
     
     
    === General Nepomuk ===
     
    ==== Handling of external storage  ====
     
    '''We already have the removablestorage service in kdebase which handles USB keys and such to a degree.'''
     
    A typical problem with the way Nepomuk handles files and file metadata are removable storage devices. They can be mounted at different paths on different systems. But still one wants to keep the metadata stored in Nepomuk. If possible one would even want to be able to search for files saved on an USB stick even if it is not plugged in.  
     
    The [http://trueg.wordpress.com/2009/04/15/portable-meta-information-yet-again-only-this-time-there-is-code/ blog entry about removable storage in Nepomuk] already discusses this problem and shows some existing code in KDE's [http://websvn.kde.org/trunk/playground/base/removablestorageservice/ playground] which tries to tackle this problem.
     
    However, one actually needs more. The system would have to be embedded into KIO to make sure the metadata cache on the removable storage device is always up-to-date. Also it is directly related to the problem of relative vs. absolute file URLs.
     
     
    ==== Nepomuk Backup Service  ====
     
    Implementation details are discussed in [[Projects/Nepomuk/MetadataSharing]]
     
    We need a backup solution. The idea is the typical one: have a Nepomuk service that allows to specify update intervals and manual updates.
     
    The service should ignore all data extracted by Strigi, i.e. data that can be recreated deterministically. This can easy be determined by checking the context/named graph the data statements are stored in. Strigi stores all extracted data in one context which is marked as the ''http://www.strigi.org/fields#indexGraphFor'' for the file in question. Thus, a query along the lines of the following would work:
    <pre>select ?s ?p ?o ?g where {
        graph ?g { ?s ?p ?o . } .
        OPTIONAL { ?g strigi:indexGraphFor ?x . } .
        FILTER(!BOUND(?x)) .
    }</pre>
     
    Other features could include replacement of the home directory like it is done in KConfig. This way the data could be re-imported in another user account.
     
     
    ==== Nepomuk Toolbox ====
    Provide a GUI that allows to call methods such as ''optimize'' and ''rebuildIndex'' on the storage service. The latter method is not commited yet due to the KDE 4.3 feature freeze but will be afterwards.
     
    It would also be useful to have Nepomuk register such operations (including the data conversion when changing backends) via the notification system.
     
    == Development status ==
     
    See [[Projects/Nepomuk/DevelopmentStatus]].
     
    http://techbase.kde.org/Projects/Nepomuk/Ideas
     
    == Subpages of {{FULLPAGENAME}}==
    {{Special:PrefixIndex/{{FULLPAGENAME}}/}}

    Revision as of 20:30, 3 December 2012


    About Nepomuk

    Nepomuk serves as a cross application semantic storage backend. It aims at collecting data from various sources - file indexing, the web, applications, etc, and linking them all together to form a cohesive map of data.

    This page is dedicated to third party documentation for Nepomuk. To know more about Nepomuk from a user's point of view, head over to the Nepomuk page on UserBase. Or to know more about the Nepomuk community and getting involved in Nepomuk, head over to the Nepomuk Community Page.

    Documentation

    Any new project is intimidating and jumping right into the API Documentation can be scary. So, we have prepared some articles which explain the different aspects of Nepomuk and even touch on some advanced features.

    The documentation of any project is always in progress as the code base is always evolving. If you feel that the documentation is lacking in some regard, please come talk to us. We'd love to hear your feedback, and the documentation might just get improved in the process.

    Nepomuk Mailing List: [email protected]
    IRC Channel: #nepomuk-kde on freenode

    Introductory Material

    If you're just getting started with Nepomuk and want to know a quick way to fetch some data.

    Managing Data

    This section includes more in-depth articles on how manage the data in Nepomuk. As a starting point you should probably open up the Nepomuk API Documentation. It is generally more up to date than the articles mentioned below.

    File Indexing

    With 4.10, the file indexing architecture has substantially changed. We no longer rely on strigi, and have our own plugin based interface.

    Querying

    As you advance into Nepomuk, you'll want to move beyond just fetching and pushing data and will want to query Nepomuk for specialized data. One can query Nepomuk is many different ways, the important part is to optimize your queries and make sure they run well on production systems where the database sizes may way very large.

    Architectural Overview

    If you're looking to get more involved with Nepomuk development process, you should probably need to need to figure out our basic architecture and where you can find all the relevant code.

    Nepomuk Internals

    When you decide to dig even deeper.

    Miscellaneous

    Outdated links

    The following links provide good reads for getting used to the Nepomuk system and its APIs. <br\> They are slightly outdated, but still has some useful material.