Development/Architecture/KDE3/Network Transparency

< Development‎ | Architecture/KDE3
Revision as of 20:35, 29 June 2011 by Neverendingo (Talk | contribs) (Text replace - "<code cppqt3>" to "<syntaxhighlight lang="cpp-qt">")

Jump to: navigation, search

KDE Architecture - KDE's IO architecture


In the age of the world wide web, it is of essential importance that desktop applications can access resources over the internet: they should be able to download files from a web server, write files to an ftp server or read mails from a web server. Often, the ability to access files regardless of their location is called network transparency.

In the past, different approaches to this goals were implemented. The old NFS file system is an attempt to implement network transparency on the level of the POSIX API. While this approach works quite well in local, closely coupled networks, it does not scale for resources to which access is unreliable and possibly slow. Here, asynchronicity is important. While you are waiting for your web browser to download a page, the user interface should not block. Also, the page rendering should not begin when the page is completely available, but should updated regularly as data comes in.

In the KDE libraries, network transparency is implemented in the KIO API. The central concept of this architecture is an IO job. A job may copy, or delete files or similar things. Once a job is started, it works in the background and does not block the application. Any communication from the job back to the application - like delivering data or progress information - is done integrated with the Qt event loop.

Background operation is achieved by starting ioslaves to perform certain tasks. ioslaves are started as separate processes and are communicated with through UNIX domain sockets. In this way, no multi-threading is necessary and unstable slaves can not crash the application that uses them.

File locations are expressed by the widely used URLs. But in KDE, URLs do not only expand the range of addressable files beyond the local file system. It also goes in the opposite direction - e.g. you can browse into tar archives. This is achived by nesting URLs. For example, a file in a tar archive on a http server could have the URL

Using KIO

In most cases, jobs are created by calling functions in the KIO namespace. These functions take one or two URLs as arguments, and possible other necessary parameters. When the job is finished, it emits the signal result(KIO::Job*). After this signal has been emitted, the job deletes itself. Thus, a typical use case will look like this:

void FooClass::makeDirectory()
    SimpleJob *job = KIO::mkdir(KURL("file:/home/bernd/kiodir"));
    connect( job, SIGNAL(result(KIO::Job*)), 
             this, SLOT(mkdirResult(KIO::Job*)) );

void FooClass::mkdirResult(KIO::Job *job)
    if (job->error())
        cout << "mkdir went fine" << endl;

Depending on the type of the job, you may connect also to other

Here is an overview over the possible functions:

KIO::mkdir(const KURL &amp;url, int permission)
:Creates a directory, optionally with certain permissions.
KIO::rmdir(const KURL &amp;url)
:Removes a directory.
KIO::chmod(const KURL &amp;url, int permissions)
:Changes the permissions of a file.
KIO::rename(const KURL &amp;src, const KURL &amp;dest, bool overwrite)
:Renames a file.
KIO::symlink(const QString &amp;target, const KURL &amp;dest, bool overwrite, bool showProgressInfo)
:Creates a symbolic link.
KIO::stat(const KURL &amp;url, bool showProgressInfo)
:Finds out certain information about the file, such as size, modification time and permissions. The information can be obtained from KIO::StatJob::statResult() after the job has finished.
KIO::get(const KURL &amp;url, bool reload, bool showProgressInfo)
:Transfers data from a URL.
KIO::put(const KURL &amp;url, int permissions, bool overwrite, bool resume, bool showProgressInfo)
:Transfers data to a URL.
KIO::http_post(const KURL &amp;url, const QByteArray &amp;data, bool showProgressInfo)
:Posts data. Special for HTTP.
KIO::mimetype(const KURL &amp;url, bool showProgressInfo)
:Tries to find the MIME type of the URL. The type can be obtained from KIO::MimetypeJob::mimetype() after the job has finished.
KIO::file_copy(const KURL &amp;src, const KURL &amp;dest, int permissions, bool overwrite, bool resume, bool showProgressInfo)
:Copies a single file.
KIO::file_move(const KURL &amp;src, const KURL &amp;dest, int permissions, bool overwrite, bool resume, bool showProgressInfo)
:Renames or moves a single file.
KIO::file_delete(const KURL &amp;url, bool showProgressInfo)
:Deletes a single file.
KIO::listDir(const KURL &amp;url, bool showProgressInfo)
:Lists the contents of a directory. Each time some new entries are known, the signal KIO::ListJob::entries() is emitted.
KIO::listRecursive(const KURL &amp;url, bool showProgressInfo)
:Similar to the listDir() function, but this one is recursive.
KIO::copy(const KURL &amp;src, const KURL &amp;dest, bool showProgressInfo)
:Copies a file or directory. Directories are copied recursively.
KIO::move(const KURL &amp;src, const KURL &amp;dest, bool showProgressInfo)
:Moves or renames a file or directory.
KIO::del(const KURL &amp;src, bool shred, bool showProgressInfo)
:Deletes a file or directory.

== Directory entries ==

Both the KIO::stat() and KIO::listDir() jobs return their results as a type
UDSEntry, UDSEntryList resp. The latter is defined as QValueList&lt;UDSEntry&gt;.
The acronym UDS stands for "Universal directory service". The principle behind 
it is that the a directory entry only carries the information which an ioslave
can provide, not more. For example, the http slave does not provide any 
information about access permissions or file owners.
Instead, a UDSEntry is a list of UDSAtoms. Each atom provides a specific piece
of information. It consists of a type stored in m_uds and either an integer
value in m_long or a string value in m_str, depending on the type.

The following types are currently defined:

*UDS_SIZE (integer) - Size of the file.
*UDS_USER (string) - User owning the file.
*UDS_GROUP (string) - Group owning the file.
*UDS_NAME (string) - File name.
*UDS_ACCESS (integer) - Permission rights of the file, as e.g. stored by the libc function stat() in the st_mode field.
*UDS_FILE_TYPE (integer) - The file type, as e.g. stored by stat() in the st_mode field. Therefore you can use the usual libc macros like S_ISDIR to test this value. Note that the data provided by ioslaves corresponds to stat(), not lstat(), i.e. in case of symbolic links, the file type here is the type of the file pointed to by the link, not the link itself.
*UDS_LINK_DEST (string) - In case of a symbolic link, the name of the file pointed to.
*UDS_MODIFICATION_TIME (integer) - The time (as in the type time_t)  when the file was last modified, as e.g. stored by stat() in the st_mtime field.
*UDS_ACCESS_TIME (integer) - The time when the file was last accessed, as e.g. stored by stat() in the st_atime field.
*UDS_CREATION_TIME (integer) - The time when the file was created, as e.g. stored by stat() in the st_ctime field.
*UDS_URL (string) - Provides a URL of a file, if it is not simply the the concatenation of directory URL and file name.
*UDS_MIME_TYPE (string) - MIME type of the file
*UDS_GUESSED_MIME_TYPE (string) - MIME type of the file as guessed by the slave. The difference to the previous type is that the one provided here should not be taken as reliable (because determining it in a reliable way would be too expensive). For example, the KRun class explicitly checks the MIME type if it does not have reliable information.

Although the way of storing information about files in a UDSEntry is flexible
and practical from the ioslave point of view, it is a mess to use for the
application programmer. For example, in order to find out the MIME type of
the file, you have to iterate over all atoms and test whether m_uds is
UDS_MIME_TYPE. Fortunately, there is an API which is a lot easier to use:
the class KFileItem.

== Synchronous usage ==

Often, the asynchronous API of KIO is too complex to use and therefore 
implementing full asynchronicity is not a priority. For example, in a program 
that can only handle one document file at a time, there is little that can be
done while the program is downloading a file anyway. For these simple cases, 
there is a mucher simpler API in the form of a set of static functions in
KIO::NetAccess. For example, in order to copy a file, use

<syntaxhighlight lang="cpp-qt">
KURL source, target;
source = ...;
target = ...
KIO::NetAccess::copy(source, target);

The function will return after the complete copying process has finished. Still,
this method provides a progress dialog, and it makes sure that the application
processes repaint events.

A particularly interesting combination of functions is download() in combination
with removeTempFile(). The former downloads a file from given URL and stores it
in a temporary file with a unique name. The name is stored in the second argument.
''If'' the URL is local, the file is not downloaded, and instead the second
argument is set to the local file name. The function removeTempFile() deletes the
file given by its argument if the file is the result of a former download. 
If that is not the case, it does nothing. Thus, a very easy to use way of loading
files regardless of their location is the following code snippet:

<syntaxhighlight lang="cpp-qt">
KURL url;
url = ...;
QString tempFile;
if (KIO::NetAccess::download(url, tempFile) {
    // load the file with the name tempFile

== Meta data ==

As can be seen above, the interface to IO jobs is quite abstract and does not
consider any exchange of information between application and IO slave that
is protocol specific. This is not always appropriate. For example, you may give
certain parameters to the HTTP slave to control its caching behavior or
send a bunch of cookies with the request. For this need, the concept of meta
data has been introduced. When a job is created, you can configure it by adding
meta data to it. Each item of meta data consists of a key/value pair. For
example, in order to prevent the HTTP slave from loading a web page from its
cache, you can use:

<syntaxhighlight lang="cpp-qt">
void FooClass::reloadPage()
    KURL url("");
    KIO::TransferJob *job = KIO::get(url, true, false);
    job->addMetaData("cache", "reload");

The same technique is used in the other direction, i.e. for communication from
the slave to the application. The method Job::queryMetaData() asks for the
value of the certain key delivered by the slave. For the HTTP slave, one such
example is the key "modified", which contains a (stringified representation of)
the date when the web page was last modified. An example how you can use this
is the following:

<syntaxhighlight lang="cpp-qt">
void FooClass::printModifiedDate()
    KURL url("");
    KIO::TransferJob *job = KIO::get(url, true, false);
    connect( job, SIGNAL(result(KIO::Job*)),
             this, SLOT(transferResult(KIO::Job*)) );

void FooClass::transferResult(KIO::Job *job)
    QString mimetype;
    if (job->error())
    else {
        KIO::TransferJob *transferJob = (KIO::TransferJob*) job;
        QString modified = transferJob->queryMetaData("modified");
        cout << "Last modified: " << modified << endl;

== Scheduling ==

When using the KIO API, you usually do not have to cope with the details of
starting IO slaves and communicating with them. The normal use case is to
start a job and with some parameters and handle the signals the jobs emits.

Behind the curtains, the scenario is a lot more complicated. When you create a
job, it is put in a queue. When the application goes back to the event loop,
KIO allocates slave processes for the jobs in the queue.  For the first jobs
started, this is trivial: an IO slave for the appropriate protocol is started.
However, after the job (like a download from an http server) has finished, it
is not immediately killed. Instead, it is put in a pool of idle slaves and
killed after a certain time of inactivity (current 3 minutes). If a new request
for the same protocol and host arrives, the slave is reused. The obvious 
advantage is that for a series of jobs for the same host, the cost for creating
new processes and possibly going through an authentication handshake is saved.

Of course, reusing is only possible when the existing slave has already finished
its previous job. when a new request arrives while an existing slave process is 
still running, a new process must be started and used. In the API usage in the
examples above, there are no limitation for creating new slave processes: if you
start a consecutive series of downloads for 20 different files, then KIO will 
start 20 slave processes. This scheme of assigning slaves to jobs is called 
''direct''. It  not always the most appropriate scheme, as it may need much
memory and put a high load on both the client and server machines.

So there is a different way. You can ''schedule'' jobs. If you do this, only
a limited number (currently 3) of slave processes for a protocol will be 
created. If you create more jobs than that, they are put in a queue and 
are processed when a slave process becomes idle. This is done as follows:

<syntaxhighlight lang="cpp-qt">
KURL url("");
KIO::TransferJob *job = KIO::get(url, true, false);

A third possibility is ''connection oriented''. For example, for the IMAP 
slave, it does not make any sense to start multiple processes for the same
server. Only one IMAP connection at a time should be enforced. In this case,
the application must explicitly deal with the notion of a slave. It has to
allocate a slave for a certain connection and then assign all jobs which 
should go through the same connection to the same slave. This can again be
easily achieved by using the KIO::Scheduler:

<syntaxhighlight lang="cpp-qt">
KURL baseUrl("imap://");
KIO::Slave *slave = KIO::Scheduler::getConnectedSlave(baseUrl);

KIO::TransferJob *job1 = KIO::get(KURL(baseUrl, "/INBOX;UID=79374"));
KIO::Scheduler::assignJobToSlave(slave, job1);

KIO::TransferJob *job2 = KIO::get(KURL(baseUrl, "/INBOX;UID=86793"));
KIO::Scheduler::assignJobToSlave(slave, job2);



You may only disconnect the slave after all jobs assigned to it are guaranted
to be finished.

== Defining an ioslave ==

In the following we discuss how you can add a new ioslave to the system.
In analogy to services, new ioslaves are advertised to the system by
installing a little configuration file. The following
snippet installs the ftp protocol:

 protocoldir = $(kde_servicesdir)
 protocol_DATA = ftp.protocol
 EXTRA_DIST = $(mime_DATA)

The contents of the file ftp.protocol is as follows:


The "protocol" entry defines for which protocol this slave is responsible. 
"exec" is (in contrast what you would expect naively) the name of the library 
that implements the slave. When the slave is supposed to start, the "kdeinit" 
executable is started which in turn loads this library into its address space.
So in practice, you can think of the running slave as a separate process 
although it is implemented as library. The advantage of this mechanism is that 
it saves a lot of memory and  reduces the time needed by the runtime linker.

The "input" and "output" lines are not used currently.

The remaining lines in the <tt>.protocol</tt> file define which abilities the 
slave has. In general, the features a slave must implement are much simpler than 
the features the KIO API provides for the application. The reason for this is 
that complex jobs are scheduled to a couple of subjobs. For example, in order to
list a directory recursively, one job will be started for the toplevel 
directory. Then for each subdirectory reported back, new subjobs are started. A 
scheduler in KIO makes sure that not too many jobs are active at the same time.
Similarly, in order to copy a file within a protocol that does not support 
copying directly (like the ftp: protocol), KIO can read the source file and then 
write the data to the destination file. For this to work, the <tt>.protocol</tt>
must advertise the actions its slave supports.

Since slaves are loaded as shared libraries, but constitute standalone programs,
their code framework looks a bit different from normal shared library plugins.
The function which is called to start the slave is called kdemain(). This 
function does some initializations and then goes into an event loop and waits
for requests by the application using it. This looks as follows:

<syntaxhighlight lang="cpp-qt">
extern "C" { int kdemain(int argc, char **argv); }

int kdemain(int argc, char **argv)
    KInstance instance("kio_ftp");
    (void) KGlobal::locale();

    if (argc != 4) {
        fprintf(stderr, "Usage: kio_ftp protocol "
                        "domain-socket1 domain-socket2\n");

    FtpSlave slave(argv[2], argv[3]);
    return 0;

== Implementing an ioslave ==

Slaves are implemented as subclasses of KIO::SlaveBase (FtpSlave in the above 
example). Thus, the actions listed in the {{path|.protocol}} correspond to
certain virtual functions in KIO::SlaveBase the slave implementation must 
reimplement. Here is a list of possible actions and the corresponding virtual 

;reading - Reads data from a URL
:void get(const KURL &url)
;writing - Writes data to a URL and create the file if it does not exist yet.
:void put(const KURL &url, int permissions, bool overwrite, bool resume)
;moving - Renames a file.
:void rename(const KURL &src, const KURL &dest, bool overwrite)
;deleting - Deletes a file or directory.
:void del(const KURL &url, bool isFile)
;listing - Lists the contents of a directory.
:void listDir(const KURL &url)
;makedir - Creates a directory.;
:void mkdir(const KURL &url, int permissions)

Additionally, there are reimplementable functions not listed in the <tt>.protocol</tt>
file. For these operations, KIO automatically determines whether they are supported
or not (i.e. the default implementation returns an error).

;Delivers information about a file, similar to the C function stat().
:void stat(const KURL &url)
;Changes the access permissions of a file.
:void chmod(const KURL &url, int permissions)
;Determines the MIME type of a file.
:void mimetype(const KURL &url)
;Copies a file.
:copy(const KURL &url, const KURL &dest, int permissions, bool overwrite)
;Creates a symbolic link.
:void symlink(const QString &target, const KURL &dest, bool overwrite)

All these implementation should end with one of two calls: If the operation
was successful, they should call <tt>finished()</tt>. If an error has occured,
<tt>error()</tt> should be called with an error code as first argument and a
string in the second. Possible error codes are listed as enum KIO::Error. The
second argument is usually the URL in question. It is used e.g. in 
KIO::Job::showErrorDialog() in order to parametrize the human-readable error

For slaves that correspond to network protocols, it might be interesting to 
reimplement the method SlaveBase::setHost(). This is called to tell the slave
process about the host and port, and the user name and password to log in.
In general, meta data set by the application can be queried by SlaveBase::metaData().
You can check for the existence of meta data of a certain key with

== Communicating back to the application ==

Various actions implemented in a slave need some way to communicate data back
to the application using the slave process:

*get() sends blocks of data. This is done with data(), which takes a QByteArray as argument. Of course, you do not need to send all data at once. If you send a large file, call data() with smaller data blocks, so the application can process them. Call finished() when the transfer is finished.

*listDir() reports information about the entries of a directory. For this purpose, call listEntries() with a KIO::UDSEntryList as argument. Analogously to data(), you can call this several times. When you are finished, call listEntry() with the second argument set to true. You may also call totalSize() to report the total number of directory entries, if known.

*stat() reports information about a file like size, MIME type, etc. Such information is packaged in a KIO::UDSEntry, which will be discussed below. Use statEntry() to send such an item to the application.

*mimetype() calls mimeType() with a string argument.

*get() and copy() may want to provide progress information. This is done with the methods totalSize(), processedSize(), speed(). The total size and processed size are reported as bytes, the speed as bytes per second.

*You can send arbitrary key/value pairs of meta data with setMetaData().

== Interacting with the user ==

Sometimes a slave has to interact with the user. Examples include informational
messages, authentication dialogs and confirmation dialogs when a file is about
to be overwritten.

*infoMessage() - This is for informational feedback, such as the message "Retrieving data from &lt;host&gt;" from the http slave, which is often displayed in the status bar of the program. On the application side, this method corresponds to the signal KIO::Job::infoMessage().
*warning() - Displays a warning in a message box with KMessageBox::information(). If a message box is still open from a former call of warning() from the same slave process, nothing happens.
*messageBox() - This is richer than the previous method. It allows to open a message box with text and caption and some buttons. See the enum SlaveBase::MessageBoxType for reference.
*openPassDlg() - Opens a dialog for the input of user name and password.

''Initial Author:'' [ Bernd Gehrmann]


Content is available under Creative Commons License SA 4.0 unless otherwise noted.