Development/Tutorials/Programming Tutorial KDE 3/KHTML

From KDE TechBase
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

For HTML parsing, you have the following possibilities:

  • QXML
  • QDOM
  • perl
  • khtml

Obviously, QXML and QDOM need xml-compliant html pages, and the least html pages are xml-compliant. Perl is not the scope of this site. So, this tutorial choses the khtml approach.

First step

Our first khtml-program does plain nothing: <highlightSyntax language="cpp">

  1. include <qstring.h>
  2. include <kapplication.h>
  3. include <kaboutdata.h>
  4. include <kmessagebox.h>
  5. include <kcmdlineargs.h>
  6. include <dom/html_document.h>

int main (int argc, char *argv[]) {

       KAboutData aboutData( "test", "test",
       "1.0", "test", KAboutData::License_GPL,
       "(c) 2006" );
       KCmdLineArgs::init( argc, argv, &aboutData );
       KApplication khello;
       DOM::HTMLDocument();

} </highlightSyntax> It can be compiled like:

gcc -I/usr/lib/qt3/include -I/opt/kde3/include \
-L/opt/kde3/lib -lkdeui -lkhtml -o khtml khtml.cpp

Showing tags

The next program is more advanced, it shows you the first tags of an html file: <highlightSyntax language="cpp">

  1. include <kapplication.h>
  2. include <kaboutdata.h>
  3. include <kcmdlineargs.h>
  4. include <dom/html_document.h>

int main (int argc, char *argv[]) {

 KAboutData aboutData( "test", "test",
 "1.0", "test", KAboutData::License_GPL,
 "(c) 2006" );
 KCmdLineArgs::init( argc, argv, &aboutData );
 KApplication khello;

 DOM::Document doc=DOM::Document();
 DOM::HTMLDocument htmldoc=DOM::HTMLDocument();
 DOM::DOMString tag("*");
 doc.loadXML("hello.htm");
 kdDebug() << "Does this doc have child elements ? " << doc.hasChildNodes() << endl;
 kdDebug() << "First child node name: " << doc.firstChild().nodeName().string() << endl;
 kdDebug() << "First grandchild node name: " << doc.firstChild().firstChild().nodeName().string() << endl;
 kdDebug() << "Count of elements in your doc " << doc.getElementsByTagName(tag).length()<< endl;
 kdDebug() << "Size of your doc " << sizeof(doc) << endl;
 kdDebug() << doc->toString().string() << endl;

} </highlightSyntax> You can use this e.g. with the following hello.htm:

<html>
<head>
<title>blah</title>
</head>
<body>
<b>fat</b>
<a href="http://www.de">denic</a>
</body>
</html>