Development/Tutorials/Programming Tutorial KDE 3/KHTML

From KDE TechBase
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

For HTML parsing, you have the following possibilities:

  • QXML
  • QDOM
  • perl
  • khtml

Obviously, QXML and QDOM need xml-compliant html pages, and the least html pages are xml-compliant. Perl is not the scope of this site. So, this tutorial choses the khtml approach. Our first khtml-program does plain nothing: <highlightSyntax language="cpp">

  1. include <qstring.h>
  2. include <kapplication.h>
  3. include <kaboutdata.h>
  4. include <kmessagebox.h>
  5. include <kcmdlineargs.h>
  6. include <dom/html_document.h>

int main (int argc, char *argv[]) {

       KAboutData aboutData( "test", "test",
       "1.0", "test", KAboutData::License_GPL,
       "(c) 2006" );
       KCmdLineArgs::init( argc, argv, &aboutData );
       KApplication khello;
       DOM::HTMLDocument();

} </highlightSyntax> It can be compiled like:

gcc -I/usr/lib/qt3/include -I/opt/kde3/include \
-L/opt/kde3/lib -lkdeui -lkhtml -o khtml khtml.cpp

<highlightSyntax language="cpp">

  1. include <kapplication.h>
  2. include <kaboutdata.h>
  3. include <kcmdlineargs.h>
  4. include <dom/html_document.h>

int main (int argc, char *argv[]) {

 KAboutData aboutData( "test", "test",
 "1.0", "test", KAboutData::License_GPL,
 "(c) 2006" );
 KCmdLineArgs::init( argc, argv, &aboutData );
 KApplication khello;

 DOM::Document doc=DOM::Document();
 DOM::HTMLDocument htmldoc=DOM::HTMLDocument();
 DOM::DOMString tag("*");
 doc.loadXML("hello.htm");
 kdDebug() << "Does this doc have child elements ? " << doc.hasChildNodes() << endl;
 kdDebug() << "First child node name: " << doc.firstChild().nodeName().string() << endl;
 kdDebug() << "First grandchild node name: " << doc.firstChild().firstChild().nodeName().string() << endl;
 kdDebug() << "Count of elements in your doc " << doc.getElementsByTagName(tag).length()<< endl;
 kdDebug() << "Size of your doc " << sizeof(doc) << endl;

} </highlightSyntax> You can use this e.g. with the following hello.htm:

<html>
<head>
<title>blah</title>
</head>
<body>
<b>fat</b>
<a href="http://www.de">denic</a>
</body>
</html>