Difference between revisions of "Development/Tutorials/Programming Tutorial KDE 3/KHTML"

Jump to: navigation, search
(Showing tags)
(Showing tags)
Line 39: Line 39:
 
int main (int argc, char *argv[])
 
int main (int argc, char *argv[])
 
{
 
{
  KAboutData aboutData( "test", "test",
+
        KAboutData aboutData( "test", "test",
  "1.0", "test", KAboutData::License_GPL,
+
        "1.0", "test", KAboutData::License_GPL,
  "(c) 2006" );
+
        "(c) 2006" );
  KCmdLineArgs::init( argc, argv, &aboutData );
+
        KCmdLineArgs::init( argc, argv, &aboutData );
  KApplication khello;
+
        KApplication khello;
+
 
  DOM::Document doc=DOM::Document();
+
        DOM::Document* doc=new DOM::Document();
  DOM::HTMLDocument htmldoc=DOM::HTMLDocument();
+
        DOM::DOMString tag("*");
  DOM::DOMString tag("*");
+
        DOM::DOMString uri("<html><b>test</b></html>");
  doc.loadXML("hello.htm");
+
        doc->loadXML(uri);
  kdDebug() << "Does this doc have child elements ? " << doc.hasChildNodes() << endl;
+
        kdDebug() << "Does this doc have child elements ? " << doc->hasChildNodes() << endl;
  kdDebug() << "First child node name: " << doc.firstChild().nodeName().string() << endl;
+
        kdDebug() << "First child node name: " << doc->firstChild().nodeName().string() << endl;
  kdDebug() << "First grandchild node name: " << doc.firstChild().firstChild().nodeName().string() << endl;
+
        kdDebug() << "First grandchild node name: " << doc->firstChild().firstChild().nodeName().string() << endl;
  kdDebug() << "Count of elements in your doc " << doc.getElementsByTagName(tag).length()<< endl;
+
        kdDebug() << "Count of elements in your doc " << doc->getElementsByTagName(tag).length()<< endl;
  kdDebug() << "Size of your doc " << sizeof(doc) << endl;
+
        for (int i=0; i<doc->getElementsByTagName(tag).length(); i++) kdDebug() << doc->getElementsByTagName(tag).item(i).nodeName().string() << endl;
  kdDebug() << doc.toString().string() << endl;
+
        kdDebug() << "Size of your doc " << sizeof(doc) << endl;
 +
        kdDebug() << doc->toString().string() << endl;
 
}
 
}
 
</highlightSyntax>
 
</highlightSyntax>
You can use this e.g. with the following
 
hello.htm:
 
<pre>
 
<html>
 
<head>
 
<title>blah</title>
 
</head>
 
<body>
 
<b>fat</b>
 
<a href="http://www.de">denic</a>
 
</body>
 
</html>
 
</pre>
 
 
You get an error because your file is not UTF-16 encoded. Here's how I proceed:
 
scorpio:~/html # hexdump hello.htm
 
0000000 3c00 000a
 
0000003
 

Revision as of 20:29, 29 October 2006

For HTML parsing, you have the following possibilities:

  • QXML
  • QDOM
  • perl
  • khtml

Obviously, QXML and QDOM need xml-compliant html pages, and the least html pages are xml-compliant. Perl is not the scope of this site. So, this tutorial choses the khtml approach.

First step

Our first khtml-program does plain nothing: <highlightSyntax language="cpp">

  1. include <qstring.h>
  2. include <kapplication.h>
  3. include <kaboutdata.h>
  4. include <kmessagebox.h>
  5. include <kcmdlineargs.h>
  6. include <dom/html_document.h>

int main (int argc, char *argv[]) {

       KAboutData aboutData( "test", "test",
       "1.0", "test", KAboutData::License_GPL,
       "(c) 2006" );
       KCmdLineArgs::init( argc, argv, &aboutData );
       KApplication khello;
       DOM::HTMLDocument();

} </highlightSyntax> It can be compiled like:

gcc -I/usr/lib/qt3/include -I/opt/kde3/include \
-L/opt/kde3/lib -lkdeui -lkhtml -o khtml khtml.cpp

Showing tags

The next program is more advanced, it shows you the first tags of an html file: <highlightSyntax language="cpp">

  1. include <kapplication.h>
  2. include <kaboutdata.h>
  3. include <kcmdlineargs.h>
  4. include <dom/html_document.h>

int main (int argc, char *argv[]) {

       KAboutData aboutData( "test", "test",
       "1.0", "test", KAboutData::License_GPL,
       "(c) 2006" );
       KCmdLineArgs::init( argc, argv, &aboutData );
       KApplication khello;
       DOM::Document* doc=new DOM::Document();
       DOM::DOMString tag("*");
       DOM::DOMString uri("<html>test</html>");
       doc->loadXML(uri);
       kdDebug() << "Does this doc have child elements ? " << doc->hasChildNodes() << endl;
       kdDebug() << "First child node name: " << doc->firstChild().nodeName().string() << endl;
       kdDebug() << "First grandchild node name: " << doc->firstChild().firstChild().nodeName().string() << endl;
       kdDebug() << "Count of elements in your doc " << doc->getElementsByTagName(tag).length()<< endl;
       for (int i=0; i<doc->getElementsByTagName(tag).length(); i++) kdDebug() << doc->getElementsByTagName(tag).item(i).nodeName().string() << endl;
       kdDebug() << "Size of your doc " << sizeof(doc) << endl;
       kdDebug() << doc->toString().string() << endl;

} </highlightSyntax>


KDE® and the K Desktop Environment® logo are registered trademarks of KDE e.V.Legal