Jump to content

Policies/Binary Compatibility Issues With C++: Difference between revisions

From KDE TechBase
CuCullin (talk | contribs)
No edit summary
Alexmerry (talk | contribs)
Replaced content with "{{Moved To Community}}"
 
(50 intermediate revisions by 18 users not shown)
Line 1: Line 1:
== Definition ==
{{Moved To Community}}
 
A library is '''binary compatible''', if a program linked dynamically to a former version of the library continues running with newer versions of the library without the need to recompile.
 
If a program needs to be recompiled to run with a new version of library but doesn't require any further modifications, the library is '''source compatible'''.
 
Binary compatibility saves a lot of trouble. It makes it much easier to distribute software for a certain platform. Without ensuring binary compatibility between releases, people will be forced to provide statically linked binaries. Static binaries  are bad because they
* waste resources (especially memory)
* don't allow the program to benefit from bugfixes or extensions in the libraries
 
In the KDE project, we will provide binary compatibility within the life-span of a major release.
 
== The Do's and Don'ts ==
 
You can...
 
* add new non-virtual functions including signals and slots.
* add a new enum to a class.
* append new enumerators to an existing enum.
* reimplement virtual functions defined in one of the base classes '''if''' it is safe that programs linked with the prior version of the library call the implementation in the base class rather than the new one. ''This is tricky and might be dangerous. Think twice before doing it. Alternatively see below for a workaround.''
* change an inline function or make an inline function non-inline '''if''' it is safe that programs linked with the prior version of the library call the old implementation. ''This is tricky and might be dangerous. Think twice before doing it.'' Because of this, classes that are supposed to stay binary compatible should '''always''' have non-inline destructor, even if it's empty, otherwise the compiler will automatically generate an empty inlined one.
* Remove private non-virtual functions '''if''' they are not called
by any inline functions.
* change the default arguments of a method. It requires recompilation to use the actual new default argument values, though.
* add new '''static''' data members.
* add new classes.
 
You cannot...
* add new virtual functions as this will change the layout of the virtual table and thus break subclasses. ''See below for some workarounds or ask on mailing lists.''
* change the order of virtual functions in the class declaration. This will just as well change the layout of the virtual table.
* change the signature of a function. This includes:
** changing any of the types of the arguments in the parameter list (instead, add a new method)
** changing the return type
** extending a function with another parameter, even if this parameter has a default argument
Suggestion: when adding new functions with the same name and different/extended argument lists, you may want to add a short note that the two functions shall be merged with a default argument in later versions of the library:
 
<code>
void functionname( int a );
void functionname( int a, int b ); //BCI: merge with int b = 0
</code>
 
* change the access rights to some functions or data members, for example from
<tt>private</tt> to <tt>public</tt>. With some compilers, this information may be part of the signature. If you need to make a private function protected or even public, you have to add a new function that calls the private one.
* add new data members to a class or change order of data members in a class (doesn't apply to static ones).
* change the class hierachy apart from adding new classes.
 
You should...
 
In order to make a class to extend in the future you should follow these rules:
* add d-pointer. ''See below''.
* add non-inline virtual destructor even if the body is empty.
* reimplement <tt>event</tt> in widget classes, even if the body for the function is empty.
* make all constructors non-inline.
* write non-inline implementations of the copy constructor and assignment operator unless the class cannot be copied by value (e.g. classes inherited from QObject can't be)
 
== Techniques for Library Programmers ==
 
The biggest problem when writing libraries is, that one cannot safely add data members since this would change the size and layout of every class, struct, or array containing objects of the type, including subclasses.
 
=== Bitflags ===
One exception are bitflags. If you use bitflags for enums or bools, you can safely round up to at least the next byte minus 1. A class with members
 
<code>
uint m1 : 1;
uint m2 : 3;
uint m3 : 1;
</code>
<code>
uint m1 : 1;
uint m2 : 3;
uint m3 : 1;
uint m4 : 2; // new member
</code>
without breaking binary compatibility. Please round up to a maxmimum of 7 bits (or 15 if the bitfield was already larger than 8). Using the very last bit may cause problems on some compilers.
 
=== Using a d-Pointer===
Bitflags and predefined reserved variables are nice, but far from being sufficient.  This is where the d-pointer technique comes into play.  The name "d-pointer" stems from Trolltech's Arnt Gulbrandsen, who first introduced the technique into Qt, making it one of the first C++ GUI libraries to maintain binary compatibility even between bigger release. The technique was quickly adapted as general programming pattern for the KDE libraries by everyone who saw it. It's a great trick to be able to add new private data members to a class without breaking binary compatibility.
 
'''Remark:''' The d-pointer pattern has been described many
times in computer science history under various names, e.g. as pimpl,
as handle/body or as cheshire cat.  Google helps finding online papers
for any of these, just add C++ to the search terms.</small>
 
In your class definition for class Foo, define a forward declaration
<code>
class FooPrivate;
</code>
and the d-pointer in the private section:
<code>
private:
FooPrivate* d;
</code>
The FooPrivate class itself is purely defined in the clas implementation file (usually *.cpp ), for example:
<code>
class FooPrivate {
public:
FooPrivate()
: m1(0), m2(0)
{};
int m1;
int m2;
QString s;
};
</code>
 
All you have to do now is to create the private data in your constructors or your init function with
<code>
d = new FooPrivate;
</code>
and to delete it again in your destructor with
<code>
delete d;
</code>
 
You may not want all member variables to live in the private data object, though. For very often used members, it's faster to put them directly in the class, since inline functions cannot access the d-pointer data. Also note that all data covered by the d-pointer is obviously private. For public or protected access, provide both a set and a get function. Example
<code>
QString Foo::string() const
{
return d->s;
}
 
void setString( const QString&amp; s )
{
d->s = s;
}
</code>
 
<h2>Trouble shooting</h2>
 
=== Adding new data members to classes without d-pointer ===
 
If you don't have free bitflags, reserved variables and no d-pointer either, but you absolutely have to add a new private member variable, there are still some possibilities left. If your class inherits QObject, you can for example place the additional data in a special child and find it by traversing over the list of children. You can access the list of children with QObject::children(). However, a fancier and usually faster approach is to use a hashtable to store a
mapping between your object and the extra data. For this purpose, Qt provides a pointer-based dictionary called QHash (or QPtrDict in Qt3).
 
The basic trick in your class implementation of class Foo is:
* Create a private data class FooPrivate.
* Create a static QHash&lt;Foo *, FooPrivate&gt;.
*Note that some compilers/linkers (almost all, unfortunately) do not manage to create static objects in shared libraries. They simply forget to call the constructor. Therefore you should use the  <tt>Q_GLOBAL_STATIC</tt> macro to create and access the object:
 
<code>
// BCI: Add a real d-pointer
Q_GLOBAL_STATIC(QHash&lt;Foo *,FooPrivate&gt;, d_func);
static FooPrivate* d( const Foo* foo )
{
FooPrivate* ret = d_func()->value( foo, 0 );
if ( ! ret ) {
ret = new FooPrivate;
d_func()->insert( foo, ret );
}
return ret;
}
static void delete_d( const Foo* foo )
{
FooPrivate* ret = d_func()->value( foo, 0 );
        delete ret;
        d_func()->remove( foo );
}
</code>
 
* Now you can use the d-pointer in your class almost as simple as in the code before, just with a function call to d(this). For example:
 
<code>
d(this)->m1 = 5;
</code>
 
* Add a line to your destructor:
<code>
delete_d(this);
</code>
* Do not forget to add a BCI remark, so that the hack can be removed in the next version of the library.
* Do not forget to add a d-pointer to your next class.
 
<h3>Adding a reimplemented virtual function</h3>
 
<p>As already explained, you can safely reimplement a virtual function defined
in one of the base classes only if it is safe that the programs linked
with the prior version call the implementation in the base class rather than
the new one. This is because the compiler sometimes calls virtual functions
directly if it can determine which one to call (for example if you have
<table align="center"  width="80%"><tr><td BGCOLOR="#F0F0FF"><pre>
void C::foo()
{
B::foo();
}
</pre></td></tr></table>
then B::foo() is called directly. If class B inherits from class A which implements
foo() and B itself doesn't reimplement it, then C::foo() will in fact call A::foo().
If a newer version of the library adds B::foo(), C::foo() will call it only after
a recompilation.</p>
 
<p>Another more common example is:
<table align="center"  width="80%"><tr><td BGCOLOR="#F0F0FF"><pre>
B b; // B derives from A
b.foo();
</pre></td></tr></table>
then the call to foo() will not use the virtual table. That means that
if B::foo() didn't exist in the library but now does, code that was
compiled with the earlier version will still call A::foo().</p>
 
<p>If you can't guarantee things will continue to work without a recompilation, move
functionality from A::foo() to a new protected function A::foo2() and use this code:
<table align="center"  width="80%"><tr><td BGCOLOR="#F0F0FF"><pre>
void A::foo()
{
if( B* b = dynamic_cast< B* >( this ))
b->B::foo(); // B:: is important
else
foo2();
}
void B::foo()
{
// added functionality
A::foo2(); // call base function with real functionality
}
</pre></td></tr></table>
All calls to A::foo() for objects of type B (or inherited) will result in calling B::foo().
The only case that will not work as expected are calls to A::foo() that explicitly specify
A::foo(), but B::foo() calls A::foo2() instead and there should not be other places doing so.
</p>
 
<h3>Using a new class</h3>
 
<p>A relatively simple method of "extending" a class can be writing a replacement
class that will include also the new functionality (and that may inherit from the old
class to reuse the code). This of course requires adapting and recompiling applications using
the library, so it is not possible this way to fix or extend functionality of classes
that are used by applications compiled against an older version of the library. However,
especially with small and/or performance-critical classes it may be simpler to write
them without making sure they'll be simple to extend in the future and if the need arises
later write a new replacement class that will provide new features or better performance.</p>
 
<h3>Adding new virtual functions to leaf classes</h3>
<p>
This technique is one of cases of using a new class that can help if there's a need to add
new virtual functions to a class that should stay binary compatible and there is no class
inheriting from it that should also stay binary compatible (i.e. all classes inheriting from it are
in applications). In such case it's possible to add a new class inheriting from the original one
that will add them. Applications using the new functionality will of course have to be modified
to use the new class.
<table align="center"  width="80%"><tr><td BGCOLOR="#F0F0FF"><pre>
class A {
public:
virtual void foo();
};
class B : public A { // newly added class
public:
virtual void bar(); // newly added virtual function
};
void A::foo()
{
// here it's needed to call a new virtual function
if( B* this2 = dynamic_cast< B* >( this ))
this2->bar();
}
</pre></td></tr></table>
It is not possible to use this technique when there are other inherited classes that should also
stay binary compatible because they'd have to inherit from the new class.</p>
 
 
<h3>Using signals instead of virtual functions</h3>
<p>
Qt's signals and slots are invoked using a special virtual method created by the Q_OBJECT macro
and it exists in every class inherited from QObject. Therefore adding new
signals and slots doesn't affect binary compatibility and the signals/slots mechanism can be
used to emulate virtual functions.
<table align="center"  width="80%"><tr><td BGCOLOR="#F0F0FF"><pre>
class A : public QObject {
Q_OBJECT
public:
A();
virtual void foo();
signals:
void bar( int* ); // added new "virtual" function
protected slots:
void barslot( int* ); // implementation of the virtual function in A
};
 
A::A()
{
connect( this, SIGNAL( bar( int* )), this, SLOT( barslot( int* )));
}
 
void A::foo()
{
int ret;
emit bar( &ret );
}
 
void A::barslot( int* ret )
{
*ret = 10;
}
</pre></td></tr></table>
Function bar() will act like a virtual function, barslot() implements the actual functionality
of it. Since signals have void return value, data must be returned using arguments. As
there will be only one slot connected to the signal returning data from the slot this way will
work without problems. Note that with Qt4 for this to work the connection type will have to be
Qt::DirectConnection.</p>
<p>
If an inherited class will want to reimplement the functionality of bar() it will have to provide
its own slot:
<table align="center"  width="80%"><tr><td BGCOLOR="#F0F0FF"><pre>
class B : public A {
Q_OBJECT
public:
B();
protected slots: // necessary to specify as a slot again
void barslot( int* ); // reimplemented functionality of bar()
};
 
B::B()
{
disconnect( this, SIGNAL( bar( int* )), this, SLOT( barslot( int* )));
connect( this, SIGNAL( bar( int* )), this, SLOT( barslot( int* )));
}
 
void B::barslot( int* ret )
{
*ret = 20;
}
</pre></td></tr></table>
Now B::barslot() will act like virtual reimplementation of A::bar(). Note that it is necessary
to specify barslot() again as a slot in B and that in the constructor it is necessary to first
disconnect and then connect again, that will disconnect A::barslot() and connect B::barslot()
instead.</p>
 
<p>Note: the same can be accomplished by implementing a virtual slot.</p>
 
<p align="right"> <small> <em>
Matthias Ettrich <a href="mailto:[email protected]">[email protected]</a><br>
Lubos Lunak <a href="mailto:[email protected]">[email protected]</a>
</em></small></p>

Latest revision as of 18:24, 10 March 2016

This page is now on the community wiki.