Policies/Binary Compatibility Examples: Difference between revisions

From KDE TechBase
(Add a section on changing the type of global data)
m (Text replace - "</code>" to "</syntaxhighlight>")
(9 intermediate revisions by 3 users not shown)
Line 7: Line 7:
! After
! After
|-
|-
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
class KDECORE_EXPORT KUrl
class KDECORE_EXPORT KUrl
{
{
   // [...]
   // [...]
};
};
</code>
</syntaxhighlight>
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
class KUrl
class KUrl
{
{
   // [...]
   // [...]
};
};
</code>
</syntaxhighlight>
|}
|}
'''Reason''': the symbols for the class above are not added to the exported symbols list of the library, so other libraries and applications cannot see them.


== Change the class hierarchy ==
== Change the class hierarchy ==
Line 27: Line 29:
! After
! After
|-
|-
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
class MyClass: public BaseClass
class MyClass: public BaseClass
{
{
   // [...]
   // [...]
};
};
</code>
</syntaxhighlight>
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
class MyClass: public BaseClass, public OtherBaseClass
class MyClass: public BaseClass, public OtherBaseClass
{
{
   // [...]
   // [...]
};
};
</code>
</syntaxhighlight>
|-
|-
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
class MyClass: public BaseClass1, public BaseClass2
class MyClass: public BaseClass1, public BaseClass2
{
{
   // [...]
   // [...]
};
};
</code>
</syntaxhighlight>
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
class MyClass: public BaseClass2, public BaseClass1
class MyClass: public BaseClass2, public BaseClass1
{
{
   // [...]
   // [...]
};
};
</code>
</syntaxhighlight>
|}
|}
'''Reason''': the size and/or order of member data in the class changes, causing existing code to allocate too much or too little memory, read/write data at the wrong offsets.


== Change the template arguments of a template class ==
== Change the template arguments of a template class ==
Line 60: Line 64:
! After
! After
|-
|-
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
template<typename T1>
template<typename T1>
class MyTemplateClass
class MyTemplateClass
Line 66: Line 70:
     // [...]
     // [...]
};
};
</code>
</syntaxhighlight>
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
template<typename T1, typename T2 = void>
template<typename T1, typename T2 = void>
class MyTemplateClass
class MyTemplateClass
Line 73: Line 77:
     // [...]
     // [...]
};
};
</code>
</syntaxhighlight>
|}
|}
<syntaxhighlight lang="cpp-qt">
// GCC mangling before: _Z3foo15MyTemplateClassIiE
//              after:  _Z3foo15MyTemplateClassIivE
void foo(MyTemplateClass<int>);
</syntaxhighlight>
'''Reason''': the mangling of the functions related to this template type change because its template expansion changes too. This can happen both for member functions (for example, the constructor) as well as functions that take it as a parameter.


== Unexport a function ==
== Unexport a function ==
Line 82: Line 94:
! After
! After
|-
|-
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
Q_CORE_EXPORT const char *qVersion();
Q_CORE_EXPORT const char *qVersion();
</code>
</syntaxhighlight>
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
const char *qVersion();
const char *qVersion();
</code>
</syntaxhighlight>
|-
|-
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
namespace KSocketFactory {
namespace KSocketFactory {
     KDECORE_EXPORT QTcpSocket *connectToHost(...);
     KDECORE_EXPORT QTcpSocket *connectToHost(...);
}
}
</code>
</syntaxhighlight>
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
namespace KSocketFactory {
namespace KSocketFactory {
     QTcpSocket *connectToHost(...);
     QTcpSocket *connectToHost(...);
}
}
</code>
</syntaxhighlight>
|}
|}
'''Reason''': the symbols for the functions above are not added to the exported symbols list of the library, so other libraries and applications cannot see them.


== Inline a function ==
== Inline a function ==
Line 107: Line 121:
! After
! After
|-
|-
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
int square(int n);
int square(int n);
</code>
</syntaxhighlight>
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
inline int square(int n) { return n * n; }
inline int square(int n) { return n * n; }
</code>
</syntaxhighlight>
|-
|-
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
int square(int n) { return n * n; }
int square(int n) { return n * n; }
</code>
</syntaxhighlight>
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
inline int square(int n) { return n * n; }
inline int square(int n) { return n * n; }
</code>
</syntaxhighlight>
|-
|-
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
class Math
class Math
{
{
Line 130: Line 144:
int Math::square(int n)
int Math::square(int n)
{ return n * n; }
{ return n * n; }
</code>
</syntaxhighlight>
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
class Math
class Math
{
{
Line 140: Line 154:
inline int Math::square(int n)
inline int Math::square(int n)
{ return n * n; }
{ return n * n; }
</code>
</syntaxhighlight>
|-
|-
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
class Math
class Math
{
{
Line 151: Line 165:
int Math::square(int n)
int Math::square(int n)
{ return n * n; }
{ return n * n; }
</code>
</syntaxhighlight>
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
class Math
class Math
{
{
Line 158: Line 172:
     { return n * n; }
     { return n * n; }
};
};
</code>
</syntaxhighlight>
|}
|}
'''Reason''': when a function is declared inline and the compiler does inline it at its call point, the compiler does not have to emit an out-of-line copy. Code that exists and was calling this function will therefore not be able to resolve the function anymore. Also, when compiling with GCC and <tt>-fvisibility-inlines-hidden</tt>, if the compiler does emit an out-of-line copy, it will be hidden (not added to the exported symbols table) and therefore not accessible from other libraries.


== Change the parameters of a function ==
== Change the parameters of a function ==
Line 167: Line 183:
! After
! After
|-
|-
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
// GCC mangling: _Z11doSomethingii
// MSVC mangling: ?doSomething@@YAXHH@Z
void doSomething(int i1, int i2);
void doSomething(int i1, int i2);
</code>
</syntaxhighlight>
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
// GCC mangling: _Z11doSomethingis
// MSVC mangling: ?doSomething@@YAXHF@Z
void doSomething(int i1, short i2);
void doSomething(int i1, short i2);
</code>
</syntaxhighlight>
|-
|-
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
// GCC mangling: _Z11doSomethingii
// MSVC mangling: ?doSomething@@YAXHH@Z
void doSomething(int i1, int i2);
void doSomething(int i1, int i2);
</code>
</syntaxhighlight>
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
// GCC mangling: _Z11doSomethingiii
// MSVC mangling: ?doSomething@@YAXHHH@Z
void doSomething(int i1, int i2, int i3 = 0);
void doSomething(int i1, int i2, int i3 = 0);
</code>
</syntaxhighlight>
|-
| <syntaxhighlight lang="cpp-qt">
// GCC mangling: _Z11doSomethingRi
// MSVC mangling: ?doSomething@@YAXABH@Z
void doSomething(int &i1);
</syntaxhighlight>
| <syntaxhighlight lang="cpp-qt">
// GCC mangling: _Z11doSomethingRKi
// MSVC mangling: ?doSomething@@YAXAAH@Z
void doSomething(const int &i1);
</syntaxhighlight>
|-
|-
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
void doSomething(int i1);
void doSomething(int i1);
</code>
</syntaxhighlight>
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
void doSomething(const int i1); // breaks with MSVC and Sun CC
void doSomething(const int i1); // breaks with Sun CC
</code>
</syntaxhighlight>
|-
|-
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
// GCC mangling: _Z11doSomethingPc
// MSVC mangling: ?doSomething@@YAXPAD@Z (32-bit)
void doSomething(char *ptr);
void doSomething(char *ptr);
</code>
</syntaxhighlight>
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
// GCC mangling: _Z11doSomethingPKc
// MSVC mangling: ?doSomething@@YAXPBD@Z (32-bit)
void doSomething(const char *ptr);
void doSomething(const char *ptr);
</code>
</syntaxhighlight>
|}
|}
'''Reason''': changing the parameters of a function (adding new or changing existing) changes the mangled name of that function. The reason for that is that the C++ language allows overloading of functions with the same name but slightly different parameters.
I don't have the mangled name for the Sun CC example above, that compiler does enforce the constness of POD types in both declaration and implementation.


== Change the return type ==
== Change the return type ==
Line 202: Line 245:
! After
! After
|-
|-
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
// GCC mangling: _Z8positionv
// MSVC mangling: ?position@@YA_JXZ
qint64 position();
</syntaxhighlight>
| <syntaxhighlight lang="cpp-qt">
// GCC mangling: _Z8positionv
// MSVC mangling: ?position@@YAHXZ
int position();
</syntaxhighlight>
|-
| <syntaxhighlight lang="cpp-qt">
// GCC mangling: _Z4namev
// MSVC mangling: ?position@@YAVQByteArray@@DXZ
QByteArray name();
</syntaxhighlight>
| <syntaxhighlight lang="cpp-qt">
// GCC mangling: _Z4namev
// MSVC mangling: ?position@@YAVQString@@XZ
QString name();
</syntaxhighlight>
|-
| <syntaxhighlight lang="cpp-qt">
// GCC mangling: _Z4namev
// MSVC mangling: ?position@@YAPBDXZ
const char *name();
</syntaxhighlight>
| <syntaxhighlight lang="cpp-qt">
// GCC mangling: _Z4namev
// MSVC mangling: ?position@@YAVQString@@XZ
QString name();
</syntaxhighlight>
|-
| <syntaxhighlight lang="cpp-qt">
// GCC mangling: _Z12createDevicev
// MSVC mangling: ?createDevice@@YAPAVQTcpSocket@@XZ
QTcpSocket *createDevice();
QTcpSocket *createDevice();
</code>
</syntaxhighlight>
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
// GCC mangling: _Z12createDevicev (unchanged)
// MSVC mangling: ?createDevice@@YAPAVQIODevice@@XZ
QIODevice *createDevice();
QIODevice *createDevice();
</code>
</syntaxhighlight>
|-
| <syntaxhighlight lang="cpp-qt">
// GCC mangling: _ZNK10QByteArray2atEi
// MSVC mangling: ?at@QByteArray@@QBA?BDH@Z
const char QByteArray::at(int) const;
</syntaxhighlight>
| <syntaxhighlight lang="cpp-qt">
// GCC mangling: _ZNK10QByteArray2atEi (unchanged)
// MSVC mangling: ?at@QByteArray@@QBADH@Z
char QByteArray::at(int) const;
</syntaxhighlight>
|-
| <syntaxhighlight lang="cpp-qt">
// GCC mangling: _ZN6QEvent17registerEventTypeEi
// MSVC mangling: ?registerEventType@QEvent@@QAAXH@Z
int QEvent::registerEventType(int)
</syntaxhighlight>
| <syntaxhighlight lang="cpp-qt">
// GCC mangling: _ZN6QEvent17registerEventTypeEi (unchanged)
// MSVC mangling: ?registerEventType@QEvent@@QAAXW4Type@V0@@@Z
QEvent::Type QEvent::registerEventType(int)
</syntaxhighlight>
|}
|}
'''Reason''': changing the return type changes the mangled name of the function names in some compilers (GCC notably does not encode the return type). However, even if the mangling doesn't change, the convention on how the return types are handled may change.
In the first example above, the return type changed from a 64- to a 32-bit integer, which means on some architectures, the upper half of the return register may contain garbage. In the second example, the return type changed from QByteArray to QString, which are two incompatible types.
In the third example, the return type changed from a simple integer (a POD) to a QString -- in this case, the compiler usually needs to pass a hidden implicit first parameter, which won't be there. In this case, existing code calling the function will more than likely crash, due to trying to dereference the implicit QString* parameter that isn't there.
In the fourth example, the return type changed from one POD type (an int) to another (an enum), which is also carried by an int. The calling sequence is most likely the same in all compilers, however the mangling of the symbol name changed, meaning that calls will fail due to an unresolved symbol.


== Change the access rights ==
== Change the access rights ==
Line 216: Line 326:
! After
! After
|-
|-
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
class MyClass
class MyClass
{
{
protected:
protected:
    // GCC mangling: _ZN7MyClass11doSomethingEv
    // MSVC mangling: ?doSomething@MyClass@@IAAXXZ
     void doSomething();
     void doSomething();
};
};
</code>
</syntaxhighlight>
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
class MyClass
class MyClass
{
{
public:
public:
    // GCC mangling: _ZN7MyClass11doSomethingEv (unchanged)
    // MSVC mangling: ?doSomething@MyClass@@QAAXXZ
     void doSomething();
     void doSomething();
};
};
</code>
</syntaxhighlight>
|}
|}
'''Reason''': some compilers encode the protection type of a function in its mangled name.


== Change the CV-qualifiers of a member function ==
== Change the CV-qualifiers of a member function ==
Line 238: Line 354:
! After
! After
|-
|-
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
class MyClass
class MyClass
{
{
public:
public:
    // GCC mangling: _ZNK7MyClass9somethingEv
    // MSVC mangling: ?something@MyClass@QBAHXZ
     int something() const;
     int something() const;
};
};
</code>
</syntaxhighlight>
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
class MyClass
class MyClass
{
{
public:
public:
    // GCC mangling: _ZN7MyClass9somethingEv
    // MSVC mangling: ?something@MyClass@QAAHXZ
     int something();
     int something();
};
};
</code>
</syntaxhighlight>
|}
|}
'''Reason''': compilers encode the constness of a function in the mangled name. The reason they all do that is because the C++ standard allows overloading of functions that differ only by the constness.


== Change the type of global data ==
== Change the type of global data ==
Line 260: Line 382:
! After
! After
|-
|-
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
// GCC mangling: data (undecorated)
// MSVC mangling: ?data@@3HA
// MSVC mangling: ?data@@3HA
int data = 42;
int data = 42;
</code>
</syntaxhighlight>
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
// GCC mangling: data (undecorated)
// MSVC mangling: ?data@@3FA
// MSVC mangling: ?data@@3FA
short data = 42;
short data = 42;
</code>
</syntaxhighlight>
|-
|-
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
class MyClass
class MyClass
{
{
public:
public:
    // GCC mangling: _ZN7MyClass4dataE
     // MSVC mangling: ?data@MyClass@@2HA
     // MSVC mangling: ?data@MyClass@@2HA
     static int data;
     static int data;
};
};
</code>
</syntaxhighlight>
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
class MyClass
class MyClass
{
{
public:
public:
    // GCC mangling: _ZN7MyClass4dataE
     // MSVC mangling: ?data@MyClass@@2FA
     // MSVC mangling: ?data@MyClass@@2FA
     static short data;
     static short data;
};
};
</code>
</syntaxhighlight>
|}
|}
'''Reason''': some compilers encode the type of the global data in its mangled name. Especially note that some compilers mangle even for simple data types that would be allowed in C, meaning the <tt>extern "C"</tt> qualifier makes a difference too.
Even if the mangling doesn't change, changing the type often changes the size of the data as well. That means code that was accessing the global data may be access too many or too few bytes.


== Change the CV-qualifiers of global data ==
== Change the CV-qualifiers of global data ==
Line 293: Line 423:
! After
! After
|-
|-
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
// MSVC mangling: ?data@@3HA
// MSVC mangling: ?data@@3HA
int data = 42;
int data = 42;
</code>
</syntaxhighlight>
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
// MSVC mangling: ?data@@3HB
// MSVC mangling: ?data@@3HB
const int data = 42;
const int data = 42;
</code>
</syntaxhighlight>
|-
|-
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
class MyClass
class MyClass
{
{
Line 309: Line 439:
     static int data;
     static int data;
};
};
</code>
</syntaxhighlight>
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
class MyClass
class MyClass
{
{
Line 317: Line 447:
     static const int data;
     static const int data;
};
};
</code>
</syntaxhighlight>
|-
|-
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
class MyClass
class MyClass
{
{
Line 325: Line 455:
     static int data;
     static int data;
};
};
</code>
</syntaxhighlight>
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
class MyClass
class MyClass
{
{
Line 333: Line 463:
     static const int data = 42;
     static const int data = 42;
};
};
</code>
</syntaxhighlight>
|}
|}
'''Reason''': some compilers encode the CV-qualifiers of the global data in its mangled name. Especially note that a static const value declared in the class itself can be considered for "inlining" -- that is, the compiler doesn't need to generate an external symbol for the value since all implementations are guaranteed to know it.
Even for compilers that don't encode the CV-qualifiers of global data, adding const may make the compiler place the variable in a read-only section of memory. Code that tried to write it will probably crash.


== Add a virtual member function to a class without any ==
== Add a virtual member function to a class without any ==
Line 342: Line 476:
! After
! After
|-
|-
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
struct Data
struct Data
{
{
     int i;
     int i;
};
};
</code>
</syntaxhighlight>
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
struct Data
struct Data
{
{
Line 354: Line 488:
     virtual int j();
     virtual int j();
};
};
</code>
</syntaxhighlight>
|}
|}
'''Reason''': a class without any virtual members or bases is guaranteed to be exactly like a C structure, for compatibility with that language (that is a POD structure). On some compilers, structures/classes with bases that are POD themselves are also POD. However, as soon as there's one virtual base or virtual member function, the compiler is free to arrange the structure in a C++ manner, which usually means inserting a hidden pointer at the beginning or the end of the structure, pointing to the virtual table of that class. This changes the size and offset of the elements in the structure.


== Add new virtuals to a non-leaf class ==
== Add new virtuals to a non-leaf class ==
Line 363: Line 499:
! After
! After
|-
|-
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
class MyClass
class MyClass
{
{
Line 370: Line 506:
     virtual void foo();
     virtual void foo();
};
};
</code>
</syntaxhighlight>
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
class MyClass
class MyClass
{
{
Line 379: Line 515:
     virtual void bar();
     virtual void bar();
};
};
</code>
</syntaxhighlight>
|}
|}
'''Reason''': the addition of a new virtual function to a class that is non-leaf (that is, there is at least one class deriving from this class) changes the layout of the virtual table (the virtual table is basically a list of function pointers, pointing to the functions that are active in this class level). To accommodate the new virtual, the compiler must add a new entry to this table, but existing derived classes won't know about it and will not have the entry in their virtual tables.


== Change the order of the declaration of virtual functions ==
== Change the order of the declaration of virtual functions ==
Line 388: Line 526:
! After
! After
|-
|-
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
class MyClass
class MyClass
{
{
Line 396: Line 534:
     virtual void bar();
     virtual void bar();
};
};
</code>
</syntaxhighlight>
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
class MyClass
class MyClass
{
{
Line 405: Line 543:
     virtual void foo();
     virtual void foo();
};
};
</code>
</syntaxhighlight>
|}
|}
'''Reason''': the compiler places the pointers to the functions implementing the virtual functions in the order that they are declared in the class. By changing the order of the declaration, the order of the entries in the virtual table changes too.
Note: the order is inherited from the parent classes, so overriding a virtual will allocate the entry in the parent's order.


== Override a virtual that doesn't come from a primary base ==
== Override a virtual that doesn't come from a primary base ==
<code cppqt>
<syntaxhighlight lang="cpp-qt">
class PrimaryBase
class PrimaryBase
{
{
Line 420: Line 562:
{
{
public:
public:
     virtual ~PrimaryBase();
     virtual ~SecondaryBase();
     virtual void bar();
     virtual void bar();
};
};
</code>
</syntaxhighlight>
{|
{|
|-
|-
Line 429: Line 571:
! After
! After
|-
|-
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
class MyClass: public PrimaryBase, public SecondaryBase
class MyClass: public PrimaryBase, public SecondaryBase
{
{
Line 436: Line 578:
     void foo();
     void foo();
};
};
</code>
</syntaxhighlight>
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
class MyClass: public PrimaryBase, public SecondaryBase
class MyClass: public PrimaryBase, public SecondaryBase
{
{
Line 445: Line 587:
     void bar();
     void bar();
};
};
</code>
</syntaxhighlight>
|}
|}
'''Reason''': this is a tricky case. When dealing with multiple-inheritance of classes with virtual tables, the compiler must create multiple virtual tables to guarantee polymorphism works (that is, when your MyClass object is stored in a PrimaryBase or SecondaryBase pointer). The virtual table for the primary base is shared with the class's own virtual table, because they have the same layout at the beginning. However, if you override a virtual coming from a non-primary base, it is the same as adding a new virtual, since that primary base did not have the virtual by that name.
'''Note''': this applies to any case of multiple-inheritance, even if it's not a direct base. In the example above, if we had MyOtherClass deriving from MyClass, the same restriction would apply.


== Override a virtual with a covariant return with different top address ==
== Override a virtual with a covariant return with different top address ==
<code cppqt>
<syntaxhighlight lang="cpp-qt">
struct Data1 { int i; };
struct Data1 { int i; };
class BaseClass
class BaseClass
Line 460: Line 606:
struct Complex1: Data0, Data1 { };
struct Complex1: Data0, Data1 { };
struct Complex2: virtual Data1 { };
struct Complex2: virtual Data1 { };
</code>
</syntaxhighlight>
{|
{|
|-
|-
Line 466: Line 612:
! After
! After
|-
|-
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
class MyClass: public BaseClass
class MyClass: public BaseClass
{
{
public:
public:
};
};
</code>
</syntaxhighlight>
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
class MyClass: public BaseClass
class MyClass: public BaseClass
{
{
Line 478: Line 624:
     Complex1 *get();
     Complex1 *get();
};
};
</code>
</syntaxhighlight>
|-
|-
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
class MyClass: public BaseClass
class MyClass: public BaseClass
{
{
public:
public:
};
};
</code>
</syntaxhighlight>
| <code cppqt>
| <syntaxhighlight lang="cpp-qt">
class MyClass: public BaseClass
class MyClass: public BaseClass
{
{
Line 492: Line 638:
     Complex2 *get();
     Complex2 *get();
};
};
</code>
</syntaxhighlight>
|}
|}
'''Reason''': this is another tricky case, like the above one and also for the same reason: the compiler must add a second entry to the virtual table, just as if a new virtual function had been added, which changes the layout of the virtual table and breaks derived classes.
Covariant calls happen when the function overriding a virtual from a parent class returns a class different from the parent (this is allowed by the C++ standard, so the code above is perfectly valid and calling <tt>p->get()</tt> with p of type <tt>BaseClass</tt> will call <tt>MyClass::get</tt>). If the more-derived type doesn't have the same top-address such as <tt>Complex1</tt> and <tt>Complex2</tt> above, when compared to <tt>Data1</tt>, then the compiler needs to generate a stub function (usually called a "thunk") to adjust the value of the pointer returned. It places the address to that thunk in the entry corresponding to the parent's virtual function in the virtual table. However, it also adds a new entry for calls made which return the new top-address.

Revision as of 20:57, 29 June 2011

This page is meant as examples of things you cannot do in C++ when maintaining binary compatibility.

Unexport or remove a class

Before After
class KDECORE_EXPORT KUrl
{
   // [...]
};
class KUrl
{
   // [...]
};

Reason: the symbols for the class above are not added to the exported symbols list of the library, so other libraries and applications cannot see them.

Change the class hierarchy

Before After
class MyClass: public BaseClass
{
   // [...]
};
class MyClass: public BaseClass, public OtherBaseClass
{
   // [...]
};
class MyClass: public BaseClass1, public BaseClass2
{
   // [...]
};
class MyClass: public BaseClass2, public BaseClass1
{
   // [...]
};

Reason: the size and/or order of member data in the class changes, causing existing code to allocate too much or too little memory, read/write data at the wrong offsets.

Change the template arguments of a template class

Before After
template<typename T1>
class MyTemplateClass
{
    // [...]
};
template<typename T1, typename T2 = void>
class MyTemplateClass
{
    // [...]
};
// GCC mangling before: _Z3foo15MyTemplateClassIiE
//              after:  _Z3foo15MyTemplateClassIivE
void foo(MyTemplateClass<int>);

Reason: the mangling of the functions related to this template type change because its template expansion changes too. This can happen both for member functions (for example, the constructor) as well as functions that take it as a parameter.

Unexport a function

Before After
Q_CORE_EXPORT const char *qVersion();
const char *qVersion();
namespace KSocketFactory {
    KDECORE_EXPORT QTcpSocket *connectToHost(...);
}
namespace KSocketFactory {
    QTcpSocket *connectToHost(...);
}

Reason: the symbols for the functions above are not added to the exported symbols list of the library, so other libraries and applications cannot see them.

Inline a function

Before After
int square(int n);
inline int square(int n) { return n * n; }
int square(int n) { return n * n; }
inline int square(int n) { return n * n; }
class Math
{
    int square(int n);
};

// the following could be in a .cpp
int Math::square(int n)
{ return n * n; }
class Math
{
    int square(int n);
};

// the following could be in a .cpp
inline int Math::square(int n)
{ return n * n; }
class Math
{
    int square(int n);
};

// the following could be in a .cpp
int Math::square(int n)
{ return n * n; }
class Math
{
    int square(int n)
    { return n * n; }
};

Reason: when a function is declared inline and the compiler does inline it at its call point, the compiler does not have to emit an out-of-line copy. Code that exists and was calling this function will therefore not be able to resolve the function anymore. Also, when compiling with GCC and -fvisibility-inlines-hidden, if the compiler does emit an out-of-line copy, it will be hidden (not added to the exported symbols table) and therefore not accessible from other libraries.

Change the parameters of a function

Before After
// GCC mangling: _Z11doSomethingii
// MSVC mangling: ?doSomething@@YAXHH@Z
void doSomething(int i1, int i2);
// GCC mangling: _Z11doSomethingis
// MSVC mangling: ?doSomething@@YAXHF@Z
void doSomething(int i1, short i2);
// GCC mangling: _Z11doSomethingii
// MSVC mangling: ?doSomething@@YAXHH@Z
void doSomething(int i1, int i2);
// GCC mangling: _Z11doSomethingiii
// MSVC mangling: ?doSomething@@YAXHHH@Z
void doSomething(int i1, int i2, int i3 = 0);
// GCC mangling: _Z11doSomethingRi
// MSVC mangling: ?doSomething@@YAXABH@Z
void doSomething(int &i1);
// GCC mangling: _Z11doSomethingRKi
// MSVC mangling: ?doSomething@@YAXAAH@Z
void doSomething(const int &i1);
void doSomething(int i1);
void doSomething(const int i1); // breaks with Sun CC
// GCC mangling: _Z11doSomethingPc
// MSVC mangling: ?doSomething@@YAXPAD@Z (32-bit)
void doSomething(char *ptr);
// GCC mangling: _Z11doSomethingPKc
// MSVC mangling: ?doSomething@@YAXPBD@Z (32-bit)
void doSomething(const char *ptr);

Reason: changing the parameters of a function (adding new or changing existing) changes the mangled name of that function. The reason for that is that the C++ language allows overloading of functions with the same name but slightly different parameters.

I don't have the mangled name for the Sun CC example above, that compiler does enforce the constness of POD types in both declaration and implementation.

Change the return type

Before After
// GCC mangling: _Z8positionv
// MSVC mangling: ?position@@YA_JXZ
qint64 position();
// GCC mangling: _Z8positionv
// MSVC mangling: ?position@@YAHXZ
int position();
// GCC mangling: _Z4namev
// MSVC mangling: ?position@@YAVQByteArray@@DXZ
QByteArray name();
// GCC mangling: _Z4namev
// MSVC mangling: ?position@@YAVQString@@XZ
QString name();
// GCC mangling: _Z4namev
// MSVC mangling: ?position@@YAPBDXZ
const char *name();
// GCC mangling: _Z4namev
// MSVC mangling: ?position@@YAVQString@@XZ
QString name();
// GCC mangling: _Z12createDevicev
// MSVC mangling: ?createDevice@@YAPAVQTcpSocket@@XZ
QTcpSocket *createDevice();
// GCC mangling: _Z12createDevicev (unchanged)
// MSVC mangling: ?createDevice@@YAPAVQIODevice@@XZ
QIODevice *createDevice();
// GCC mangling: _ZNK10QByteArray2atEi
// MSVC mangling: ?at@QByteArray@@QBA?BDH@Z
const char QByteArray::at(int) const;
// GCC mangling: _ZNK10QByteArray2atEi (unchanged)
// MSVC mangling: ?at@QByteArray@@QBADH@Z
char QByteArray::at(int) const;
// GCC mangling: _ZN6QEvent17registerEventTypeEi
// MSVC mangling: ?registerEventType@QEvent@@QAAXH@Z
int QEvent::registerEventType(int)
// GCC mangling: _ZN6QEvent17registerEventTypeEi (unchanged)
// MSVC mangling: ?registerEventType@QEvent@@QAAXW4Type@V0@@@Z
QEvent::Type QEvent::registerEventType(int)

Reason: changing the return type changes the mangled name of the function names in some compilers (GCC notably does not encode the return type). However, even if the mangling doesn't change, the convention on how the return types are handled may change.

In the first example above, the return type changed from a 64- to a 32-bit integer, which means on some architectures, the upper half of the return register may contain garbage. In the second example, the return type changed from QByteArray to QString, which are two incompatible types.

In the third example, the return type changed from a simple integer (a POD) to a QString -- in this case, the compiler usually needs to pass a hidden implicit first parameter, which won't be there. In this case, existing code calling the function will more than likely crash, due to trying to dereference the implicit QString* parameter that isn't there.

In the fourth example, the return type changed from one POD type (an int) to another (an enum), which is also carried by an int. The calling sequence is most likely the same in all compilers, however the mangling of the symbol name changed, meaning that calls will fail due to an unresolved symbol.

Change the access rights

Before After
class MyClass
{
protected:
    // GCC mangling: _ZN7MyClass11doSomethingEv
    // MSVC mangling: ?doSomething@MyClass@@IAAXXZ
    void doSomething();
};
class MyClass
{
public:
    // GCC mangling: _ZN7MyClass11doSomethingEv (unchanged)
    // MSVC mangling: ?doSomething@MyClass@@QAAXXZ
    void doSomething();
};

Reason: some compilers encode the protection type of a function in its mangled name.

Change the CV-qualifiers of a member function

Before After
class MyClass
{
public:
    // GCC mangling: _ZNK7MyClass9somethingEv
    // MSVC mangling: ?something@MyClass@QBAHXZ
    int something() const;
};
class MyClass
{
public:
    // GCC mangling: _ZN7MyClass9somethingEv
    // MSVC mangling: ?something@MyClass@QAAHXZ
    int something();
};

Reason: compilers encode the constness of a function in the mangled name. The reason they all do that is because the C++ standard allows overloading of functions that differ only by the constness.

Change the type of global data

Before After
// GCC mangling: data (undecorated)
// MSVC mangling: ?data@@3HA
int data = 42;
// GCC mangling: data (undecorated)
// MSVC mangling: ?data@@3FA
short data = 42;
class MyClass
{
public:
    // GCC mangling: _ZN7MyClass4dataE
    // MSVC mangling: ?data@MyClass@@2HA
    static int data;
};
class MyClass
{
public:
    // GCC mangling: _ZN7MyClass4dataE
    // MSVC mangling: ?data@MyClass@@2FA
    static short data;
};

Reason: some compilers encode the type of the global data in its mangled name. Especially note that some compilers mangle even for simple data types that would be allowed in C, meaning the extern "C" qualifier makes a difference too.

Even if the mangling doesn't change, changing the type often changes the size of the data as well. That means code that was accessing the global data may be access too many or too few bytes.

Change the CV-qualifiers of global data

Before After
// MSVC mangling: ?data@@3HA
int data = 42;
// MSVC mangling: ?data@@3HB
const int data = 42;
class MyClass
{
public:
    // MSVC mangling: ?data@MyClass@@2HA
    static int data;
};
class MyClass
{
public:
    // MSVC mangling: ?data@MyClass@@2HB
    static const int data;
};
class MyClass
{
public:
    static int data;
};
class MyClass
{
public:
    // the compiler won't even create a symbol
    static const int data = 42;
};

Reason: some compilers encode the CV-qualifiers of the global data in its mangled name. Especially note that a static const value declared in the class itself can be considered for "inlining" -- that is, the compiler doesn't need to generate an external symbol for the value since all implementations are guaranteed to know it.

Even for compilers that don't encode the CV-qualifiers of global data, adding const may make the compiler place the variable in a read-only section of memory. Code that tried to write it will probably crash.

Add a virtual member function to a class without any

Before After
struct Data
{
    int i;
};
struct Data
{
    int i;
    virtual int j();
};

Reason: a class without any virtual members or bases is guaranteed to be exactly like a C structure, for compatibility with that language (that is a POD structure). On some compilers, structures/classes with bases that are POD themselves are also POD. However, as soon as there's one virtual base or virtual member function, the compiler is free to arrange the structure in a C++ manner, which usually means inserting a hidden pointer at the beginning or the end of the structure, pointing to the virtual table of that class. This changes the size and offset of the elements in the structure.

Add new virtuals to a non-leaf class

Before After
class MyClass
{
public:
    virtual ~MyClass();
    virtual void foo();
};
class MyClass
{
public:
    virtual ~MyClass();
    virtual void foo();
    virtual void bar();
};

Reason: the addition of a new virtual function to a class that is non-leaf (that is, there is at least one class deriving from this class) changes the layout of the virtual table (the virtual table is basically a list of function pointers, pointing to the functions that are active in this class level). To accommodate the new virtual, the compiler must add a new entry to this table, but existing derived classes won't know about it and will not have the entry in their virtual tables.

Change the order of the declaration of virtual functions

Before After
class MyClass
{
public:
    virtual ~MyClass();
    virtual void foo();
    virtual void bar();
};
class MyClass
{
public:
    virtual ~MyClass();
    virtual void bar();
    virtual void foo();
};

Reason: the compiler places the pointers to the functions implementing the virtual functions in the order that they are declared in the class. By changing the order of the declaration, the order of the entries in the virtual table changes too.

Note: the order is inherited from the parent classes, so overriding a virtual will allocate the entry in the parent's order.

Override a virtual that doesn't come from a primary base

class PrimaryBase
{
public:
    virtual ~PrimaryBase();
    virtual void foo();
};

class SecondaryBase
{
public:
    virtual ~SecondaryBase();
    virtual void bar();
};
Before After
class MyClass: public PrimaryBase, public SecondaryBase
{
public:
    ~MyClass();
    void foo();
};
class MyClass: public PrimaryBase, public SecondaryBase
{
public:
    ~MyClass();
    void foo();
    void bar();
};

Reason: this is a tricky case. When dealing with multiple-inheritance of classes with virtual tables, the compiler must create multiple virtual tables to guarantee polymorphism works (that is, when your MyClass object is stored in a PrimaryBase or SecondaryBase pointer). The virtual table for the primary base is shared with the class's own virtual table, because they have the same layout at the beginning. However, if you override a virtual coming from a non-primary base, it is the same as adding a new virtual, since that primary base did not have the virtual by that name.

Note: this applies to any case of multiple-inheritance, even if it's not a direct base. In the example above, if we had MyOtherClass deriving from MyClass, the same restriction would apply.

Override a virtual with a covariant return with different top address

struct Data1 { int i; };
class BaseClass
{
public:
    virtual Data1 *get();
};

struct Data0 { int i; };
struct Complex1: Data0, Data1 { };
struct Complex2: virtual Data1 { };
Before After
class MyClass: public BaseClass
{
public:
};
class MyClass: public BaseClass
{
public:
    Complex1 *get();
};
class MyClass: public BaseClass
{
public:
};
class MyClass: public BaseClass
{
public:
    Complex2 *get();
};

Reason: this is another tricky case, like the above one and also for the same reason: the compiler must add a second entry to the virtual table, just as if a new virtual function had been added, which changes the layout of the virtual table and breaks derived classes.

Covariant calls happen when the function overriding a virtual from a parent class returns a class different from the parent (this is allowed by the C++ standard, so the code above is perfectly valid and calling p->get() with p of type BaseClass will call MyClass::get). If the more-derived type doesn't have the same top-address such as Complex1 and Complex2 above, when compared to Data1, then the compiler needs to generate a stub function (usually called a "thunk") to adjust the value of the pointer returned. It places the address to that thunk in the entry corresponding to the parent's virtual function in the virtual table. However, it also adds a new entry for calls made which return the new top-address.