Listening to a podcast about the internals of GCC I’ve learnt that, in order to support object oriented languages in a common AST (abstract syntax tree), the GCC does polymorphism in a quite exquisite way.

There is a page that describes how to do function polymorphism in C but not structure polymorphism as it happens on GCC, by means of a union, so I decided that was a good post to write…

Unions

Like structs, you can create a list of things together in an union but, unlike structs, the things share the same space. So, if you create a struct of an int and a double, the size of the structure is the sum of both sizes. In an union, its size is the size of the biggest element and all elements are in the same area of memory, accessed from the first byte of the union.

Its usage is somewhat limited and can be quite dangerous, so you won’t find many C programs and rarely find any C++ programs using it. One of the uses is to (unsafely) convert numbers (double, long, int) to their byte representation by accessing as an array of chars. But the use we’ll see now is how to entangle several structures together to achieve real polymorphism.

Polymorphism

In object oriented polymorphism, you can have a list of different objects sharing the same common interface being accessed by their interface’s structure. But in C you don’t have classes and you can’t build structure inheritance, so to achieve the same effect you need to put them all in the same box, but at the same time defining a generic interface to access their members.

So, if you define your structures like:

struct Interface {
    int foo;
    void (*bar)();
};
struct OneClass {
    int foo;
    void (*bar)();
};
struct TwoClass {
    int foo;
    void (*bar)();
};

and implement the methods (here represented by function pointers) like:

void one_class_bar () {
    printf("OneClass.Bar()\n");
}
void two_class_bar () {
    printf("TwoClass.Bar()\n");
}

and associate the functions created to the objects (you could use a Factory for that), you have three different classes, still not connected. The next step is to connect them via the union:

typedef union {
    struct Interface i;
    struct OneClass o;
    struct TwoClass t;
} Object;

and you have just created the generic object that could hold both OneClass and TwoClass and be accessed via the Interface. Later on, when reading from a list of Objects, you can access through the interface (if you build your classes with parameters in the same order) and it’ll call the correct method (or use the correct variable):

Object list[2];
/* Setting */
list[0] = (Object) one;
list[1] = (Object) two;
/* Using */
list[0].i.bar(list[0]);
list[1].i.bar(list[1]);

Note that when iterating the list, it access the Object via the Interface (list[0].i) and not via OneClass or TwoClass. Although the result would be the same (as they share the same portion in memory, thus would execute the same method), it’s conceptually correct and compatible with object oriented polymorphisms.

The code above produces the following output:

$ ./a.out
OneClass.Bar()
TwoClass.Bar()

You can get the whole example here. I haven’t checked the GCC code but I believe that they’ve done it in a much better and more stable way, of course, but the idea is probably be the same.

Disclaimer: This is just a proof-of-concept. It’s not nice, they (GCC programmers) were not proud (at least in the podcast) of using it and I’d not recommend anyone to use that in production.

One Reply to “Object Orientation in C: Structure polymorphism”

  1. It’s nicer to put stuff that’s common to all classes in its own structure. Something like struct common { int foo; void(*bar)(); }. Then OneClass could be declared like struct OneClass { struct common common; int baz; /* more stuff that’s specific to OneClass */ }.

Leave a Reply

Your email address will not be published. Required fields are marked *