C Preprocessor Trick For Implementing Similar Data Types

A C preprocessor trick for handling many similar structured data types, such as packets in a communication protocol, is described and justified.

17 January 2000

Introduction

The C preprocessor trick I describe in this paper is not my own invention. I first learned it when I joined the HiBase project in 1997. It first slightly appalled me, since I'd learned from experience that doing anything fancy with the C preprocessor was likely to cause problems later, and anyway it obfuscated the code. However, I learned to appreciate the trick, and have found it to be an efficient way of avoiding memory allocation, initialization, and other problems in some programs. It is not a panacea; indeed, its applicability is limited to certain kinds of tasks, but for those it is quite useful.

The trick was developed in HiBase by Kenneth Oksanen, who tells me he based it on similar constructs used in GCC, but developed and refined the idea further. It is not known to me whether other people have used the trick in this form.

The problem

In some kinds of programs there are many similar and related data types. A good example is the large number of data types representing packets in a communication protocol. They might look like the following:

struct AuthPacket {
    char *username;
    char *password;
};

struct AckPacket {
    int status;
    char *error_text;
};

These data structures represent the unpacked versions of the binary packets that are transmitted over the communications channel. The unpacked versions are easier to use, but there needs to be functions for packing and unpacking such structures, and also to allocate (or at least initialize) and free them. A function for producing debugging dumps would be nice as well.

Writing such functions is a tedious and error prone task. Even the initial task of writing them is bad, but then they all need to be kept in synchronization: if a new field is added to a packet, it needs to be initialized, freed, packed, unpacked, and dumped. Forgetting even one is likely to cause a bug.

The solution

The trick is use the preprocessor to define a new language to describe the structure of the packets:

PACKET(Auth,
    STRING(username)
    STRING(password)
)

PACKET(Ack,
    INTEGER(status)
    STRING(error_text)
)

The descriptions are put in a header file, say, packet-desc.h. The header file is then included in suitable places, with the PACKET, STRING, and INTEGER macros defined suitably. For example, to define the actual data types, one could do the following:

struct Packet {
    int type;
    #define INTEGER(name) int name;
    #define STRING(name) char *name;
    #define PACKET(name, fields) \
    struct name { fields } name;
    #include "packet-desc.h"
};

The preprocessor will turn it into code equivalent to this:

struct Packet {
    int type;
    struct Auth {
    char *username;
    char *password;
    } Auth;
    struct Ack {
    int status;
    char *error_text;
    } Ack;
};

Similar use of the header file can be used to implement the various functions. Additionally, one could easily define an enumerated type for giving the type of packets, and then functions for converting the enumerated values to strings and back, for I/O. See the appendix for actual code.

Discussion

The problems the trick solves would be better solved with a high level language. One does not, however, always have the option of picking the language freely.

Even for C, the trick is suspicious: it uses the preprocessor to define new syntax. C programmers have learned through bitter experience that inventing new syntax confuses maintenance programmers, and therefore causes bugs. I think, however, that the benefits in this case outweigh the potential problems, since the amount of repetitive code saved is so great.

Fancy preprocessor tricks also tend to find bugs in preprocessors, and cause headaches when the code is ported. As far as preprocessor tricks go, this one is fairly benign and simple. It does have its pitfalls: one needs to be careful with commas in macro arguments (they need to be within parentheses), and careful with undefining the macros used by the header file, so that the wrong definition isn't used when the header is included in the next place.

Complete example source code

This appendix contains the complete source code to a program that implements the example used in the text. It consists of two files, the header file cpp-trick.h that defines the two packets, and the source cpp-trick.c that uses the header file to implement things.

cpp-trick.h

#if !defined(PACKET) || !defined(INTEGER) || !defined(STRING)
#error Not all necessary macros were defined.
#endif

PACKET(Auth,
        STRING(username)
        STRING(password)
)

PACKET(Ack,
        INTEGER(status)
        STRING(error_text)
)

#undef PACKET
#undef INTEGER
#undef STRING

cpp-trick.c

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void panic(char *msg) {
        fprintf(stderr, "%s\n", msg);
        exit(EXIT_FAILURE);
}

void *xmalloc(size_t size) {
        void *p;

        p = malloc(size);
        if (p == NULL)
                panic("Out of memory");
        return p;
}

void *xstrdup(char *str) {
        char *copy;

        copy = xmalloc(strlen(str) + 1);
        strcpy(copy, str);
        return copy;
}

int read_integer(int *p) {
        if (fread(p, sizeof(int), 1, stdin) == 0)
                return 0;
        return 1;
}

char *read_string(void) {
        int len;
        char *str;

        if (read_integer(&len) == 0)
                return NULL;
        str = xmalloc(len + 1);
        if (fread(str, len, 1, stdin) != 1)
                return NULL;
        str[len] = '\0';
        return str;
}

void write_integer(int i) {
        if (fwrite(&i, sizeof(i), 1, stdout) != 1)
                panic("Write error.");
}

void write_string(char *str) {
        int len;

        len = (int) strlen(str);
        write_integer(len);
        if (fwrite(str, len, 1, stdout) != 1)
                panic("Write error.");
}

typedef enum {
        #define INTEGER(name)
        #define STRING(name)
        #define PACKET(name, fields) name,
        #include "cpp-trick.h"
} PacketType;

char *type_name(PacketType type) {
        #define INTEGER(name)
        #define STRING(name)
        #define PACKET(name, fields) \
                if (type == name) return #name;
        #include "cpp-trick.h"
        return "unknown";
}


typedef struct Packet {
        PacketType type;

        #define INTEGER(name) int name;
        #define STRING(name) char *name;
        #define PACKET(name, fields) struct name { fields } name;
        #include "cpp-trick.h"
} Packet;


Packet *packet_create(PacketType type) {
        Packet *packet;

        packet = xmalloc(sizeof(Packet));
        packet->type = type;

        #define INTEGER(name) p->name = 0;
        #define STRING(name) p->name = NULL;
        #define PACKET(name, fields) \
                { struct name *p = &packet->name; fields }
        #include "cpp-trick.h"

        return packet;
}


void packet_destroy(Packet *packet) {
        #define INTEGER(name)
        #define STRING(name) free(p->name); p->name = NULL;
        #define PACKET(name, fields) \
                { struct name *p = &packet->name; fields }
        #include "cpp-trick.h"
        free(packet);
}

void packet_dump(Packet *packet) {
        printf("Dumping packet %p:\n", (void *) packet);
        printf("  Type: %s\n", type_name(packet->type));
        #define INTEGER(name) printf("  %s: %d\n", #name, p->name);
        #define STRING(name) printf("  %s: %s\n", #name, p->name);
        #define PACKET(name, fields) \
                if (packet->type == name) \
                        { struct name *p = &packet->name; fields }
        #include "cpp-trick.h"
        printf("End of dump.\n");
}


Packet *packet_read(void) {
        Packet *packet;
        int type;

        if (read_integer(&type) == 0)
                return NULL;
        packet = packet_create(type);
        #define INTEGER(name) read_integer(&p->name);
        #define STRING(name) p->name = read_string();
        #define PACKET(name, fields) \
                if (type == name) { struct name *p = &packet->name; fields }
        #include "cpp-trick.h"

        return packet;
}


void packet_write(Packet *packet) {
        write_integer(packet->type);
        #define INTEGER(name) write_integer(p->name);
        #define STRING(name) write_string(p->name);
        #define PACKET(name, fields) \
                if (packet->type == name) \
                        { struct name *p = &packet->name; fields }
        #include "cpp-trick.h"
}


int main(int argc, char **argv) {
        Packet *packet;

        if (argc != 2) {
                fprintf(stderr, "Usage: %s [read|write]\n", argv[0]);
                return EXIT_FAILURE;
        }

        if (strcmp(argv[1], "read") == 0) {
                while ((packet = packet_read()) != NULL) {
                        packet_dump(packet);
                        packet_destroy(packet);
                }
        } else {
                packet = packet_create(Auth);
                packet->Auth.username = xstrdup("guest");
                packet->Auth.password = xstrdup("demo");
                packet_write(packet);
                packet_destroy(packet);

                packet = packet_create(Ack);
                packet->Ack.status = 123;
                packet->Ack.error_text = xstrdup("Access denied.");
                packet_write(packet);
                packet_destroy(packet);
        }

        return 0;
}