Converting code to data

An array and a code snippet interpreting it can save a lot of repetitive code.

13 September 1996

One of the nicer tricks in a programmer's bag is converting code to data. A series of repetetive statements can often be replaced with a simple array and a loop that interprets it. C's array initializers make this trick especially useful.

Consider the problem of printing enums. It is often better to print out the name than the numeric value. C has no built in way to do this, so it must be done manually. Assume that the enum type has been defined in the following way.

enum e { foo, bar };

We want a function to convert an enum e to a string.

const char *e2str(enum e e_val);

A typical way of implementing this is with a switch statement.

switch (e_val) {
case foo: return "foo";
case bar: return "bar";
}
assert(0);
return NULL;

The switch method is a good one. It ensures that all enum labels have different values (otherwise two case labels have the same value). A good compiler will even warn about enum labels that do not have a case.

e2str could be written using arrays instead. The straightforward method uses an array indexed by the case value. This restricts the enum labels to relatively dense ranges of small non-negative values, and requires that array elements and enum labels are defined in the same order.

A better way is to use an array of structs.

static struct {
    enum e e_val;
    const char *str;
} tab[] = {
    { foo, "foo" },
    { bar, "bar" },
};
static size_t ntab = sizeof(tab) / sizeof(tab[0]);

const char *e2str(enum e e_val) {
    size_t i;
    for (i = 0; i < ntab; ++i)
        if (tab[i].e_val == e_val)
            return tab[i].str;
    assert(0);
    return NULL;
}

This version of e2str does not allow the compile time checks the switch version does. It is, however, simpler to maintain. If we need str2e, we can re-use the array (which is why it was defined outside e2str).

enum e str2e(const char *str) {
    size_t i;
    for (i = 0; i < ntab; ++i)
        if (strcmp(tab[i].str, str) == 0)
            return tab[i].e_val;
    assert(0);
    return 0;
}

If both conversions are needed, then the array version is superior to the switch version. If the enum definition changes, only the one array needs to be changed, not both conversion functions.

This array trick can be used in many mapping situations: command names to pointers to functions that implement them, RGB triplets to color names, pointers to comparison functions to special-case versions of qsort, etc.