(╮.□.)╮︶💻

A brain teaser for C programmers.

Can you see what’s wrong with this code?

#include <stdlib.h>
#include <stdio.h>

struct Foo {
    char* ptr;
    int foo;
    char data[];
};

static char* fill_data(char* buf, size_t size) {
    // dummy implementation
    for (size_t i = 0; i < size; ++i) {
        buf[i] = (char) i;
    }
    return buf + size / 2;
}

static struct Foo* foo_create(size_t size) {
    struct Foo* result = (struct Foo*) malloc(sizeof(struct Foo) + size);
    if (!result) {
        return NULL;
    }
    *result = (struct Foo) {
        .ptr = fill_data(result->data, size)
    };
    return result;
}

int main() {
    struct Foo* f = foo_create(128);
    return (int) (f == NULL);
}

What is going on here?

foo_create creates a struct Foo on the heap and initializes it by assigning a compound literal. It fills the data flexible array member (FAM) using fill_data function, and sets ptr to a pointer to somewhere within data buffer.

Using a FAM instead of a pointer there allows for creating the complete struct on heap in a single contiguous chunk of memory - with just a single allocation (struct Foo) instead of two (struct Foo + second one for data).

`struct Foo` with pointer to a separately allocated buffer vs one with inline storage

How is it wrong?

On some platforms, first bytes of data end up being set to zero. For example, on x86_64 data ends up holding 00 00 00 00 04 05 06 07 ... instead of 00 01 02 03 04 05 06 07 ....

If you want to figure out yourself how does this happen - pause here.

Deconstructing the initializer

Let’s look closely at the order of operations happening in those lines:

    *result = (struct Foo) {
        .ptr = fill_data(result->data, size)
    };

What happens is:

fill_data is called. It initializes result->data to 00 01 02 03 04 05 06 07 ....
A temporary struct Foo is constructed. Its ptr is set to the return value of fill_data, rest of the fields are zero-initialized. Note that for the purposes of assignment, Foo behaves as if it had no data field.
The value of the temporary object is assigned to *result.

This might sound reasonable - we fill data first, then all the other fields. The assignment doesn’t touch data after all, so everything is great.

Or does it? insert Vsauce music

Struct padding in C

Let’s consider struct Bar that looks just like struct Foo, but has no FAM:

struct Bar {
    char* ptr;
    int foo;
};

The alignment of struct Bar is the same as the biggest alignment of its members. In case of x86_64, it’s max(alignof(char*), alignof(int), alignof(char)) = alignof(char*) = 8. Because of that, the compiler needs to ensure sizeof(struct Bar) is a multiple of 8, because otherwise struct Bar[] might contain misaligned structs. To achieve that, extra unused padding bytes are inserted at the end of a struct, making struct Bar actually look like this:

struct Bar {
    char* ptr;
    int foo;
    char __padding[4];
};

What is the value of those padding bytes? It’s unspecified. Compilers can do whatever they please. So what happens to padding bytes when a designated initializer is used? Well, the standard requires any fields missing from the initializer to be set to zero, so it’s not uncommon for compilers to do something equivalent to memset(&var, 0, sizeof(T)) followed by actual assignments. And because sizeof(T) includes padding, it also gets set to all zeros.

What happens to the padding bytes when a structure is copied by-value? The C standard says only that structure assignment need not copy any padding bits. But that doesn’t mean it must not copy them, and compilers (at least GCC and Clang) tend to make struct assignment equivalent to memcpy(&dst, &src, sizeof(T)).

Struct padding + FAMs

How does struct padding combine with FAMs though? In struct Foo definition, data member immediately follows foo, and the alignment of its type (char) does not require adding any padding. If it was a fixed-size array, there will be no need for padding - and it turns out FAMs are no different. In other words, offsetof(struct Foo, data) < sizeof(struct Foo).

Can you see the problem now?

The code fills in FAM first, and then handles struct assignment. Compilers are free to copy over the padding bytes or not, and GCC/Clang do just that - overwriting first 4 bytes of the FAM with zeros.

The code compiles with no warnings on both GCC and Clang with -Wall -Wextra -pedantic. One might think that copying a struct containing a FAM is a bad enough idea to warrant a warning, but it might have legitimate uses I guess?

Conclusion

The buggy code is something I personally wrote at one point, and was lucky enough that the CI ran tests on both 32-bit and 64-bit architectures. In the real case, it happened to show up only on 32-bit builds, because of some other fields in the struct.

Having to debug this served as a painful reminder to avoid writing “clever” code when possible. The more “clever” the code is, the trickier and harder to understand the bugs hidden in it. And you can be sure that whatever you write has bugs in it.

C, being a low-level language, is full of pitfalls. Compilers get a lot of leeway in what they can do with your code, so keep it simple and obvious. Make sure to write tests, and run them on your C code on different compilers and architectures - and you’ll be less likely to do these kinds of mistakes.