Header Hygiene

Quick note on one approach for maintaining fast compile times, inspired by Sebastian Aaltonen’s tweet here.

This is something that we used to do at Rare when I was there ages ago. It’s been a long time so I may be misrepresenting it, and it wasn’t rigidly followed even then, but I always thought it was a good idea.

The key point is to split the header for a .cpp file into two headers. One that contains just the public types for that module, and one that contains the functions. For a module called foo you’d have foo.h for the main API of that module, and foo_types.h for the forward declares of all the types in foo. Then you follow these rules:

  1. foo_types.h cannot include anything. They simply forward declares the types owned by the foo module, and that’s it. One exception is “value types”, where you’re primarily expected to pass the structure by value. Those would also go in the foo_types.h header.
  2. foo.h can only include *_types.h headers. It can not include any other headers. Typically it will at least include its own foo_types.h header, but it will also often include *_types.h for modules it depends on.

This breaks the include chain without too much hassle on the part of downstream clients. The owner of foo is responsible for maintaining the foo_types.h header so there’s never any risk of maintaning N duplicates of forward declarations for types - they’re all in one place. And since headers can only include *_types.h headers, there’s no risk of exploding includes. _types.h headers can never include anything else, so the recursion will at most go one level deep and stop.

Here’s an example:

1234
// socket_types.h
...
struct socket;
...
1234
// message_types.h
...
struct message;
...
1234567
// erro_code_types.h
...
// value type, not a forward declare
struct error_code {
    ...
};
...
12345678
// socket.h

#include "socket_types.h"
#include "message_types.h"
#include "error_code_types.h"

error_code send_message(socket* s, message* s);
...

A user of our socket library merely has to include socket.h and they’ll have everything they need, including the forward declares for message and error_code. You’ll never end up with exploding header includes, because you’ll only ever get the header for the library you asked for, plus any forward declares it needs, and that’s it.

One downside is that you have to explicitly include the headers for every single thing you need. E.g. a user of the socket library who wants to utilize any operations on the message type will have to also include the message.h header. Whether this is actually a downside or a feature is up to you.

This is basically a slightly more ergonomic version of Our Machinery’s stricter system. It has the same benefits in that it avoids header explosion, but it avoids proliferation of forward declares (since they’re all in one place owned by the module that also implements the types). One downside of this approach compared to Our Machinery’s variant is that the *_types.h headers are monolithic. So if the message library had hundreds of types in it, but socket only needed struct message, it will nonetheless pull in all the other forward declares in message_types.h too. IME this isn’t a huge problem - forward declares don’t cost that much, and the key thing is eliminating the recursive explosion of header includes, not micro-optimizing the size of an individual header. You could always split the headers (have several *_types.h headers) if you really need to.

This whole system works best with C, where headers tend not to include any definitions in the first place. It kinda breaks with C++/OOP though, unless you heavily start using the PIMPL idiom to avoid taking dependencies on class implementations.

More Reading
Comment Form is loading comments...