How to quickly navigate an unfamiliar makefile

The other day, I was working with an unfamiliar build and I needed to get familiar with it in a hurry. In this case, I was dealing with a makefile generated by the Perl utility h2xs, but the trick I’ll show you here works any time you need to find your way around a new build system, whether it’s something you just downloaded or an internal project you just transferred to.

What I wanted to do was add a few object files to the link command. Here’s the build log, with the link command highlighted:

gcc -c  -I. -D_REENTRANT -D_GNU_SOURCE -DDEBIAN -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -O2 -g   -DVERSION=\"0.01\" -DXS_VERSION=\"0.01\" -fPIC "-I/usr/lib/perl/5.10/CORE"   mylib.c
rm -f blib/arch/auto/mylib/
gcc  -shared -O2 -g -L/usr/local/lib mylib.o   -o blib/arch/auto/mylib/   \
chmod 755 blib/arch/auto/mylib/

Should be easy, right? I just needed to find that command in the makefile and make my changes. Wrong. Read on to see how annotation helped solve this problem.

Read the rest of this entry »

Subbuilds: build avoidance done right

I’ve heard it said that the best programmer is a lazy programmer. I’ve always taken that to mean that the best programmers avoid unnecessary work, by working smarter and not harder; and that they focus on building only those features that are really required now, not allowing speculative work to distract them.

I wouldn’t presume to call myself a great programmer, but I definitely hate doing unnecessary work. That’s why the concept of build avoidance is so intriguing. If you’ve spent any time on the build speed problem, you’ve probably come across this term. Unfortunately it’s been conflated with the single technique implemented by tools like ccache and ClearCase winkins. I say “unfortunate” for two reasons: first, those tools don’t really work all that well, at least not for individual developers; and second, the technique they employ is not really build avoidance at all, but rather object reuse. But by co-opting the term build avoidance and associating it with such lackluster results, many people have become dismissive of build avoidance.

Subbuilds are a more literal, and more effective, approach to build avoidance: reduce build time by building only the stuff required for your active component. Don’t waste time building the stuff that’s not related to what you’re working on now. It seems so obvious I’m almost embarrassed to be explaining it. But the payoff is anything but embarrassing. On my project, after making changes to one of the prerequisites libraries for the application I’m working on, a regular incremental takes 10 minutes; a subbuild incremental takes just 77 seconds:

Standard incremental:
Subbuild incremental:

Not bad! Read on for more about how subbuilds work and how you can get SparkBuild, a free gmake- and NMAKE-compatible build tool, so you can try subbuilds yourself.
Read the rest of this entry »

Annocat: e pluribus unum

ElectricAccelerator annotation files are a fantastic way to get a grip on your build behavior and performance, but what if your Build (capital B) spans more than one invocation of emake? Annotation gives you a good look inside any single invocation, but there’s no way to get an overview of the entire process. You can’t just catenate the annotation files from subsequent emake runs — the result won’t be well-formed XML, and the timing information for jobs in each subsection of the build will reflect time from the start of that subsection, not from the start of the logical build. Plus, you run the risk of having overlapping job identifiers in different subsections. What you need is a specialized version of cat that is annotation-aware. In this article I’ll introduce annocat, a simple Perl script I wrote for just this purpose, and I’ll explain how it works.
Read the rest of this entry »

Data Mining ElectricAccelerator Annotation: Bill of Materials

ElectricAccelerator annotation files contain a gold mine of information about your build, such as the dependencies between jobs in the build, the time required to run each job, the exact command-line and environment used to invoke each command in the build, and even every file read and written by each job in the build. Many people have correctly speculated that they could use the file access data in annotation to create a bill of materials for the build, similar to so-called “configuration records” in ClearCase. In this post, we’ll look at how we can do that using the annolib library.
Read the rest of this entry »

Untangling Parallel Build Logs

I spend most of my time with ElectricAccelerator working on the “big” features — performance, scalability, fault-tolerance. It’s easy to forget that there are a ton of “little” features that can themselves make a big difference in the value of the system. Case in point: the build log. If you have any experience with parallel build systems, you know what a mess the build log becomes because you have any number of parallel commands all dumping output to a single logfile simultaneously. The output from each command gets interleaved with the output from other commands. Worse, the error messages get jumbled up too, so it becomes difficult to tell which commands are producing the errors.

I was reminded of this issue when it popped up again on the GNU make-help mailing list. Take a look at this recent post:

Read the rest of this entry »