A second look at SCons performance

UPDATE: In response to comments here and elsewhere, I’ve done another series of SCons builds using the tips on the SCons ‘GoFastButton’ wiki page. You can view the results here


A few months ago, I took a look at the scalability of SCons, a popular Python-based build tool. The results were disappointing, to say the least. That post stirred up a lot of comments, both here and in other forums. Several people pointed out that a comparison with other build tools would be helpful. Some suggested that SCons’ forte is really incremental builds, rather than the full builds I used for my test. I think those are valid points, so I decided to revisit this topic. This time around, I’ve got head-to-head comparisons between SCons and GNU make, the venerable old workhorse of build tools, as I use each tool to perform full, incremental, and clean builds. Read on for the gory details — and lots of graphs. Spoiler alert: SCons still looks pretty bad.

The setup

As before, my test system is a dual 2.4 GHz Intel Xeon, with 2 GB RAM. But since the previous set of tests, my system has been upgraded to RHEL4; there’s a new version of Python; and there’s even a new version of SCons — the big 2.0. So the setup is now:

  • RedHat Enterprise Linux 4 (update 8, kernel version 2.6.9-89.ELsmp)
  • Dual 2.4 GHz Intel Xeon, with hyperthreading enabled
  • 2 GB RAM
  • Python 2.7
  • SCons v2.0.0.final.0.r5023
  • GNU make 3.81

As previously, the test builds consists of a bunch of compiles and links: N C files, each with a unique associated header, spread across N/500 directories (to ensure there are no filesystem scalability effects), are compiled, then bundled into a standard archive library. Every 20th object is linked into an executable along with the archive. The build tree is generated using a Perl script, which generates both SConstruct files and Makefiles for building. One difference between the two builds: when using GMake, I added the -MMD flag to gcc, to generate additional dependency information; each timed full build was then preceded by a full build to generate all the dependencies and a clean build to nuke all the generated output. I felt that this gave a more realistic comparison to SCons, which employs elaborate dependency analysis logic to ensure accurate incremental builds.

Round 1: Full builds

Again, my interest is primarily full, from scratch builds:

As expected, SCons’ performance is pretty miserable: by the time we get to a build with several thousand source files, the build time is already over 15 minutes, and with the same n2 growth we saw last time, the times race past one hour, then two, finally hitting nearly 5 hours for a build containing just 50,000 source files.

GNU make shows a much more sedate, linear growth — in fact it neatly keeps pace with a simple shell script that does all the same compiles and links as the regular build, but without any of the overhead of a build tool. Yes, I know that a shell script is no substitute for a proper build tool. It just serves to give us an idea of how long it takes just to do the work in the build — a lower bound on the build time. No build tool running serially can run the build any faster than this (obviously we can do better, by a constant factor, if we use parallel build features).

Using that lower bound, we can compute the amount of overhead introduced by the build tool. This is all the time that the tool spends doing things other than actually running the commands needed to execute the build: parsing SConstruct files or makefiles, building the dependency graph, traversing that graph, computing command-lines, etc. Viewing this overhead as a percentage of the total build time gives us an easier-to-digest comparison between SCons and GMake:

Even with relatively few files — around 2,000 — over 50% of the total build time is wasted by overhead with SCons. In comparison, GMake has barely any overhead on a build of that size. At the other end of the range, SCons overhead accounts for nearly 90% of the total build time. GMake overhead has increased as a percentage of the total time too, but only to a modest 20% — still significantly less than SCons. In fact, with 50,000 files, GMake overhead is less than half of SCons overhead with just 2,000 files!

OK, just a couple more things to touch on here before we look at incremental build times: first, you probably noticed the sudden hook in the graph for SCons full build times, between 45,000 and 50,000 files. That means that at that level, on my system, some other factor has kicked in to influence the times. I believe this is because the system has started to page heavily, as SCons’ memory footprint has grown to consume all available RAM. Second, if you compare the SCons times in this article with those in the previous article, you’ll see that this time around, the SCons times are a bit better than last time — about 30% on average. Keep in mind that there are several differences between this setup and the previous: a new OS, a new version of Python, a new version of SCons. Even a new version of the compiler and other tools used during the build. I did try to pin down exactly which factors contributed to this improvement by testing various combinations of versions of Python and SCons (for example, Python 2.6.2 with SCons 2.0; or Python 2.7 with SCons 1.2.0); none of these tests produced any significant change in performance, so I presume that the performance difference is most likely due to the new operating system and tools. I choose not to pursue the matter further than that.

Round 2: Incremental builds

A lot of people claimed that full builds “don’t really matter; developers only really do incremental builds”. So let’s take a look at incremental build performance:

For these builds, I ran a full build, then patched one C file and its associated header. This caused a rebuild of one object file, followed by a rebuild of the archive and the executable that the object feeds into. The actual time to execute just those commands is a paltry one-tenth of a second, so with either tool, the incremental time is dominated by the overhead added by the tool itself. Even so, it’s obvious that SCons adds considerably more overhead than GMake. Even with a small build containing just 2,000 files, SCons burns about 35 seconds to do one-tenth of a second of work. GMake does the same build in about 3 seconds. For the 50,000 file build, SCons ran for about 25 minutes, again just to do one-tenth of a second of actual work; GMake ran for about 9 minutes.

One thing I find especially interesting about this graph is that supposedly the problem with full builds is that they force SCons to constantly rescan the dependency graph. That shouldn’t be necessary in an incremental build, and yet we still see what looks like an O(n2) growth in build time.

Round 3: Clean builds

The last build comparison was a clean build: scons -c versus gmake clean:

At last we have found a build variant where SCons performance is in the same ballpark as GMake! Of course, according to the SCons docs, you’ll never need to do a clean build with SCons (because with SCons your dependencies are perfectly accurate), so maybe this data point is not actually interesting to SCons users.

Sudden death overtime: Memory usage

One last comparison: memory usage. This metric is of particular interest because it puts a hard limit on the maximum size of the build that the tool can handle. I’ve learned the importance of memory efficiency the hard way — long nights in “firefighting” mode to improve memory usage to accomodate this or that customer’s enormous build. I’d hoped that the new versions of Python and SCons used in this test would prove beneficial for SCons memory footprint. Unfortunately, the opposite is true: memory usage is about 8% worse now than it was the last time I ran these tests, and as you can see here, SCons uses about 4 1/2 times as much memory as GMake:

With 50,0000 files, SCons uses just a hair less than 2GB of memory, enough to cause my test system (with 2GB of RAM) to start swapping. In comparison, GMake needs just 440MB of memory for 50,000 files. At that rate, GMake won’t start to thrash my system until the build has grown to more than 225,000 files. Unfortunately, it seems likely that there’s not much that the SCons developers can do to fix this: because SCons is implemented in Python, they are at the mercy of the Python runtime implementation. There’s just not enough control over low-level details like memory allocation when you’re using an interpreted language.

Conclusions

We can clearly see now that it is not simply “the nature of the beast” that SCons performance is so bad. GMake scales much more gracefully, both in terms of elapsed time, and in memory usage, on a variety of scenarios: full, one-touch incremental, and clean builds. From the discussions that the previous post sparked, we know that the primary problem is an inefficient O(n2) algorithm in the SCons implementation. It seems that SCons is academically interesting, but ultimately not practical for any non-trivial build (as some of my customers are now finding out the hard way).

A lot of SCons fans, in defense of SCons, say things like “Well, the SCons developers made a concious decision to focus on correctness rather than performance.” But what good is a correct system that is so slow it’s unusable? For me, the choice is a no-brainer: my time is too precious to waste on a slow build tool.

21 Responses to “A second look at SCons performance”

  1. Diwaker Says:

    Eric, thanks for the analysis. Can you share your test harness? I’m curious to see how waf would perform on your benchmark.

  2. Josh Says:

    I think most people use SCons for its correctness guarantee and not its speed.

    I would like to see how SCons performs on incremental builds when you use the –interactive option. I would also be interested to see how changing the SCons files change identification method from md5 to timestamp affects performance.

  3. Saphyx Says:

    How would SCons compare to CMake, in a similar test harness? (e.g. KDE chose SCons by committee, then CMake by action.) I guess CMake would be somewhere in between SCons and CMake, but it isn’t obvious (to me at least) how it would position itself.

    I might even be interested in chipping in.

  4. Sohail Says:

    +1 Josh

  5. Gary Says:

    I’d also be interested to know how this test is affected by the techniques in http://www.scons.org/wiki/GoFastButton, all of which can give significant speedups compared to the default setup.

  6. Anon Says:

    But which build system to go for? Makefiles are tortuous to debug when they go wrong. Surely there has to be some better way (without switching to another languge)? waf? Jam? CMake? rake? ant?

  7. Kai Backman Says:

    I’d be interesting in seeing data of Jam in this context. It tends to beat out almost all other systems in terms of raw speed.

  8. Eric Melski Says:

    @Josh: on correctness-vs-performance, are you willing to pay an extra 4 hours on a full build for a small additional guarantee of correctness? As for –interactive, I tried it out, but I honestly don’t see how you expect that to help here. There are thousands of targets in the build, surely you don’t think I’m going to type “build f00001.o” for every one of those?

  9. Eric Melski Says:

    @Gary: I’m rerunning the scons builds with --max-drift=1 --implicit-deps-unchanged as recommended on the SCons GoFastButton page, hopefully I’ll be able to put that up in an update tomorrow or early next week.

  10. Eric Melski Says:

    @Anon: makefile debugging has come a long ways in the past few years. There’s remake, a GNU make debugger; and SparkBuild annotation files.

  11. Eric Melski Says:

    @Diwaker: I’m working on cleaning up the build generator, it’s not quite in a state that I’m comfortable sharing yet. Should be ready soon.

  12. Eric Melski Says:

    @Saphyx: I haven’t used CMake, so it would be tough for me to modify the test harness to generate cmake files. Once I publish the generator, maybe you can see about adding CMake support.

  13. Eric Melski Says:

    @Kai: I haven’t used Jam, but like I told Saphyx, once I publish the generator script maybe you’d be willing to extend it to support Jam.

  14. Anon Says:

    Eric:
    Thanks for the reply (PS: only by reading your other blog entries did I realise you actually sell and improved make!)

  15. Matt Doar Says:

    Yes, SCons is slower than gmake. I’d still choose SCons for medium sized projects because it is more likely to remain correct over time, IMO.

  16. Eric Melski Says:

    @Diwaker: I finally got the test harness up here. I invoked it as “./genscons.pl -d N -l 2 -f 500”, where d varied from 3 to 99; the script generates a tree with N subdirectories, and (N + 1) * 500 total C files.

  17. Eric Melski Says:

    @Gary: I finally posted the results of my tests using the GoFast settings: The last word on SCons performance.

  18. Top posts for September 2010 « blog.melski.net Says:

    […] A second look at SCons performance: 18% of page views […]

  19. Electric Says:

    Hopefully SCons will be improve when it comes to speed performance but Correctness is what important but It’s understandable to have the accurate result it will take time which result to slow speed performance.

  20. Eric Melski Says:

    @Electric: that’s a false dichotomy. It’s possible to have both high performance and correctness; see ElectricAccelerator, for example.

  21. Make syntax is the worst… except for all the alternatives « blog.melski.net Says:

    […] series of comparisons between SCons and GNU make sparked a lot of discussion, not just about SCons and gmake, […]


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: