Improving quality in Gentoo

I recently posted about making Gentoo a better tool. A requirement for being a good tool is being a tool that doesn’t break—thus, we need to improve our quality to a more reliable level. I’m going to mention a few ideas to start this discussion, which I hope the rest of our community will participate in.

First, we essentially have no code review. About the only time any code in Gentoo is reviewed is before and during a developer’s training, with a notable exception being the requirement to post eclasses to the gentoo-dev list. Increasing our code review ought to result in an increase in quality, in ability to justify code in words, and in a stronger community of contributors.

How do we increase code review? One idea is to require reviewer approval prior to committing, but this isn’t the best answer for Gentoo. We’ve always been a pretty open community. Developers aren’t prohibited by ACL from committing anywhere in our ebuild repository, so I don’t think they would accept additional requirements that increased the burden of contributing.

Instead, let’s create a gentoo-commits mailing list or RSS feed(s), with full diffs. We should use this tool in many different ways.

  • Each team should use it internally to review all commits to its packages.
  • Mentors should continue to follow their mentees’ commits well after they’re granted commit access (6 months minimum, and I recommend forever).
  • Mentees should also review their mentors’ commits, first to learn and later to review.
  • Every developer should have at least one reviewer and review at least one other developer. This should be formal and documented to ensure it’s happening.

These uses will require that the commit diffs be easily filterable by both committer and files affected. RSS feeds could be made available based on developer or herd, and e-mail lists could contain the information in e-mail subjects or headers.

Second, we should improve our unit testing, where the units are individual packages. This can be both automated and performed by developers and arch testers. Although a number of packages have a useful, working test suite, most lack one. For these packages, we should attempt to provide something automatable in src_test() even when a test suite is absent. Failing that, we should print out a checklist in src_test() of tests to perform before stabilizing a package. There should never be an empty src_test().

Another package-level testing approach is to create solid, automated tinderboxes. This remains unrealistic for our entire database of 10,000+ packages, but we should at least get this going for our “system” set and perhaps for some of the most common sets of packages for servers and desktops. Exactly how to set this up remains a question, since there’s a lot of tinderbox code floating around. Bonsaikitten has some almost-working code based on swegener’s work; Catalyst has some tinderboxing capability; or we could look into using Mozilla’s tinderbox.

Third, we should improve our integration testing, on the entire repository level. Our main source of testing here will be our users, because they have infinitely more combinations of build options and hardware than we can reproduce on Gentoo infrastructure. But how can we take advantage of this testing to improve our quality? By creating an additional, time-lagged set of rsync mirrors with additional QA checks, we could allow users who want to test the latest and greatest software to help those who want stable and solid software.

We already have keywords for ~arch and arch, but they’re still too mixed. A problem in ~arch ebuilds can break the entire tree for all users. They really need a stronger separation. Perhaps the separate repositories should be ~arch versus stable. But another way to do it is to add a delay to the second set of repos, anywhere from 24 hours to a week. This delay allows us time to encounter major problems in the fast-sync repos, fix them, and carry the fixes over to the slow-sync repo. But we’ll need a way to make this really easy to do. It feels like branching with periodic merges, along with cherry picks of major bugfixes, is the right way to do this. Unfortunately, CVS sucks at this. We may need to migrate to a more capable version-control system before this option becomes realistic. In addition to the user testing, we could add a tinderbox into the slow-sync repos to require that they build with the most common configurations.

To sum up, I want to increase code review, unit testing, and integration testing. These three things will strengthen Gentoo’s quality, reputation, and community.

Democracy as education in OSS projects

Reading the PressThink blog, I came across a couple quotes that apply well to open-source projects:

I kept thinking about a famous passage from Christopher Lasch, the great social critic and historian who died in 1994— before the rise of the Web. In the Revolt of the Elites, he said we learn more from argument than from information, not because opinions are weighter than facts, but because to argue for your ideas (in public) puts those ideas at risk. And that is how we learn. …

Lasch in his book:

If we insist on argument as the essence of education, we will defend democracy not as the most efficient but as the most educational form of government, one that extends the circle of debate as widely as possible and thus forces all citizens to articulate their views, to put their views at risk, and to cultivate the virtues of eloquence, clarity of thought and expression, and sound judgment… small communities are the classic locus of democracy— not because they are “self-contained,” however, but simply because they allow everyone to take part in public debates. Instead of dismissing direct democracy is irrelevant to modern conditions, we need to re-create it on a large scale.