It’s been a while …

As that famous song [YouTube, last.fm] says, it’s been a while. Since last I blogged, that is. Lots of stuff going on in my world, although I haven’t been spending enough time on Gentoo lately.

I’ve joined the Web 2.0 trend, using Google Reader and saving my bookmarks on del.icio.us via the wonderful Firefox plugin. Next thing you know, I’ll be reading Digg or another equally trendy Slashdot replacement. The only thing like that I read now is the superb LWN. I just added the Planet Conary feed (thanks ferringb!), because I think there’s a lot Gentoo can learn from rPath, since it’s got a similar base.

My Gentoo activity is probably best illustrated via the CIA commit stats — only 9 commits this week and 41 this month. A large part of my drop in commit activity lately is thanks to Joshua Baergen (Josh_B on IRC), who’s really started to take over X maintenance with double my commits this month, mostly in preparation for X.Org 7.2 as well as the new input-hotplug work for X.Org 7.3.

In Gentoo, we plan to show you a mixture of 7.2 and 7.3. What we try to do is mix and match the latest individual X component releases wherever they’re compatible, regardless of which “official release” they come from. So you may already have a number of input-hotplug components, and the only changes you’d need to make are the server and drivers. This mirrors what you saw with 7.0 and 7.1, where the server and drivers lagged back on 7.0 waiting for Nvidia and ATI while all the other components jumped to 7.1.

I’d like to publicly thank Diego Pettenò (look, I got the accent right!) for his contributions to XCB, both in my overlay and upstream. On that note, I encourage anyone using my overlay to send me patches for anything that doesn’t work. There’s no reason a personal overlay should only hold commits from that person.

In the past month, I’ve gotten in touch with two new, exciting ventures using Gentoo. Engine Yard is a Ruby on Rails deployment provider that allows you to purchase virtual clusters, and SiCortex is an innovative HPC cluster creator that uses Gentoo on clusters with 5,800 nodes. Check out the videos on the Engine Yard site, they’ve got one specifically about their use of Gentoo.

I’ve also taken on the job of creating a monthly newsletter for the OSEL, which aims to get more students involved in open source at OSU and liaise with the academic side of the university, while the LUG interfaces with the local community and the OSL connects with the broader, outside community. This is really exciting for me because I’ve got a significant journalism background [PDF] (and no, that contact information is no longer accurate), but I haven’t had a chance to use it for a couple of years. I’ll share the first issue with all of you once I finish it.

My birthday is (also) coming

Yes, Diego, you’re not the only one with a birthday in November. =) Mine will be this Saturday. I’m not as modest as you, though — here’s links to my Amazon wishlist and PayPal to prove it! Consider PayPal my “Saving for a Mac Mini / PS3 / nice LCD” account. My Amazon wishlist contains a number of books ranging from less than $10 to more than $100, many of which would help in Linux work, others with grad school, and a few for (gasp!) entertainment.

If you appreciate what I’ve done for Gentoo in the past 3 1/2 years, feel free to give something back.

Current projects

For anyone who’s interested, here are the projects I’ve got going right now. Many of them could use some help, so take a look and let me know if you’re interested in any. Roughly in my order of interest:

  • Add the Sugar desktop environment for OLPC — it’s in my overlay, but it segfaults on startup of sugar-emulator somewhere in sugar-shell code. Try it out and see whether you can come up with a fix.
  • Port LTSP to Gentoo — pioto, straaken and perhaps another person or two are working with me on this. This involves changes to the client-building plugins, init scripts, and adding some ebuilds. Also, probably creating Seeds for the client and server.
  • Get the rest of the system-config-* GUI tools from Red Hat working — some remain masked. Would appreciate testing and fixing on any that remain masked.
  • Add virt-manager into the main tree from my overlay — haven’t got a Xen instance to test it with yet. If anyone would like to test this and let me know, that’d be great.
  • Fix our X init scripts to be more like upstream intended, then fixing upstream to be current. havner is taking the lead on this, and I look forward to seeing his work.
  • Add some new science packages, including KiNG and friends from the Richardson lab, CCP4MG, CCTBX and more.
  • The infamous bug #44132 — make multiple MPI implementations simultaneously installable.
  • Resume my occasional series of blog posts on Gentoo in the enterprise, embedded, cluster etc environments. One post I want to make is how to use the Gentoo installer’s CLI frontend to make large, automated installations easy.

And of course these are beyond the usual ongoing maintenance of X, science packages and cluster packages.

Web 2.0 blog software

I brought this up in #redmonk, and sog suggested I stick it here and ask the LazyWeb for ideas. So here it is, with added capitalization and punctuation (no extra charge!).

There needs to be Web 2.0 blog software. The default blog page would show either just topics or perhaps topics and the first paragraph. Click, and you get a simple expansion or compression on the same page, instead of redirecting to a new one. This turns a blog’s homepage into effectively a featureful blog reader, especially when coupled with possibility of people logging in so you can track what they’ve read. Every time I see one of those annoying 1-paragraph things that trails off and I have to click to read the full post, I think about it.

Academic papers in Linux

We’re beginning to put an academic paper together, and of course I’d like to do this using open-source software if I can. My PI (principal investigator, the head of the lab) uses Word — so whatever ends up getting used, it needs some capability to at least export to .doc or .rtf. A critical aspect of any solid academic paper is citing your reference in a bibliography. OpenOffice.org does have some basic bibliography capabilities, but that’s what they are: basic. Work is underway to fix that, but it’s not expected to get anywhere for a year or so.

After some research, I’ve come across a few promising packages:

Zotero: A Firefox 2.0 extension, public beta started less than 2 weeks ago. No integration with word processors yet, but you can copy and paste a formatted bibliography across, and export and import the actual data. “It lives right where you do your work — in the web browser itself.” As a result, adding references from online searches such as PubMed is as simple as a single click. Every other package needs explicit support added for online searches.
Bibus: Uses OpenOffice.org’s Python functionality, also integrates with MS Word. The build system sucks — it should use distutils, but instead it’s got some custom Makefile and weird shell scripts and configuration files. Its functionality comes highly recommended, however. Will do PubMed and eTBLAST queries.
Pybliographer: The development version (1.3) integrates into OO.o and LyX. The 1.3-series GUI is alpha-quality and just had its first release. Will do PubMed, Web of Science and CrossRef queries.
bibutils: Command-line filters to convert between a variety of formats, including EndNote (which is currently in use under MS Word). Also handles RIS and BibTex, so that provides for OO.o import and export as well.

Update: As of now, bibutils and Bibus are both available in Portage. Try ’em out.

So much work for such a little thing

I finished my graph figure around 6 p.m. today. Since then, I’ve probably spent an hour pondering it. Two weeks or more went into working on the information in and presentation of that one little figure with three graphs in it. It’s really humbling when you think about how small the material is, and how much effort it took to produce. When it’s printed, it will be something like 6 cm wide and 12 cm tall.

If you really have an urge to see it, drop me a line and I’ll let you know once the paper’s been submitted and accepted. Till then, I’m gonna keep it on the D-L. =)

[Gentoo] Why fitting a line to points is weird

Yesterday was an enlightening day. At work, I’m trying to compare two protein structures — one is a higher-resolution version of the other. I’m plotting a certain characteristic of each against the other and finding the slope of the line. Great news! There’s a really cool trend between pairs of structures. My boss asked me to switch the X and Y axes around, since one typically puts the newer data on the Y, and the data to which it’s being compared on the X.

So I did, and guess what? The slopes don’t match! That’s right, switching the axes around doesn’t necessarily result in an inverse slope when you’re doing a linear regression. Why, you ask? Because the technique only minimizes for the Y direction. Suddenly my really cool trend isn’t a trend at all, and all the slopes are equal within error.

What we’ve decided to do is put the more accurate data in the X axis, to better account for the larger error on the Y axis using linear regression.

If any of you know much statistics, I’d like to hear a more accurate, better way to come up with a slope that’s robust to flipping the axes (perhaps by minimizing both X and Y distances or residuals?). This method needs to already be implemented in some open-source program and fairly trivial to learn.

[Gentoo] Focusing Gentoo without forking it

Mark Shuttleworth posted (Thanks to Steve O’Grady for the link) about how Ubuntu focuses in a few specific areas, but Debian is a more general plateau. One can trivially draw the parallel between Gentoo and Debian, so his points are equally applicable to us. Most Gentoo developers draw the most pleasure from working near the bleeding edge, not from trying to backport fixes and fix old stable software.

Perhaps this is because it requires more creativity and less monotony. I certainly feel more challenged and fulfilled by packaging new software (such as the system-config-* utilities I did last weekend) than by fixing some simple bugs for random stable packages.

But this raises some new questions: Can Gentoo develop specific “peaks” in conflicting areas, without forcing new subdistributions to form that focus on them? If so, how? Stuart Herbert and I threw around some ideas shortly after I started a discussion about whether democracy works for Gentoo, and our lack of goals.

Stuart’s idea, which I like, is preparing specific “releases” for certain vertical markets. Yeah, I said “vertical markets.” WTH is that? Just a given group of people using Gentoo for a certain purpose, such as a LAMP stack, an HPC cluster or a development workstation. One could create a LiveCD with an installer image tailored to, and preconfigured for, a LAMP server. The key here, as Stuart pointed out in our discussion, is making things “just work,” not just installing the packages and leaving the user to set everything up. But we’d need more than just the LiveCD, because clearly people want to maintain the installation. Perhaps adding a series of profiles for these vertical markets could do the trick. Some developers have already tested this concept with GNAP, the Gentoo Network APpliance, but not in a formalized way that pushes into a number of different areas.

[Gentoo] Lots of new GUI config tools

For ages, I’ve thought Gentoo has a major lack in GUI tools to configure various stuff like X, sound, etc. Finally I got sick of it, so this weekend I packaged most of Red Hat’s tools. It really bothers me that every distro writes its own tools from scratch when they’re mostly portable, so I decided to do something about it.

I did most of the work on Friday, and then some cleanup through the rest of the weekend when I wasn’t busy with friends and such. The vast majority simply need a small patch to migrate them to the Gentoo way: chkconfig to rc-update, /etc/rc.d/init.d to /etc/init.d, /sbin/service to /etc/init.d, and the occasional difference in file locations (/etc/ to /etc/bind/, and some PAM modules for authconfig).

Here’s the list:
authconfig (system-config-authentication)
firstboot
hwbrowser
system-config-bind
system-config-date
system-config-display
system-config-httpd
system-config-keyboard
system-config-language
system-config-lvm
system-config-nfs
system-config-printer
system-config-samba
system-config-soundcard
system-config-users

There’s a couple more I may add, such as system-config-cluster and system-config-netboot.

They’re all hard-masked right now, but give them a shot. Please submit patches if something doesn’t work, rather than just whining about it. After all, they are masked.

I have some hopes that I can substitute these tools into our LiveCD builds for the current ones. Some experiments today showed that they can come up with at least the basics of autoconfiguration, and I know they have the capability to write config files as well. Anyone got an idea for how to figure out when to use binary drivers?