As a followup to my previous post, the readahead doesn’t help at all. I spent a fair bit of time optimizing which files got read with some grep+ldd shell scripting, but no clear improvement.
Apparently whatever time it spends doing the readahead is about the same amount of time saved by doing it. Conclusion? I guess my disk I/O is not the rate-limiting step.