Opportunities for Gentoo

When I’ve wanted to play in some new areas lately, it’s been a real frustration because Gentoo hasn’t had a complete set of packages ready in any of them. I feel like these are some opportunities for Gentoo to be awesome and gain access to new sets of users (or at least avoid chasing away existing users who want better tools):

  • Data science. Package Hadoop. Package streaming options like Storm. How about related tools like Flume? RabbitMQ is in Gentoo, though. I’ve heard anecdotally that a well-optimized Hadoop-on-Gentoo installation showed double-digit performance increases over the usual Hadoop distributions (i.e., not Linux distributions, but companies specializing in providing Hadoop solutions). Just heard from Tim Harder (radhermit) than he’s got some packages in progress for a lot of this, which is great news.
  • DevOps. This is an area where Gentoo historically did pretty well, in part because our own infrastructure team and the group at the Open Source Lab have run tools like CFEngine and Puppet. But we’re lagging behind the times. We don’t have Jenkins or Travis. Seriously? Although we’ve got Vagrant packaged, for example, we don’t have Veewee. We could be integrating the creation of Vagrant boxes into our release-engineering process.
  • Relatedly: Monitoring. Look for some of the increasingly popular open-source tools today, things like Graphite, StatsDLogstash, LumberjackElasticSearch, Kibana, Sensu, Tasseo, Descartes, Riemann. None of those are there.
  • Cloud. Public cloud and on-premise IaaS/PaaS. How about IaaS: OpenStack, CloudStack, Eucalyptus, or OpenNebula? Not there, although some work is happening for OpenStack according to Matthew Thode (prometheanfire). How about a PaaS like Cloud Foundry or OpenShift? Nope. None of the Netflix open-source tools are there. On the public side, things are a bit better — we’ve got lots of AWS tools packaged, even stretching to things like Boto. We could be integrating the creation of AWS images into our release engineering to ensure AWS users always have a recent, official Gentoo image.
  • NoSQL. We’ve got a pretty decent set here with some holes. We’ve got Redis, Mongo, and CouchDB not to mention Memcached, but how about graph databases like Neo4j, or other key-value stores like RiakCassandra, or Voldemort?
  • Android development. Gentoo is perfect as a development environment. We should be pushing it hard for mobile development, especially Android given its Linux base. There’s a couple of halfhearted wiki pages but that does not an effort make. If the SDKs and related packages are there, the docs need to be there too.

Where does Gentoo shine? As a platform for developers, as a platform for flexibility, as a platform to eke every last drop of performance out of a system. All of the above use cases are relevant to at least one of those areas.

I’m writing this post because I would love it if anyone else who wants to help Gentoo be more awesome would chip in with packaging in these specific areas. Let me know!

Update: Michael Stahnke suggested I point to some resources on Gentoo packaging, for anyone interested, so take a look at the Gentoo Development Guide. The Developer Handbook contains some further details on policy as well as info on how to get commit access by becoming a Gentoo developer.

16 thoughts on “Opportunities for Gentoo

  1. I am a developer (including lots of mobile at the moment) and abandoned Gentoo for Ubuntu many years ago. Part of the problem was the drama at the time, but the biggest contributor was not marking packages as stable. I ended up having to manually update that file to say it was okay to use unstable versions and the list kept growing. The Gnome release stability was lagging by over a year, and I eventually got fed up with the baby sitting. I fully understand why not marking things as stable in a timely manner happens – it is far easier to leave things at the status quo than take the risk (and ensure testing etc).

    Due to the stunts Canonical is pulling I’d really like to move away from Ubuntu so I checked in on Gentoo again. A piece of open source I make has a version from 10 months ago marked as stable (there have been 6 releases since, and all are better/more stable). As far as I can tell no version of Gnome 3 is marked as stable. And who knows what is going on with SQLite 3.

    I did try Arch which has terrible documentation compared to Gentoo. However the experience with AUR was the breaking point – I wanted Google Chrome and it required far too much hoop jumping (compare with Debian/Ubuntu/Fedora/Suse). Debian testing was stale (eg only had Gnome 3.4). Fedora refused to install into an existing partition I had made.

    I should also point out that I have multiple machines (beefy workstations, puny old ones, headless servers, laptops), and also work with other developers so issues with any distro get magnified multiple times over.

    There is some combination of freshness, convenience and breadth where Ubuntu does a remarkable job of hitting my sweet spot – YMMV. As a developer they have worked out just right.

    Things to make Gentoo work better for me include addressing the stability/freshness issues and making administration of multiple machines easier (eg propagating make.conf changes and installed packages).

  2. In this time we have two great ways Gentoo and Arch Linux. Both with pros and cons. I prefer Gentoo to servers and Arch to Desktop. My big problem is time Gentoo is based on sources and beauty flags to compile anything. Arch is quickly to upgrade and use, many similar files are there into both. Arch has systemd, systemd is awesone. Gentoo is system v (init). I like both. i recommended both.

  3. I remmeber one thing that Tahoe LAFS needs patching to work on Gentoo, but only a very small patch (from very friendly people).

  4. Maybe you should add existing bug reports to your blog entry, too. This reduces the effort generated by creating ebuilds from scratch.

  5. @Roger, I did try Arch which has terrible documentation compared to Gentoo.

    I have used both Arch and Gentoo, I prefers the latter as it provides an higher control over the system. However, the Arch documentation, in my point of view, is the best you can find among the different Linux distributions, it almost covers every single topic in deep details. Unfortunately, I can not say the same for the Gentoo documentation, it lacks in details and it just cover the main topics.

    1. @Giuseppe: I had two problems with the Arch documentation. The first was that it turns out you should use the “unofficial” installation documentation for your first install not the “official” documentation (they are links right next to each other and you’d never guess this). Gentoo only has one set of installation documentation.

      The second was that the Arch doc did describe things in detail, but would keep going over multiple ways of doing the same thing, and multiple tools for the same functionality. I don’t need to be told about 8 different disk partitioning tools – just the recommended one to use (the others can be a footnote).

      The doc was also out of sync. Arch doesn’t install ifconfig if you follow the install guide, and the systemd/udev it ships renames network interfaces so you won’t find eth0 on the system. It took quite a while to work out what the heck was going on!

  6. Packaging Java/Clojure stuff is quite a nightmare. If you’re interested in Hadoop packaging and specifically the Cloudera distributions, I’d like to point out that I’ve started packaging the whole ecosystem. Stay tuned on the ultrabug overlay (layman), maybe I should talk to radhermit.

    I’ve also looked at Storm packaging but I’m not comfortable enough with clojure et al to take a good decision yet (I packaged leningen tho).

  7. Great list of items, especially regarding DevOps and Development (which should apply to more than just Android). I’ve tried setting up Harmattan/Meego development on Gentoo and it has been somewhat of a PITA. Hopefully Sailfish will be (even) better than previous iterations!

  8. I would love to see some of those cloud tools in portage.
    There’s a recent-ish ebuild for OpenNebula 3.4.1 available in Gentoo’s bugzilla. See bug 344969. I haven’t tested it myself, but I’m currently in the process of setting up a few servers for VMs and looking into giving it a shot.

  9. Absolutely with you on the arch unstable comment. This stuff is iffy.

    Either get an anonymized mirror statistic feedback system with automatic ‘most people are using vX’ going, or just install latest by default. Let people manage their own problems by rolling back.

    Gentoo philosophy: “If the tool forces the user to do things a particular way, then the tool is working against, rather than for, the user. We have all experienced situations where tools seem to be imposing their respective wills on us. This is backwards, and contrary to the Gentoo philosophy. “

Comments are closed.