The last cluster admin around here had a “diskless” cluster setup booting from floppy, using Etherboot to boot directly into a kernel.
For increased flexibility and ease of administration, I’ve modified that. First, instead of using http://rom-o-matic.net/ to create the floppies, I build them manually from Etherboot source. That allows me to have one floppy for all the different network cards in the cluster, rather than one floppy per card type.
Also, I’m booting from Etherboot to the PXELINUX bootloader, which allows me more flexibility in booting than shoving a kernel straight into the computer’s face. PXELINUX not only allows me to custom-select kernels, boot options etc as a standard bootloader would, but it also allows a separate configuration per MAC address, so I can customize on a per-node level.
What I’ve done is create a master PXELINUX config file with variables that get replaced on a per-type, then per-node level when I run a small shell script called propogate_pxelinux_changes. Each MAC symlinks to its type, and the types are derived from the master. That keeps administration simple by minimizing the files that change.
As for DNS and DHCP, dnsmasq worked like a charm! After some stupidity on my part with setting up the routing from my server at 192.168.0.1 to get to the nodes at 192.168.1.0/24, things are fantastic. It was far easier to configure than BIND and ISC DHCP would have been. I’m able to set the IP and hostname again based on the MAC address in one location rather than keeping separate records on each node somehow, or a pile of separate configuration files on the master.