I often get asked how to automate host naming and/or DNS records using Chef. In fact, there was an individual in IRC today who asked some variation of the same question I always get:
Currently I set hostnames on my nodes by looking up the ip from ifconfig and doing a reverse DNS lookup on that IP. It turns out this is painful since my upstream never sets the rDNS correctly without nagging. I’m thinking of just building a “nodes” data bag with IP -> hostname mapping. Is there a better way to do this?
Often I’ve answered people by telling them that if they’re using configuration management, the names of their hosts are completely irrelevant. They shouldn’t even bother adding them to DNS or to even care what their names are. I thought I’d explain my rationale behind this and why host names are largely unnecessary if you’re using a configuration management tool like Chef.
What are Hostnames For?
Let’s take a step back to the 1990s — the days when FreeBSD 2.2.6 was state-of-the-art, Linux was still running a.out binaries, and we wore flannel with our ripped jeans (I’m told that grunge fashion is on its way back, by the way.) Ah, nostalgia. Why did we set hostnames on our servers back then? So we don’t need to remember what IP the VAX 4000 has when logging into it, of course! Now back then, we didn’t have that many systems, so we SAs often chose some theme for our handful of machines — cartoon characters, animals, Greek gods, whatever.
Later on, we got more machines, so we started running out of Greek gods to name our machines, and anyway, of the 35 production database servers, can anyone remember whether apollo or athena is the Oracle master? Nope — and maybe we caused outages because of this (those were actual hostnames that got mixed up during one). So we started just naming things generically — maybe web01, web02.qa, etc. or db01.prod, db01.stg, or if you really want to claw your eyes out, some catch-all naming convention like dfwlnxghp1tsp003. Anyway, with this kind of naming system, hostnames started to move beyond IP-to-name association and began to store details about the operating system, location, environment, etc. of the system (“the third TeamSite Linux production server in the Dallas-Fort Worth data center” for those who still have eyeballs). Later on, as this started to become really absurd, we took all the hostnames and details about the machines and shoved them into some asset management system so that we could see that dfwlnxghp1tsp003 has 4 gigabytes of memory but dfwlnxghp1tsp004 only has 2.
Along Came The Cloud
This was all standard practice for awhile until the cloud came along in the form of Amazon EC2. In the cloud, we don’t get to choose what IP addresses our machines have; we’re at the mercy of the cloud provider (or OpenStack supervisory processes, etc.) Sure, we can still name our machines if we want, but since they can go away anytime, do we really need to? Hmm… we still need to which machines perform which function, so the answer is yes, right? Not if we’re using configuration management.
Remember all the information we stuck into the host naming convention that caused our machines to be named dfwlnxghp1tsp003? And then the knowledge we stuck into our asset management system to reflect all the other information we couldn’t fit into the hostname? It’s all in our configuration management system now. So we no longer need to know that system X is in the Dallas-Fort Worth data center; we can simply set an attribute on that record in the config management system, e.g. node.set['location'] = 'dfw'. (Even better if we have figured out how to programmatically determine these attributes, e.g. with a plugin that sets such an attribute upon inspecting the system state like the baseboard management controller, but I’m getting ahead of myself.) At this point, it doesn’t matter whether our box is called dfwlnxghp1tsp003 or ip-10-39-49-225.ec2.internal. Just ask the configuration management system what that box is, what its role is, what environment its in, and our answers will be forthcoming.
What if we need to log into these servers and we don’t know their hostnames?
Oh no! It’s 3 a.m. and there’s an emergency in the production environment. I would like to look at the logs of all the Java application servers simultaneously, but you told me I didn’t need to name my hosts, so how do I know which boxes to log into?
Any decent configuration management tool should also have some way for you to query the records and remotely execute commands against them. Chef has “knife ssh”, which, given a search query, will open parallel SSH sessions to machines that match the criteria and execute the command. So to satisfy the foregoing requirement, you might do something like
knife ssh 'role:app_server AND chef_environment:production' \ tail -f /var/log/tomcat5/catalina.out
I still don’t know the name of the production Java application servers — and I don’t care.
I haz a data center? All this cloud business doesn’t apply to me?!
Sure it does. Even if you’re racking-and-stacking machines, you still don’t need to name your machines anything meaningful. As a best practice, of course, name them something, something random, and put those records in DNS, because it’s good practice to have forward and reverse DNS records, but then completely forget about what they’re named. Don’t think about it again, because you have configuration management.
Benefits of not having manually-created DNS records
If you’re lucky enough to be using a cloud-type solution where your DNS records are autogenerated for you by your provider, then just use those records. Not having to specifically register hosts in DNS eases a whole maintenance burden, and it also allows you to create and destroy entire stacks of machines — e.g. using Amazon CloudFormation — with one less thing to worry about.
Hopefully I’ve sufficiently made the case for not bothering with DNS when using configuration management. Of course, it requires your configuration management solution of choice to have one single, real-time source of truth, like the Chef Server database & search index, but if you have one of those, it basically acts like a CMDB. At that point, managing DNS is just unnecessary overhead.
For other things you don’t need to bother with when using CM with disposable infrastructure, watch Mike Fiedler’s talk at ChefConf 2013 entitled “Treating Your Infrastructure Like Garbage“.