Earlier this year, we were forced into decommissioning our Cacti installation after the server it was hosted on suffered a catastrophic failure (it literally melted down). The server was an ancient Compaq Proliant DL320 with an older HP SmartArray RAID controller, so we had no feasible way of recovering the RRDs off it, nor the MySQL database.
Nevertheless, we figured our trending needs would be met by the implementation of another trending solution whose name I will withhold. It does the job of monitoring devices over SNMP just fine, but this product cannot get data from external scripts. This is essential for us to monitor things such as the thread states on our Apache servers. Consequently, we have decided to rebuild an instance of Cacti for these needs.
Cacti is a tool that I love and hate. On one hand, it is the natural evolution from MRTG, and it is extremely powerful and with its templating abilities, scalable to monitoring many objects. However, with the power comes a great deal of complexity. It’s very difficult to wrap one’s head around the various terminology used in the product, such as data sources, data templates, host templates, graph templates, etc. etc. unless one already has a deep understanding of RRDTool. Worse, there is no easy page of “Cacti recipes” for the user who is monitoring all the default parameters of any given object and needs to venture further afield.
The Cacti forums are full of very knowledgeable people and contain many examples, but these are poorly organized. Someone has tried to create a compendium of best Cacti scripts and templates but this is at best a disorganized collection. What this cries out for is a Cacti Cookbook Wiki.
Nevertheless, as I said, once you climb the learning curve (and what a steep curve it is — I still haven’t climbed it completely), there are some extremely interesting things you can do with Cacti. We are planning to implement the successor to Apachestats to monitor the thread states on our webserver farm, but I will likely be crafting many templated data queries, etc. to monitor other aspects of our infrastructure. It’ll be a long road, with much gnashing of teeth, but I hope it’s worth it!