Server Monitoring with Cacti + ServerStats

Cacti Logo I have been using Cacti, a RRDTool-based graphing solution, to monitor my servers/VPS over the last year or two (and before that, with MRTG). My needs are simple — I just need graphing of several server metrics and Cacti fits the bill perfectly when the data is well supplied. Cacti is installed on one of my servers, and it will pull stats from my other servers all around the globe at 5 minutes interval.

Data collection from the actual servers is another story. Usually I need

  • Number of bytes transmitted and received on all network interfaces except local.
  • Current load average over 1 minute, 5 minutes and 15 minutes.
  • Current memory usage.

Net-SNMP Logo Net-SNMP is often used on the servers together with Cacti to plot the graph, and it is available on many operating systems. It uses SNMP over UDP, and with proper (and often extensive) configuration, you can retrieve a lot of information about the server from a remote SNMP client.

However over the years I have had some issues with Net-SNMP running on my servers, especially VPS with small amount memory.

  1. It is too complicated for what I need. I think the default configuration has already covered everything I need, but it is just number of packages that need to be installed/built to get the basic snmpd running.
  2. It uses too much memory. On a typical 32bit Linux box, snmpd takes around 4MB resident memory when it starts up. Not a lot, but I rather give that to MySQL query cache.
  3. It leaks memory, badly. Every version of Net-SNMP I tried over the last year or so, from 5.2 to 5.4, all leak memory. I have to restart the snmpd once a week as the resident memory could grow to 50MB. The situation is bad on 32bit boxes and worse on 64bit boxes.

So I decided to ditch Net-SNMP on the servers to collect stats. Since I do not need all the flexibility of SNMP, and Cacti has a nice Data Input Methods mechanism for you to write your own data collection agents, I decided to write a simple replacement myself.

Result? serverstats (version 0.1, 6kb tarball). Sorry about the uncreative name :) It is a server stats collection program written in around 170 lines of C (i.e. tiny) and invoked from xinetd. It works only on Linux as it reads extensively from files under /proc directory.

Requirement

On Linux boxes to be monitored, serverstats needs to be compiled, installed and invoked from xinetd. You’ll need

  • gcc (3.3/3.4/4.x tested)
  • pcre (6.x/7.x tested)
  • xinetd (2.3.13/2.3.14 tested)

Sorry I could have written my own tokeniser but I was lazy and used pcre :)

On the box where Cacti is installed, you’ll need

Installation

On the Linux boxes to be monitored:

  1. Download the serverstats tarball and then extract out all files.
  2. make && make install

It should install both /usr/local/bin/serverstats and /etc/xinetd.d/serverstats. Try to run serverstats from the command line to see whether it gives back useful data.

$ /usr/local/bin/serverstats
loadavg-1:0.07 loadavg-5:0.06 loadavg-15:0.07 net-rx:26740961501 net-tx:46202108165 mem-memfree:39816 mem-buffers:8240 mem-cached:67044

It should give back your load average, RX and TX in bytes on all interfaces except local, and finally the free/buffers/cached memory in kilobytes. The next step is having it invoked from xinetd. Check /etc/xinet.d/serverstats. Change “disabled” to “no”. It defaults to listen on TCP port 9087 but feel free to bind to whatever port that suits you. Make sure your xinetd can accept connection from your other server running Cacti. Now reload xinetd to use the new configuration.

On the box running Cacti:

  1. Install webserver, PHP, MySQL and make sure Cacti is working. Check Cacti manual for details.
  2. Make sure netcat has been installed.
  3. Log into Cacti, click on “Import Templates”, and import the file “cacti_host_template_serverstats_host.xml” which can be found in the serverstats tarball.

The template contains a host template that includes 3 graphs — load average, network RX/TX and memory usage. It also imports a Data Input Method called “ServerStats Query” that calls “nc -w 5 <host> 9087” to query the remote servers. Change the port number if serverstats listens on a different port.

Monitor Servers

In Cacti, click on “Devices”, and then “Add”. Fill in a name and hostname but under Host Template choose “ServerStats Host”. After the new device has been added, you will be presented with options to create graphs. Create the graphs you want (loadavg, network or memory usage), and you have just created a device where data is collected by serverstats.

You can now add the graphs to Graph Tree. Here are some sample graphs monitoring the VPS I have with GPLHost.

Load Average
Load Average

Network Traffic
Network Traffic

Memory Usage
Memory Usage

OpenVZ Memory Usage

Under an OpenVZ VPS, serverstats will use privvmpages to determine the amount of free memory. However as you can only read /proc/user_beancounters as a root user, you might wish to modify your xinetd configuration file to execute serverstats as root rather than nobody. If it cannot read /proc/user_beancounters, it will fall back to /proc/meminfo which might not contain correct memory usage.

Security

Also it is a good idea to block access to the port opened by xinetd so that only your Cacti server can connect and collect the data. You can either use xinetd configuration directive (only_from for example), or use a firewall (iptables) to block out unwanted access.

Comments

Gravatar

Neat Scotty, looks handy. I have found net-snmp to be flaky myself.

Gravatar

Have you ever heard of munin?

Gravatar

Sure. Too bloated for me :)

There are too many open source agents out there for data collection but I just need something simple and light, and take as little memory as possible on my VPS.

Gravatar

Too bloated… can you give me details on how you came to this assumption?

In any case there is other reasons for utilising existing technologies then “yet another …. “, there is also valid reasons for creating new ones, but I don’t see how this project has an extensive gain over something like munin considering they both produce similar graphs using similar/same libs, they both pull from proc, etc etc etc.

Munin has the added benefit of monitoring lots of servers from one location and just running the node portion of code everywhere, which would make it less bloated, wouldn’t it?

Gravatar

Munin has the added benefit of monitoring lots of servers from one location.

So does Cacti and most net-snmp derived monitoring tools. However when you have tiny VPS with as little as 64MB RAM, doesn’t want anything resident, or does not want anything coded in scripting language (thus you need to factor in the cost to run the VM) then you definitely want something less boated running on nodes.

there is also valid reasons for creating new ones, but I don’t see how this project has an extensive gain over something like munin

Nor am I.

However, it is not a project. It’s a half-a-day hack scratching my own itches on server monitoring, and it is not meant to be competing with any existing projects out there that meant to be all things to all man. Scratching my own itches — would that be valid enough reason to do something myself? :)

Consider the code pieces here are released as public domain — I am just writing a blog post on how I resolved my server monitoring/graphing issues, and take the code if you like :)

Gravatar

How much ram is you solution chewing through exactly?

root 2187 0.0 0.9 6232 4780 ? Ss Oct26 0:10 /usr/sbin/munin-node

sshd chews more resources then a munin-node… (and munin itself runs from cron)…

Gravatar

You might need to check your configuration if your sshd chews more memory than 4.7MB VmRss :)

As of my hack, each invocation is around 600kb RSS but I can imagine that the majority is libc and libpcre as the code is only around 170 lines of C. It’s invoked from xinetd which I already run on all my servers.

Gravatar

I liked your solution but I’m getting an error from Cacti while importing the template:

Error: XML: Hash version does not exist.

Thanks

Francisco

Gravatar

It is usually caused by importing a Cacti template created from a newer version into older version. I used Cacti 0.8.6j to create the template export XML. What version of Cacti are you running?

Gravatar

Hi, I have this problem, my version of cacti is 0.8.6h and need import a template.

as resolve this problem?

Post new comment

The content of this field is kept private and will not be shown publicly.

More information about formatting options