The gmetad process on my Ganglia server has been a bit finicky lately. Periodically it segfaults which prevents new metrics from making their way into the RRD databases it manages:
[14745149.528104] gmetad[24286]: segfault at 0 ip 00007fb498c413c1 sp 00007fb48db40358 error 4 in libc-2.17.so[7fb498ade000+1b7000]
Luckily The gmetad service runs under systemd which provides a Restart directive to revive failed processes. You can take advantage of this nifty feature by adding “Restart=always” to yourunit files:
$ cat /usr/lib/systemd/system/gmetad.service
[Unit]
Description=Ganglia Meta Daemon
After=network.target
[Service]
Restart=always
ExecStart=/usr/sbin/gmetad -d 1
[Install]
WantedBy=multi-user.target
Now each time gmetad pukes systemd will automatically restart it. Hopefully I will get some time in the next few weeks to go through the core file to see why it keeps puking. Until then, this band aid should work rather nicely.