It hurts to admit it, but this is just another preview and not something that is ready to be released to the general public! But don’t think that I took it easy: I spent a lot of time cleaning up and restructuring the code. I worked on new features and some fundamental issues and then I thought: Instead of going for 3.4, which is nearing its end of life, rather aim for 4.0, which will become the next LTS!
In hindsight, I am not upset with this decision, but unfortunately it added a lot more complexity to the job and that is why it is not done! This is where you could come in: I am looking for help on the frontend part. Please read on if you are interested and let me know!
If you don’t know what Action Simulator is and why is it great, consider reading the introductory post first.
Zabbix network discovery is a way to scan a network range and automatically start monitoring the discovered devices and services. Network discovery can look for basic services like HTTP, IMAP or SMTP. It can also scan the specified network range for SNMP, Zabbix agent or ICMP ping responses.
Constant network scanning is not desirable, thus the discovery is usually set to run not too frequently – once every few hours to once every few days. If you have added a new Zabbix action for network discovery that links a fresh template, there is no way to see when will the discovery rule run next time, and no way to force it to run sooner.
About a year ago, I was inspired by Raymond Kuiper’s Zen and the Art of Zabbix Template Design, and decided to start applying its practices to my own environment. If you’re not familiar with it, I would strongly recommend looking at the linked page, as well his video presentation on the subject. The idea is to create small “Task” templates for very specialized purposes, and then group those into more generalized “Duty” templates, which are grouped into even more generalized “Role” templates that provide a scope for the function of the generalized server, and that finally nests inside of a “Profile” template which gets applied to the monitored server. The process creates an extremely modular system that is very easy to manage. One of the key concepts is to rely on discovery as much as possible to maintain this modularity.
Among the first places I put this into practice was to monitor Windows services. I had envisioned several task templates for monitoring Windows services – one for DFS (distributed file services), one for SQL services, one for Skype For Business services, etc. The only functional differences between them would be that the filters used in each rule would pick just the services for that particular task.
In practice, just doing this created a problem, because while Zabbix does have a very nice Windows service.discovery item, it can only exist once per server. It was easy to create the Tplt::Task::Win::Services::DFS and Tplt::Task::Win::Services::SQL “Task” templates, however trying to link these into the same parent “Duty” template Tplt::Duty::MSSQLServer gave an error like this:
- Discovery rule "service.discovery" already exists on ...
One common question we often hear on IRC is: how to make hosts depend on their proxy. The problem being that when a proxy is unable to send data to the server for a long enough time, the nodata() triggers on hosts monitored by the proxy start to fire.
A possible end result is illustrated perfectly by quoting one of the users in #zabbix @ irc.freenode.net
<someuser> lets say one of my large proxies falls behind. I can receive like 2k mails.
Obviously, receiving such a quantity of notifications doesn’t help you correct the issue.
Zabbix agent is easy to extend for data collection with a feature called userparameters. We figured out how they work in the article Create your own items – extend the agent with userparameters. Unfortunately, userparameters sometimes don’t work right away, and not always the cause is obvious. In this article we will explore the most common issues in more detail and learn to use simple methods to debug Zabbix agent userparameters.
Zabbix supports Linux agent auto-registration and it’s a well documented process in the Zabbix manual. However, it’s not really straightforward to mass provision Zabbix agents on hundreds of servers if you want to have Zabbix agent <-> Zabbix server communication encrypted. At least not without some form of additional scripted step in your installation process. For large deployments I usually use Ansible, although this article just briefly covers that part and I’ll mostly focus on how to make sure your agents get registered for encrypted communication.
Zabbix supports many different ways of monitoring, including agentless, SNMP and IPMI. Zabbix also provides a monitoring agent, which has a great set of built-in items for monitoring diskspace, processes, memory usage and many other things.
While the list of built-in items is growing with each release, there will always be something else we will want to monitor. Luckily, Zabbix agent is very easy to extend with new items by using a feature called userparameters. Zabbix userparameters are commands that the agent runs and expects an item value to be returned.
Applications – software applications – mean a specific thing in IT. Usually, that’s a user-oriented piece of software. A web browser, word processor, game. Lately, mobile applications have somewhat lowered the bar, even down to I Am Rich applications.
And then there’s Zabbix with it’s own definition for applications. So what are applications in Zabbix?
If Zabbix keeps on surprising you with its notifications, you might want to try the Action Simulator! The Action Simulator is a community patch that helps you to figure out whether your actions really do as you intend. It first came out for Zabbix 2.0 in 2013 and was downloaded by hundreds of users from all around the world.
The following article gives a brief introduction to the Action Simulator and explains the challenges of developing it for Zabbix 3.2.
Testing of new Zabbix items, triggers, actions, etc is always easier on a separate test instance, which is the reason why we have a few test Zabbix servers. These test servers are usually behind our firewall, but a few weeks ago we found that one test instance wasn’t. To make things even worse, it had the default admin credentials. This wasn’t a big issue, because it was isolated from the rest of our hosts, but it was interesting what happened on that server.
The way we found out that the server was compromised was that it was using 100% CPU. The process which was using all the CPU was a process which we never seen before, nor did any of us ever configure it, and of course it was run by the zabbix user. We killed it instantly, and after some digging around we found out that the executable file was used as an agent for some data mining service on which you can rent computing power to do some tasks.