Concepts

Architecture overview

The system has three-tier architecture: the information is collected by monitoring agents (either our own high-performance agents or SNMP agents) and delivered to monitoring server for processing and storage. Network administrator can access collected data using cross-platform Management Console (Rich Console), Web Interface (Web console) or Management Console for Android. Rich and Web console have almost the same functionality and the same user interface.

_images/architecture_scheme.png

Objects

All network infrastructure monitored by NetXMS inside monitoring system represented as a set of objects. Each object represents one physical or logical entity (like host or network interface), or group of them. Objects are organized into hierarchical structure. Each object has it’s own access rights. Access rights are applied hierarchically on all children of object. For example if it grant Read access right for user on a Container, then user have Read right on all objects that contains this Container. Every object has set of attributes; some of them exist for all objects (like id and name or status), while other depends on object class – for example, only Node objects have attribute SNMP community string. There are default attributes and custom attributes defined either by user or external application via NetXMS API.

NetXMS has six top level objects – Entire Network, Service Root (named “Infrastructure Services” after system installation), Template Root, Network Map Root, Dashboard Root and Business Service Root. These objects serve as an abstract root for an appropriate object tree. All top level objects have only one editable attribute – name.

Object Class Description Valid Child Objects
Entire Network Abstract object representing root of IP topology tree. All zone and subnet objects located under it. System can have only one object of this class.
  • Zone (if zoning enabled)
  • Subnet (if zoning disabled)
Zone Object representing group of (usually interconnected) IP networks without overlapping addresses. Contains appropriate subnet objects.
  • Subnet
Subnet Object representing IP subnet. Typically objects of this class are created automatically by the system to reflect system’s knowledge of IP topology. The system places Node objects inside an appropriate Subnet object based on an interface configuration. Subnet objects have only one editable attribute - Name.
  • Node
Node Object representing physical host or network device (such as a router or network switch). These objects can be created either manually by administrator or automatically during network discovery process. They have a lot of attributes controlling all aspects of interaction between NetXMS server and managed node. For example, the attributes specify what data must be collected, how node status must be checked, which protocol versions to use, etc. Node objects contain one or more interface objects. The system creates interface objects automatically during configuration polls.
  • Interface
  • Access point
  • Network Service
  • VPN Connector
Interface Interface objects represent network interfaces of managed computers and devices. These objects created automatically by the system during configuration polls or can be created manually by user.  
Access point Object representing wireless network access point. A node can have several access points, e.g. 2.4Ghz and 5Ghz, or in case of thin wireless access points managed by a central controller. These objects are created automatically by the system.  
Network Service Object representing network service running on a node (like http or ssh), which is accessible online (via TCP IP). Network Service objects are always created manually. Currently, the system works with the following protocols - HTTP, POP3, SMTP, Telnet, SSH and Custom protocol type.  
VPN Connector Object representing VPN tunnel endpoint. Such objects can be created to add VPN tunnels to network topology known to NetXMS server. VPN Connector objects are created manually. In case if there is a VPN connection linking two different networks open between two firewalls that are added to the system as objects, a user can create a VPN Connector object on each of the firewall objects and link one to another. The network topology will now show that those two networks are connected and the system will take this condition into account during problem analysis and event correlation.  
Service Root Abstract object representing root of your infrastructure service tree. System can have only one object of this class. After system installation it is named “Infrastructure Services”.
  • Cluster
  • Chassis
  • Condition
  • Container
  • Node
  • Sensor
  • Subnet
  • Rack
Container Grouping object which can contain any type of objects that Service Root can contain. With help of container objects you can build object’s tree which represents logical hierarchy of IT services in your organization.
  • Cluster
  • Chassis
  • Condition
  • Container
  • Node
  • Sensor
  • Subnet
  • Rack
Cluster Object representing cluster consisting of two or more nodes. See Cluster monitoring for more information.
  • Node
Rack Object representing a rack. It has the same purpose as container, but allows to configure visual representation of equipment installed in a rack.
  • Node
  • Chassis
Chassis Object representing a chassis, e.g. a blade server enclosure. Chassis can be configured as a part of a rack.
  • Node
Condition Object representing complicated condition – like “cpu on node1 is overloaded and node2 is down for more than 10 minutes”. Conditions may represent more complicated status checks because each condition can have a script attached. Interval for evaluation of condition status is configured in Server Configuration Variables as ConditionPollingInterval with default value 60 seconds.  
Template Root Abstract object representing root of your template tree.
  • Template
  • Template Group
Template Group Grouping object which can contain templates or other template groups.
  • Template
  • Template Group
Template Data collection template. See Data collection section for more information about templates.
  • Mobile Device
  • Node
Network Map Root Abstract object representing root of your network map tree.
  • Network Map
  • Network Map Group
Network Map Group Grouping object which can contain network maps or other network map groups.
  • Network Map
  • Network Map Group
Network Map Network map.  
Dashboard Root Abstract object representing root of your dashboard tree.
  • Dashboard
Dashboard Dashboard. Can contain other dashboards.
  • Dashboard
Business Service Root Abstract object representing root of your business service tree. System can have only one object of this class.
  • Business Service
Business Service Object representing single business service. Can contain other business services, node links, or service checks.
  • Business Service
  • Node Link
  • Service Check
Node Link Link between node object and business service. Used to simplify creation of node-related service checks.
  • Service Check
Service Check Object used to check business service state. One business service can contain multiple checks.  

Object status

Each object has a status. Status of the object calculated based on polling results, status of underlying objects, associated alarms and status DCIs. For some object classes, like Report or Template, status is irrelevant. Status for such objects is always Normal. Object’s status can be one of the following:

Nr. Status Description
0 NORMAL Normal Object is in normal state.
1 WARNING Warning Warning(s) exist for the object.
2 MINOR Minor Minor problem(s) exist for the object.
3 MAJOR Major Major problem(s) exist for the object.
4 CRITICAL Critical Critical problem(s) exist for the object.
5 UNKNOWN Unknown Object’s status is unknown to the management server.
6 UNMANAGED Unmanaged Object is set to “unmanaged” state.
7 DISABLED Disabled Object is administratively disabled (only applicable to interface objects).
8 TESTING Testing Object is in testing state (only applicable to interface objects).

Unmanaged status

Objects can be unmanaged. In this status object is not polled, DCIs are not collected, no data is updated about object. This status can be used to store data about an object that is temporary or permanently unavailable or not managed.

Maintenance mode

This is special status, that’s why it is not included in above status list. This status prevents event processing for specific node. While this node in maintenance mode is still polled and DCI data is still collected, but no event is generated.

Event Processing

NetXMS is event based monitoring system. Events can come from different sources (polling processes (status, configuration, discovery, and data collection), SNMP traps, and directly from external applications via client library). All events all are forwarded to NetXMS Event Queue. All events are processed by NetXMS Event Processor one-by-one, according to the processing rules defined in Event Processing Policy. As a result of event processing, preconfigured actions can be executed, and/or event can be shown up as alarm.

Usually alarm represents something that needs attention of network administrators or network control center operators, for example low free disk space on a server. NetXMS provides one centralized location, the Alarm Browser, where alarms are visible. It can be configured which events should be considered important enough to show up as alarm.

_images/event_flow.png

Event flow inside the monitoring system

Polling

For some type of objects NetXMS server start gathering status and configuration information as soon as they are added to the system. These object types are: nodes, access points, conditions, clusters, business services, zones (if a zone has more then one proxy, proxy health check is being performed). This process called polling. There are multiple polling types, usually performed with different intervals:

Type Purpose
Status Determine current status of an object
ICMP Ping nodes and gather response time statistics
Configuration Determine current configuration of an object (list of interfaces, supported protocols, etc.)
Topology Gather information related to network topology
Routing Gather information about IP routing
Instance Discovery Verify all DCIs created by instance discovery process
Network Discovery Searches for new nodes by polling information about neighbor IP addresses from known nodes

Data Collection

From each node NetXMS can collect one or more metrics which can be either single-value (“CPU.Usage”), list (“FileSystem.MountPoints”) or table (“FileSystem.Volumes”). When new data sample is collected, it’s value is checked against configured thresholds. This documentation use term Data Collection Item (DCI) to describe configuration of metric collection schedule, retention, and thresholds.

Metrics can be collected from multiple data sources:

Source Description
Internal Metrics internal to the server (server statistics, etc.)
NetXMS Agent Data is collected from NetXMS agent, which should be installed on target node. Server collect data from agent based on schedule.
SNMP SNMP transport will be used. Server collect data based on schedule.
Push Values are pushed by external system (using nxpush or API).
SM-CLP  
Windows Performance counters  
Check Point SNMP  
Script Value is generated by NXSL script. Script should be stored in Script Library.

Discovery

Network discovery

NetXMS can detect new devices and servers on the network and automatically create node objects for them. Two modes are available – passive and active.

In passive mode server will use only non-intrusive methods by querying ARP and routing tables from known nodes. Tables from the server running NetXMS are used as seed for passive discovery.

In active mode in addition to passive scan methods configured address ranges are periodically scanned using ICMP echo requests.

Instance discovery

NetXMS can create parameters for Data Collection Item automatically. Instance discovery collects information about node instances like disk mountpoints, device list, etc. and automatically creates or removes DCIs with obtained data.

Security

All communications are encrypted using either AES-256, AES-128, or Blowfish and authenticated. As additional security measure, administrator can restrict list of allowed ciphers.

Agent authenticate incoming connections using IP white list and optional preshared key.

User passwords (if internal database is used) as hashed with salt with SHA-256.

All shared secrets and passwords stored in the system can be obfuscated to prevent snooping.