Why is the node status not crital but unknown when the node is offline

Started by justrest, February 16, 2023, 11:02:52 AM

Previous topic - Next topic

justrest

When I manually disconnect a switch, the system quickly detects SNMP and ICMP errors, but then the node status becomes UNKNOWN, and then the polling of that node seems to stop, and the SYS_NODE_DOWN event cannot be generated,  the node status cannot become CRITICAL, please tell me under what circumstances the node status will become UNKNOWN and how to solve this problem, thanks a lot!

justrest

NetXMS Server Remote Console V4.3.1 Ready
Enter "help" for command list

netxmsd: show stat
Objects............: 28053
Monitored nodes....: 557
Collectible DCIs...: 783
Active alarms......: 0
Uptime.............: 0 days,  0:07:58

netxmsd: show q
Data collector                  : 0
DCI cache loader                : 0
Template updater                : 0
Database writer                  : 0
Database writer (IData)          : 0
Database writer (raw DCI values) : 17
Event processor                  : 0
Event log writer                : 0
Poller                          : 0
Node discovery poller            : 0
Syslog processor                : 0
Syslog writer                    : 0
Scheduler                        : 0
Windows event processor          : 0
Windows event writer            : 0

netxmsd: sh thr
MAIN
  Threads.............. 24 (24/768)
  Load average......... 0.00 0.01 0.00
  Current load......... 0%
  Usage................ 3%
  Active requests...... 0
  Scheduled requests... 1
  Total requests....... 27
  Thread starts........ 0
  Thread stops......... 0
  Wait time EMA........ 0 ms
  Wait time SMA........ 0 ms
  Wait time SD......... 0 ms
  Queue size EMA....... 0
  Queue size SMA....... 0
  Queue size SD........ 0

AGENT
  Threads.............. 32 (32/256)
  Load average......... 0.00 0.00 0.00
  Current load......... 0%
  Usage................ 12%
  Active requests...... 0
  Scheduled requests... 0
  Total requests....... 2475
  Thread starts........ 0
  Thread stops......... 0
  Wait time EMA........ 0 ms
  Wait time SMA........ 0 ms
  Wait time SD......... 0 ms
  Queue size EMA....... 0
  Queue size SMA....... 0
  Queue size SD........ 0

NPE
  Threads.............. 1 (1/1024)
  Load average......... 0.00 0.00 0.00
  Current load......... 0%
  Usage................ 0%
  Active requests...... 0
  Scheduled requests... 0
  Total requests....... 0
  Thread starts........ 0
  Thread stops......... 0
  Wait time EMA........ 0 ms
  Wait time SMA........ 0 ms
  Wait time SD......... 0 ms
  Queue size EMA....... 0
  Queue size SMA....... 0
  Queue size SD........ 0

DATACOLL
  Threads.............. 30 (30/750)
  Load average......... 0.05 0.10 0.04
  Current load......... 0%
  Usage................ 4%
  Active requests...... 0
  Scheduled requests... 0
  Total requests....... 6136
  Thread starts........ 0
  Thread stops......... 0
  Wait time EMA........ 0 ms
  Wait time SMA........ 0 ms
  Wait time SD......... 0 ms
  Queue size EMA....... 0
  Queue size SMA....... 0
  Queue size SD........ 0

CLIENT
  Threads.............. 16 (16/2048)
  Load average......... 0.00 0.00 0.00
  Current load......... 0%
  Usage................ 0%
  Active requests...... 0
  Scheduled requests... 0
  Total requests....... 2213
  Thread starts........ 0
  Thread stops......... 0
  Wait time EMA........ 0 ms
  Wait time SMA........ 0 ms
  Wait time SD......... 1 ms
  Queue size EMA....... 0
  Queue size SMA....... 0
  Queue size SD........ 0

SYNCER
  Threads.............. 11 (10/100)
  Load average......... 0.00 0.00 0.00
  Current load......... 0%
  Usage................ 11%
  Active requests...... 0
  Scheduled requests... 0
  Total requests....... 2646
  Thread starts........ 1
  Thread stops......... 0
  Wait time EMA........ 83 ms
  Wait time SMA........ 39 ms
  Wait time SD......... 24 ms
  Queue size EMA....... 0
  Queue size SMA....... 0
  Queue size SD........ 0

DISCOVERY
  Threads.............. 192 (24/192)
  Load average......... 415.48 1596.52 946.05
  Current load......... 0%
  Usage................ 100%
  Active requests...... 0
  Scheduled requests... 0
  Total requests....... 5511
  Thread starts........ 168
  Thread stops......... 0
  Wait time EMA........ 242677 ms
  Wait time SMA........ 0 ms
  Wait time SD......... 0 ms
  Queue size EMA....... 904
  Queue size SMA....... 0
  Queue size SD........ 0

POLLERS
  Threads.............. 63 (30/750)
  Load average......... 2.76 36.67 29.92
  Current load......... 0%
  Usage................ 8%
  Active requests...... 0
  Scheduled requests... 0
  Total requests....... 10282
  Thread starts........ 33
  Thread stops......... 0
  Wait time EMA........ 104 ms
  Wait time SMA........ 84 ms
  Wait time SD......... 226 ms
  Queue size EMA....... 26
  Queue size SMA....... 0
  Queue size SD........ 0

SCHEDULER
  Threads.............. 3 (3/192)
  Load average......... 0.00 0.00 0.00
  Current load......... 0%
  Usage................ 1%
  Active requests...... 0
  Scheduled requests... 1
  Total requests....... 2
  Thread starts........ 0
  Thread stops......... 0
  Wait time EMA........ 0 ms
  Wait time SMA........ 0 ms
  Wait time SD......... 0 ms
  Queue size EMA....... 0
  Queue size SMA....... 0
  Queue size SD........ 0

MOBILE
  Threads.............. 4 (4/256)
  Load average......... 0.00 0.00 0.00
  Current load......... 0%
  Usage................ 1%
  Active requests...... 0
  Scheduled requests... 0
  Total requests....... 0
  Thread starts........ 0
  Thread stops......... 0
  Wait time EMA........ 0 ms
  Wait time SMA........ 0 ms
  Wait time SD......... 0 ms
  Queue size EMA....... 0
  Queue size SMA....... 0
  Queue size SD........ 0

netxmsd:  sh watc
Thread                                          Interval Status
----------------------------------------------------------------------------
Item Poller                                      10      Running
Syncer Thread                                    30      Sleeping
Poll Manager                                    5        Sleeping
Ad hoc scheduler                                5        Sleeping
Recurrent scheduler                              5        Sleeping

netxmsd: sh watchdog
Thread                                          Interval Status
----------------------------------------------------------------------------
Item Poller                                      10      Running
Syncer Thread                                    30      Sleeping
Poll Manager                                    5        Sleeping
Ad hoc scheduler                                5        Sleeping
Recurrent scheduler                              5        Sleeping

netxmsd:  sh poller
Type | Object ID | Object name                    | Status
-----+-----------+--------------------------------+--------------------------
STAT |      7292 | guangyuwang                    | child poll
ICMP |      7292 | guangyuwang                    | awaiting execution
STAT |    24835 | 5HaoXianCT                    | check SNMP
....

...
netxmsd:  show dbcp
0 database connections in use

netxmsd: show dbst
SQL query counters:
  Total .......... 571758
  SELECT ......... 309204
  Non-SELECT ..... 262554
  Long running ... 0
  Failed ......... 0
Background writer requests:
  DCI data ....... 6842
  DCI raw data ... 6842
  Others ......... 2031
netxmsd: show flags
Flags: 0x200425A001385CF1
  AF_DAEMON                              = 1
  AF_USE_SYSLOG                          = 0
  AF_PASSIVE_NETWORK_DISCOVERY          = 0
  AF_ACTIVE_NETWORK_DISCOVERY            = 0
  AF_ENABLE_8021X_STATUS_POLL            = 1
  AF_DELETE_EMPTY_SUBNETS                = 1
  AF_ENABLE_SNMP_TRAPD                  = 1
  AF_ENABLE_ZONING                      = 1
  AF_SYNC_NODE_NAMES_WITH_DNS            = 0
  AF_CHECK_TRUSTED_OBJECTS              = 0
  AF_ENABLE_NXSL_CONTAINER_FUNCTIONS    = 1
  AF_USE_FQDN_FOR_NODE_NAMES            = 1
  AF_APPLY_TO_DISABLED_DCI_FROM_TEMPLATE = 1
  AF_DEBUG_CONSOLE_DISABLED              = 0
  AF_AUTOBIND_ON_CONF_POLL              = 1
  AF_WRITE_FULL_DUMP                    = 1
  AF_RESOLVE_NODE_NAMES                  = 1
  AF_CATCH_EXCEPTIONS                    = 1
  AF_HELPDESK_LINK_ACTIVE                = 0
  AF_DB_LOCKED                          = 1
  AF_DB_CONNECTION_LOST                  = 0
  AF_NO_NETWORK_CONNECTIVITY            = 0
  AF_EVENT_STORM_DETECTED                = 0
  AF_SNMP_TRAP_DISCOVERY                = 0
  AF_TRAPS_FROM_UNMANAGED_NODES          = 0
  AF_PERFDATA_STORAGE_DRIVER_LOADED      = 0
  AF_BACKGROUND_LOG_WRITER              = 0
  AF_CASE_INSENSITIVE_LOGINS            = 0
  AF_TRAP_SOURCES_IN_ALL_ZONES          = 0
  AF_SYSLOG_DISCOVERY                    = 0
  AF_CACHE_DB_ON_STARTUP                = 1
  AF_ENABLE_NXSL_FILE_IO_FUNCTIONS      = 0
  AF_DB_SUPPORTS_MERGE                  = 1
  AF_PARALLEL_NETWORK_DISCOVERY          = 1
  AF_SINGLE_TABLE_PERF_DATA              = 0
  AF_MERGE_DUPLICATE_NODES              = 1
  AF_SYSTEMD_DAEMON                      = 0
  AF_USE_SYSTEMD_JOURNAL                = 0
  AF_COLLECT_ICMP_STATISTICS            = 1
  AF_LOG_IN_JSON_FORMAT                  = 0
  AF_LOG_TO_STDOUT                      = 0
  AF_DBWRITER_HK_INTERLOCK              = 0
  AF_LOG_ALL_SNMP_TRAPS                  = 0
  AF_ALLOW_TRAP_VARBIND_CONVERSION      = 1
  AF_TSDB_DROP_CHUNKS_V2                = 0
  AF_DISABLE_AGENT_PROBE                = 0
  AF_DISABLE_ETHERNETIP_PROBE            = 0
  AF_DISABLE_SNMP_V1_PROBE              = 0
  AF_DISABLE_SNMP_V2_PROBE              = 0
  AF_DISABLE_SNMP_V3_PROBE              = 0
  AF_DISABLE_SSH_PROBE                  = 0
  AF_SERVER_INITIALIZED                  = 1
  AF_SHUTDOWN                            = 0

Filipp Sudanov

There is topology-based event correlation built in NetXMS. If we know, based on topology, that some node is behind a switch and this switch goes down, that for that node SYS_NODE_UNREACHEABLE event should get generated (SYS_NODE_DOWN is not generated, as we can not say for sure is that node down or up as there's no communication to it)