Website monitoring using network service

Started by Guillaume, April 10, 2015, 12:04:38 AM

Previous topic - Next topic

Guillaume

Hello everyone!

I was wondering if anyone had a solution to monitor a website using a network service. I am currently using a single-server setup to monitor a bunch of other servers (server and agent running on same machine) and I am using the web interface to manage the configurations. Everything is in 2.0-M2 version.

The problem I am encountering is that the status of my network service never updates... I keep getting the same value in the "Last Values" page (currently only value I have is 5) and when I do a manual polling (right click on object, Poll --> Status) I get a not-so-helpful error:
Unable to check service status due to agent or communication error

I tried everything that I could find on the forums regarding how to setup the network service, and so far I am using this (Right click on the service, Properties, Network Service menu):
Service Type: HTTP
Port: 80
Request: http://intranet.website.com:/testDB.php
Response: ^HTTP/1\.[01] 200.*OK*
Poller node: <default>
Requires poll count: 5

**The requested page (testDB.php) is coded to only print "OK" and nothing else. When I do a curl request from the monitoring server to the target page, I get an appropriate response so the page is definitely reachable.

I enabled the SubAgent portcheck in /etc/nxagent.conf:
SubAgent = /usr/local/lib/netxms/portcheck.nsm

I also made sure that 127.0.0.1, the ip adress of server and DNS name were listed after "MasterServers" in the same configuration file.

**EDIT: I triple checked my apache virtual_access_log and the page testDB.php has not been requested today, except by our other monitoring server using nagios. I began the setup of the network service this morning. :'(

I am totally lost on how to make this network service work.... :(

My question, then, is really simple: What should I modify in my configuration to be able to use a network service in order to monitor a webpage?

Thanks for any constructive feedback/comments/solutions!

EDIT FOR SOLUTION:
I found that some problems were in the way of a proper working solution, and to troubleshoot them I had to go through these steps:

  • Setup the agent to run in full debug mode (add a line at the end of nxagentd.conf --> DebugLevel = 9)
  • Enable the portcheck subagent in the agent's configuration (SubAgent = /usr/local/lib/netxms/portcheck.nsm)
  • Enable Subagent autoload in the agent's configuration (EnableSubagentAutoload = yes)
  • Make sure that the secret stated in the agent's config file is the same as in the console (Configuration --> Server Configuration: AgentDefaultSharedSecret key)
  • Disable authentication and encryption in the agent's config file (RequireAuthentication = no and RequireEncryption = no)
  • Disable the use of the shared secret in the properties of the node that has an agent (in my case it was only localhost). In the properties of the node (in the console), there is a dropdown in the "Communications" section called "Authentication method". I set it to "NONE".
Now the network services are properly stating a status of "0" in the Last Values of my node. Thanks a lot to everyone that helped with this issue!

djex81

Hi Guillaume,

Did you read this forum post? https://www.netxms.org/forum/configuration/web-site-monitoring/

Try this. Set the Response to use ^HTTP/1\.[01] 200.*

There might be an issue with it finding the OK on the testDB.php response.

Victor Kirhenshtein

Hi,

your request should be

intranet.website.com:/testDB.php

(without http:// prefix). Also, there is a typo in response regexp, likely it should be

^HTTP/1\.[01] 200.*OK.*

(note the dot before second asterisk) - otherwise it will match anything with just letter O.

Best regards,
Victor

Guillaume

#3
Hi,

First of all thank you all for your prompt response! The community seems pretty friendly around here!

@djex81: I already had read this post, but thanks for pointing it out because it was my #1 source of information about this issue to troubleshoot before posting here.

I made the suggested corrections (removing "http://" and adding the dot before the last asterisk) and tried to poll the service again to no avail. I also tried restarting both the server and the agent but it did not change the problem. I also manually added the file "/var/netxms/registry.dat" and made sure the agent had the permission to write in it (I had a warning about it before in the agent's log).

I then tried removing the "OK.*" part at the end but I still get "Unable to check service status due to agent or communication error" and a value of 5 in the "Last Values" of the server. Apache's virtual_access_log also does not receive any request originating from the monitoring server.

Anything else that I could check to help debug this issue?  :-\

Victor Kirhenshtein

Hi,

do you have portcheck subagent loaded on agent running on management server? Can you please share your nxagentd.conf from your management server?

Best regards,
Victor

Guillaume

#5
Hello again!

I can indeed link the config of the agent. I noticed in the process of linking the file that the "EnableSubagentAutoload" was not set to yes. I modified this switch but so far to no avail.

To answer your question more specifically:
- I enabled the port check subagent with the switch:
"SubAgent = /usr/local/lib/netxms/portcheck.nsm"
- I made sure to add the server's address in the config (as stated in the original post)

Tatjana Dubrovica

#6
Do you have this line in agent log when server starts: "[15-Apr-2015 23:39:31.379] [INFO ] Subagent "portcheck.nsm" loaded successfully"? Debug level 6.

Is <default> selected as a Poller Node in network service preferences?

Guillaume

#7
Hello,

I do have that line in the log when I start the agent:
[16-Apr-2015 09:56:12.699] Log file opened
[16-Apr-2015 09:56:12.699] [INFO ] Additional configs was loaded from /etc/nxagentd.conf.d
[16-Apr-2015 09:56:12.699] [INFO ] Debug level set to 0
[16-Apr-2015 09:56:13.706] [INFO ] Subagent "linux.nsm" loaded successfully
[16-Apr-2015 09:56:13.789] [INFO ] Subagent "/usr/local/lib/netxms/portcheck.nsm" loaded successfully
[16-Apr-2015 09:56:14.790] [INFO ] Listening on socket 0.0.0.0:4700
[16-Apr-2015 09:56:15.791] [INFO ] NetXMS Agent started

Do I have to explicitely set the debug level to 6 in order to get something else?

EDIT: I added the "DebugLevel = 6" to my nxagentd.conf and here is the startup sequence:

[16-Apr-2015 10:01:25.838] Log file opened
[16-Apr-2015 10:01:25.839] [INFO ] Additional configs was loaded from /etc/nxagentd.conf.d
[16-Apr-2015 10:01:25.839] [INFO ] Debug level set to 6
[16-Apr-2015 10:01:25.839] [DEBUG] Data directory: /var/netxms
[16-Apr-2015 10:01:25.839] [DEBUG] Subagent API initialized
[16-Apr-2015 10:01:25.839] [DEBUG] Validating ciphers
[16-Apr-2015 10:01:25.840] [DEBUG]    AES-256 enabled
[16-Apr-2015 10:01:25.840] [DEBUG]    Blowfish-256 enabled
[16-Apr-2015 10:01:25.840] [DEBUG]    IDEA enabled
[16-Apr-2015 10:01:25.840] [DEBUG]    3DES enabled
[16-Apr-2015 10:01:25.840] [DEBUG]    AES-128 enabled
[16-Apr-2015 10:01:25.840] [DEBUG]    Blowfish-128 enabled
[16-Apr-2015 10:01:25.840] [DEBUG] Crypto library initialized
[16-Apr-2015 10:01:26.847] [DEBUG] Linux: using /sys/block to distinguish devices from partitions
[16-Apr-2015 10:01:26.848] [INFO ] Subagent "linux.nsm" loaded successfully
[16-Apr-2015 10:01:26.848] [DEBUG] Debug callback set for DB library
[16-Apr-2015 10:01:26.853] [INFO ] Subagent "/usr/local/lib/netxms/portcheck.nsm" loaded successfully
[16-Apr-2015 10:01:26.865] [DEBUG] Unable to parse /proc/drbd, DRBD data collector will not start
[16-Apr-2015 10:01:26.866] [DEBUG] ParseIoStat(): new device added (name=fd0 isRealDevice=1)
[16-Apr-2015 10:01:26.866] [DEBUG] ParseIoStat(): new device added (name=sda isRealDevice=1)
[16-Apr-2015 10:01:26.866] [DEBUG] ParseIoStat(): new device added (name=sda1 isRealDevice=0)
[16-Apr-2015 10:01:26.866] [DEBUG] ParseIoStat(): new device added (name=sda2 isRealDevice=0)
[16-Apr-2015 10:01:26.866] [DEBUG] ParseIoStat(): new device added (name=sr0 isRealDevice=1)
[16-Apr-2015 10:01:26.866] [DEBUG] ParseIoStat(): new device added (name=dm-0 isRealDevice=1)
[16-Apr-2015 10:01:26.866] [DEBUG] ParseIoStat(): new device added (name=dm-1 isRealDevice=1)
[16-Apr-2015 10:01:26.866] [DEBUG] ParseIoStat(): new device added (name=dm-2 isRealDevice=1)
[16-Apr-2015 10:01:26.866] [DEBUG] ParseIoStat(): new device added (name=dm-3 isRealDevice=1)
[16-Apr-2015 10:01:27.860] [DEBUG] External parameters providers poller thread will not start
[16-Apr-2015 10:01:27.860] [DEBUG] Session agent connector disabled
[16-Apr-2015 10:01:27.861] [DEBUG] Trying to bind on 0.0.0.0:4700
[16-Apr-2015 10:01:27.861] [INFO ] Listening on socket 0.0.0.0:4700
[16-Apr-2015 10:01:28.866] [INFO ] NetXMS Agent started

I wonder what "[16-Apr-2015 10:01:26.865] [DEBUG] Unable to parse /proc/drbd, DRBD data collector will not start" means and what impact it has on the proper working of the agent?

Also, to answer your second question, the poller node is set to <default> in the web interface of the network service's properties.

Victor Kirhenshtein

Hi,

message about DRBD is normal if you don't have DRBD configured. It will not impact agent operations.

Configuration looks correct. Could you please run agent with debug level 9, run status poll on node with network service and share agent's log?

Best regards,
Victor

Guillaume

There you go

[17-Apr-2015 13:12:48.603] Log file opened
[17-Apr-2015 13:12:48.603] [INFO ] Additional configs was loaded from /etc/nxagentd.conf.d
[17-Apr-2015 13:12:48.603] [INFO ] Debug level set to 9
[17-Apr-2015 13:12:48.604] [DEBUG] Data directory: /var/netxms
[17-Apr-2015 13:12:48.604] [DEBUG] Subagent API initialized
[17-Apr-2015 13:12:48.604] [DEBUG] Validating ciphers
[17-Apr-2015 13:12:48.605] [DEBUG]    AES-256 enabled
[17-Apr-2015 13:12:48.605] [DEBUG]    Blowfish-256 enabled
[17-Apr-2015 13:12:48.605] [DEBUG]    IDEA enabled
[17-Apr-2015 13:12:48.605] [DEBUG]    3DES enabled
[17-Apr-2015 13:12:48.605] [DEBUG]    AES-128 enabled
[17-Apr-2015 13:12:48.605] [DEBUG]    Blowfish-128 enabled
[17-Apr-2015 13:12:48.605] [DEBUG] Crypto library initialized
[17-Apr-2015 13:12:49.611] [DEBUG] Linux: using /sys/block to distinguish devices from partitions
[17-Apr-2015 13:12:49.612] [INFO ] Subagent "linux.nsm" loaded successfully
[17-Apr-2015 13:12:49.612] [DEBUG] Debug callback set for DB library
[17-Apr-2015 13:12:49.617] [INFO ] Subagent "/usr/local/lib/netxms/portcheck.nsm" loaded successfully
[17-Apr-2015 13:12:49.630] [DEBUG] Unable to parse /proc/drbd, DRBD data collector will not start
[17-Apr-2015 13:12:49.631] [DEBUG] ParseIoStat(): new device added (name=fd0 isRealDevice=1)
[17-Apr-2015 13:12:49.631] [DEBUG] ParseIoStat(): new device added (name=sda isRealDevice=1)
[17-Apr-2015 13:12:49.631] [DEBUG] ParseIoStat(): new device added (name=sda1 isRealDevice=0)
[17-Apr-2015 13:12:49.631] [DEBUG] ParseIoStat(): new device added (name=sda2 isRealDevice=0)
[17-Apr-2015 13:12:49.631] [DEBUG] ParseIoStat(): new device added (name=sr0 isRealDevice=1)
[17-Apr-2015 13:12:49.631] [DEBUG] ParseIoStat(): new device added (name=dm-0 isRealDevice=1)
[17-Apr-2015 13:12:49.631] [DEBUG] ParseIoStat(): new device added (name=dm-1 isRealDevice=1)
[17-Apr-2015 13:12:49.631] [DEBUG] ParseIoStat(): new device added (name=dm-2 isRealDevice=1)
[17-Apr-2015 13:12:49.631] [DEBUG] ParseIoStat(): new device added (name=dm-3 isRealDevice=1)
[17-Apr-2015 13:12:50.625] [DEBUG] External parameters providers poller thread will not start
[17-Apr-2015 13:12:50.625] [DEBUG] Session agent connector disabled
[17-Apr-2015 13:12:50.627] [DEBUG] Trying to bind on 0.0.0.0:4700
[17-Apr-2015 13:12:50.627] [INFO ] Listening on socket 0.0.0.0:4700
[17-Apr-2015 13:12:51.637] [INFO ] NetXMS Agent started
[17-Apr-2015 13:13:53.890] [DEBUG] Incoming connection from 192.168.108.22
[17-Apr-2015 13:13:53.890] [DEBUG] Connection from 192.168.108.22 accepted
[17-Apr-2015 13:13:53.891] [DEBUG] [session:0] Message dump:
  ** 00B60020000000100000000000000000
  ** code=0x00B6 (CMD_GET_NXCP_CAPS) flags=0x0020 id=0 size=16 numFields=0

[17-Apr-2015 13:13:53.891] [DEBUG] [session:0] Received control message CMD_GET_NXCP_CAPS
[17-Apr-2015 13:13:53.891] [DEBUG] [session:0] Sending message CMD_NXCP_CAPS (size 16)
[17-Apr-2015 13:13:53.893] [DEBUG] [session:0] Message dump:
  ** 00030000000000100000000100000000
  ** code=0x0003 (CMD_KEEPALIVE) flags=0x0000 id=1 size=16 numFields=0

[17-Apr-2015 13:13:53.893] [DEBUG] [session:0] Received message CMD_KEEPALIVE
[17-Apr-2015 13:13:53.893] [DEBUG] [session:0] Authentication required
[17-Apr-2015 13:13:53.893] [DEBUG] [session:0] Sending message CMD_REQUEST_COMPLETED (size 32)
[17-Apr-2015 13:13:53.894] [DEBUG] [session:0] Session with 192.168.108.22 closed
[17-Apr-2015 13:13:53.894] [DEBUG] Incoming connection from 192.168.108.22
[17-Apr-2015 13:13:53.894] [DEBUG] Connection from 192.168.108.22 accepted
[17-Apr-2015 13:13:53.895] [DEBUG] [session:0] Message dump:
  ** 00B60020000000100000000000000000
  ** code=0x00B6 (CMD_GET_NXCP_CAPS) flags=0x0020 id=0 size=16 numFields=0

[17-Apr-2015 13:13:53.895] [DEBUG] [session:0] Received control message CMD_GET_NXCP_CAPS
[17-Apr-2015 13:13:53.895] [DEBUG] [session:0] Sending message CMD_NXCP_CAPS (size 16)
[17-Apr-2015 13:13:53.895] [DEBUG] [session:0] Message dump:
  ** 00400000000000380000000200000002
  ** 00000010030000030000001104000000
  ** 000000141441D6E35A5F72FD21CCF23A
  ** 5D07BB65C05565F6
  ** code=0x0040 (CMD_AUTHENTICATE) flags=0x0000 id=2 size=56 numFields=2
  ** [    16] INT16  3
  ** [    17] BINARY len=20

[17-Apr-2015 13:13:53.895] [DEBUG] [session:0] Received message CMD_AUTHENTICATE
[17-Apr-2015 13:13:53.895] [WARN ] Authentication failed for peer 192.168.108.22, method: SHA1
[17-Apr-2015 13:13:53.895] [DEBUG] [session:0] Sending message CMD_REQUEST_COMPLETED (size 32)
[17-Apr-2015 13:13:53.895] [DEBUG] [session:0] Session with 192.168.108.22 closed


There seems to be some kind of authentication problem between the server and the agent... I will investigate!

If you have any other insights about what the problem might be, your comments are greatly appreciated

Victor Kirhenshtein

You have set RequireAuthentication to yes in agent config. Did you set correct shared secret in node properties? Also, try to turn off authentication in agent config and in node properties.

Best regards,
Victor

Guillaume

Funny thing, the shared secret was wrong... I fixed it and now I can see some stuff going on. Here is a request on the configured network service:

[17-Apr-2015 13:24:32.123] [DEBUG] [session:1] Received message CMD_KEEPALIVE
[17-Apr-2015 13:24:32.123] [DEBUG] [session:1] Sending message CMD_REQUEST_COMPLETED (size 32)
[17-Apr-2015 13:24:32.124] [DEBUG] [session:1] Message dump:
  ** 00730000000000C80000000300000006
  ** 0000000800000000C0A8791B00000000
  ** 00000082030000050000008403000050
  ** 00000083030000060000008501000000
  ** 0000003E0063006C00690065006E0074
  ** 002E006D0074006900630061006E0061
  ** 00640061002E00630061003A002F0074
  ** 00650073007400440042002E00700068
  ** 00700000000000000000008601000000
  ** 0000002E005E0048005400540050002F
  ** 0031005C002E005B00300031005D0020
  ** 003200300030002E002A004F004B002E
  ** 002A000000000000
  ** code=0x0073 (CMD_CHECK_NETWORK_SERVICE) flags=0x0000 id=3 size=200 numFields=6
  ** [     8] INT32  -1062700773
  ** [   130] INT16  5
  ** [   132] INT16  80
  ** [   131] INT16  6
  ** [   133] STRING "client.mticanada.ca:/testDB.php"
  ** [   134] STRING "^HTTP/1\.[01] 200.*OK.*"

[17-Apr-2015 13:24:32.124] [DEBUG] [session:1] Received message CMD_CHECK_NETWORK_SERVICE
[17-Apr-2015 13:24:32.138] [DEBUG] [session:1] Sending message CMD_REQUEST_COMPLETED (size 64)
[17-Apr-2015 13:24:32.139] [DEBUG] [session:1] Session with 192.168.108.22 closed
[17-Apr-2015 13:24:32.139] [DEBUG] Incoming connection from 192.168.108.22
[17-Apr-2015 13:24:32.139] [DEBUG] Connection from 192.168.108.22 accepted
[17-Apr-2015 13:24:32.139] [DEBUG] [session:1] Message dump:
  ** 00B60020000000100000000000000000
  ** code=0x00B6 (CMD_GET_NXCP_CAPS) flags=0x0020 id=0 size=16 numFields=0


I will try again without the authentication both in the agent and in node

Victor Kirhenshtein

Communications is ok now, you won't get any difference with or without authentication. Do you see any difference in status poll? Also, what version you are using?

Best regards,
Victor

Guillaume

#13
Every authentication mechanism is disabled (Shared secret is no longer relevant) and I get some cryptic stuff in the log:

[17-Apr-2015 13:33:09.232] [DEBUG] Incoming connection from 192.168.108.22
[17-Apr-2015 13:33:09.233] [DEBUG] Connection from 192.168.108.22 accepted
[17-Apr-2015 13:33:09.233] [DEBUG] [session:0] Message dump:
  ** 00B60020000000100000000000000000
  ** code=0x00B6 (CMD_GET_NXCP_CAPS) flags=0x0020 id=0 size=16 numFields=0

[17-Apr-2015 13:33:09.233] [DEBUG] [session:0] Received control message CMD_GET_NXCP_CAPS
[17-Apr-2015 13:33:09.233] [DEBUG] [session:0] Sending message CMD_NXCP_CAPS (size 16)
[17-Apr-2015 13:33:09.233] [DEBUG] [session:0] Message dump:
  ** 00400000000000380000000F00000002
  ** 00000010030000030000001104000000
  ** 0000001459FB535A1BDE48131CB5A9FB
  ** C0BFCD4CABEBC931
  ** code=0x0040 (CMD_AUTHENTICATE) flags=0x0000 id=15 size=56 numFields=2
  ** [    16] INT16  3
  ** [    17] BINARY len=20

[17-Apr-2015 13:33:09.233] [DEBUG] [session:0] Received message CMD_AUTHENTICATE
[17-Apr-2015 13:33:09.233] [DEBUG] [session:0] Sending message CMD_REQUEST_COMPLETED (size 32)
[17-Apr-2015 13:33:09.234] [DEBUG] [session:0] Session with 192.168.108.22 closed
[17-Apr-2015 13:33:28.338] [DEBUG] Incoming connection from 192.168.108.22
[17-Apr-2015 13:33:28.338] [DEBUG] Connection from 192.168.108.22 accepted
[17-Apr-2015 13:33:28.338] [DEBUG] [session:0] Message dump:
  ** 00B60020000000100000000000000000
  ** code=0x00B6 (CMD_GET_NXCP_CAPS) flags=0x0020 id=0 size=16 numFields=0

[17-Apr-2015 13:33:28.338] [DEBUG] [session:0] Received control message CMD_GET_NXCP_CAPS
[17-Apr-2015 13:33:28.338] [DEBUG] [session:0] Sending message CMD_NXCP_CAPS (size 16)
[17-Apr-2015 13:33:28.338] [DEBUG] [session:0] Message dump:
  ** 00400000000000380000001100000002
  ** 00000010030000030000001104000000
  ** 0000001459FB535A1BDE48131CB5A9FB
  ** C0BFCD4CABEBC931
  ** code=0x0040 (CMD_AUTHENTICATE) flags=0x0000 id=17 size=56 numFields=2
  ** [    16] INT16  3
  ** [    17] BINARY len=20

[17-Apr-2015 13:33:28.338] [DEBUG] [session:0] Received message CMD_AUTHENTICATE
[17-Apr-2015 13:33:28.338] [DEBUG] [session:0] Sending message CMD_REQUEST_COMPLETED (size 32)
[17-Apr-2015 13:33:28.339] [DEBUG] [session:0] Session with 192.168.108.22 closed


I attached the full agent log (start, loading of modules, poll, etc) just in case I am missing something because at this point I am not sure what to look for in these logs :P

EDIT: This is the M2 version of "everything" (server, agent and console). Also, the poll is still showing a communication problem with the agent and last value is still at 5

Victor Kirhenshtein

Authentication settings in agent config and in node properties should match - either enable it on both sides or disable on both sides. In this log it seems that you turn off authentication on agent but enable it in node properties.

Best regards,
Victor