Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Topics - Guillaume

#1
General Support / On Call Schedule
June 10, 2015, 10:02:38 PM
Hello everyone!

I am continuing NetXMS's implementation in our network and I am now facing a big question:

How can I set up On-call alerts, just like Nagios did?

What I want to achieve is:
-Globally setup a user (target e-mail and SMS phone number) that will receive every alert for a given period of time
-Switch between 2 users (the 2 SysAdmins) within a bi-weekly schedule
-Make the email and sms alerts "repeatable" until acknowledged (I could not find the option "repeat event" in the events configuration...)

What I have read that would be possible to do so far with NetXMS:
-Setup 2 DCI's for every DCI that I have so far with a schedule for each user, so only the DCI's associated with a given user's schedule will fire events to send an email
-Make a script that will receive every alert (SMS and e-mail) and dispatch them to the concerned user

Is there some other solution that I am missing? Because the first one is kind of not efficient and the second one is basically creating an external tool to maintain...

Thanks for any input/suggestions
#2
General Support / [SOLVED] LDAP synchronisation
May 14, 2015, 05:48:53 PM
Hello!

I'm trying to integrate NetXMS with my LDAP environment and things are looking a little ugly so far...

My first issue is with the debugging. I started my netxms server in a screen session with the command "/usr/local/bin/netxmsd" in order to get access to the console. I am then presented with a prompt in which I can type some commands. Since I want to debug the ldap synchronization mechanism, I type "ldapsync"as instructed in the manual (HERE). After the console is done presenting me with various errors (discussed further down this post) I am no longer presented with a command prompt. I have to kill the server instance and restart it in the same fashion in order to get one more try at debugging, which is very annoying.

First question: Is this a bug? How can I attach/re-attach to the console and get a prompt without killing the server instance and then starting it locally?

In the various debug messages produced by the ldapsync command, I noticed a problem in my NetXMS's ldap syncronisation parameters. The exact error is:
[14-May-2015 10:18:18.750] [DEBUG] Found dn: cn=wildfly,ou=groups,dc=myCompany,dc=ca
[14-May-2015 10:18:18.750] [DEBUG] LDAPConnection::fillLists(): cn=wildfly,ou=groups,dc=myCompany,dc=ca is not a user nor a group
[14-May-2015 10:20:51.965] [DEBUG] LDAPConnection::fillLists(): Found dn: cn=steeve,ou=users,dc=myCompany,dc=ca
[14-May-2015 10:20:51.965] [DEBUG] LDAPConnection::fillLists(): cn=steeve,ou=users,dc=myCompany,dc=ca is not a user nor a group


Which means that the various parameters I entered in my LDAP config are wrong. What I can gather from the logs and the various documentation we have internally is that a group is identified by "ou=groups" and a user is identified by "ou=users", whereas an object name is "cn=objectName".

Here is what I have so far in the config:
LdapConnectionString:   ldap://olserver1.myCompany.priv:389
LdapGroupClass:   groups
LdapMappingDescription:   description
LdapMappingFullName:   full name
LdapMappingName:   cn
LdapPageSize:   1000
LdapSearchBase:   dc=myCompany,dc=ca
LdapSearchFilter:   (objectClass=*)
LdapSyncInterval:   5
LdapSyncUser:   cn=admin,dc=myCompany,dc=ca
LdapSyncUserPassword:   myPaSsWoRd
LdapUserClass:   users
LdapUserDeleteAction:   0

I am not "that" familiar with the LDAP syntax, and even less with NetXMS's so I would like to know if anyone could help with those 2 issues:
1) The console prompt disappearing after invoking "ldapsync"
2) The configuration mismatch

Thank you all for your dedication to this project, I am sure that we can make it great by poking around issues like this one!
#3
General Support / [SOLVED] Nagios integration
May 11, 2015, 05:25:57 PM
Hello everyone!

I am trying to make the switch from Nagios to NetXMS and things have not been easy. Since the network service part of NetXMS is kind of on and off, I was wondering if it would be possible to use NetXMS's scripting interface to interact with the already-installed nrpe plugin on every server?

The command to interact with such a plugin is quite simple, something like:

/usr/lib64/nagios/plugins/check_nrpe -H webserver -c check_disk_custom -a 20 10 /

Where the syntax is
-H: target host
-c: the custom command to run on the remote server
-a: a list of arguments

In my example, the plugin checks for free disk space on "webserver" on the disk "/" with a 20% and a 10% threshold.

Would such a scripted solution be acceptable and, if so, where/how can I code it?

Thank you!
#4
Hello again netxms community!

After my first problem, I thought that I had fixed everything and that I was able to monitor network services.

I was wrong.

My previously working services are now showing with a value of "4" in the last value panel and a nice red "x" is showing next to the services name... Needless to say, I am able to access the web page and get the needed information from the monitoring server:
prompt$ curl -Is http://host.com/testDB.php | head  -n 1
HTTP/1.1 200 OK



The only difference between the state of my setup now versus what it was at the time of my first topic is the software version change. I am now using the M4 version of netxms.

Here is the relevant part of the nxagentd.log file, where we can see that a connection is made between localhost and the agent. Then the log shows the proper shared secret and the proper request to send:
[05-May-2015 13:35:26.207] [DEBUG] [session:1] Session with 192.168.108.22 closed
[05-May-2015 13:35:26.207] [DEBUG] Incoming connection from 192.168.108.22
[05-May-2015 13:35:26.207] [DEBUG] Connection from 192.168.108.22 accepted
[05-May-2015 13:35:26.207] [DEBUG] [session:1] Message dump:
  ** 00B60020000000100000000000000000
  ** code=0x00B6 (CMD_GET_NXCP_CAPS) flags=0x0020 id=0 size=16 numFields=0

[05-May-2015 13:35:26.207] [DEBUG] [session:1] Received control message CMD_GET_NXCP_CAPS
[05-May-2015 13:35:26.207] [DEBUG] [session:1] Sending message CMD_NXCP_CAPS (size 16)
[05-May-2015 13:35:26.207] [DEBUG] [session:1] Message dump:
  ** 00400000000000400000000100000002
  ** 00000010030000010000001101000000
  ** 000000180053005E0064006700370044
  ** 00430073006F004C0069004500000000
  ** code=0x0040 (CMD_AUTHENTICATE) flags=0x0000 id=1 size=64 numFields=2
  ** [    16] INT16    1
  ** [    17] STRING   "secret^secret"

[05-May-2015 13:35:26.207] [DEBUG] [session:1] Received message CMD_AUTHENTICATE
[05-May-2015 13:35:26.207] [DEBUG] [session:1] Sending message CMD_REQUEST_COMPLETED (size 32)
[05-May-2015 13:35:26.207] [DEBUG] [session:1] Message dump:
  ** 01340000000000180000000200000001
  ** 0000009703010001
  ** code=0x0134 (CMD_ENABLE_IPV6) flags=0x0000 id=2 size=24 numFields=1
  ** [   151] INT16    1

[05-May-2015 13:35:26.207] [DEBUG] [session:1] Received message CMD_ENABLE_IPV6
[05-May-2015 13:35:26.207] [DEBUG] [session:1] Sending message CMD_REQUEST_COMPLETED (size 32)
[05-May-2015 13:35:26.207] [DEBUG] [session:1] Message dump:
  ** 00730000000000D80000000300000006
  ** 00000008060000000000000000000000
  ** 00000000000000000020000000000000
  ** 00000082030000050000008403000050
  ** 00000083030000060000008501000000
  ** 000000400069006E007400720061006E
  ** 00650074002E007300610076006F0075
  ** 00720061002E0063006F006D003A002F
  ** 007400650073007400440042002E0070
  ** 00680070000000000000008601000000
  ** 0000002E005E0048005400540050002F
  ** 0031005C002E005B00300031005D0020
  ** 003200300030002E002A004F004B002E
  ** 002A000000000000
  ** code=0x0073 (CMD_CHECK_NETWORK_SERVICE) flags=0x0000 id=3 size=216 numFields=6
  ** [     8] INETADDR 0.0.0.0
  ** [   130] INT16    5
  ** [   132] INT16    80
  ** [   131] INT16    6
  ** [   133] STRING   "host.com:/testDB.php"
  ** [   134] STRING   "^HTTP/1\.[01] 200.*OK.*"

[05-May-2015 13:35:26.207] [DEBUG] [session:1] Received message CMD_CHECK_NETWORK_SERVICE
[05-May-2015 13:35:26.207] [DEBUG] [session:1] Sending message CMD_REQUEST_COMPLETED (size 64)
[05-May-2015 13:35:26.207] [DEBUG] [session:1] Session with 192.168.108.22 closed


Is there anything that I could check/provide to get the monitoring of web services up and running once and for all?

Thanks for any input!

**I have replaced the real secret and hostname of the target web service for confidentiality reasons
#5
Hello everyone!

I was wondering if anyone had a solution to monitor a website using a network service. I am currently using a single-server setup to monitor a bunch of other servers (server and agent running on same machine) and I am using the web interface to manage the configurations. Everything is in 2.0-M2 version.

The problem I am encountering is that the status of my network service never updates... I keep getting the same value in the "Last Values" page (currently only value I have is 5) and when I do a manual polling (right click on object, Poll --> Status) I get a not-so-helpful error:
Unable to check service status due to agent or communication error

I tried everything that I could find on the forums regarding how to setup the network service, and so far I am using this (Right click on the service, Properties, Network Service menu):
Service Type: HTTP
Port: 80
Request: http://intranet.website.com:/testDB.php
Response: ^HTTP/1\.[01] 200.*OK*
Poller node: <default>
Requires poll count: 5

**The requested page (testDB.php) is coded to only print "OK" and nothing else. When I do a curl request from the monitoring server to the target page, I get an appropriate response so the page is definitely reachable.

I enabled the SubAgent portcheck in /etc/nxagent.conf:
SubAgent = /usr/local/lib/netxms/portcheck.nsm

I also made sure that 127.0.0.1, the ip adress of server and DNS name were listed after "MasterServers" in the same configuration file.

**EDIT: I triple checked my apache virtual_access_log and the page testDB.php has not been requested today, except by our other monitoring server using nagios. I began the setup of the network service this morning. :'(

I am totally lost on how to make this network service work.... :(

My question, then, is really simple: What should I modify in my configuration to be able to use a network service in order to monitor a webpage?

Thanks for any constructive feedback/comments/solutions!

EDIT FOR SOLUTION:
I found that some problems were in the way of a proper working solution, and to troubleshoot them I had to go through these steps:

  • Setup the agent to run in full debug mode (add a line at the end of nxagentd.conf --> DebugLevel = 9)
  • Enable the portcheck subagent in the agent's configuration (SubAgent = /usr/local/lib/netxms/portcheck.nsm)
  • Enable Subagent autoload in the agent's configuration (EnableSubagentAutoload = yes)
  • Make sure that the secret stated in the agent's config file is the same as in the console (Configuration --> Server Configuration: AgentDefaultSharedSecret key)
  • Disable authentication and encryption in the agent's config file (RequireAuthentication = no and RequireEncryption = no)
  • Disable the use of the shared secret in the properties of the node that has an agent (in my case it was only localhost). In the properties of the node (in the console), there is a dropdown in the "Communications" section called "Authentication method". I set it to "NONE".
Now the network services are properly stating a status of "0" in the Last Values of my node. Thanks a lot to everyone that helped with this issue!