This post will help you to troubleshoot most common issues with the NCM Jobs / Download Device configuration / Common Errors while downloading the Configuration / Running NCM jobs checking the NCM logs.
Please follow the steps and recommendation carefully and do let us know if this post helped to address your issue in comments section .
Your check list
NCM version System Hardware & latest HF installed
Disable Session Trace
Clear Temp Folder
Clear pending reboot
Disable Config Archive
Increase CLI TimeOut
Reduce the amount of the config retain
1-NCM System &Hardware Sockets
Make sure your are on NCM 7.9 OR Upgraded to latest released NCM 8.0 version installed
Please audit your environment and make sure there is no bottleneck at your side which is causing the issue
How to check the server hardware using Orion Platform diagnostics
Open Task Manager On Orion Server > Performance Tab > CPU - Make sure you have Minimum 4 Sockets available there.
Make sure you are on recommended hardware
For more details please see My Thwack post below.
https://thwack.solarwinds.com/docs/DOC-190027
Check the HF from the Customer portal and make sure you have the latest HF installed
https://customerportal.solarwinds.com/HotFixes
In few cases if you are running jobs for large network its not recommended to keep the Session Trace ON as it will consume CPU and Memory on the system also will effect the config download progress therefor keep this folder clear .
To disable Session Tracing:
- Open the Orion Web Console.
- Go to:
7.6 and older: Settings > All Settings > NCM Settings > Advanced Settings
7.7 and newer: Settings > All Settings > Product Specific Settings > CLI Settings - Clear the Enable Session Tracing check box.
- Go to the trace log location:
7.6 and older: C:\ProgramData\SolarWinds\Logs\Orion\NCM\Session-Trace
7.7 and newer: C:\ProgramData\SolarWinds\Logs\Orion\CLI\Session-Trace - Delete the trace files.
3-Clare Temp folder
Check disc space on NCM drives as well as SQL server where the NCM DB is stored
Clear the Windows Temp directory for all polling engines in NPM
- Log in to the Orion server hosting the Main Polling Engine.
- Stop all Orion services.
- Disable the Antivirus software running on the system.
- Navigate to:
C:\windows\Temp (including the SolarWinds folder) - Delete all files in the Temp folder.
- Restart all Orion services.
- Restart the system.
- Repeat step 1 through step 7 for all servers hosting an Additional Polling Engine (APE).
4- Reboot the NCM Server (Recommended windows update pending reboot could cause issues with NCM jobs)
5- Disable Config Archive
Settings > All Settings > Config Settings ( Disable Config Archive)
6- Increase CLI TimeOut
Go to the Settings > All Settings > CLI Settings > (Uncheck session Trace ) > increase the timeout values a bit .
7-Reduce the amount of the config retain in the NCM .
Off load extra load from the NCM DB will also help to run the NCM jobs faster.
Configs > Jobs > Edit the >Default Purge Configs Job
Follow the Wizard and on Add Job specific Details > Delete all Configs Except for the last 10 days.
Most common issues with NCM Jobs area
Schedule Job is not running at all / stuck on 99% or 100% / Running 100% - Now Post Processing what should i try after above ?
Turn On NCM Job Logs first
If its already enabled take a backup of the folder and delete all the old files files from the folder
Settings > NCM Settings> Advanced Settings > (Enable Scheduled Jobs)
Create a New NCM TEST Job > Add only One NODE (From main poller) and then run the job manually check if you have the failure ?
Now Check the same Job schedule after 10 minute?
What results you have ? Failure .
If you have success ad 10 more nodes into the same job and run it again and so on up to 50 nodes and then test up to 100 nodes in same job
If you have multiple polling engines Create separate job for each of APE and test the same as above
On the Orion Server go to the following location and check the log files .
C:\ProgramData\Application Data\SolarWinds\Logs\Orion\NCM\Logging
Check the log file and see if you have any Errorthere ?
Open Support Ticket
Tips and Tricks on opening a Support Ticket with SolarWinds
NCM Inventory Job is taking to much time / Which nodes are taking long time / Where i can see real time inventory job logs.
Settings > NCM Settings> Advanced Settings > (Enable Scheduled Jobs)
Enable the logging
Now go to this location
C:\ProgramData\Solarwinds\Logs\Orion\NCM\Logging
and you will find two types of logs (General.log ) (This log file will contain live activity what you have on the inventory job )
You will also see the Job logs on same folder where it will show you the remaining nodes processing by the Inventory Job
20-6-2019 09:28:46 - Jobs - ConfigMgmtJob.InventoryComplete on 10.xx.xx.xx Remaining nodes: 9
20-6-2019 09:30:24 - Jobs - ConfigMgmtJob.InventoryComplete on 10.xx.xx.xx Remaining nodes: 8
20-6-2019 09:30:24 - Jobs - ConfigMgmtJob.InventoryComplete on 10.xx.xx.xx Remaining nodes: 7
20-6-2019 09:32:15 - Jobs - ConfigMgmtJob.InventoryComplete on 10.xx.xx.xx Remaining nodes: 6
20-6-2019 09:32:22 - Jobs - ConfigMgmtJob.InventoryComplete on 10.xx.xx.xx Remaining nodes: 5
20-6-2019 09:33:41 - Jobs - ConfigMgmtJob.InventoryComplete on 10.xx.xx.xx Remaining nodes: 4
20-6-2019 09:33:41 - Jobs - ConfigMgmtJob.InventoryComplete on 10.xx.xx.xx Remaining nodes: 3
20-6-2019 09:33:41 - Jobs - ConfigMgmtJob.InventoryComplete on 10.xx.xx.xx Remaining nodes: 2
20-6-2019 09:35:32 - Jobs - ConfigMgmtJob.InventoryComplete on 10.xx.xx.xx Remaining nodes: 1
Please Note: Please check your inventory node for the options under inventory settings and select what you really need
Try not to poll any Rout Table and uncheck them as nodes can have very large tables and this could result delay completion for the Inventory job
How can i troubleshot and isolate the inventory job issue.
Create a brand new NCM invenoty job > Add only 2 - 5 Cisco Nodes in the job >
Only select ARP option to polll
Run the inventory job > Check the General logs and job logs here.
C:\ProgramData\Solarwinds\Logs\Orion\NCM\Logging
Now add more nodes in the same job and check the results and outcome.
Please keep checking (CPU / Memory ) on the Orion NCM Server to make sure it doesn't create any recourse constraint situation there.
Still have any issues with the job please
Open Support Ticket
Tips and Tricks on opening a Support Ticket with SolarWinds
I have few nodes failing downloading config files (Connection Refused ) / (Connection TimeOut) Error what should i check?
Pick one single node and work along to make sure the node in question actually have no issues with connectivity.
(Please make sure you work on the correct polling Engine where the node is assigned for polling if you have multiple pollers)
Checking NCM Profile
Orion web console > Go to this node "target effected node" > Edit Node > NCM Properties check the Connection Profile "Test" if this successful ?
Checking NCM Profile with SSH Auto
Also please on Orion web console > Go to this node "target effected node" > Edit Node > NCM Properties check the Connection Profile
Select SSHAuto >
"Test" if this successful ?
If you have connection failure (Make sure you RDP on the the same Poller where the node is assigned )
Please try this "ConnectionTester.exe" tool from NCM server and let me know the outcome if this failed as well ?
C:\Program Files (x86)\SolarWinds\Orion\NCM\Tools
ConnectionTester.exe
If you are able to connect to the node without any issue and still have the same issue in NCM downloading the configuration
Or
You are able to connect with the device using PUTTY / SSH or NCM Connection Tester however you have failure with NCM when running Connection Profile "Test"
In this case we have to check the Session Trace for the Node.
Enable Session Trace
- Open the Orion Web Console.
- Go to:
7.6 and older: Settings > All Settings > NCM Settings > Advanced Settings
7.7 and newer: Settings > All Settings > Product Specific Settings > CLI Settings - Enable Session Tracing check box.
- Go to the trace log location:
7.6 and older: C:\ProgramData\SolarWinds\Logs\Orion\NCM\Session-Trace
7.7 and newer: C:\ProgramData\SolarWinds\Logs\Orion\CLI\Session-Trace - Delete the trace files.
Now Run the below Test and check the Session Trace log file for the Error (Error will be listed on the bottom of the log file)
Checking NCM Profile
Orion web console > Go to this node "target effected node" > Edit Node > NCM Properties check the Connection Profile "Test"
Checking NCM Profile with SSH Auto
Also please on Orion web console > Go to this node "target effected node" > Edit Node > NCM Properties check the Connection Profile
Select SSHAuto >
"Test"
If you are unable to understand the Error why its failing - You can either search or post the issue on the Thwack or Open Support Ticket and provide us the Session Trace file.
Please do not forget to ZIP the Session Trace file
Open Support Ticket
Tips and Tricks on opening a Support Ticket with SolarWinds
I have nodes failing downloading config files with Error / Error message: connectivity issues, discarding configuration / "show running" on a Cisco switch ( % Invalid input detected at '^' marker. ) ?
Please follow the above steps and also make sure you have enabled mode setup for the device up to # enable level .
Still have the same issue ?
In this case we have to check the Session Trace for the Node.
Enable Session Trace
- Open the Orion Web Console.
- Go to:
7.6 and older: Settings > All Settings > NCM Settings > Advanced Settings
7.7 and newer: Settings > All Settings > Product Specific Settings > CLI Settings - Enable Session Tracing check box.
- Go to the trace log location:
7.6 and older: C:\ProgramData\SolarWinds\Logs\Orion\NCM\Session-Trace
7.7 and newer: C:\ProgramData\SolarWinds\Logs\Orion\CLI\Session-Trace - Delete the trace files.
Now Run the below Test and check the Session Trace log file for the Error (Error will be listed on the bottom of the log file)
If you are unable to understand the Error why its failing - You can either search or post the issue on the Thwack or Open Support Ticket and provide us the Session Trace file.
Please do not forget to ZIP the Session Trace file
Open Support Ticket
Tips and Tricks on opening a Support Ticket with SolarWinds
Problem downloading F5 devices configuration ?
Please use build-in NCM device template for F5 devices.
It is called “F5 Big IP”. Before use it you need to do some pre-configuration on device.
Basically you need to login on device web interface and configure user to login directly into Advanced shell right after connect instead of tmsh as it is right now. Details are there:
With such configuration it will enabled full NCM support of device and you will be able to even download UCF and SCF configs. and you should not have any further issues with the F5 devices.
Nightly Network Inventory failed with some errors
20-6-2019 14:39:59 - Inventory - 10.11.255.253 InventoryNode.SNMPReply: Error: Timeout
20-6-2019 14:39:59 - Inventory - ---------------------------ActiveNodes.InventoryFailure xxxxxxxxxxxxxxxx-192.168.1.1 Failure reason: Does not respond to SNMP queries using 'SNMP_COMMUNITY' Remaining nodes: 40
20-6-2019 14:39:59 - Inventory - 10.11.232.88 InventoryNode.SNMPReply: proceed reply for ,{A57292D6-8797-4CF5-9CE9-FD7DAD71217B},NCM_Entity_Physical,AssetID
20-6-2019 14:39:59 - Inventory - 10.11.232.88 InventoryNode.SNMPReply: OID: 1.3.6.1.2.1.47.1.1.1.1.15.6461 NCM_Entity_Physical:AssetID.6461=
Inventory job showing below logs instead
Job Engine: NCM
1-6-2019 23:00:09 : Started Nightly Network Inventory : JobDescription_10024d4c-9963-41d3-9f9a-fe271ab81f19
Please disable this job > Create brand New Job > Add only one Node (For example Cisco Node )
Deselect Inventory Settings options if not applicable
Run this New inventory job now and then check if you still have the same Error ?
Add more Nodes into the same Job and keep checking the logs .
If you still have the same issue even after following all above steps please
Open Support Ticket
Tips and Tricks on opening a Support Ticket with SolarWinds
NCM Logs and data locations
Where i can see NCM Jobs activity in details ?
You can find the Logs under following location where you can check and track the NCM jobs activity if there is any Error there can be tracked.
C:\ProgramData\SolarWinds\Logs\Orion\NCM
NcmBusinessLayerPlugin
NCM.Collector.Jobs
Default location for CLI and Session Trace Logs
C:\ProgramData\SolarWinds\Logs\Orion\CLI\Session-Trace
C:\ProgramData\SolarWinds\Logs\Orion\CLI
Default Location for NCM ASA Polling
C:\ProgramData\SolarWinds\Logs\Orion\ASA
Default location for NCM vulnerability location
C:\ProgramData\SolarWinds\NCM\Vuln
Default location for config archive
C:\ProgramData\SolarWinds\NCM\Config-Archive
I will include more details in it and case studies please feel free to let me know about your feedback and i will include in this guide.
Related Links.
Troubleshooting NCM performance for jobs /devices downloading configs failure
Troubleshot NCM RealTime Change Notification RTN logs / email issues