Quantcast
Channel: THWACK: Document List - Network Performance Monitor
Viewing all articles
Browse latest Browse all 1956

Troubleshooting NCM performance for jobs /devices downloading configs failure

$
0
0

I will try to address most common issues with the NCM Jobs which you can troubleshot kindly check these areas of application before you open up the support ticket with Solarwinds

 

 

Your check list

NCM System Hardware

Disable Session Trace

Clear Temp Folder

Clear pending reboot

Disable Config Archive

Increase CLI TimeOut

Reduce the amount of the config retain

 

 

1-NCM System Hardware Sockets

Please audit your environment and make sure there is no bottleneck at your side which is causing the issue

How to check the server hardware using Orion Platform diagnostics

Open Task Manager On Orion Server > Performance Tab > CPU - Make sure you have Minimum 4 Sockets available there.

Make sure you are on recommended hardware

For more details please see My  Thwack post below.

https://thwack.solarwinds.com/docs/DOC-190027

 

 

2- Disable Session Trace

In few cases if you are running jobs for large network its not recommended to keep the Session Trace ON as it will consume CPU and Memory on the system also will effect the config  download progress therefor keep this folder clear .

To disable Session Tracing:

  1. Open the Orion Web Console.
  2. Go to:
    7.6 and older: Settings > All Settings > NCM Settings > Advanced Settings
    7.7 and newer: Settings > All Settings > Product Specific Settings > CLI Settings
  3. Clear the Enable Session Tracing check box.
  4. Go to the trace log location:
    7.6 and older: C:\ProgramData\SolarWinds\Logs\Orion\NCM\Session-Trace
    7.7 and newer: C:\ProgramData\SolarWinds\Logs\Orion\CLI\Session-Trace
  5. Delete the trace files.

 

 

 

3-Clare Temp folder

Check disc space on NCM drives as well as SQL server where the NCM DB is stored

Clear the Windows Temp directory for all polling engines in NPM

  1. Log in to the Orion server hosting the Main Polling Engine.
  2. Stop all Orion services.
  3. Disable the Antivirus software running on the system.
  4. Navigate to:
    C:\windows\Temp (including the SolarWinds folder)
  5. Delete all files in the Temp folder.
  6. Restart all Orion services.
  7. Restart the system.
  8. Repeat step 1 through step 7 for all servers hosting an Additional Polling Engine (APE).

 

4- Reboot the NCM Server   (Recommended  windows update pending reboot could cause issues with NCM jobs)

5- Disable Config Archive

Settings > All Settings > Config Settings ( Disable Config Archive)

 

6- Increase CLI TimeOut

Go to the Settings > All Settings > CLI Settings > (Uncheck session Trace ) > increase the timeout values a bit .

 

7-Reduce the amount of the config retain in the NCM .

Off load extra load from the NCM DB will also help to run the NCM jobs faster.

Configs > Jobs > Edit the >Default Purge Configs Job
Follow the Wizard and on Add Job specific Details >  Delete all Configs Except for the last 10 days.

 

 

Most common issues with NCM Jobs area

 

Schedule Job is not running / stuck on 99% or 100%  what should i try after above ?

Turn On NCM Job Logs first

If its already enabled take a backup of the folder and delete all the old files files from the folder

Settings > NCM Settings> Advanced Settings > (Enable Scheduled Jobs)

Create a New NCM TEST Job > Add only One NODE and then run the job manually check if you have the failure  ?

Now Check the same Job schedule after 10 minute?

What results you have ? Failure .

 

If you have success ad 10 more nodes into the same job and run it again and so on up to 50 nodes and then test up to 100 nodes in same job

 

On the Orion Server go to the following location and check the log files .

C:\ProgramData\Application Data\SolarWinds\Logs\Orion\NCM\Logging

Check the log file and see if you have any Error there ?

 

Open Support Ticket

Tips and Tricks on opening a Support Ticket with SolarWinds

 

I have few nodes failing downloading config files (Connection Refused ) / (Connection TimeOut) Error what should i check?

Pick one single node and work along to make sure the node in question actually have no issues with connectivity.

Checking NCM Profile
Orion web console > Go to this node  "target effected node" > Edit Node > NCM Properties check the Connection Profile "Test" if this successful ?

Checking NCM Profile with SSH Auto

Also please on Orion web console > Go to this node  "target effected node" > Edit Node > NCM Properties check the Connection Profile

Select SSHAuto >

"Test" if this successful ?

 

If you have connection failure

Please try this "ConnectionTester.exe" tool from NCM server and let me know the outcome if this failed as well ?

C:\Program Files (x86)\SolarWinds\Orion\NCM\Tools

ConnectionTester.exe

 

If you are able to connect to the node without any issue and still have the same issue in NCM downloading the configuration

Or

You are able to connect with the device using PUTTY / SSH or NCM Connection Tester however you have failure with NCM when running Connection Profile "Test"

In this case we have to check the Session Trace for the Node.

Enable Session Trace

  1. Open the Orion Web Console.
  2. Go to:
    7.6 and older: Settings > All Settings > NCM Settings > Advanced Settings
    7.7 and newer: Settings > All Settings > Product Specific Settings > CLI Settings
  3. Enable Session Tracing check box.
  4. Go to the trace log location:
    7.6 and older: C:\ProgramData\SolarWinds\Logs\Orion\NCM\Session-Trace
    7.7 and newer: C:\ProgramData\SolarWinds\Logs\Orion\CLI\Session-Trace
  5. Delete the trace files.

 

 

Now Run the below Test and check the Session Trace log file for the Error (Error will be listed on the bottom of the log file)

Checking NCM Profile
Orion web console > Go to this node  "target effected node" > Edit Node > NCM Properties check the Connection Profile "Test"

Checking NCM Profile with SSH Auto

Also please on Orion web console > Go to this node  "target effected node" > Edit Node > NCM Properties check the Connection Profile

Select SSHAuto >

"Test"

 

If you are unable to understand the Error why its failing - You can either search or post the issue on the Thwack or Open Support Ticket and provide us the Session Trace file.

Please do not forget to ZIP the Session Trace file

Open Support Ticket

Tips and Tricks on opening a Support Ticket with SolarWinds

 

 

 

Nightly Network Inventory failed with some errors

 

 

Job Engine: NCM

1-6-2019 23:00:09 : Started Nightly Network Inventory : JobDescription_10024d4c-9963-41d3-9f9a-fe271ab81f19

 

Please disable this job > Create brand New Job > Add only one  Node (For example Cisco Node )

Deselect Inventory Settings options if not applicable  

Run this New inventory job now and then check if you still have the same Error ?

 

Add more Nodes into the same Job and keep checking the logs .

 

If you still have the same issue even after following all above steps please

Open Support Ticket

Tips and Tricks on opening a Support Ticket with SolarWinds

 

 

 

NCM Logs and data locations

 

Where i can see NCM Jobs activity in details ?

 

You can find the Logs under following location where you can check and track the NCM jobs activity if there is any Error there can be tracked.

C:\ProgramData\SolarWinds\Logs\Orion\NCM

NcmBusinessLayerPlugin

NCM.Collector.Jobs

 

 

Default location for CLI and Session Trace Logs

C:\ProgramData\SolarWinds\Logs\Orion\CLI\Session-Trace

C:\ProgramData\SolarWinds\Logs\Orion\CLI

 

Default Location for NCM ASA Polling

C:\ProgramData\SolarWinds\Logs\Orion\ASA

 

Default location for NCM vulnerability location 

C:\ProgramData\SolarWinds\NCM\Vuln   

 

Default location for config archive

C:\ProgramData\SolarWinds\NCM\Config-Archive

 

 

I will include more details in it and case studies please feel free to let me know about your feedback and i will include in this guide.


Viewing all articles
Browse latest Browse all 1956

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>