Event Management is an application built in the ServiceNow Platform, being a part of the ITOM suite, that collects all events from a variety of monitoring systems into 1 place and then does the magic with the data to produce actionable alerts and incidents. Its great abilities bring organizations to the next level of IT operations with ServiceNow.
The Event Management feature can cooperate with many monitoring tools often already installed in organizations, such as:
- HP Operations Manager,
- HP OMi,
- IBM Netcool/OMNIbus,
- Microsoft SCOM (also metric collection in Operational Metrics),
- Nagios XI,
- OP5 NEW,
- Opsview NEW,
- PRTG NEW,
- SolarWinds NPM and SAM,
- VMware Hyperic,
- VMware vRealize Operations,
- Amazon Web Services,
- BMC TrueSight,
- Oracle Enterprise Manager,
- REST (ideal for custom integrations),
- Web Services,
- SNMP Traps (very useful for custom integrations),
- Email (integration method of last resort),
- Datadog NEW,
How Event Management works
Event Management applies MID Server to collect events from the infrastructure using connectors to the third party monitoring tools. Event is a notification from a CI or a cloud that IT team should be aware of. Events are categorized as:
- Critical: Immediate action is required. The resource is either not functional or critical problems are imminent.
- Major: Major functionality is severely impaired or performance has degraded.
- Minor: Partial, non-critical loss of functionality or performance degradation occurred.
- Warning: Attention is required, even though the resource is still functional.
- Info: An alert is created. The resource is still functional.
- Clear: No action is required. An alert is not created from this event. Existing alerts are closed.
Events can be collected in 2 different ways:
The MID Server can use connectors to pull the events from third party monitoring tools or the monitoring tools can push the data to the instance. When the events are pushed to the instance some listeners have to be configured either on the MID Server or directly on the instance as a Web Service. The monitoring tools can also send events to the instance by email.
Event Management Monitoring
Apart of events, also metrics are collected. Metric is a measure of operating characteristic for a device over a certain time period like CPU or memory usage. Events and metrics are collected by the platform to help IT team to monitor health, prevent outages and resolve issues quickly with a minimum impact on the operations.
Collected events can trigger an alert to notify the IT team about the issue. This is done by defining event rules that map event fields to the alert fields as well as bind specific CI to the alert. Event management uses Operational Intelligence to filter out certain alerts by applying thresholds, group the alerts and identify primary alert.
Finally, Event Management uses alert rules to initiate remediation actions like creating an incident, launching a remediation workflow or recommending a knowledge article. The app calculates an impact on CIs, and Operational Intelligence normalizes metrics and identifies anomalies which are sent to the instance for further processing.
Event Management Dashboard and how it’s used
Event Management Dashboard is the way the events are monitored. The dashboard can represent business services or alert groups:
- Size of a tile represents priority of a service or a group,
- Color indicates the highest severity alert in the group or a service,
- Alerts related to a particular group or a service are visible at the bottom of the dashboard.
The topology map is available for a service when it’s double clicked on the dashboard. The topology map shows the impact tree and relationships as well as related alerts down below. The impact history scroll bar can be used to move back over time to see the historical data.
Event Management overview is another way to see the health status of the infrastructure.
Anomaly Events & Operational Intelligence
Operational Intelligence is also used for collecting performance metrics from the system. It uses event rules to bind CIs to the metrics. Metric data is normalized by the application of statistical model. Thanks to that a normal range of value is recognized.
All metric values are checked against the model and values outside the range are identified as anomalies to be sent to the instance as an anomaly event. On the instance, the event rules create anomaly alert based on anomaly events.
The anomaly alert can initiate creation of an incident, launching a remediation workflow or recommending a knowledge article.
Anomalies can be reviewed with Anomaly Map module. Anomaly Map gives an overview of anomalies for individual CIs. The chart shows anomaly history. The color of tiles represents the highest anomaly score for the selected CI at that time.
Metric Explorer is a very useful module to review the metrics and, what’s more, to build. It allows to see the metrics on a single pane or to correlate multiple metrics or see the anomaly scores and anomaly bounds on the chart.
Benefits of Event Management & Operational Intelligence
Event Management together with Operational Intelligence help shared services organizations centralize the IT operations by delivering single pane of glass of business service health, and solve problems much faster. The benefits are:
- Reduction of MTTR (Mean Time To Repair),
- Identification of root cause,
- Increase in the existing tools value,
- Improvement in service availability.
If you need any support in the area of IT Operations Management in your organization, we’re here to help!