Data Ingestion

Overview

A healthy Configuration Management Database (CMDB) depends on accurate, automated, and consistent data ingestion. ServiceNow provides multiple methods for populating the CMDB, ranging from native discovery tools to modern ETL frameworks and legacy import processes.

This guide outlines the primary ingestion methods and highlights best practices for maintaining data integrity using the Identification and Reconciliation Engine (IRE).

CMDB Ingestion Methods

1. ServiceNow Discovery

ServiceNow Discovery is an agentless, IP-based solution that automatically identifies infrastructure components within a network. This type of discover is also referred to as horizontal discovery because it focuses on discovering devices and their relationships across the infrastructure layer.

Capabilities

Identifies devices such as routers, switches, and servers
Discovers relationships and dependencies between configuration items (CIs)

Discovery Phases

Scan
Identifies active IP addresses via network probing.
Classification
Determines the device type via patterns (e.g., Windows, Linux, network device).
Identification
Checks whether the CI already exists in the CMDB. If it does, the record is updated; if not, a new CI is created.
Exploration
Collects detailed attributes and relationship data. Will complete more targeted discovery patterns based on the device such as WMI for Windows or SNMP for network devices.

ServiceNow Discovery Image

2. Service Mapping

Service Mapping provides a top-down view of business services, mapping how infrastructure components support specific applications or services.

Builds service maps dynamically
Updates relationships as the environment changes
Improves visibility into service dependencies and impact analysis

ServiceNow Service Mapping Image

3. Agent Client Collector (ACC)

The Agent Client Collector (ACC) uses an installed agent to provide detailed, real-time visibility into endpoint systems.

Key Benefits

Collects software usage data for Software Asset Management (SAM)
Captures detailed hardware metrics (CPU, RAM, disk)
Provides configuration data such as OS patches and network settings

ServiceNow ACC Image

4. Service Graph Connectors

Service Graph Connectors are pre-built integrations that ingest data from external systems such as Azure, SCCM/MECM, AWS, and Tanium.

Standardized ingestion aligned with CMDB data models and CSDM
Maintains consistency across multiple data sources
Reduces the need for custom integration logic

ServiceNow SGC Image

IntegrationHub ETL (IH-ETL)

IntegrationHub ETL (IH-ETL) is a modern framework for importing and transforming external data into ServiceNow, with a strong focus on CMDB ingestion.

Key Features

Data Transformation
Normalizes incoming data to align with the ServiceNow data model
Automation
Reduces manual intervention and ensures consistent ingestion
IRE Integration
Natively leverages the Identification and Reconciliation Engine to prevent duplicate records

Import Sets & Transform Maps

Import Sets are the legacy method for importing external data into ServiceNow.

Standard Import Process

Data Staging
Data is loaded into an Import Set table from sources such as CSV, Excel, or APIs
Field Mapping
Fields from the import table are mapped to target table fields
Transform Execution
Data is processed and inserted into the target table using mappings and scripts

ServiceNow Transform Mapping

IH-ETL vs. Legacy Transform Maps

Feature	IntegrationHub ETL	Legacy Transform Maps
Ease of Use	Guided and user-friendly	Requires manual mapping and scripting
Data Integrity	Native IRE support	Uses coalesce fields by default
Targeting	Optimized for CMDB/CSDM	General-purpose ingestion
Logic Handling	Built-in transformation functions	Relies on transform scripts

Using the IRE with Transform Maps

By default, Transform Maps rely on coalesce fields to determine uniqueness. This approach can lead to duplicate records when unique identifiers are inconsistent or distributed across multiple fields.

Problem Scenario

In many real-world datasets:

Some records may contain a serial number
Others may only contain a hostname
Not all records consistently populate both fields

Using coalesce in this scenario is problematic because:

Coalesce expects a single consistent unique identifier
Missing values prevent proper matching
This can result in duplicate CI creation

Solution: Use the IRE

The Identification and Reconciliation Engine (IRE) uses CI Identifiers, which follow a prioritized matching approach:

Attempt to match using the highest-priority identifier (e.g., serial number)
If unavailable, fall back to secondary identifiers (e.g., name)

This ensures accurate CI matching even when data is incomplete or inconsistent.

Implementing IRE in a Transform Map

To use the IRE with a Transform Map, an onBefore script must be added to call the CMDBTransformUtil API.

This bypasses the need for the coalesce logic and ensures all records are processed through the IRE.

OnBefore Transform Script

(function runTransformScript(source, map, log, target) {
    // Call CMDB API to perform Identification and Reconciliation
    var cmdbUtil = new CMDBTransformUtil();
    cmdbUtil.identifyAndReconcile(source, map, log);
    
    // Prevent standard transform processing
    ignore = true;
})(source, map, log, target);

Undiscoverable Data

Undiscoverable data is a critical component in a CMDB. This type of data is considered undiscoverable because it cannot be collected through automated discovery tools or standard ingestion methods. Common examples include organizational and contextual information such as groups and locations. In some cases, data from Service Graph Connectors may provide portions of this information, depending on the external data source being integrated. However, this is not always guaranteed, and manual input or governance is often required.

Purpose and Importance

Undiscoverable data plays a key role in adding business context to Configuration Items (CIs):

Groups are essential for operational workflows, particularly for task assignment and work routing.
Locations provide physical context for assets, enabling use cases such as cost allocation, asset tracking, and geographically-based support.

In some cases, default values for this data can be applied at the class level. For example, within the Class Manager, a Managed By Group can be assigned at the Computer class level to be the LAN group. This configuration ensures that all CIs within that class automatically inherit the specified group, improving consistency and reducing manual effort.

Group Field Usage

Managed By Group: Used to streamline the assignment of Data Manager tasks, including attestation, lifecycle management, and certification policies.
Support Group: Maps to the Incident Assignment Group, enabling routing of incidents to the appropriate support teams.
Change Group: Maps to the Change Assignment Group, ensuring changes are directed to the correct group for review and implementation.

Summary

ServiceNow offers a suite of tools for CMDB data ingestion, each designed to meet specific architectural and business needs:

Automated Visibility: Discovery and Service Mapping provide agentless infrastructure visibility and top-down dependency mapping for business services, helping to maintain an accurate and dynamic CMDB.
Endpoint Insights: The Agent Client Collector (ACC) delivers detailed, real-time data from endpoint systems, enhancing the granularity of CMDB records and supporting software asset management. Additionally allows the support of running commands for remediation or data collection purposes.
Modern Integrations: Service Graph Connectors and IntegrationHub ETL (IH-ETL) offer structured, scalable frameworks for synchronizing data from third-party platforms like Azure, AWS, and SCCM while ensuring CSDM compliance and without the need to build custom integrations.
Legacy & Custom Workflows: Import Sets and Transform Maps remain available for handling flat files or unique custom ingestion requirements, though they require manual configuration to maintain data integrity.

Regardless of the chosen method, leveraging the Identification and Reconciliation Engine (IRE) is critical to maintaining a “single source of truth” by preventing duplicate configuration items and managing data precedence. While no single tool fits every scenario, the optimal strategy depends on the frequency of updates, the complexity of the data source, and the specific governance requirements of your organization.

Last updated on April 1, 2026

Data Governance Data Insight