Graph Studio

Graph Studio is the platform’s identity graph building tool. It provides two builders — Edge Builder and Graph Builder — that work together to transform your raw data into a connected identity graph.

How it works

Building an identity graph is a two-step process:

Define edges — Tell the platform which identifiers in your data should be used to connect records. Two records that share the same identifier value (like the same email address) are linked together.
Build the graph — The platform runs a connected components algorithm over your edges to discover which records belong to the same person or household, resolving transitive connections across multiple hops.

Edges

An edge connects a source record to a target identifier. The Edge Builder creates an edges dataset by combining your data sources with the identity attributes you choose as connection points.

Key concepts

Term	Description	Example
Source ID type	A label you choose to identify which system a record comes from	`CRM`, `Website`, `Partner`
Source ID	The field that uniquely identifies each record in that system	`CUSTOMER_ID`
Target ID type	The category of identifier used as a connection point — always a Rosetta Stone attribute	`normalized_email`, `clear_text_e164_phone_number`
Target ID	The actual identifier value for a given record, derived from the attribute mapping	`[email protected]`, `+15705551234`

Each source record produces one edge per target ID group. When two records from any source share the same target ID value, the graph recognizes them as connected.

Target ID groups

Target IDs are organized into groups. Each group defines one type of connection. A group can contain a single attribute or multiple attributes — when a group has multiple attributes, all values in the group must match for two records to be connected. For example:

A group with just normalized email connects any two records sharing the same email — high confidence, since email is typically unique to a person
A group with phone number + first name requires both values to match, which is more precise than phone alone — useful when a phone number might be shared across a household

You can define multiple target ID groups to give the graph different ways to find connections. The algorithm considers all groups when resolving identities.

Data sources

The Edge Builder accepts two types of sources:

Datasets (first-party) — Your own data, mapped to Rosetta Stone attributes
Access rules (third-party) — Data shared with you by other companies. Third-party sources introduce connections that your first-party data cannot see on its own.

Graph

The Graph Builder takes one or more edges datasets and runs a Label Connected Components algorithm. It follows connections between records — including transitive chains — and groups every connected record into a single identity.

Algorithm parameters

Parameter	Default	Description
Max Component Size	100	Caps how many records can merge into one identity. Prevents over-connection.
Max Iterations	10	How many passes the algorithm makes to resolve transitive chains.
Max Degree Threshold	100	Excludes nodes with too many connections (e.g., shared corporate emails) to avoid merging unrelated records.

The defaults work well for most use cases.

Output

The graph produces a dataset where each record is assigned a component ID — all records with the same component ID belong to the same resolved identity. You can join this back to your original data for analytics, segmentation, and activation.

Automation

You can run the graph build once or set a refresh schedule to keep it current as source data changes. You can also optionally encode identifiers in the output using your company’s encryption material.

Source eligibility

Not every dataset or access rule appears in the Graph Studio source pickers. Each builder only lists sources that have been prepared for the step you’re on — the Edge Builder lists sources that are ready to become edges, and the Graph Builder lists the edges datasets that are ready to be resolved into a graph. If a source you expect is missing, it hasn’t met the requirement for that list. Under the hood, eligibility is driven by dataset and access-rule tags and Rosetta Stone attribute mappings:

Builder	Source list	Requirement
Edge Builder	First-party datasets	Dataset carries the `_nio_ci_components` tag
Edge Builder	Third-party access rules	Access rule carries the `_nio_ci_components` tag
Graph Builder	First-party input datasets	Dataset is mapped to the `graph_edge` Rosetta Stone attribute (this is what an edges dataset produced by the Edge Builder looks like)
Graph Builder	Third-party access rules	Access rule carries both the `_nio_ci_components` tag and a `_nio_ci_source_id_type:<name>` tag (for example `_nio_ci_source_id_type:CUSTOMER_ID`)

The _nio_ci_* tags are managed on the dataset or access rule itself. First-party data becomes edge-eligible when it’s tagged _nio_ci_components; the Edge Builder’s output edges datasets are mapped to graph_edge, which is what makes them appear in the Graph Builder. If a source is missing from a picker, check that it carries the tag or mapping listed above.

Data plane support

Graph Studio runs on both Snowflake and AWS data planes. Both the Edge Builder and the Graph Builder are available in either environment — the Graph Builder uses the platform’s LabelConnectedComponents workflow task for step 2, which is cross-platform.

Identity Graphs

How connected components and graph structure unify identifiers

Building an Identity Graph

Step-by-step guide to creating your first graph

Graph Enrichment

Strengthen graph structure with third-party linkage data

Mapping Schemas

Map your data to Rosetta Stone attributes

Overview

Core Primitives

Rosetta Stone

NQL

Data Formats

Identifiers

Architecture

Workflows

Webhooks

Data Collaboration MCP Server

Model Inference

Security

Compliance

Data Activation

Apps

How it works

Edges

Key concepts

Target ID groups

Data sources

Graph

Algorithm parameters

Output

Automation

Source eligibility

Data plane support

Identity Graphs

Building an Identity Graph

Graph Enrichment

Mapping Schemas

​How it works

​Edges

​Key concepts

​Target ID groups

​Data sources

​Graph

​Algorithm parameters

​Output

​Automation

​Source eligibility

​Data plane support

​Related content

Identity Graphs

Building an Identity Graph

Graph Enrichment

Mapping Schemas

How it works

Edges

Key concepts

Target ID groups

Data sources

Graph

Algorithm parameters

Output

Automation

Source eligibility

Data plane support

Related content