Glossary

This glossary defines key terms and concepts used throughout the Narrative I/O documentation and platform.

A

Addressability expansion

The practice of appending additional identifiers to existing identity graph records to improve match rates in downstream activation platforms, without changing the graph’s structure. Distinct from graph enrichment, which adds edges between nodes to improve resolution. Related: Addressability Expansion, Identity Graphs, Graph Enrichment

Access rules

Business rules that define how other organizations can access data within the Narrative platform. Access rules give data owners fine-grained control over who can query their data, what records and fields are accessible, and how much access costs. Every dataset requires at least one access rule before other organizations can query it. Access rules are enforced at query execution time by the control plane and data planes. They support flexible pricing (including free access for non-commercial sharing), partner-specific permissions, and field-level controls. Related: Access Rules, Security Model

API key

A long-lived authentication token for programmatic access to the Narrative API. Each API key is scoped with permissions that control which resources it can access and at what level (Read or Write). Keys support expiration dates and should be rotated regularly. Related: API Keys, API Key Permissions, Permissions Reference

Attribute lineage

The automatic tracking and preservation of Rosetta Stone attribute mappings when creating materialized views. When source datasets have existing attribute mappings, lineage ensures these mappings persist in derived views without requiring manual re-mapping. Mappings are preserved when selecting unaltered attributes, querying from existing materialized views, or selecting all fields that constitute an attribute mapping. Related: Materialized Views, How Rosetta Stone Works

Attributes

Standardized field definitions in Rosetta Stone that form the common schema for data normalization. Each attribute has an immutable numeric ID, a name, description, type, and optional validations. Attributes can be global (platform-wide standards like hl7_gender) or organization-specific (scoped to one organization and its collaborators). Also referred to as normalized attributes or Rosetta Stone attributes. Related: How Rosetta Stone Works, Creating Normalized Attributes, Attribute Types

B

Base model

A pre-trained foundation model that serves as the starting point for fine-tuning. Base models have been trained on large datasets to develop general capabilities, which can then be specialized for specific tasks or domains through fine-tuning with custom training data. In Model Studio, you select a base model before providing your training dataset. Related: Model Studio Interface, Fine-tuning

C

CCPA (California Consumer Privacy Act)

A comprehensive privacy law that gives California residents rights over their personal information, including the rights to know what data is collected, to delete it, to opt-out of its sale, and to non-discrimination for exercising these rights. As amended by CPRA, it applies to for-profit businesses meeting certain revenue or data volume thresholds. Related: CCPA, GDPR

Chunking

An automatic query execution strategy that splits large dataset scans into smaller, time-bounded segments. Chunking improves stability and cost efficiency for queries over very large datasets (tens of gigabytes to hundreds of terabytes) by limiting the blast radius of failures and enabling targeted retries. The platform determines when chunking applies based on query structure—no user configuration is required. Related: Chunking, Query Processing, Materialized Views

Cardinality

The number of distinct values in a column. Approximate cardinality uses probabilistic algorithms (HyperLogLog) for efficiency on large datasets, while exact cardinality counts every unique value. Related: Dataset Statistics, Dataset Statistics Reference

Committed-Usage

A pricing model that provides discounted usage rates when you commit to a set amount of platform usage over a period of time. Unlike pay-as-you-go pricing, Committed-Usage requires a contractual agreement and offers rates negotiated with Narrative’s sales team. Related: Pricing, Processing Fee, Transfer Fee In privacy law, a data subject’s freely given, specific, informed, and unambiguous indication of agreement to the processing of their personal data. Under GDPR, consent is one of six legal bases for processing. Under CCPA, consent is typically required for selling personal information of minors. Related: Legal basis, GDPR

Control plane

The centralized orchestration layer that manages metadata, enforces permissions, and coordinates queries across data planes. The control plane never touches raw data directly—it only handles the coordination and governance aspects of data collaboration. Related: Control Plane Architecture

Context selector

The UI element in the platform’s top navigation that lets you view and change your execution context—the combination of data plane, compute pool, database, and schema that controls where operations run and what data is visible. Related: Using the Context Selector, Execution Context The process of sharing cookie-based identifiers between domains by passing IDs as query parameters. Because web browsers enforce same-origin policies that prevent one domain from reading another domain’s cookies, cookie syncing creates a shared understanding of user identity across separate cookie namespaces. Related: Data Collection Endpoint

Connected component

In an identity graph, a cluster of identifiers that the system has determined belong to the same individual or household. Every identifier in a connected component is reachable from every other through some path of linkages. Related: Identity Graphs, ID Mapping

Company ID

A unique identifier assigned to each partner or customer organization on the Narrative data collaboration platform. Company IDs are used internally to identify organizations in API calls, integrations, and data collaboration workflows. To find the company name associated with a Company ID, or to determine your own organization’s Company ID, contact your Narrative partner success representative or reach out to support. Related: Data Collection Endpoint

Completeness

The ratio of non-null values to total rows for a column, expressed as a value between 0 and 1. A completeness of 1.0 means every row has a value; 0.0 means the column is entirely null. Related: Dataset Statistics, Dataset Statistics Reference

Confidence score

A numerical rating (0-100%) generated by AI that indicates the likely accuracy of a Rosetta Stone mapping. Higher scores suggest the mapping is more reliable. Confidence scores are calculated based on column name analysis, data sample inspection, and transformation logic evaluation. Scores are grouped into tiers: high (80-100%), medium (50-79%), and low (0-49%). Related: Confidence Scoring, Managing Evaluations

Compute pool

A compute resource allocation that determines the processing power available for query execution within a data plane. Compute pools come in three types: dedicated (isolated resources), shared (pooled resources), and default (Snowflake-managed). The available compute pool options depend on your data plane’s underlying provider. Related: Compute Pools, Execution Context

Connector

A pre-built integration that connects Narrative to an external destination platform. Connectors handle the complexities of data delivery including format translation, identity matching, API integration, and audience management. Each connector is specific to a destination (e.g., The Trade Desk Connector, Meta Connector). Related: Data Activation, Destination, Connector Reference

CPRA (California Privacy Rights Act)

An amendment to CCPA that took effect January 1, 2023, strengthening California’s privacy law by adding new consumer rights (correction, limiting sensitive data use), creating the California Privacy Protection Agency (CPPA) for enforcement, and expanding business obligations. Related: CCPA, CCPA Concepts

D

Data collaboration

The process by which multiple organizations share, query, and analyze each other’s data while maintaining control and compliance. Data collaboration in Narrative enables organizations to unlock value from combined datasets without sacrificing data governance.

Data controller

Under GDPR, an entity that determines the purposes and means of processing personal data. The controller decides what data to collect, why to collect it, and how it will be used. Controllers bear primary responsibility for compliance with data protection principles. Related: Data processor, GDPR

Data crosswalk

A method for linking or matching records across different data sources without exposing the underlying identifiers. In privacy-safe implementations, crosswalks use pseudonymization techniques like hashing to enable record matching while protecting individual privacy. Data crosswalks are essential for data collaboration scenarios where organizations need to find common records (such as overlapping customers) without sharing raw PII. Related: Data Pseudonymization, Hashing

Data enrichment

The process of enhancing existing customer or audience data with additional attributes from external sources. Common enrichment attributes include demographics (age, gender), location data, purchase behavior, and interest signals. In Narrative, enrichment is performed by joining seed datasets with provider data through Rosetta Stone identity matching. Unlike per-use data licensing models, enriched data can typically be used across multiple platforms and use cases under an omni-use license. Related: Data collaboration, Demographic Enrichment

Data plane

The infrastructure where data physically resides and where queries execute. Data planes can be hosted by Narrative (Narrative-hosted) or deployed within a customer’s own infrastructure (customer-hosted) for maximum control over data residency. Related: Data Planes

Data portability

A data subject right under GDPR allowing individuals to receive their personal data in a structured, commonly used, machine-readable format and to transmit that data to another controller. This right supports individuals’ ability to switch between service providers. Related: GDPR

Data processor

Under GDPR, an entity that processes personal data on behalf of a data controller. Processors act only on the controller’s documented instructions and must maintain appropriate security measures. Service providers under CCPA serve a similar role. Related: Data controller, GDPR

Data Processing Agreement (DPA)

A legally binding contract between a data controller and data processor that governs the processing of personal data. Required under GDPR, DPAs specify the subject matter, duration, nature, and purpose of processing, as well as the processor’s obligations regarding data security, sub-processors, and data subject rights. Related: Data controller, Data processor, GDPR

Data products

Curated, packaged datasets designed for sharing with partner organizations. Data products include relevant metadata, access controls, and documentation to enable effective data collaboration.

Data subject

Under GDPR, an identified or identifiable natural person whose personal data is being processed. Data subjects have various rights including access, rectification, erasure, and data portability. Related: GDPR

Data Subject Request (DSR)

A formal request from a data subject exercising their rights under privacy laws such as GDPR or CCPA. DSRs include opt-out requests (where the data subject asks to be removed from marketing or advertising) and data erasure requests (where the data subject requests complete deletion of their personal data). In Narrative, DSRs are submitted by mapping identifiers to the Data Privacy Request Identifier attribute, which automatically excludes those identifiers from queries and propagates requests to downstream data collaborators. Related: Data subject, Managing Data Subject Requests, GDPR, CCPA

Dataset statistics

Column-level metrics computed over dataset contents, including counts, bounds, distributions, and quality indicators. Statistics are computed per column and vary by data type—numeric columns support the full set of metrics, while complex types like arrays support only basic counts and storage metrics. Related: Dataset Statistics, Dataset Statistics Reference

Dataset

A structured collection of data registered in Narrative with a defined schema. Datasets function like database tables—they have fields (columns), hold records (rows), and can be queried using NQL. Each dataset is owned by a single company, scoped to a specific data plane, and can be shared with other organizations through access rules. Related: Datasets, Managing Datasets

Data sample

A preview of up to 1,000 rows from a dataset, retrieved via a sampling job and stored in the control plane for quick access through the API or UI. Samples are used to view interactive query results and inspect dataset contents without downloading the full dataset from the data plane. Samples can be requested on demand and cleared when no longer needed. Related: Data Flow, Managing Datasets

Device alias

A data field that provides details about a user’s device, broken out into make, model, and operating system. Device alias data helps categorize user devices and is collected through SDKs or real-time bidding environments. Examples include “iPhone X, iOS 13.3.1” or “Pixel 3, Android 10.0”.

Encoding space

The per-partner namespace for Narrative IDs. The same clear text identifier encoded for different partners produces different Narrative IDs, ensuring that identifiers cannot be correlated across partners without going through Narrative’s translation functionality. Related: Narrative ID, Using Narrative ID

E

Execution context

The combination of data plane, compute pool, database, and schema that determines where and how platform operations execute. Your execution context controls which datasets are visible, where queries run, and what compute resources are used. The context selector in the platform’s top navigation lets you view and change these settings. Related: Execution Context, Compute Pools, Using the Context Selector

F

Fine-tuning

The process of training a pre-trained AI model on a specific dataset to adapt its behavior for particular tasks or domains. Fine-tuning adjusts the model’s weights based on your training examples, enabling the model to perform better on your use case while retaining its general capabilities. Narrative’s Model Studio provides an interface for fine-tuning models using datasets prepared in Prompt Studio. Related: Model Studio Interface, Prompt Studio Interface, Base model

fine_tuning_conversation

A Rosetta Stone attribute format for training data used in AI model fine-tuning. Each row represents a structured conversation with system, user, and assistant roles. Datasets must be mapped to this attribute and materialized before use in Model Studio. Related: Model Studio Interface, Prompt Studio Interface

G

Graph enrichment

The practice of using third-party identity data to create or strengthen connections between nodes in an identity graph, improving person-level and household-level resolution. Distinct from addressability expansion, which appends identifiers without changing graph structure. Related: Graph Enrichment, Identity Graphs, Addressability Expansion The European Union’s comprehensive data protection law that governs the collection, processing, and storage of personal data for individuals in the EU. Effective since May 2018, GDPR establishes data subject rights, requires legal bases for processing, mandates breach notification, and imposes significant penalties for non-compliance. It has influenced privacy legislation worldwide. Related: GDPR, CCPA

H

Hash function

A one-way mathematical algorithm that converts input data (like an email address) into a fixed-length string of characters. Hash functions are deterministic (same input always produces same output), non-reversible (you cannot recover the original input from the hash), and collision-resistant (different inputs produce different outputs). Common hash functions include MD5, SHA-1, and SHA-256. Related: Data Pseudonymization, Hashing PII for Upload

Histogram

A frequency distribution showing how values in a column are distributed across distinct buckets. In the context of dataset statistics, histograms are computed per column and power UI features like filter dropdowns in Data Studio. Related: Dataset Statistics, Dataset Statistics Reference

Hash join

An efficient join algorithm where the query planner builds a hash table from one table’s join keys and probes it with the other table’s keys. Hash joins require single-key equality conditions and are significantly faster than nested loop joins for large datasets. Related: Understanding JOIN Performance

I

Identity graph

A data structure that connects identifiers (emails, device IDs, phone numbers, cookies) across devices, channels, and platforms into unified profiles. Each cluster of connected identifiers (a connected component) represents an individual or household. Related: Identity Graphs, Graph Enrichment, Addressability Expansion

Identity provider (IdP)

An enterprise system (such as Okta, Azure AD, or OneLogin) that manages user authentication and maintains the authoritative record of user identities and credentials. Organizations can integrate their IdP with Narrative for single sign-on access. Related: SSO Configuration, SSO Concepts

Interactive query

A query executed through the Query Editor or NQL API that stores results as a temporary dataset. Interactive queries are implemented as materialized views with a 24-hour retention policy and an automatic row limit. Users view results through data sampling rather than receiving the full result set directly. Related: Query Processing, Data Flow, Write Your First Query

Incremental View Maintenance (IVM)

An optimization technique that updates materialized views by processing only changed data rather than recomputing the entire result. IVM dramatically reduces refresh time for views over large datasets with relatively small changes between refreshes. Related: Incremental View Maintenance, Materialized view

Inference configuration

A set of parameters that control how an LLM processes a Model Inference request. Configuration options include output_format_schema (JSON Schema defining the expected response structure), max_tokens (maximum response length), temperature (response randomness from 0-1), top_p (nucleus sampling parameter), and stop_sequences (tokens that terminate generation). Related: Model Inference Overview, Running Model Inference

Idempotency key

A unique identifier included in each webhook subscription event delivery. If the same event is delivered more than once, the idempotency key remains the same, allowing your endpoint to detect and discard duplicate notifications. Related: Webhook Event Reference, Subscribing to Notifications

Inference job

A job type that submits a prompt to an LLM hosted within a data plane and returns a structured output. Unlike external AI API calls, inference jobs keep data within customer infrastructure—no data is sent to external model providers. The job returns token usage statistics and a response conforming to the specified JSON Schema. Related: Model Inference Overview, Job Types

J

Job queue

A control plane component that coordinates work between the control plane and data planes. The job queue holds various types of jobs—including compiled queries, dataset operations, and system tasks—that data plane operators poll and execute. This pull-based architecture ensures the control plane never needs direct access to data plane infrastructure. Related: Query Processing, Operator, Job Types

Just-in-time provisioning (JIT)

The automatic creation of user accounts on first login, eliminating the need for manual account creation before SSO access. When enabled, Narrative creates user accounts automatically when users authenticate through SSO for the first time. Related: SSO Configuration

L

Legal basis

Under GDPR, a lawful ground that permits the processing of personal data. GDPR recognizes six legal bases: consent, contract performance, legal obligation, vital interests, public task, and legitimate interests. Organizations must identify and document a valid legal basis before processing personal data. Related: Consent, GDPR

Licensed Data Fees

Fees paid to Data Licensors for the licensing of data transacted through Narrative’s platform. These fees are set by and paid to the Data Licensor (the party providing the data), not to Narrative. Narrative charges a Transaction Services Fee as a percentage of the Licensed Data Fees. Related: Pricing, Transaction Services Fee

M

Mapping (Rosetta Stone)

A translation rule that associates a Rosetta Stone attribute with a specific column in a dataset. Mappings can include transformation expressions to convert source data formats to the standard attribute format. Each mapping connects a source column to a target attribute and optionally includes NQL logic to transform values. Related: How Rosetta Stone Works, Mapping Schemas

Mapping evaluation

The AI-powered analysis process that assesses the quality of existing Rosetta Stone mappings. Evaluations generate confidence scores for each mapping and identify potential issues such as missing transformation cases, type mismatches, or edge case handling. Run evaluations to understand the overall health of your normalizations. Related: Confidence Scoring, Managing Evaluations

Mapping suggestion

An AI-generated recommendation for mapping a dataset column to a Rosetta Stone attribute. Suggestions include the proposed transformation expression, confidence score, reasoning, and sample output preview showing before/after values. Users can accept suggestions to create mappings or reject them if they don’t fit the use case. Related: Accepting AI Suggestions

Marketplace

The Narrative platform’s catalog where organizations discover available datasets from other organizations. The marketplace enables data discovery and facilitates data collaboration relationships.

Mobile Advertising ID (MAID)

A unique pseudo-anonymous identifier tied to a mobile device, provided by the mobile operating system. iOS calls this the Identifier For Advertisers (IDFA), while Android calls it the Advertising ID (GAID or Ad ID). Both consist of 32 hyphen-separated characters. MAIDs are resettable and users can opt out of their collection. Related: Mobile Ad IDs

Match table

A pre-generated table that pairs Narrative IDs with clear text identifiers, allowing partners to use Narrative IDs for matching independently of the Narrative platform. Match tables enable offline workflows while maintaining privacy through the encoding. Related: Narrative ID, Using Narrative ID

MD5

Message-Digest Algorithm 5—a widely-used hash function that produces a 128-bit (32-character hexadecimal) hash value. While cryptographically broken for security purposes, MD5 remains acceptable for non-security applications like identifier matching in data collaboration. Narrative supports MD5 alongside SHA-1 and SHA-256 for hashing PII. Related: Hash function, Hashing PII for Upload

Materialized view

A pre-computed query result stored as a dataset. Unlike executing a query each time, a materialized view stores results physically for faster access. Views can be refreshed on a schedule or manually to incorporate changes in underlying data. Related: Materialized Views, Creating Materialized Views, Materialized View Syntax

Model Inference

A job type that enables AI-powered operations by running LLM inference within a customer’s data plane. Model Inference supports multiple models from Anthropic (Claude Haiku, Sonnet, Opus) and OpenAI (GPT-4.1, o4-mini) and guarantees structured output via JSON Schema. Because inference runs within the data plane, data never leaves customer-controlled infrastructure—no external API calls are made to model providers. Related: Model Inference Overview, Running Model Inference, Supported Models

Model Studio

A UI tool for training and fine-tuning AI models using datasets within Narrative’s platform. Model Studio integrates dataset selection, base model selection, and compute resource configuration into a unified workflow. Training datasets must be prepared in the fine_tuning_conversation format using Prompt Studio before use. Related: Model Studio Interface, Prompt Studio, Fine-tuning

N

Native Apps

Pre-built applications within the Narrative platform that perform specific data collaboration tasks. Native Apps provide ready-to-use functionality for common workflows without requiring custom development. Users can launch and configure Native Apps through the Web UI to accomplish tasks like data ingestion, transformation, and activation. Related: Web UI

Nested loop join

A join algorithm that iterates through each row of one table and, for each row, scans the other table for matches. Less efficient than hash joins, nested loops are often the fallback when query conditions prevent hash join optimization, such as when OR operators appear in JOIN clauses. Related: Understanding JOIN Performance

Narrative ID

A secure, pseudonymous identifier derived from a clear text identifier (such as an email or hashed email) through Narrative’s secure encoding methodology. Unlike simple hashing, Narrative IDs are encoded per-partner, meaning the same identifier produces different Narrative IDs for different partners. This enables privacy-safe data collaboration without exposing underlying identifiers. Related: Narrative ID Concepts, Using Narrative ID, Pseudonymization

Normalized dataset

A dataset with active Rosetta Stone mappings, making its data queryable through the narrative.rosetta_stone table. Normalized datasets appear in the Normalized Datasets interface where you can evaluate mapping quality using AI-powered confidence scoring and review AI-generated mapping suggestions. Related: Rosetta Stone Overview, Managing Evaluations

NQL (Narrative Query Language)

Narrative’s proprietary query language designed specifically for data collaboration and normalization. NQL is an interpreted language—queries are parsed by the control plane and transpiled to native SQL dialects (Snowflake SQL, Spark SQL, etc.) depending on the target data plane. This abstraction enables a single query to run against data in different database systems while enforcing permissions and integrating Rosetta Stone normalization. Related: NQL Design Philosophy, Query Processing

O

Operator (data plane)

A software component running within a customer-hosted data plane that bridges the control plane and the customer’s data infrastructure. The operator polls the control plane’s job queue for compiled queries, executes them against the local database engine (Snowflake, Spark, etc.), and reports results back. Because the operator runs in the customer’s infrastructure, raw data never leaves the data plane. Related: Data Planes, Query Processing

P

PII (Personally Identifiable Information)

Data that can be used to identify a specific individual, either directly or in combination with other data. Common examples include email addresses, phone numbers, names, addresses, and government-issued IDs. Narrative requires that PII be pseudonymized (hashed) before upload to protect individual privacy. Related: Data Pseudonymization, Hashing PII for Upload

Pseudonymization

A data protection technique that replaces directly identifying information with artificial identifiers (pseudonyms). In Narrative’s context, pseudonymization is achieved by hashing PII such as email addresses and phone numbers. Unlike anonymization, pseudonymized data can still be matched across datasets (same input produces same hash) while protecting individual privacy. Related: Data Pseudonymization, Hashing PII for Upload

Permission (API key)

A pair of an access level and a resource that defines what an API key can do. Access levels are Read (view and list) and Write (create, update, and delete). Resources represent functional platform areas such as Datasets, Connections, or Jobs. Related: API Key Permissions, Permissions Reference

Placeholder (query template)

A variable element in a query template marked with {{name}} syntax that is replaced with a user-provided value at execution time. Placeholder types include literal values (strings, numbers, dates), columns, filters, and output fields. Each placeholder has metadata including name, type, description, and whether it is required. Related: Query Template Syntax, Using Query Templates

Processing Fee

A service fee based on the volume of bytes processed per month during usage of Narrative’s products and services. Processing includes ingestion, evaluation, transactions, forecasting, and order management. For pay-as-you-go customers, the rate is $0.65/GB. Processing fees may be waived for activity directly supporting data transactions under certain conditions. Related: Pricing, Transfer Fee, Committed-Usage

Prompt Studio

A UI tool for transforming datasets into structured, fine-tuning-ready examples for AI models. Prompt Studio allows you to configure system, user, and assistant prompts with dynamic macros that pull values from dataset fields, NQL expressions, or literal text. Each row in the source dataset becomes a conversation-formatted training example. Related: Prompt Studio Interface, Model Inference

Q

Query compilation

The process by which the control plane transforms an NQL query into executable SQL for a target data plane. Query compilation includes parsing, permission validation, optimization, and transpilation to the target database’s SQL dialect. Related: Query Processing, Transpilation

Query template

A reusable NQL query pattern with configurable placeholders that users fill in at execution time. Templates enable users to run customized queries without writing NQL directly, making complex queries accessible to non-technical users while ensuring consistent query structure. Related: Query Templates, Using Query Templates, Query Template Syntax

R

Relationships

Connections between datasets that enable querying across multiple related data sources. Relationships allow organizations to join and analyze data across different datasets while respecting access controls.

REST API

The programmatic interface for interacting with the Narrative platform. The REST API follows standard conventions including resource-oriented URLs, standard HTTP verbs (GET, POST, PUT, DELETE), JSON-encoded request and response bodies, and Bearer token authentication. All functionality available in the Web UI is also accessible via the API. Related: REST API Documentation, SDKs, API Keys

Retention policy

A set of rules that governs how long data within a dataset is kept before automatic deletion. Retention policies help manage storage costs, comply with data governance requirements, and ensure data is only stored for as long as necessary. Narrative supports three policy classes: Row TTL (performs row-level hard deletes based on a timestamp column), Snapshot TTL (deletes old ingestion snapshots based on snapshot age), and Table TTL (drops the entire dataset when table age exceeds the interval). Policies use ISO 8601 durations (for example, P90D for 90 days) to specify retention intervals. Related: Dataset Retention Policies, Datasets, EXPIRE clause

Right to erasure

A data subject right under GDPR (also known as the “right to be forgotten”) allowing individuals to request deletion of their personal data under certain circumstances, such as when the data is no longer necessary, when consent is withdrawn, or when processing was unlawful. Similar deletion rights exist under CCPA. Related: Data subject, GDPR, CCPA

Rosetta AI Assistant

An AI-powered assistant that interprets plain English instructions and executes actions within the Narrative platform. Rosetta (accessible at rosett.st) can help users accomplish data collaboration tasks including complex data operations like querying data. The assistant makes data collaboration more accessible by eliminating the need to learn query syntax or navigate complex interfaces. Related: NQL, Web UI

Rosetta Stone

Narrative’s schema normalization system that enables data collaboration across organizations with different data structures. Rosetta Stone uses two core primitives—attributes (standardized field definitions) and mappings (translations between source columns and attributes)—to automatically normalize data from multiple sources into a queryable common schema. The system combines machine learning and human curation to create accurate mappings, handles schema changes transparently, and eliminates the need for custom ETL pipelines between partners. Related: Rosetta Stone Overview, How Rosetta Stone Works, Mapping Schemas

Rosetta Stone access patterns

The three ways to scope Rosetta Stone queries in NQL, each controlling which data sources are included:

Global access (narrative.rosetta_stone): Queries all normalized data from all datasets shared with you across all companies.
Company-scoped access (company_data._rosetta_stone or <company_slug>._rosetta_stone): Queries normalized data from a specific company’s datasets.
Dataset-scoped access (company_data.<dataset>._rosetta_stone): Queries normalized data from a specific dataset, enabling combination with non-normalized columns.

S

SAML 2.0

Security Assertion Markup Language 2.0—the industry standard protocol enabling secure authentication between an identity provider and Narrative. SAML 2.0 is used for enterprise single sign-on integrations. Related: SSO Configuration, SSO Concepts

Schema

The structural definition of a dataset that specifies its fields, data types, and validation rules. A schema acts as the blueprint that determines what data can be stored in a dataset and how it should be organized. Schemas ensure data consistency and enable Narrative to validate incoming records during ingestion. Related: Datasets, Managing Datasets

Schema inference

The automated process by which Narrative analyzes uploaded data to identify column types and suggest appropriate Rosetta Stone attribute mappings. Schema inference uses pattern recognition and machine learning to propose mappings that can be accepted, modified, or rejected. Related: How Rosetta Stone Works, Mapping Schemas

Schema mapping

The process of defining how fields from one data schema correspond to fields in another schema or to a standard schema. Schema mapping is a key component of Narrative’s Rosetta Stone functionality. Related: Rosetta Stone Overview, Mapping Schemas

Schema normalization

The process of mapping different data schemas to a common standardized format, enabling organizations to work with data in consistent ways regardless of how it was originally structured. Related: The Normalization Model, Rosetta Stone Overview

SDK (Software Development Kit)

A collection of libraries, tools, and documentation that enables developers to integrate with the Narrative platform programmatically. The official TypeScript SDK (@narrative.io/data-collaboration-sdk-ts) provides type-safe access to all Narrative APIs, including dataset management, NQL query execution, and job tracking. Related: TypeScript SDK Reference, SDK Quickstart, SDK Guides A Tools feature that enables one-time-use encrypted sharing of sensitive information like API keys, passwords, and credentials through unique retrieval links. Related: Secret Sharing Guide

Schema preset

A pre-defined collection of Rosetta Stone attributes designed for common use cases such as demographic data, marketing events, or identity resolution. Organizations can use public presets to accelerate mapping or create private presets to standardize across teams. Related: The Normalization Model

SHA-1

Secure Hash Algorithm 1—a hash function that produces a 160-bit (40-character hexadecimal) hash value. More secure than MD5 but now considered cryptographically deprecated for security purposes. Still acceptable for identifier matching in data collaboration. Narrative supports SHA-1 alongside MD5 and SHA-256 for hashing PII. Related: Hash function, Hashing PII for Upload

Snapshot

A point-in-time collection of files that were ingested together into a dataset. When you upload data, the ingestion process creates a new snapshot containing the validated records. Snapshot TTL retention policies evaluate data based on snapshot age—the time since the snapshot was created—to determine when snapshots should be automatically deleted. Related: Datasets, Dataset Retention Policies

SHA-256

Secure Hash Algorithm 256-bit—a member of the SHA-2 family that produces a 256-bit (64-character hexadecimal) hash value. Currently considered cryptographically secure and is the recommended algorithm for new implementations. Narrative supports SHA-256 as the preferred hashing algorithm alongside MD5 and SHA-1. Related: Hash function, Hashing PII for Upload

SSO (Single Sign-On)

An authentication method that allows users to access Narrative using their organization’s existing credentials and identity provider. SSO eliminates the need for separate Narrative-specific passwords and enables centralized access management. Related: SSO Configuration, SSO Concepts

Structured output (inference)

A Model Inference response that conforms to a predefined JSON Schema. Structured output ensures the LLM returns data in a predictable, typed format suitable for programmatic consumption. The schema is specified in the output_format_schema field of the inference configuration and can define required fields, data types, enums, and nested structures. Related: Structured Output Concepts, JSON Schema Reference

T

Transaction Services Fee

A service fee calculated as a percentage of the Licensed Data Fees transacted on Narrative’s platform. For pay-as-you-go customers, the rate is 25% of Licensed Data Fees. This fee applies to data you transact, source, or facilitate on the platform or via Narrative Services, including via any Connector App. Related: Pricing, Licensed Data Fees

Transfer Fee

A service fee based on the volume of bytes transferred out of your Narrative account per month, also known as an egress fee. For pay-as-you-go customers, the rate is $1.50/GB. Egress transfers to client-owned locations within AWS Region “us-east-1” (US East, N. Virginia) do not incur Transfer Fees. Related: Pricing, Processing Fee, Committed-Usage

Transpilation

The process of converting NQL code to equivalent native SQL code for a specific database system. The control plane transpiles NQL to Snowflake SQL, Spark SQL, or other dialects depending on where the target data plane stores its data. Transpilation is distinct from compilation in that it produces human-readable code in another query language rather than machine code. Related: Query Processing, Query compilation

Transformation expression

An NQL expression used within a Rosetta Stone mapping to convert source data to the target attribute format. Transformation expressions can include type conversions, string manipulation, conditional logic (CASE statements), and null handling. They enable mappings to normalize diverse source formats into consistent attribute values. Related: Mapping Schemas, Transformation Functions, Edge Cases

Task output

The structured JSON result produced by a workflow task after it executes. Task output contains metadata such as dataset IDs, snapshot IDs, and row counts. Output can be captured into the workflow context via export and injected into subsequent tasks via variable expressions. Related: Task Reference, Specification Syntax

U

UID2 (Unified ID 2.0)

A privacy-focused user identifier developed by the Identity Consortium (including The Trade Desk) that enables cross-platform data matching without compromising user privacy. UID2 is an open-source, standalone solution that can be generated from hashed emails or phone numbers. Narrative supports automatic UID2 generation from uploaded data. Related: UID2

UNNEST

An NQL function that expands an array column into multiple rows, with one row generated for each element in the array. UNNEST is commonly used to flatten multi-key matching scenarios into single-key joins that the query planner can optimize efficiently. Related: NQL Functions Reference, Avoid OR in JOIN Clauses

Unix time

The number of seconds (or milliseconds) elapsed since January 1st, 1970 at 00:00:00 UTC (the Unix epoch). Narrative’s Data Streaming Platform uses Unix time in milliseconds for all timestamp fields, providing millisecond-level precision for data collaboration. Related: Unix Time

Upsert

A database operation that inserts a new record if it does not exist, or updates the existing record if it does. The term combines “update” and “insert.” In NQL, upsert behavior is achieved using the MERGE ON clause within CREATE MATERIALIZED VIEW statements. Related: Incremental Upserts with MERGE ON, Materialized View Syntax

URI (Uniform Resource Identifier)

A string that represents a particular resource. Common resources include web pages (full URLs like http://example.org/page) and mobile applications (iOS app IDs like 553834731 or Android package names like com.example.app). In Narrative, URI is a required attribute in the Digital Consumption data type, denoting where a digital behavior took place.

User agent

A string from a user’s device that identifies the operating system, browser, carrier, and hardware information. User agent strings are collected in digital consumption scenarios and help identify device characteristics. Example: Mozilla/5.0 (iPhone; CPU iPhone OS 13_3 like Mac OS X) AppleWebKit/605.1.15.

V

Variable expression

A ${…} placeholder in workflow task parameters that is evaluated before the task executes. Variable expressions use jq syntax and can reference the previous task’s output (.) or the accumulated workflow context ($context). A pure expression preserves the JSON type of the result; a string-interpolated expression (e.g., ${"text \(.value)")}) always produces a string. Related: Specification Syntax, Task output, Workflow context

W

Webhook subscription

A registration that tells the Narrative platform to send an HTTP POST notification to a specified URL when job state changes occur. Subscriptions can filter by job IDs, job types, and job states, so you receive only the events you care about. Each subscription includes a shared secret for verifying that incoming requests are authentic. Related: Webhook Concepts, Subscribing to Notifications, Webhook Event Reference

Web UI

The browser-based interface for the Narrative data collaboration platform, accessible at app.narrative.io. The Web UI provides visual access to all platform functionality including dataset management, access rule configuration, NQL query execution, and Native App deployment. Organized like a file explorer, the interface enables users to accomplish data collaboration tasks without writing code. Related: Native Apps, REST API

Workflow context

An accumulator object ($context) that threads through every task in a workflow run. It starts as an empty object {} and grows as tasks use export.as to merge their output into it. Subsequent tasks can read values from $context using variable expressions (${…}), enabling structured data passing across multiple workflow steps. Related: Specification Syntax, Task output, Variable expression

Overview

NQL Reference

Connectors

Integrations

Workflow Reference

Webhook Reference

Security Reference

Architecture Reference

Model Inference Reference

Rosetta Stone Reference

UI Reference

SDKs

Billing

​A

​Addressability expansion

​Access rules

​API key

​Attribute lineage

​Attributes

​B

​Base model

​C

​CCPA (California Consumer Privacy Act)

​Chunking

​Cardinality

​Committed-Usage

​Consent

​Control plane

​Context selector

​Cookie syncing

​Connected component

​Company ID

​Completeness

​Confidence score

​Compute pool

​Connector

​CPRA (California Privacy Rights Act)

​D

​Data collaboration

​Data controller

​Data crosswalk

​Data enrichment

​Data plane

​Data portability

​Data processor

​Data Processing Agreement (DPA)

​Data products

​Data subject

​Data Subject Request (DSR)

​Dataset statistics

​Dataset

​Data sample

​Device alias

​Encoding space

​E

​Execution context

​F

​Fine-tuning

​fine_tuning_conversation

​G

​Graph enrichment

​GDPR (General Data Protection Regulation)

​H

​Hash function

​Histogram

​Hash join

​I

​Identity graph

​Identity provider (IdP)

​Interactive query

​Incremental View Maintenance (IVM)

​Inference configuration

​Idempotency key

​Inference job

​J

​Job queue

​Just-in-time provisioning (JIT)

​L

​Legal basis

A