What datasets are for
Datasets serve as the primary way to bring data into Narrative and make it available for collaboration: Storage and organization. Datasets provide a structured container for your data. Each dataset has a defined schema that specifies what fields exist, their data types, and how they should be validated. Querying. Once data is in a dataset, you can query it using NQL. Datasets are the foundation of all query operations in Narrative. Collaboration. Through access rules, you can grant other organizations permission to query your datasets—enabling data sharing, monetization, or joint analysis.How datasets are structured
Schema
Every dataset has a schema that acts as its structural blueprint. The schema defines:- Field names — The columns that exist in the dataset
- Field types — The data type for each field (string, number, timestamp, etc.)
- Descriptions — Documentation explaining what each field contains
- Validations — Rules that ensure data integrity when records are added
Schemas are designed to be stable. While you can add new fields to a schema, changing or removing existing fields requires careful consideration to avoid breaking queries or integrations that depend on them.
Records and snapshots
Data in a dataset is organized into records (rows) and snapshots:- Records are individual data entries that conform to the dataset’s schema
- Snapshots represent a point-in-time collection of files that were ingested together
Adding data to datasets
Datasets support multiple ways to add data: Append mode. New data is added alongside existing data. Use this for event-style data where each upload contains new records. Overwrite mode. New data replaces existing data. Use this when you want to refresh the entire dataset with an updated version. For procedural details on uploading data, see Uploading Data.Retention policies
Datasets can have retention policies that automatically manage data lifecycle. A retention policy defines how long data is kept before automatic deletion, helping you manage storage costs and comply with data governance requirements. Common retention configurations include:- Time-based retention — Automatically remove data older than a specified period (e.g., 90 days, 1 year)
- Retain everything — Keep all data indefinitely until manually deleted
Ownership and access
Single-company ownership
Every dataset is owned by exactly one company. The owner has full control over:- The dataset’s schema and configuration
- Who can access the data and under what terms
- Whether to archive or delete the dataset
Access through access rules
By default, a dataset is private to its owner. To share data with other organizations, you create access rules that define:- Which organizations can query the dataset
- Which fields and records they can access
- What pricing applies (if any)
Where datasets live
Datasets are scoped to a specific data plane. The data plane determines:- Where the data physically resides (Narrative-hosted or your own infrastructure)
- Which query engine processes queries against the dataset
- What data residency and compliance requirements are met
Datasets vs. materialized views
Narrative supports two types of data containers:| Type | Source | Updates |
|---|---|---|
| Dataset | External data you upload | Manual uploads or automated ingestion |
| Materialized view | Query results from other datasets | Automatic refresh on schedule |
Related content
Retention Policies
Configure automatic data lifecycle management
Access Rules
Control who can query your datasets and at what price
Data Planes
Understand where your datasets physically reside
Dataset Statistics
Column-level metrics computed over your dataset contents
Managing Datasets
Create and manage datasets with the SDK

