Sample data is one of the rare cases where actual data leaves your data plane and is stored in the control plane. For organizations with data residency requirements or strict governance policies, this distinction matters.
What are data samples?
A data sample is a preview of up to 1,000 rows from a dataset. Samples enable you to view query results, validate data quality, and inspect datasets without downloading full result sets.
When you click to view query results in the UI or retrieve results via the API, the platform creates a sample from your dataset and stores it in the control plane for display.
Why samples cross the boundary
Narrative’s architecture is built around a core principle: data stays in place. Queries are sent to your data plane, executed locally, and results remain within your infrastructure.
Samples are the exception. To display query results in the UI or return them via the API, actual row-level data must be transmitted from your data plane to the control plane and stored there.
Unlike full query results (which stay in your data plane), sample data is transmitted to and persisted in Narrative’s control plane infrastructure. If your datasets contain sensitive information, that information will be present in any samples you create.
This is one of only two scenarios where data egresses from your data plane:
| Scenario | Description | Destination |
|---|
| Data samples | Preview rows for viewing results | Control plane database |
| Connector delivery | Intentional export for activation | External platforms (S3, DSPs, etc.) |
What data is included in samples
When a sample is created:
- Row limit — Up to 1,000 rows are retrieved
- All columns — Every column in the dataset is included
- Rosetta Stone attributes — If the dataset has Rosetta Stone mappings, normalized attribute values are included alongside the raw columns
- No transformation — Raw data is not masked, hashed, or modified (Rosetta Stone attributes reflect their normalized transformations)
- Point-in-time snapshot — The sample reflects the dataset at the moment of retrieval
Sample lifecycle
Samples are created on-demand, not automatically.
Creation
A sample is created when:
- You click to view query results in the Query Editor
- You call
requestDatasetSample() via the SDK
- You request results through the API
Each sample creation triggers a sample job that retrieves rows from your data plane and transmits them to the control plane.
Storage
Once created, samples persist in the control plane database until explicitly deleted. They do not expire automatically.
Deletion
Samples can be removed:
- Programmatically via
deleteDatasetSample() in the SDK
- Through the UI when clearing query results
- When the underlying dataset is deleted
Governance considerations
Data residency
Sample data resides in Narrative’s control plane infrastructure, which may be in a different geographic region than your data plane. If you operate a customer-hosted data plane specifically for data residency requirements, be aware that samples bypass this isolation.
Sensitive data exposure
If your dataset contains PII or other sensitive information, that data will be present in samples stored in the control plane. Consider:
- Whether viewing results is necessary for your workflow
- Clearing samples promptly after inspection
- Using direct data plane access for sensitive datasets
Retention practices
Since samples persist until deleted, establish practices for clearing them when no longer needed. This reduces the footprint of sensitive data in the control plane.
For datasets containing sensitive information, clear samples after inspection using the SDK’s deleteDatasetSample() method. See Managing Datasets for implementation details.
Access control
Sample access follows the same permission model as the underlying dataset. Users who can query a dataset can view its samples.
Alternatives to sampling
If you need to avoid storing data in the control plane:
- Direct data plane access — Query and process data within your data plane infrastructure
- Cloud storage export — Use connectors to deliver data to your own S3 bucket or data warehouse
- Materialized views — Access the underlying dataset directly rather than viewing samples
These approaches keep data within infrastructure you control, at the cost of not being able to preview results in the Narrative UI.
Related content