A retention policy is a set of rules that governs how long data within a dataset is kept before automatic deletion. Retention policies help organizations manage storage costs, comply with data governance requirements, and ensure data is only stored for as long as necessary.
Why retention policies matter
Cost management. Data storage incurs ongoing costs. Retention policies automatically remove data you no longer need, preventing unbounded storage growth.
Compliance. Many regulations (GDPR, CCPA, industry-specific rules) require organizations to delete data after a certain period. Retention policies automate this requirement.
Data hygiene. Stale data can lead to incorrect analysis or decisions. Retention policies ensure your datasets contain only current, relevant information.
How retention policies work
Retention policies are applied at the dataset level. Each dataset can have one or more policies that Narrative evaluates on a configurable schedule to determine which data to remove.
Each policy specifies:
- A policy class that determines what gets deleted (rows, snapshots, or the entire table)
- An interval that determines when deletion occurs, expressed as an ISO 8601 duration (for example,
P30D for 30 days)
- An enabled flag that controls whether the policy is actively enforced
You can combine multiple policy classes on a single dataset. For example, you might use a Row TTL policy for routine data expiration alongside a Table TTL policy as a safety net to drop the dataset entirely if it becomes stale.
Policy classes
Narrative supports three policy classes, each targeting a different level of the data lifecycle.
Row TTL
Performs row-level hard deletes based on a timestamp column in your data. Row TTL evaluates each row individually and deletes rows whose timestamp exceeds the retention interval. This is the most commonly used policy class and works across all data plane types.
Row TTL uses an event time clock—a Rosetta Stone attribute that maps to a timestamp column in your dataset. If you don’t specify a clock, the system defaults to the nio_last_modified attribute mapping when one exists. If no default mapping is available, the request fails.
Row TTL is useful when:
- You need fine-grained control over which rows expire
- Your dataset contains rows with different ages that shouldn’t all expire together
- You want to enforce retention based on when events occurred, not when data was ingested
Snapshot TTL
Deletes old snapshots based on snapshot age—the time since data was ingested into the dataset. This is the original retention model and operates at the Iceberg snapshot level.
Snapshot TTL is only available on AWS-based data planes that use Iceberg storage.
Within a Snapshot TTL policy, you specify a retention value that controls behavior:
| Value | Behavior |
|---|
| Time-based | Delete snapshots older than a specified duration (for example, P90D for 90 days) |
| Retain everything | Keep all snapshots indefinitely until manually deleted |
| Expire everything | Remove all snapshots immediately |
Common ISO 8601 duration examples:
| Duration | Meaning |
|---|
P30D | 30 days |
P90D | 90 days |
P6M | 6 months |
P1Y | 1 year |
Table TTL
Drops the entire dataset when the table age exceeds the retention interval. This is the most aggressive policy class—instead of removing individual snapshots or rows, it deletes the dataset object itself.
Table TTL supports three clock types that determine how table age is measured:
| Clock | Measures age from |
|---|
created_at | When the dataset was created |
max_event_time | The most recent event timestamp in the data (requires a column reference) |
static_time | A fixed timestamp you provide (requires a column reference) |
Table TTL is useful for:
- Temporary or staging datasets that should be automatically cleaned up
- Datasets with a defined useful lifespan
- Enforcing hard data deletion deadlines for compliance
Table TTL permanently deletes the dataset, not just its data. Use this policy class with care—once the dataset is dropped, it cannot be recovered.
Choosing the right policy class
| Scenario | Recommended policy |
|---|
| Delete individual rows that have aged out based on event timestamps | Row TTL |
| Remove old ingestion batches based on when they arrived (Iceberg only) | Snapshot TTL |
| Drop a temporary or staging dataset after a fixed period | Table TTL |
| Combine routine row expiration with a hard dataset deadline | Row TTL + Table TTL |
Retention evaluation schedule
Retention policies are not evaluated continuously. Instead, Narrative evaluates them on a configurable schedule. You can set the evaluation schedule when configuring your retention policies through the API.
Between evaluation runs, data that has exceeded its retention period remains in the dataset until the next evaluation occurs.
Retention and dataset deletion
When you delete a dataset, the retention policy determines what happens to the underlying data:
On AWS-based data planes, Narrative applies a default 30-day retention period before permanent deletion. This grace period protects against accidental deletions and provides an opportunity to restore the dataset if needed.
On Snowflake-based data planes, data is deleted immediately when you delete a dataset. There is no grace period.
Understand which data plane your dataset resides in before deleting. Snowflake-based data planes do not offer a recovery window.
Data plane differences
Retention policy behavior differs depending on where your data is stored:
| Behavior | AWS Data Planes | Snowflake Data Planes |
|---|
| Default deletion grace period | 30 days | None (immediate) |
| Row TTL | Supported | Supported |
| Table TTL | Supported | Supported |
| Snapshot TTL | Supported | Not supported |
Row TTL and Table TTL work across both data plane types. Snapshot TTL operates on Iceberg snapshots, which are specific to AWS-based data planes.
Retention policies for materialized views
Materialized views support retention policies through the EXPIRE clause in NQL. When creating or modifying a materialized view, you can specify how long data should be retained:
CREATE MATERIALIZED VIEW "my_view"
EXPIRE = 'P90D'
AS
SELECT
user_id,
event_type,
event_timestamp
FROM my_dataset
For complete syntax details, see the EXPIRE clause reference.
Choosing a retention period
Consider these factors when setting retention policies:
Regulatory requirements. Check what your compliance obligations require. Some regulations mandate maximum retention periods, while others require minimum retention.
Business needs. How far back does your analysis typically need to go? Set retention to cover your longest reasonable lookback period plus a buffer.
Cost sensitivity. Longer retention means higher storage costs. Balance completeness against budget constraints.
Data refresh patterns. If you’re regularly refreshing data (for example, daily snapshots), you may not need to retain old versions for long.
Start with a longer retention period and reduce it based on observed usage patterns. It’s easier to shorten retention than to recover deleted data.
Related content