Skip to main content
A retention policy is a set of rules that governs how long data within a dataset is kept before automatic deletion. Retention policies help organizations manage storage costs, comply with data governance requirements, and ensure data is only stored for as long as necessary.

Why retention policies matter

Cost management. Data storage incurs ongoing costs. Retention policies automatically remove data you no longer need, preventing unbounded storage growth. Compliance. Many regulations (GDPR, CCPA, industry-specific rules) require organizations to delete data after a certain period. Retention policies automate this requirement. Data hygiene. Stale data can lead to incorrect analysis or decisions. Retention policies ensure your datasets contain only current, relevant information.

How retention policies work

Retention policies are applied at the dataset level. Each dataset can have one or more policies that Narrative evaluates on a configurable schedule to determine which data to remove. Each policy specifies:
  • A policy class that determines what gets deleted (rows, snapshots, or the entire table)
  • An interval that determines when deletion occurs, expressed as an ISO 8601 duration (for example, P30D for 30 days)
  • An enabled flag that controls whether the policy is actively enforced
You can combine multiple policy classes on a single dataset. For example, you might use a Row TTL policy for routine data expiration alongside a Table TTL policy as a safety net to drop the dataset entirely if it becomes stale.
Retention policy behavior varies by data plane. See Data plane differences below for details.

Policy classes

Narrative supports three policy classes, each targeting a different level of the data lifecycle.

Row TTL

Performs row-level hard deletes based on a timestamp column in your data. Row TTL evaluates each row individually and deletes rows whose timestamp exceeds the retention interval. This is the most commonly used policy class and works across all data plane types. Row TTL uses an event time clock—a Rosetta Stone attribute that maps to a timestamp column in your dataset. If you don’t specify a clock, the system defaults to the nio_last_modified attribute mapping when one exists. If no default mapping is available, the request fails. Row TTL is useful when:
  • You need fine-grained control over which rows expire
  • Your dataset contains rows with different ages that shouldn’t all expire together
  • You want to enforce retention based on when events occurred, not when data was ingested

Snapshot TTL

Deletes old snapshots based on snapshot age—the time since data was ingested into the dataset. This is the original retention model and operates at the Iceberg snapshot level. Snapshot TTL is only available on AWS-based data planes that use Iceberg storage. Within a Snapshot TTL policy, you specify a retention value that controls behavior:
ValueBehavior
Time-basedDelete snapshots older than a specified duration (for example, P90D for 90 days)
Retain everythingKeep all snapshots indefinitely until manually deleted
Expire everythingRemove all snapshots immediately
Common ISO 8601 duration examples:
DurationMeaning
P30D30 days
P90D90 days
P6M6 months
P1Y1 year

Table TTL

Drops the entire dataset when the table age exceeds the retention interval. This is the most aggressive policy class—instead of removing individual snapshots or rows, it deletes the dataset object itself. Table TTL supports three clock types that determine how table age is measured:
ClockMeasures age from
created_atWhen the dataset was created
max_event_timeThe most recent event timestamp in the data (requires a column reference)
static_timeA fixed timestamp you provide (requires a column reference)
Table TTL is useful for:
  • Temporary or staging datasets that should be automatically cleaned up
  • Datasets with a defined useful lifespan
  • Enforcing hard data deletion deadlines for compliance
Table TTL permanently deletes the dataset, not just its data. Use this policy class with care—once the dataset is dropped, it cannot be recovered.

Choosing the right policy class

ScenarioRecommended policy
Delete individual rows that have aged out based on event timestampsRow TTL
Remove old ingestion batches based on when they arrived (Iceberg only)Snapshot TTL
Drop a temporary or staging dataset after a fixed periodTable TTL
Combine routine row expiration with a hard dataset deadlineRow TTL + Table TTL

Retention evaluation schedule

Retention policies are not evaluated continuously. Instead, Narrative evaluates them on a configurable schedule. You can set the evaluation schedule when configuring your retention policies through the API. Between evaluation runs, data that has exceeded its retention period remains in the dataset until the next evaluation occurs.

Retention and dataset deletion

When you delete a dataset, the retention policy determines what happens to the underlying data: On AWS-based data planes, Narrative applies a default 30-day retention period before permanent deletion. This grace period protects against accidental deletions and provides an opportunity to restore the dataset if needed. On Snowflake-based data planes, data is deleted immediately when you delete a dataset. There is no grace period.
Understand which data plane your dataset resides in before deleting. Snowflake-based data planes do not offer a recovery window.

Data plane differences

Retention policy behavior differs depending on where your data is stored:
BehaviorAWS Data PlanesSnowflake Data Planes
Default deletion grace period30 daysNone (immediate)
Row TTLSupportedSupported
Table TTLSupportedSupported
Snapshot TTLSupportedNot supported
Row TTL and Table TTL work across both data plane types. Snapshot TTL operates on Iceberg snapshots, which are specific to AWS-based data planes.

Retention policies for materialized views

Materialized views support retention policies through the EXPIRE clause in NQL. When creating or modifying a materialized view, you can specify how long data should be retained:
CREATE MATERIALIZED VIEW "my_view"
EXPIRE = 'P90D'
AS
SELECT
    user_id,
    event_type,
    event_timestamp
FROM my_dataset
For complete syntax details, see the EXPIRE clause reference.

Choosing a retention period

Consider these factors when setting retention policies: Regulatory requirements. Check what your compliance obligations require. Some regulations mandate maximum retention periods, while others require minimum retention. Business needs. How far back does your analysis typically need to go? Set retention to cover your longest reasonable lookback period plus a buffer. Cost sensitivity. Longer retention means higher storage costs. Balance completeness against budget constraints. Data refresh patterns. If you’re regularly refreshing data (for example, daily snapshots), you may not need to retain old versions for long.
Start with a longer retention period and reduce it based on observed usage patterns. It’s easier to shorten retention than to recover deleted data.