Skip to main content
When building materialized views that refresh periodically, you often need to update existing records rather than simply appending new ones. The MERGE ON clause lets you match incoming rows against existing data and decide whether to update or insert—avoiding duplicates while keeping data fresh.

Prerequisites

Before using MERGE ON, you should have:

The problem

Consider a view that refreshes monthly with recent activity data. With WRITE_MODE = 'overwrite', each refresh replaces all data—losing historical records. With WRITE_MODE = 'append', each refresh adds new rows—but if the same record appears in multiple refreshes, you get duplicates. MERGE ON solves this by matching rows between the query result and existing view data, then updating matches and inserting non-matches.

How MERGE ON works

MERGE ON compares each row from your query (the source) against existing rows in the materialized view (the target). Based on the match condition, it either updates the existing row or inserts a new one.

Source and target aliases

NQL automatically generates two aliases for use in MERGE ON clauses:
AliasDescription
sourceReferences columns from the incoming query result
targetReferences columns from the existing materialized view
Use these aliases in your MERGE ON condition and in the WHEN MATCHED/WHEN NOT MATCHED clauses.

Use case 1: Incremental upsert with time-bounded matching

This example maintains an audience feed that refreshes monthly. The MERGE ON condition restricts updates to rows considered “fresh” (modified within the last 20 days), which can help respect license refresh windows or avoid stale overwrites.
CREATE MATERIALIZED VIEW "ongoing_Feed"
REFRESH_SCHEDULE = '@monthly'
WRITE_MODE = 'append'
AS
SELECT
    "ip_address" AS ip,
    "sha256_hashed_email"."value" AS sha256,
    "sha1_hashed_email"."value" AS sha1,
    "md5_hashed_email"."value" AS md5,
    "mobile_id_unique_identifier"."value" AS maid,
    "mobile_id_unique_identifier"."type" AS maid_type,
    CAST("event_timestamp" AS TIMESTAMP) AS "event_timestamp",
    CURRENT_TIMESTAMP as last_modified_at
FROM
    company_data.audience_feed
WHERE
    "event_timestamp" >= CURRENT_TIMESTAMP - INTERVAL '30' DAY
    AND "sha256_hashed_email"."value" IS NOT NULL
    AND "ip_address" IS NOT NULL
    AND "mobile_id_unique_identifier"."value" IS NOT NULL
MERGE ON target.last_modified_at > CURRENT_TIMESTAMP - INTERVAL '20' DAY
WHEN MATCHED THEN UPDATE SET
  ip = source.ip,
  sha256 = source.sha256,
  sha1 = source.sha1,
  md5 = source.md5,
  maid = source.maid,
  maid_type = source.maid_type,
  event_timestamp = source.event_timestamp,
  last_modified_at = source.last_modified_at
WHEN NOT MATCHED THEN INSERT (
  ip,
  sha256,
  sha1,
  md5,
  maid,
  maid_type,
  event_timestamp,
  last_modified_at
) VALUES (
  source.ip,
  source.sha256,
  source.sha1,
  source.md5,
  source.maid,
  source.maid_type,
  source.event_timestamp,
  source.last_modified_at
);
Key points:
  • The time-bounded condition (target.last_modified_at > CURRENT_TIMESTAMP - INTERVAL '20' DAY) is optional but useful for limiting which rows are considered for updates
  • WRITE_MODE = 'append' is required for incremental merge semantics
  • The last_modified_at column tracks when each row was last updated

Use case 2: Enrichment with composite keys

This example enriches internal user data with geographic information from Rosetta Stone. It uses a composite key (user_id + sha256) with IS NOT DISTINCT FROM to handle nullable values safely.
CREATE MATERIALIZED VIEW "email_geo_enrichment"
WRITE_MODE = 'append'
REFRESH_SCHEDULE = '@weekly'
AS (
  SELECT
    ds.user_id,
    rs."sha256_hashed_email"."value" AS sha256,
    rs."geo_country_code" AS country_code,
    rs."geo_region" AS region,
    rs."nio_last_modified_at" AS rs_modified_at,
    CURRENT_TIMESTAMP AS last_modified_at
  FROM company_data."internal_users" ds
  JOIN narrative."rosetta_stone" rs
    ON rs."unique_id"."value" = ds.unique_id
)
MERGE ON
  target.user_id IS NOT DISTINCT FROM source.user_id
  AND target.sha256 IS NOT DISTINCT FROM source.sha256
WHEN MATCHED THEN UPDATE SET
  country_code = source.country_code,
  region = source.region,
  rs_modified_at = source.rs_modified_at,
  last_modified_at = source.last_modified_at
WHEN NOT MATCHED THEN INSERT (
  user_id,
  sha256,
  country_code,
  region,
  rs_modified_at,
  last_modified_at
) VALUES (
  source.user_id,
  source.sha256,
  source.country_code,
  source.region,
  source.rs_modified_at,
  source.last_modified_at
);
Key points:
  • IS NOT DISTINCT FROM treats two NULL values as equal, unlike = which returns NULL when comparing NULLs
  • Composite keys work well when no single column uniquely identifies a record
  • The enriched columns (country_code, region) update on match while the key columns remain stable

Best practices

PracticeWhy
Use a stable, unique key for matchingNon-unique or changing keys cause repeated inserts or updates to wrong rows
Use IS NOT DISTINCT FROM for nullable fieldsPlain = on nullable fields can misclassify matches (NULL = NULL returns NULL, not true)
Consider time-bounded conditionsScoping to recent data respects license windows and improves performance
Always use WRITE_MODE = 'append'Required for incremental merge semantics; overwrite mode replaces all data
Keep MERGE ON conditions minimalOverly broad conditions cause frequent rewrites; scope to the minimum key set needed

Common pitfalls

Non-unique match keysIf your MERGE ON condition matches multiple target rows for a single source row, you may see unexpected updates or duplicate inserts. Ensure your match key (or composite key) uniquely identifies records.
Using = on nullable fieldsUsing plain = to compare nullable fields can cause matches to be missed:
-- Wrong: will not match when both values are NULL
MERGE ON target.user_id = source.user_id

-- Correct: treats NULL = NULL as a match
MERGE ON target.user_id IS NOT DISTINCT FROM source.user_id
MERGE ON only works in CREATE MATERIALIZED VIEWMERGE ON is a clause within the CREATE MATERIALIZED VIEW statement. It cannot be used as a standalone MERGE command in Data Studio.