Skip to main content
This guide covers the fundamentals of querying data with NQL: selecting columns, filtering rows with WHERE clauses, and transforming data using expressions and functions.

Prerequisites

  • A Narrative I/O account with at least one dataset
  • Basic familiarity with SQL syntax

What you’ll learn

  • How to select specific columns and use aliases
  • How to filter data with comparison and logical operators
  • How to transform data using expressions and functions
  • How to work with complex types (arrays, structs, maps)

Selecting columns

Basic column selection

Select specific columns from a dataset:
SELECT
  user_id,
  email,
  created_at
FROM company_data."123"

Using column aliases

Rename columns in your output with AS:
SELECT
  user_id AS id,
  email AS contact_email,
  created_at AS signup_date
FROM company_data."123"

Fully qualified column names

When joining datasets or for clarity, use fully qualified names:
SELECT
  company_data."123".user_id,
  company_data."123".email
FROM company_data."123"
Always use fully qualified column names in joins to avoid ambiguity.
NQL requires explicit column lists. Wildcards (SELECT *) are not supported. See Explicit Column Selection for why and how to list columns explicitly.

Select distinct values

Remove duplicates with DISTINCT:
SELECT DISTINCT category
FROM company_data."123"

Filtering with WHERE

Comparison filters

Filter rows using comparison operators:
SELECT user_id, email, age, status
FROM company_data."123"
WHERE
  age >= 18
  AND age < 65
  AND status = 'active'
OperatorMeaning
=Equal to
<>Not equal to
<Less than
>Greater than
<=Less than or equal
>=Greater than or equal

Combining conditions

Use AND, OR, and NOT to combine conditions:
SELECT user_id, email, status, region
FROM company_data."123"
WHERE
  (status = 'active' OR status = 'pending')
  AND NOT is_test_account
  AND region = 'US'
Use parentheses to control evaluation order. AND is evaluated before OR.

IN and NOT IN

Check membership in a list:
SELECT user_id, category, source, created_at
FROM company_data."123"
WHERE
  category IN ('electronics', 'clothing', 'home')
  AND source NOT IN ('test', 'internal')

BETWEEN

Filter within a range (inclusive):
SELECT user_id, email, age, score
FROM company_data."123"
WHERE
  age BETWEEN 25 AND 45
  AND score BETWEEN 80 AND 100

NULL checks

Check for missing values:
SELECT user_id, email, phone
FROM company_data."123"
WHERE
  email IS NOT NULL
  AND phone IS NOT NULL
  AND secondary_email IS NULL
Don’t use = NULL or <> NULL. These always return NULL, not true/false. Use IS NULL and IS NOT NULL.

Pattern matching with LIKE

Match string patterns using LIKE wildcards:
SELECT user_id, email, name, code
FROM company_data."123"
WHERE
  email LIKE '%@gmail.com'      -- Ends with @gmail.com
  AND name LIKE 'John%'          -- Starts with John
  AND code LIKE 'A__'            -- A followed by exactly 2 characters
WildcardMatches
%Zero or more characters
_Exactly one character
For case-insensitive matching, normalize the case first:
WHERE LOWER(email) LIKE '%@gmail.com'

Date and time filtering

Current date comparisons

Filter relative to the current date:
SELECT user_id, event_date, created_at
FROM company_data."123"
WHERE
  -- Last 30 days
  event_date >= CURRENT_DATE - INTERVAL '30' DAY

  -- Last 3 months
  AND created_at >= CURRENT_DATE - INTERVAL '3' MONTH

Date ranges

Filter within specific date ranges:
SELECT user_id, email, event_date
FROM company_data."123"
WHERE
  event_date BETWEEN '2024-01-01' AND '2024-12-31'

Extracting date parts

Filter by specific date components:
SELECT user_id, event_date, event_type
FROM company_data."123"
WHERE
  EXTRACT(YEAR FROM event_date) = 2024
  AND EXTRACT(MONTH FROM event_date) IN (1, 2, 3)
  AND EXTRACT(DAYOFWEEK FROM event_date) = 1  -- Sunday

Price filtering

Control costs by filtering on the special _price_cpm_usd column:
SELECT user_id, email, category
FROM company_data."123"
WHERE
  category = 'premium'
  AND _price_cpm_usd <= 1.00  -- Max $1 per 1000 rows
Always include price filtering to control data acquisition costs, especially when creating materialized views.

Transforming data

Expressions in SELECT

Compute values using expressions:
SELECT
  quantity * unit_price AS line_total,
  (quantity * unit_price) * 1.0825 AS total_with_tax,
  first_name || ' ' || last_name AS full_name
FROM company_data."123"

CASE expressions

Apply conditional logic:
SELECT
  user_id,
  CASE
    WHEN lifetime_value > 1000 THEN 'platinum'
    WHEN lifetime_value > 500 THEN 'gold'
    WHEN lifetime_value > 100 THEN 'silver'
    ELSE 'bronze'
  END AS tier
FROM company_data."123"

String transformations

Clean and format text:
SELECT
  UPPER(name) AS uppercase_name,
  LOWER(email) AS lowercase_email,
  TRIM(user_input) AS cleaned_input,
  CONCAT(city, ', ', state) AS location,
  SUBSTRING(phone, 1, 3) AS area_code
FROM company_data."123"

Type casting

Convert between types:
SELECT
  CAST(string_number AS LONG) AS numeric_value,
  CAST(timestamp_value AS DATE) AS date_only,
  CAST(amount AS DECIMAL) AS precise_amount
FROM company_data."123"

Working with complex types

Accessing array elements

Use bracket notation with zero-based indexing:
SELECT
  identifiers[0] AS first_id,
  identifiers[1] AS second_id,
  tags[0] AS primary_tag
FROM company_data."123"

Accessing struct fields

Use dot notation:
SELECT
  person.name,
  person.age,
  address.city,
  address.postal_code
FROM company_data."123"

Nested access

Combine notation for deeply nested data:
SELECT
  -- Array of structs
  identifiers[0].type AS first_id_type,
  identifiers[0].value AS first_id_value,

  -- Struct with nested struct
  user.preferences.language,
  user.preferences.timezone
FROM company_data."123"

Accessing map values

Use ELEMENT_AT for maps:
SELECT
  ELEMENT_AT(properties, 'color') AS color,
  ELEMENT_AT(metadata, 'source') AS source
FROM company_data."123"

Expanding arrays with UNNEST

Convert array elements to rows:
SELECT
  t.user_id,
  tag
FROM company_data."123" t, UNNEST(t.tags) AS tag
This creates one row per array element. If a user has tags ['a', 'b', 'c'], the result includes three rows for that user.

Aggregating data

Basic aggregations

Compute summary statistics:
SELECT
  COUNT(1) AS total_rows,
  APPROX_COUNT_DISTINCT(user_id) AS unique_users,
  SUM(amount) AS total_amount,
  AVG(score) AS average_score,
  MIN(created_at) AS earliest,
  MAX(created_at) AS latest
FROM company_data."123"

Grouping

Aggregate by categories:
SELECT
  category,
  COUNT(1) AS count,
  SUM(amount) AS total,
  AVG(amount) AS average
FROM company_data."123"
GROUP BY category

Filtering aggregated results

Use HAVING to filter after aggregation:
SELECT
  category,
  COUNT(1) AS count
FROM company_data."123"
GROUP BY category
HAVING COUNT(1) > 100

Deduplication with QUALIFY

Remove duplicates based on a window function:
SELECT
  user_id,
  email,
  last_seen
FROM company_data."123"
QUALIFY ROW_NUMBER() OVER (
  PARTITION BY user_id
  ORDER BY last_seen DESC
) = 1
This keeps only the most recent record for each user.

Putting it together

Here’s a complete example combining multiple techniques:
SELECT
  user_id,
  LOWER(email) AS normalized_email,
  CASE
    WHEN age < 25 THEN '18-24'
    WHEN age < 35 THEN '25-34'
    WHEN age < 45 THEN '35-44'
    ELSE '45+'
  END AS age_bracket,
  identifiers[0].type AS primary_id_type,
  created_at
FROM company_data."123"
WHERE
  email IS NOT NULL
  AND status = 'active'
  AND created_at >= CURRENT_DATE - INTERVAL '90' DAY
  AND _price_cpm_usd <= 0.50
QUALIFY ROW_NUMBER() OVER (
  PARTITION BY user_id
  ORDER BY created_at DESC
) = 1
ORDER BY created_at DESC
LIMIT 1000 ROWS

Joining Datasets

Combine data from multiple sources

NQL Syntax Reference

Complete syntax documentation

Operators Reference

All comparison and logical operators

Functions Reference

String, date, and transformation functions