do block calls a supported task. This reference documents every available task, its parameters, output schema, and usage examples.
For how tasks fit into the overall workflow specification, see Workflow Specification Syntax.
Supported tasks
CreateMaterializedViewIfNotExists
Task that creates a materialized view if it does not already exist. Parameters:| Parameter | Type | Required | Description |
|---|---|---|---|
nql | string | Yes | An NQL CREATE MATERIALIZED VIEW statement. |
computePoolId | string | No | The compute pool ID to use for running the task. When omitted, the resolution depends on whether the task operates on an existing dataset: - If it does (e.g. RefreshMaterializedView, ExecuteDml, CreateDatasetSample), the dataset’s default compute pool is used; if the dataset has no default, the dataplane’s default compute pool is used. - If it does not (e.g. CreateMaterializedViewIfNotExists, where the dataset is being created, or RunModelInference, which is not tied to a dataset), the dataplane’s default compute pool is used directly. |
| Field | Type | Always present | Description |
|---|---|---|---|
datasetId | integer | No | The ID of the created or existing dataset. |
created | boolean | No | Whether the materialized view was newly created by this task. |
snapshotId | integer or null | No | The Iceberg snapshot ID of the initial refresh. Non-null only when created is true. |
recalculationId | string or null | No | The recalculation ID, if applicable. Non-null only when created is true. |
rowStats | object or null | No | Row-level statistics produced by a refresh. ## Platform behavior - Snowflake dataplanes populate this object with real counts. - AWS dataplanes return null — row-level statistics are not yet produced for materialized-view refreshes on AWS. |
RefreshMaterializedView
Task that triggers a refresh of an existing materialized view. Parameters:| Parameter | Type | Required | Description |
|---|---|---|---|
datasetId | integer | No | The numeric id of an existing dataset. |
datasetName | string | No | The name of a dataset. Must contain only alphanumeric characters and underscores, with a maximum length of 256 characters. |
computePoolId | string | No | The compute pool ID to use for running the task. When omitted, the resolution depends on whether the task operates on an existing dataset: - If it does (e.g. RefreshMaterializedView, ExecuteDml, CreateDatasetSample), the dataset’s default compute pool is used; if the dataset has no default, the dataplane’s default compute pool is used. - If it does not (e.g. CreateMaterializedViewIfNotExists, where the dataset is being created, or RunModelInference, which is not tied to a dataset), the dataplane’s default compute pool is used directly. |
| Field | Type | Always present | Description |
|---|---|---|---|
datasetId | integer | No | The ID of the refreshed dataset. |
snapshotId | integer | No | The new Iceberg snapshot ID after the refresh. |
recalculationId | string or null | No | The recalculation ID, if applicable. |
rowStats | object or null | No | Row-level statistics produced by a refresh. ## Platform behavior - Snowflake dataplanes populate this object with real counts. - AWS dataplanes return null — row-level statistics are not yet produced for materialized-view refreshes on AWS. |
ExecuteDml
Task that executes a DML statement on a dataset. Parameters:| Parameter | Type | Required | Description |
|---|---|---|---|
nql | string | Yes | An NQL DML statement. Supports INSERT, UPDATE, and DELETE. |
computePoolId | string | No | The compute pool ID to use for running the task. When omitted, the resolution depends on whether the task operates on an existing dataset: - If it does (e.g. RefreshMaterializedView, ExecuteDml, CreateDatasetSample), the dataset’s default compute pool is used; if the dataset has no default, the dataplane’s default compute pool is used. - If it does not (e.g. CreateMaterializedViewIfNotExists, where the dataset is being created, or RunModelInference, which is not tied to a dataset), the dataplane’s default compute pool is used directly. |
| Field | Type | Always present | Description |
|---|---|---|---|
affectedRows | integer | Yes | Total rows affected by the DML statement (insert + update + delete). |
insertedRows | integer | Yes | Rows inserted by the DML statement. |
updatedRows | integer | Yes | Rows updated by the DML statement. |
deletedRows | integer | Yes | Rows deleted by the DML statement. |
RunModelInference
Task that runs a model inference job. Parameters:| Parameter | Type | Required | Description |
|---|---|---|---|
model | enum (anthropic.claude-haiku-4.5, anthropic.claude-sonnet-4.5, anthropic.claude-opus-4.5, openai.gpt-oss-120b, openai.gpt-4.1, openai.o4-mini) | Yes | The narrative model ID to use for inference. |
messages | array | Yes | A list of messages to send to the model. |
inferenceConfig | object | Yes | Configuration for the model inference. |
computePoolId | string | No | The compute pool ID to use for running the task. When omitted, the resolution depends on whether the task operates on an existing dataset: - If it does (e.g. RefreshMaterializedView, ExecuteDml, CreateDatasetSample), the dataset’s default compute pool is used; if the dataset has no default, the dataplane’s default compute pool is used. - If it does not (e.g. CreateMaterializedViewIfNotExists, where the dataset is being created, or RunModelInference, which is not tied to a dataset), the dataplane’s default compute pool is used directly. |
| Field | Type | Always present | Description |
|---|---|---|---|
structuredOutput | object | No | The structured output from the model, conforming to the provided outputFormatSchema. |
usage | object | No | Token usage information. |
LabelConnectedComponents
Task that runs bipartite label propagation for cross-system customer identity resolution. Finds connected components in a customer identity graph by linking customer IDs across platforms via shared identifiers. Parameters:| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
edgeDataset | string | Yes | — | The name of a dataset. Must contain only alphanumeric characters and underscores, with a maximum length of 256 characters. |
outputDataset | string | Yes | — | The name of a dataset. Must contain only alphanumeric characters and underscores, with a maximum length of 256 characters. |
maxDegreeThreshold | integer | No | 100 | Maximum number of connections a single vertex can have before it is excluded as a “supernode.” Prevents a single overly-connected identifier from incorrectly merging thousands of unrelated customers. |
maxComponentSize | integer | No | 100 | Maximum number of members allowed in a single resolved component. Prevents runaway merges that would create implausibly large identity groups. |
maxIterations | integer | No | 10 | Upper bound on how many times the label propagation loop can run before stopping, even if not fully converged. Safety valve against infinite loops. |
convergenceThreshold | number | No | 0.000001 | Stop label propagation when the fraction of vertices that changed label in an iteration drops below this value. Must be in the range [0, 1]. |
sourceIdCol | string | Yes | — | Column name in the edge table containing the customer ID. |
sourceSystemCol | string | Yes | — | Column name identifying which platform the customer ID came from. |
bridgeKeyCol | string | Yes | — | Column name for the shared identifier value. |
bridgeKeyTypeCol | string | Yes | — | Column name for the type/category of the shared identifier. |
firstPartySources | array | No | [] | Ordered list of first-party platform identifiers. Order determines priority when selecting the representative component ID. |
thirdPartySources | array | No | [] | List of third-party platform identifiers. |
computePoolId | string | No | — | The compute pool ID to use for running the task. When omitted, the resolution depends on whether the task operates on an existing dataset: - If it does (e.g. RefreshMaterializedView, ExecuteDml, CreateDatasetSample), the dataset’s default compute pool is used; if the dataset has no default, the dataplane’s default compute pool is used. - If it does not (e.g. CreateMaterializedViewIfNotExists, where the dataset is being created, or RunModelInference, which is not tied to a dataset), the dataplane’s default compute pool is used directly. |
| Field | Type | Always present | Description |
|---|---|---|---|
datasetId | integer | Yes | The ID of the dataset. |
CreateRosettaStoneMappingsIfNotExist
Task that creates Rosetta Stone attribute mappings for a dataset. Parameters:| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
datasetId | integer | No | — | The numeric id of an existing dataset. |
datasetName | string | No | — | The name of a dataset. Must contain only alphanumeric characters and underscores, with a maximum length of 256 characters. |
mappings | array | Yes | — | A list of mapping definitions to create. |
allowPartial | boolean | No | true | When true, individual mapping failures don’t prevent other valid mappings from being created. When false, any single failure causes the entire operation to fail. |
| Field | Type | Always present | Description |
|---|---|---|---|
createdMappings | array | Yes | Mappings that were successfully created. |
failedMappings | array | Yes | Mappings that failed to create. |
conflictMappings | array | Yes | Mappings skipped because an identical mapping already exists. |
CreateDatasetSample
Task that generates a sample for a dataset. Parameters:| Parameter | Type | Required | Description |
|---|---|---|---|
datasetId | integer | No | The numeric id of an existing dataset. |
datasetName | string | No | The name of a dataset. Must contain only alphanumeric characters and underscores, with a maximum length of 256 characters. |
computePoolId | string | No | The compute pool ID to use for running the task. When omitted, the resolution depends on whether the task operates on an existing dataset: - If it does (e.g. RefreshMaterializedView, ExecuteDml, CreateDatasetSample), the dataset’s default compute pool is used; if the dataset has no default, the dataplane’s default compute pool is used. - If it does not (e.g. CreateMaterializedViewIfNotExists, where the dataset is being created, or RunModelInference, which is not tied to a dataset), the dataplane’s default compute pool is used directly. |
| Field | Type | Always present | Description |
|---|---|---|---|
datasetId | integer | Yes | The id of the dataset whose sample was generated. |
rowCount | integer | Yes | The number of rows captured in the sample. |
RecalculateStatistics
Task that triggers a recalculation of a dataset’s column statistics and waits for it to complete. Parameters:| Parameter | Type | Required | Description |
|---|---|---|---|
datasetId | integer | No | The numeric id of an existing dataset. |
datasetName | string | No | The name of a dataset. Must contain only alphanumeric characters and underscores, with a maximum length of 256 characters. |
computePoolId | string | No | The compute pool ID to use for running the task. When omitted, the resolution depends on whether the task operates on an existing dataset: - If it does (e.g. RefreshMaterializedView, ExecuteDml, CreateDatasetSample), the dataset’s default compute pool is used; if the dataset has no default, the dataplane’s default compute pool is used. - If it does not (e.g. CreateMaterializedViewIfNotExists, where the dataset is being created, or RunModelInference, which is not tied to a dataset), the dataplane’s default compute pool is used directly. |
| Field | Type | Always present | Description |
|---|---|---|---|
totalRows | integer | Yes | The total number of rows the statistics were calculated over. |
columnCount | integer | Yes | The number of columns the statistics were calculated for. |
Related content
Workflow Specification Syntax
Full specification format for document, schedule, and task blocks
Automating Multi-Step Pipelines
Step-by-step guide to creating and running workflows
Materialized Views
How materialized views work
Workflows API
REST API endpoints for managing workflows

