Skip to main content
The Model Inference API enables you to run LLM inference within your data plane. This reference documents all methods, types, and interfaces available in the TypeScript SDK.

Methods

runModelInference

Submits a model inference request and returns a job that can be polled for results.
async runModelInference(request: ModelInferenceRunRequest): Promise<ModelInferenceRunJob>
Parameters:
NameTypeRequiredDescription
requestModelInferenceRunRequestYesThe inference request configuration
Returns: Promise<ModelInferenceRunJob> - A job object that can be polled for completion. Example:
import { NarrativeApi } from '@narrative.io/data-collaboration-sdk-ts';

const api = new NarrativeApi({
  apiKey: process.env.NARRATIVE_API_KEY,
});

const job = await api.runModelInference({
  data_plane_id: 'dp_abc123',
  model: 'anthropic.claude-sonnet-4.5',
  messages: [
    { role: 'system', text: 'You are a helpful assistant.' },
    { role: 'user', text: 'Summarize this data in one sentence.' }
  ],
  inference_config: {
    output_format_schema: {
      type: 'object',
      properties: {
        summary: { type: 'string' }
      },
      required: ['summary']
    },
    max_tokens: 500,
    temperature: 0.7
  }
});

console.log('Job ID:', job.id);

Types

InferenceModel

Model identifiers supported by the Narrative model inference API.
type InferenceModel =
  | 'anthropic.claude-haiku-4.5'
  | 'anthropic.claude-sonnet-4.5'
  | 'anthropic.claude-opus-4.5'
  | 'openai.gpt-oss-120b'
  | 'openai.gpt-4.1'
  | 'openai.o4-mini';
ModelProviderUse Case
anthropic.claude-haiku-4.5AnthropicFast, cost-effective tasks
anthropic.claude-sonnet-4.5AnthropicBalanced performance and capability
anthropic.claude-opus-4.5AnthropicComplex reasoning and analysis
openai.gpt-oss-120bOpenAIOpen-source large model
openai.gpt-4.1OpenAIAdvanced reasoning
openai.o4-miniOpenAIFast, efficient responses

MessageRole

The role of a message in the conversation.
type MessageRole = 'user' | 'assistant' | 'system';
RoleDescription
systemSets the model’s behavior and context
userInput from the user or application
assistantPrevious model responses (for multi-turn conversations)

InferenceMessage

A message in the inference conversation.
interface InferenceMessage {
  role: MessageRole;
  text: string;
}
PropertyTypeRequiredDescription
roleMessageRoleYesThe role of the message sender
textstringYesThe message content
Example:
const messages: InferenceMessage[] = [
  { role: 'system', text: 'You are a data classification expert.' },
  { role: 'user', text: 'Classify the following record: {...}' }
];

InferenceConfig

Configuration parameters for the inference request.
interface InferenceConfig {
  output_format_schema: Record<string, unknown>;
  max_tokens?: number;
  temperature?: number;
  top_p?: number;
  stop_sequences?: string[];
}
PropertyTypeRequiredDescription
output_format_schemaRecord<string, unknown>YesJSON Schema defining the expected output format
max_tokensnumberNoMaximum number of tokens to generate
temperaturenumberNoSampling temperature (0-1). Lower = more deterministic
top_pnumberNoNucleus sampling parameter (0-1)
stop_sequencesstring[]NoSequences that will stop generation
Example:
const config: InferenceConfig = {
  output_format_schema: {
    type: 'object',
    properties: {
      category: {
        type: 'string',
        enum: ['retail', 'finance', 'healthcare', 'technology']
      },
      confidence: {
        type: 'number',
        minimum: 0,
        maximum: 1
      },
      reasoning: {
        type: 'string'
      }
    },
    required: ['category', 'confidence']
  },
  max_tokens: 1000,
  temperature: 0.3
};

ModelInferenceRunRequest

The complete request body for running a model inference job.
interface ModelInferenceRunRequest {
  data_plane_id: string;
  model: InferenceModel;
  messages: InferenceMessage[];
  inference_config: InferenceConfig;
  tags?: string[];
}
PropertyTypeRequiredDescription
data_plane_idstringYesThe data plane ID where inference will execute
modelInferenceModelYesThe model to use for inference
messagesInferenceMessage[]YesThe conversation messages
inference_configInferenceConfigYesConfiguration for the inference
tagsstring[]NoOptional tags for organizing and filtering jobs

InferenceUsage

Token usage metrics from the inference response.
interface InferenceUsage {
  total_tokens: number;
  prompt_tokens: number;
  completion_tokens: number;
}
PropertyTypeDescription
total_tokensnumberTotal tokens used (prompt + completion)
prompt_tokensnumberTokens in the input messages
completion_tokensnumberTokens in the generated response

ModelInferenceRunResult

The result from a completed model inference job.
interface ModelInferenceRunResult<T = unknown> {
  usage: InferenceUsage;
  structured_output: T;
}
PropertyTypeDescription
usageInferenceUsageToken usage metrics
structured_outputTThe model’s response, typed according to your schema
Example with typed output:
interface ClassificationResult {
  category: string;
  confidence: number;
  reasoning?: string;
}

// After job completion
const result = job.result as ModelInferenceRunResult<ClassificationResult>;

console.log('Category:', result.structured_output.category);
console.log('Confidence:', result.structured_output.confidence);
console.log('Tokens used:', result.usage.total_tokens);

ModelInferenceRunJob

The job object returned when submitting an inference request. Extends the base job type with inference-specific result typing.
interface ModelInferenceRunJob extends Job {
  type: 'model_inference';
  result?: ModelInferenceRunResult;
}
PropertyTypeDescription
idstringUnique job identifier
type'model_inference'The job type
state'pending' | 'running' | 'completed' | 'failed'Current job state
resultModelInferenceRunResultPresent when job completes successfully
failuresobject[]Present when job fails
created_atstringISO timestamp of job creation
updated_atstringISO timestamp of last update

Error handling

Model Inference jobs can fail for several reasons:
ErrorCauseSolution
Invalid schemaJSON Schema is malformedValidate schema before submission
Model unavailableRequested model not available in data planeCheck supported models
Token limit exceededResponse would exceed max_tokensIncrease max_tokens or simplify request
Invalid data planeData plane ID not found or no accessVerify data plane ID and permissions
Example error handling:
const job = await api.runModelInference(request);

// Poll for completion
const completedJob = await waitForJob(job.id);

if (completedJob.state === 'failed') {
  console.error('Inference failed:', completedJob.failures);
  // Handle specific failure types
} else {
  console.log('Result:', completedJob.result.structured_output);
}