Why Model Inference matters
When working with sensitive data, sending it to external AI services creates compliance and security risks. Model Inference solves this by hosting models within your data plane infrastructure:| Traditional AI APIs | Model Inference |
|---|---|
| Data sent to external servers | Data stays in your infrastructure |
| Subject to third-party data policies | You control data residency |
| Network latency to external services | Local execution within data plane |
| Compliance complexity | Simplified compliance posture |
How Model Inference works
Model Inference operates through Narrative’s job queue, following the same pattern as other asynchronous operations:- Submit request: Your application sends an inference request specifying the model, messages, and output schema
- Job creation: The control plane creates an inference job and queues it
- Local execution: The data plane operator picks up the job and runs inference locally
- Structured response: Results are returned in a predictable format defined by your JSON Schema
Key capabilities
Supported models
Model Inference supports models from multiple providers, all hosted within your data plane:| Provider | Models |
|---|---|
| Anthropic | Claude Haiku 4.5, Claude Sonnet 4.5, Claude Opus 4.5 |
| OpenAI | GPT-4.1, o4-mini, GPT-oss-120b |
Structured output
Every inference request includes a JSON Schema that defines the expected response format. The model is constrained to return valid JSON matching your schema, making responses predictable and easy to parse programmatically.Conversation context
Inference requests support multi-turn conversations through the messages array. Each message has a role (system, user, or assistant) and text content:
Configuration options
Fine-tune model behavior with inference configuration parameters:| Parameter | Description | Default |
|---|---|---|
max_tokens | Maximum tokens in the response | Model default |
temperature | Randomness (0 = deterministic, 1 = creative) | Model default |
top_p | Nucleus sampling parameter | Model default |
stop_sequences | Strings that stop generation | None |
Common use cases
Model Inference powers AI features throughout the Narrative platform:- Dataset descriptions: Automatically generate human-readable descriptions from dataset metadata and samples
- Schedule translation: Convert natural language schedules (“every weekday at 9am”) to CRON expressions
- Data classification: Categorize records based on content analysis
- Schema suggestions: Recommend Rosetta Stone attribute mappings based on column names and sample data
Data privacy
Because inference runs within your data plane:- No external API calls: Data is never sent to Anthropic, OpenAI, or any external service
- Your infrastructure: Models run on compute resources within your data plane
- Compliance-friendly: Simplifies GDPR, CCPA, and other regulatory requirements
- Audit trail: All inference jobs are logged through the standard job system

