Skip to main content
Model Inference supports multiple models with different capabilities. This guide helps you choose the right model for your use case.

Quick selection guide

Your PriorityRecommended Model
Speed and costanthropic.claude-haiku-4.5 or openai.o4-mini
Balanced performanceanthropic.claude-sonnet-4.5
Complex reasoninganthropic.claude-opus-4.5 or openai.gpt-4.1

Model comparison

Anthropic Claude models

ModelSpeedCapabilityBest For
Claude Haiku 4.5FastestGoodSimple classification, extraction
Claude Sonnet 4.5FastBetterAnalysis, summarization, most tasks
Claude Opus 4.5SlowerBestComplex reasoning, nuanced decisions

OpenAI models

ModelSpeedCapabilityBest For
o4-miniFastestGoodQuick responses, simple tasks
GPT-4.1ModerateBetterAdvanced reasoning
GPT-oss-120bModerateGoodGeneral-purpose tasks

When to use smaller models

Use Claude Haiku or o4-mini when:
  • Task is straightforward: Binary classification, simple extraction
  • High volume: Processing many items where speed matters
  • Cost sensitivity: Budget constraints require efficiency
  • Latency matters: User-facing features needing fast response
// Good use case for Haiku: Simple classification
const job = await api.runModelInference({
  data_plane_id: 'dp_your_data_plane_id',
  model: 'anthropic.claude-haiku-4.5',
  messages: [
    { role: 'user', text: 'Is this email spam? Subject: "You won $1M!"' }
  ],
  inference_config: {
    output_format_schema: {
      type: 'object',
      properties: {
        is_spam: { type: 'boolean' },
        confidence: { type: 'number', minimum: 0, maximum: 1 }
      },
      required: ['is_spam', 'confidence']
    }
  }
});

When to use medium models

Use Claude Sonnet 4.5 when:
  • Task requires understanding: Content analysis, summarization
  • Balanced needs: Good quality without excessive cost
  • Most production use cases: Default choice for typical workflows
// Good use case for Sonnet: Content analysis
const job = await api.runModelInference({
  data_plane_id: 'dp_your_data_plane_id',
  model: 'anthropic.claude-sonnet-4.5',
  messages: [
    { role: 'system', text: 'Analyze dataset descriptions for quality and completeness.' },
    { role: 'user', text: 'Dataset: Customer transactions\nColumns: user_id, amount, date, category' }
  ],
  inference_config: {
    output_format_schema: {
      type: 'object',
      properties: {
        summary: { type: 'string' },
        completeness_score: { type: 'number', minimum: 0, maximum: 1 },
        missing_elements: { type: 'array', items: { type: 'string' } },
        suggestions: { type: 'array', items: { type: 'string' } }
      },
      required: ['summary', 'completeness_score']
    }
  }
});

When to use larger models

Use Claude Opus 4.5 or GPT-4.1 when:
  • Complex reasoning required: Multi-step analysis, nuanced judgment
  • High stakes: Decisions with significant impact
  • Ambiguous inputs: Tasks requiring interpretation
  • Quality over speed: Accuracy is paramount
// Good use case for Opus: Complex analysis
const job = await api.runModelInference({
  data_plane_id: 'dp_your_data_plane_id',
  model: 'anthropic.claude-opus-4.5',
  messages: [
    {
      role: 'system',
      text: 'You are a data governance expert. Analyze datasets for compliance risks.'
    },
    {
      role: 'user',
      text: `Analyze this dataset schema for privacy compliance:
        - email (string)
        - phone (string)
        - purchase_history (array)
        - ip_address (string)
        - device_fingerprint (string)`
    }
  ],
  inference_config: {
    output_format_schema: {
      type: 'object',
      properties: {
        risk_level: { type: 'string', enum: ['low', 'medium', 'high', 'critical'] },
        pii_fields: {
          type: 'array',
          items: {
            type: 'object',
            properties: {
              field: { type: 'string' },
              pii_type: { type: 'string' },
              risk: { type: 'string' }
            },
            required: ['field', 'pii_type', 'risk']
          }
        },
        compliance_concerns: { type: 'array', items: { type: 'string' } },
        recommendations: { type: 'array', items: { type: 'string' } }
      },
      required: ['risk_level', 'pii_fields', 'compliance_concerns', 'recommendations']
    }
  }
});

Task-based recommendations

Classification tasks

ComplexityRecommended Model
Binary (yes/no, spam/not spam)Claude Haiku
Multi-class (3-5 categories)Claude Haiku or Sonnet
Complex taxonomy (many categories, nuanced)Claude Sonnet or Opus

Extraction tasks

ComplexityRecommended Model
Simple fields (dates, names, numbers)Claude Haiku
Structured entities (addresses, products)Claude Sonnet
Complex relationships (multi-entity)Claude Opus

Generation tasks

ComplexityRecommended Model
Short text (taglines, labels)Claude Haiku or Sonnet
Medium content (descriptions, summaries)Claude Sonnet
Long-form (reports, analysis)Claude Sonnet or Opus

Transformation tasks

ComplexityRecommended Model
Format conversion (dates, units)Claude Haiku
Language translation (technical to plain)Claude Sonnet
Complex interpretation (natural language to code)Claude Opus

Testing different models

Try multiple models on sample data to compare quality:
async function compareModels(prompt: string, schema: object) {
  const models = [
    'anthropic.claude-haiku-4.5',
    'anthropic.claude-sonnet-4.5',
    'anthropic.claude-opus-4.5'
  ];

  const results = await Promise.all(
    models.map(async (model) => {
      const start = Date.now();
      const job = await api.runModelInference({
        data_plane_id: 'dp_your_data_plane_id',
        model,
        messages: [{ role: 'user', text: prompt }],
        inference_config: { output_format_schema: schema }
      });

      const result = await waitForJob(job.id);
      const duration = Date.now() - start;

      return {
        model,
        duration,
        tokens: result.result?.usage.total_tokens,
        output: result.result?.structured_output
      };
    })
  );

  console.table(results.map(r => ({
    model: r.model,
    duration_ms: r.duration,
    tokens: r.tokens
  })));

  return results;
}

Best practices

PracticeDescription
Start smallBegin with Haiku, upgrade if quality is insufficient
Test on samplesCompare models on representative data before production
Monitor qualityTrack output quality metrics over time
Balance cost and qualityDon’t over-engineer simple tasks
Consider latencyUser-facing features may need faster models