What is a confidence score?
A confidence score is an AI-generated rating from 0-100% that indicates how likely a mapping is to produce correct results. Higher scores mean the AI found strong evidence that the mapping is accurate; lower scores indicate potential issues that warrant review. Confidence scores differ from manual validation in important ways:| Approach | What it checks | When it runs |
|---|---|---|
| Manual validation | Specific test cases you define | On demand |
| Confidence scoring | Pattern analysis across all data | Automatically |
How scores are calculated
The AI evaluates each mapping by analyzing multiple factors: Column name analysis The AI examines whether the source column name semantically matches the target attribute. A column namedemail_address mapping to the raw_email attribute scores higher than a column named field_7.
Data sample inspection
The system samples actual data values and checks whether they match expected patterns for the target attribute. If the hl7_gender attribute expects values like male or female, but the source column contains 1 and 2, the AI will flag this unless there’s an appropriate transformation.
Transformation logic evaluation
When a mapping includes a transformation expression, the AI analyzes whether the logic correctly handles the conversion. It looks for:
- Missing case handling (what happens to unexpected values?)
- Type mismatches (is a string being cast to a number correctly?)
- Edge cases (null values, empty strings, special characters)
Understanding score ranges
Confidence scores fall into three tiers:| Score range | Classification | Recommended action |
|---|---|---|
| 80-100% | High confidence | Generally reliable; spot-check if desired |
| 50-79% | Medium confidence | Review the mapping and AI feedback |
| 0-49% | Low confidence | Manual review required before use |
- Partial transformation coverage (handles most but not all values)
- Column names that don’t clearly indicate content
- Data patterns that vary from typical examples
- Missing or incomplete transformations
- Source data that doesn’t match expected patterns
- Ambiguous column names with multiple possible interpretations
Evaluation vs. suggestions
Confidence scoring powers two distinct workflows: Mapping evaluation analyzes your existing mappings. Run an evaluation to:- Assess the overall quality of your normalizations
- Identify specific mappings that need attention
- Get AI-generated explanations for quality issues
- Which attribute each column should map to
- What transformation expression to use
- Sample output showing before/after values
The confidence gradient
In the Normalized Datasets interface, confidence is visualized as a gradient bar showing the distribution of your mappings across quality tiers:| Color | Meaning |
|---|---|
| Green | High confidence mappings |
| Yellow | Medium confidence mappings |
| Red | Low confidence mappings |
| Gray | Not yet scored |
When to re-evaluate
Confidence scores can become stale when:- You modify transformation expressions
- The source data changes significantly
- New data samples reveal patterns not present in the original evaluation

