This page is a lookup reference for dataset statistics—the column-level metrics the platform computes for each dataset. For an explanation of why statistics matter and how they work, see Dataset Statistics.
Statistics reference
Each statistic is computed per column. The computation column shows the underlying SQL-level operation.
| Statistic | Description | Computation |
|---|
valueCount | Total number of non-null values in the column | COUNT(column) |
nullValueCount | Number of null values in the column | COUNT(*) - COUNT(column) |
nanValueCount | Number of NaN (not-a-number) values—floating-point columns only | COUNT_IF(IS_NAN(column)) |
approxCountDistinct | Approximate number of distinct values, using HyperLogLog for efficiency on large datasets | APPROX_COUNT_DISTINCT(column) |
countDistinct | Exact number of distinct values | COUNT(DISTINCT column) |
lowerBound | Minimum value in the column | MIN(column) |
upperBound | Maximum value in the column | MAX(column) |
histogram | Frequency distribution of values across distinct buckets | Aggregation over value frequencies |
mean | Arithmetic mean—numeric columns only | AVG(column) |
standardDeviation | Population standard deviation—numeric columns only | STDDEV(column) |
columnStoredBytes | Bytes of storage consumed by the column | Storage metadata lookup |
completeness | Ratio of non-null values to total rows (0 to 1) | COUNT(column) / COUNT(*) |
Type compatibility matrix
Not all statistics apply to all data types. The table below shows which statistics are computed for each type.
| Type | Supported statistics |
|---|
DoubleType | All 12: valueCount, nullValueCount, nanValueCount, approxCountDistinct, countDistinct, lowerBound, upperBound, histogram, mean, standardDeviation, columnStoredBytes, completeness |
LongType | 11 (all except nanValueCount): valueCount, nullValueCount, approxCountDistinct, countDistinct, lowerBound, upperBound, histogram, mean, standardDeviation, columnStoredBytes, completeness |
StringType | 9: valueCount, nullValueCount, approxCountDistinct, countDistinct, lowerBound, upperBound, histogram, columnStoredBytes, completeness |
BooleanType | 9: valueCount, nullValueCount, approxCountDistinct, countDistinct, lowerBound, upperBound, histogram, columnStoredBytes, completeness |
TimestampTzType | 9: valueCount, nullValueCount, approxCountDistinct, countDistinct, lowerBound, upperBound, histogram, columnStoredBytes, completeness |
ArrayType | 4: valueCount, nullValueCount, columnStoredBytes, completeness |
ObjectType | 4: valueCount, nullValueCount, columnStoredBytes, completeness |
DoubleType is the only type that supports nanValueCount, since NaN is a floating-point concept. mean and standardDeviation are limited to numeric types (DoubleType and LongType).
Related content