Epistemic Uncertainty

Understanding the difference between what models don't know (epistemic) and what's inherently uncertain (aleatoric)

What is Epistemic Uncertainty?

Epistemic uncertainty (Q2) represents reducible model ignorance — the uncertainty that arises from insufficient training data, out-of-distribution queries, or knowledge gaps that the model could theoretically learn given more information.

Key Characteristics

  • Reducible: Can be decreased with more training data
  • Signals ignorance: Model doesn't have enough knowledge
  • Detects hallucination: High Q2 = high hallucination risk
  • Out-of-distribution: Identifies queries outside training domain
# Example: High epistemic uncertainty
question = "What will Bitcoin's price be tomorrow?"
result = auditor.evaluate(question)
print(result.q2) # High Q2 - future is unknowable
print(result.verdict) # "REFUSED"

Q1 (Aleatoric) vs Q2 (Epistemic)

Q1 - Aleatoric Uncertainty

Irreducible data noise and inherent ambiguity in the question itself.

Source: Data ambiguity
Reducibility: Irreducible
Solution: Clarify question
Action: Ask user for details
Example: "What's the capital of Netherlands?"
(Amsterdam vs The Hague - both valid answers)

Q2 - Epistemic Uncertainty

Reducible uncertainty from model ignorance and knowledge gaps.

Source: Model ignorance
Reducibility: Reducible
Solution: More training data
Action: Escalate to expert
Example: "What's the GDP of Narnia?"
(Model lacks knowledge - fictional place)
AspectAleatoric (Q1)Epistemic (Q2)
SourceData ambiguityModel ignorance
Reducibility❌ Irreducible✅ Reducible
When HighQuestion is ambiguousModel lacks knowledge
Verdict"MAYBE""REFUSED"
ActionAsk for clarificationRetrieve more context

How Q1 and Q2 Are Measured

Neural Network Gates

AletheionGuard uses specialized neural networks to predict Q1 and Q2 from sentence embeddings.

# Q1 Gate (Aleatoric)
class Q1Gate(nn.Module):
# Input: 384-dim embeddings
# Output: Q1 ∈ [0, 1]
def forward(self, embeddings):
features = nn.Sequential(
nn.Linear(384, 256),
nn.ReLU(),
nn.Dropout(0.1),
nn.Linear(256, 1)
)(embeddings)
return torch.sigmoid(features)
# Q2 Gate (Epistemic - conditioned on Q1)
class Q2Gate(nn.Module):
# Input: 384-dim embeddings + Q1 value
# Q2 is conditioned on Q1 for better calibration
def forward(self, embeddings, q1):
features = self.feature_net(embeddings, q1)
q2 = self.q2_head(features)
return torch.sigmoid(q2)

Why Q2 is Conditioned on Q1

Conditioning Q2 on Q1 improves calibration by 21%. If a question is very ambiguous (high Q1), the model should account for that when assessing its own knowledge (Q2).

Derived Metrics

Height (Proximity to Truth)

height = 1 - sqrt(q1² + q2²)
  • Range: [0, 1]
  • 0 = Base (completely uncertain)
  • 1 = Apex (perfect confidence)

Total Uncertainty

u = sqrt(q1² + q2²)
  • Range: [0, ~1.41]
  • • Combines both sources of uncertainty
  • • Used in verdict decision logic

Verdict Decision Logic

AletheionGuard uses Q1, Q2, and total uncertainty to make verdicts:

u = 1.0 - height # Total uncertainty
if q2 >= 0.35 or u >= 0.60:
verdict = "REFUSED" # High epistemic uncertainty
elif q1 >= 0.35 or (0.30 <= u < 0.60):
verdict = "MAYBE" # High aleatoric uncertainty
else:
verdict = "ACCEPT" # Low uncertainty

ACCEPT

Low Q1 and Q2. Model is confident and likely correct.

MAYBE

High Q1 (aleatoric). Question is ambiguous, needs clarification.

REFUSED

High Q2 (epistemic). Model lacks knowledge, high hallucination risk.

Why Epistemic Uncertainty Matters

1. Enables Safe Automation

Without calibration, you cannot safely automate decisions. With epistemic uncertainty, you can:

if result['uncertainty'] < 0.15:
return "AUTO_APPROVE" # Safe to automate
elif result['uncertainty'] > 0.40:
return "ESCALATE_TO_EXPERT" # Too uncertain

2. Detects Hallucinations

High Q2 signals hallucination risk. AletheionGuard achieves ROC-AUC 0.94 for hallucination detection.

Precision@0.9 Recall: 0.87

3. Enables RAG Optimization

Trigger additional retrieval when epistemic uncertainty is high:

if audit.q2 > 0.3:
# High epistemic uncertainty - get more context
additional_docs = retriever.retrieve(query, k=10)
response = llm.generate(query, additional_docs)

4. Compliance & Auditability

Critical for regulated industries (medical, legal, financial). Know when AI cannot make a decision with confidence.

Real-World Example

Healthcare Q&A System

Low Epistemic Uncertainty

Q: "What is the normal heart rate?"

Q1: 0.12
Q2: 0.08
Verdict: ACCEPT

✅ Safe to answer automatically

High Epistemic Uncertainty

Q: "Should I take this new experimental drug?"

Q1: 0.22
Q2: 0.68
Verdict: REFUSED

⚠️ Escalate to medical professional

Impact

  • • Reduced hallucination rate from 18% to 4%
  • • Improved user trust scores by 37%
  • • 60% of queries auto-approved safely

The Pyramidal Architecture

AletheionGuard uses a geometric pyramid model to represent uncertainty:

Apex (Truth)
/ \
/ Q2 \ ← Epistemic
/________\
/ \
/ Q1 \ ← Aleatoric
/______________\
Base (Max Uncertainty)

Apex (Height = 1)

Perfect knowledge. Q1 = 0, Q2 = 0. Model is certain and correct.

Middle (0.3 < Height < 0.7)

Moderate uncertainty. Some Q1 or Q2. Requires human review.

Base (Height = 0)

Maximum uncertainty. High Q1 and/or Q2. Cannot make reliable prediction.

Next Steps

Want to Learn More?

Explore our research paper and technical documentation