Python SDK

Official Python client library for AletheionGuard

Quick Start

Installation

pip install aletheion-guard

Basic Usage

from aletheion_guard import EpistemicAuditor
auditor = EpistemicAuditor()
result = auditor.evaluate("Paris is the capital of France")
print(result.verdict) # "ACCEPT"

Installation

Via pip (Recommended)

pip install aletheion-guard

From Source

git clone https://github.com/AletheionAGI/AletheionGuard-Pypi.git
cd AletheionGuard-Pypi
pip install -e .

Requirements

  • • Python 3.8+ (3.10+ recommended)
  • • torch >= 2.0.0
  • • transformers >= 4.30.0
  • • sentence-transformers >= 2.2.0
  • • numpy >= 1.24.0
  • • scipy >= 1.10.0
  • • pydantic >= 2.0.0

EpistemicAuditor Class

Initialization

from aletheion_guard import EpistemicAuditor
# Default configuration
auditor = EpistemicAuditor()
# Custom configuration
auditor = EpistemicAuditor(
config={
"q1_threshold": 0.4,
"q2_threshold": 0.3,
"device": "cuda", # "cuda" or "cpu"
"model_path": "models/real_finetuned"
}
)

Configuration Options

  • q1_threshold: Threshold for aleatoric uncertainty (default: 0.35)
  • q2_threshold: Threshold for epistemic uncertainty (default: 0.35)
  • device: Compute device - "cuda" or "cpu" (auto-detected)
  • model_path: Path to custom model weights

evaluate() - Single Audit

result = auditor.evaluate(
text="Paris is the capital of France",
context="Optional context" # Optional
)
# Access results
print(result.q1) # 0.15
print(result.q2) # 0.08
print(result.height) # 0.83
print(result.verdict) # "ACCEPT"
print(result.ece) # 0.042

Return Type: EpistemicAudit

AttributeTypeDescription
q1floatAleatoric uncertainty (0.0-1.0)
q2floatEpistemic uncertainty (0.0-1.0)
heightfloatProximity to truth (0.0-1.0)
verdictstr"ACCEPT" | "MAYBE" | "REFUSED"
ecefloatExpected Calibration Error
brierfloatBrier score
confidence_intervalTuple[float, float]95% CI for height
explanationstrHuman-readable reasoning
metadatadictAdditional diagnostics

batch_evaluate() - Batch Audit

texts = [
"Paris is the capital of France",
"The sky is green",
"Water boils at 100°C"
]
results = auditor.batch_evaluate(
texts=texts,
batch_size=32 # Optional, default: 32
)
# Iterate over results
for i, result in enumerate(results):
print(f"Text {i}: {result.verdict}")
Performance Tip: Batch processing is 5-10x faster than individual calls for multiple texts.

Complete Examples

1. OpenAI Integration

import openai
from aletheion_guard import EpistemicAuditor
# Initialize
client = openai.OpenAI(api_key="your-key")
auditor = EpistemicAuditor()
# Get LLM response
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "What is AI?"}]
)
text = response.choices[0].message.content
# Audit the response
audit = auditor.evaluate(text)
if audit.verdict == "REFUSED":
print("⚠️ High uncertainty detected")
print(f"Q2 (epistemic): {audit.q2:.3f}")
else:
print(text)

2. RAG Integration

from aletheion_guard import EpistemicAuditor
auditor = EpistemicAuditor()
# Generate initial response
response = llm.generate(query, context=initial_docs)
# Audit response
audit = auditor.evaluate(response)
# If Q2 is high, retrieve more context
if audit.q2 > 0.3:
print("High epistemic uncertainty - retrieving more context")
extra_docs = retriever.retrieve(query, k=10)
response = llm.generate(query, context=extra_docs)
audit = auditor.evaluate(response)
print(f"Final verdict: {audit.verdict}")

3. LangChain Integration

from langchain.llms import OpenAI
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from aletheion_guard import EpistemicAuditor
# Setup LangChain
llm = OpenAI(temperature=0.7)
prompt = PromptTemplate(
input_variables=["question"],
template="Answer this question: {question}"
)
chain = LLMChain(llm=llm, prompt=prompt)
# Setup auditor
auditor = EpistemicAuditor()
# Run chain and audit
response = chain.run("What is quantum computing?")
audit = auditor.evaluate(response)
print(f"Response: {response}")
print(f"Verdict: {audit.verdict} (Q2: {audit.q2:.3f})")

4. Compare Multiple Models

from aletheion_guard import EpistemicAuditor
auditor = EpistemicAuditor()
# Get responses from different models
prompt = "Explain neural networks"
responses = {
"gpt-4": gpt4.generate(prompt),
"claude-3": claude.generate(prompt),
"llama-3": llama.generate(prompt)
}
# Audit all responses
results = {}
for model, text in responses.items():
results[model] = auditor.evaluate(text)
# Rank by Q2 (epistemic uncertainty)
ranked = sorted(
results.items(),
key=lambda x: x[1].q2
)
print("Model Ranking (best to worst):")
for model, audit in ranked:
print(f"{model}: Q2={audit.q2:.3f}, Verdict={audit.verdict}")

Advanced Features

Custom Thresholds

# More conservative (less ACCEPT, more REFUSED)
auditor = EpistemicAuditor(config={
"q1_threshold": 0.25, # Lower threshold
"q2_threshold": 0.25 # Lower threshold
})
# More permissive (more ACCEPT, less REFUSED)
auditor = EpistemicAuditor(config={
"q1_threshold": 0.45, # Higher threshold
"q2_threshold": 0.45 # Higher threshold
})

GPU Acceleration

# Use GPU if available
auditor = EpistemicAuditor(config={"device": "cuda"})
# Force CPU
auditor = EpistemicAuditor(config={"device": "cpu"})
# Auto-detect (default)
auditor = EpistemicAuditor() # Uses CUDA if available

Error Handling

from aletheion_guard import EpistemicAuditor, AuditorError
try:
auditor = EpistemicAuditor()
result = auditor.evaluate("Some text")
except AuditorError as e:
print(f"Error: {e}")
except ValueError as e:
print(f"Invalid input: {e}")

Best Practices

✓ Use Batch Processing

For multiple texts, use batch_evaluate() instead of looping evaluate(). It's 5-10x faster.

✓ Reuse Auditor Instance

Create one EpistemicAuditor instance and reuse it. Initialization loads models (~500MB), which is expensive.

✓ Use GPU for Production

Enable CUDA for 3-5x faster inference. CPU is fine for testing but slower for production workloads.

⚠ Handle REFUSED Verdicts

Always check verdict before using the response. REFUSED means high hallucination risk.

⚠ Monitor Q2 Over Time

Track Q2 metrics to detect model drift or distribution shift in your application.

Performance

OperationCPU (Intel i7)GPU (RTX 3090)
Single evaluate()~50ms~15ms
Batch (32 items)~500ms~80ms
Throughput (single)~20 req/sec~65 req/sec
Throughput (batch 32)~60 req/sec~400 req/sec

Next Steps

Need Help?

Join our community or reach out to our support team