Python SDK
Official Python client library for AletheionGuard
Quick Start
Installation
pip install aletheion-guard
Basic Usage
from aletheion_guard import EpistemicAuditor
auditor = EpistemicAuditor()
result = auditor.evaluate("Paris is the capital of France")
print(result.verdict) # "ACCEPT"
Installation
Via pip (Recommended)
pip install aletheion-guard
From Source
git clone https://github.com/AletheionAGI/AletheionGuard-Pypi.git
cd AletheionGuard-Pypi
pip install -e .
Requirements
- • Python 3.8+ (3.10+ recommended)
- • torch >= 2.0.0
- • transformers >= 4.30.0
- • sentence-transformers >= 2.2.0
- • numpy >= 1.24.0
- • scipy >= 1.10.0
- • pydantic >= 2.0.0
EpistemicAuditor Class
Initialization
from aletheion_guard import EpistemicAuditor
# Default configuration
auditor = EpistemicAuditor()
# Custom configuration
auditor = EpistemicAuditor(
config={
"q1_threshold": 0.4,
"q2_threshold": 0.3,
"device": "cuda", # "cuda" or "cpu"
"model_path": "models/real_finetuned"
}
)
Configuration Options
- • q1_threshold: Threshold for aleatoric uncertainty (default: 0.35)
- • q2_threshold: Threshold for epistemic uncertainty (default: 0.35)
- • device: Compute device - "cuda" or "cpu" (auto-detected)
- • model_path: Path to custom model weights
evaluate() - Single Audit
result = auditor.evaluate(
text="Paris is the capital of France",
context="Optional context" # Optional
)
# Access results
print(result.q1) # 0.15
print(result.q2) # 0.08
print(result.height) # 0.83
print(result.verdict) # "ACCEPT"
print(result.ece) # 0.042
Return Type: EpistemicAudit
| Attribute | Type | Description |
|---|---|---|
q1 | float | Aleatoric uncertainty (0.0-1.0) |
q2 | float | Epistemic uncertainty (0.0-1.0) |
height | float | Proximity to truth (0.0-1.0) |
verdict | str | "ACCEPT" | "MAYBE" | "REFUSED" |
ece | float | Expected Calibration Error |
brier | float | Brier score |
confidence_interval | Tuple[float, float] | 95% CI for height |
explanation | str | Human-readable reasoning |
metadata | dict | Additional diagnostics |
batch_evaluate() - Batch Audit
texts = [
"Paris is the capital of France",
"The sky is green",
"Water boils at 100°C"
]
results = auditor.batch_evaluate(
texts=texts,
batch_size=32 # Optional, default: 32
)
# Iterate over results
for i, result in enumerate(results):
print(f"Text {i}: {result.verdict}")
Performance Tip: Batch processing is 5-10x faster than individual calls for multiple texts.
Complete Examples
1. OpenAI Integration
import openai
from aletheion_guard import EpistemicAuditor
# Initialize
client = openai.OpenAI(api_key="your-key")
auditor = EpistemicAuditor()
# Get LLM response
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "What is AI?"}]
)
text = response.choices[0].message.content
# Audit the response
audit = auditor.evaluate(text)
if audit.verdict == "REFUSED":
print("⚠️ High uncertainty detected")
print(f"Q2 (epistemic): {audit.q2:.3f}")
else:
print(text)
2. RAG Integration
from aletheion_guard import EpistemicAuditor
auditor = EpistemicAuditor()
# Generate initial response
response = llm.generate(query, context=initial_docs)
# Audit response
audit = auditor.evaluate(response)
# If Q2 is high, retrieve more context
if audit.q2 > 0.3:
print("High epistemic uncertainty - retrieving more context")
extra_docs = retriever.retrieve(query, k=10)
response = llm.generate(query, context=extra_docs)
audit = auditor.evaluate(response)
print(f"Final verdict: {audit.verdict}")
3. LangChain Integration
from langchain.llms import OpenAI
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from aletheion_guard import EpistemicAuditor
# Setup LangChain
llm = OpenAI(temperature=0.7)
prompt = PromptTemplate(
input_variables=["question"],
template="Answer this question: {question}"
)
chain = LLMChain(llm=llm, prompt=prompt)
# Setup auditor
auditor = EpistemicAuditor()
# Run chain and audit
response = chain.run("What is quantum computing?")
audit = auditor.evaluate(response)
print(f"Response: {response}")
print(f"Verdict: {audit.verdict} (Q2: {audit.q2:.3f})")
4. Compare Multiple Models
from aletheion_guard import EpistemicAuditor
auditor = EpistemicAuditor()
# Get responses from different models
prompt = "Explain neural networks"
responses = {
"gpt-4": gpt4.generate(prompt),
"claude-3": claude.generate(prompt),
"llama-3": llama.generate(prompt)
}
# Audit all responses
results = {}
for model, text in responses.items():
results[model] = auditor.evaluate(text)
# Rank by Q2 (epistemic uncertainty)
ranked = sorted(
results.items(),
key=lambda x: x[1].q2
)
print("Model Ranking (best to worst):")
for model, audit in ranked:
print(f"{model}: Q2={audit.q2:.3f}, Verdict={audit.verdict}")
Advanced Features
Custom Thresholds
# More conservative (less ACCEPT, more REFUSED)
auditor = EpistemicAuditor(config={
"q1_threshold": 0.25, # Lower threshold
"q2_threshold": 0.25 # Lower threshold
})
# More permissive (more ACCEPT, less REFUSED)
auditor = EpistemicAuditor(config={
"q1_threshold": 0.45, # Higher threshold
"q2_threshold": 0.45 # Higher threshold
})
GPU Acceleration
# Use GPU if available
auditor = EpistemicAuditor(config={"device": "cuda"})
# Force CPU
auditor = EpistemicAuditor(config={"device": "cpu"})
# Auto-detect (default)
auditor = EpistemicAuditor() # Uses CUDA if available
Error Handling
from aletheion_guard import EpistemicAuditor, AuditorError
try:
auditor = EpistemicAuditor()
result = auditor.evaluate("Some text")
except AuditorError as e:
print(f"Error: {e}")
except ValueError as e:
print(f"Invalid input: {e}")
Best Practices
✓ Use Batch Processing
For multiple texts, use batch_evaluate() instead of looping evaluate(). It's 5-10x faster.
✓ Reuse Auditor Instance
Create one EpistemicAuditor instance and reuse it. Initialization loads models (~500MB), which is expensive.
✓ Use GPU for Production
Enable CUDA for 3-5x faster inference. CPU is fine for testing but slower for production workloads.
⚠ Handle REFUSED Verdicts
Always check verdict before using the response. REFUSED means high hallucination risk.
⚠ Monitor Q2 Over Time
Track Q2 metrics to detect model drift or distribution shift in your application.
Performance
| Operation | CPU (Intel i7) | GPU (RTX 3090) |
|---|---|---|
| Single evaluate() | ~50ms | ~15ms |
| Batch (32 items) | ~500ms | ~80ms |
| Throughput (single) | ~20 req/sec | ~65 req/sec |
| Throughput (batch 32) | ~60 req/sec | ~400 req/sec |
Next Steps
Need Help?
Join our community or reach out to our support team