Python SDK

Name: AletheionGuard
Rating: 4.8 (127 reviews)
Author: AletheionGuard

Official Python client library for AletheionGuard

Quick Start

Installation

pip install aletheion-guard

Basic Usage

from aletheion_guard import EpistemicAuditor

auditor = EpistemicAuditor()

result = auditor.evaluate("Paris is the capital of France")

print(result.verdict) # "ACCEPT"

Installation

Via pip (Recommended)

pip install aletheion-guard

From Source

git clone https://github.com/AletheionAGI/AletheionGuard-Pypi.git

cd AletheionGuard-Pypi

pip install -e .

Requirements

• Python 3.8+ (3.10+ recommended)
• torch >= 2.0.0
• transformers >= 4.30.0
• sentence-transformers >= 2.2.0
• numpy >= 1.24.0
• scipy >= 1.10.0
• pydantic >= 2.0.0

EpistemicAuditor Class

Initialization

from aletheion_guard import EpistemicAuditor

# Default configuration

auditor = EpistemicAuditor()

# Custom configuration

auditor = EpistemicAuditor(

config={

"q1_threshold": 0.4,

"q2_threshold": 0.3,

"device": "cuda", # "cuda" or "cpu"

"model_path": "models/real_finetuned"

}

)

Configuration Options

• q1_threshold: Threshold for aleatoric uncertainty (default: 0.35)
• q2_threshold: Threshold for epistemic uncertainty (default: 0.35)
• device: Compute device - "cuda" or "cpu" (auto-detected)
• model_path: Path to custom model weights

evaluate() - Single Audit

result = auditor.evaluate(

text="Paris is the capital of France",

context="Optional context" # Optional

)

# Access results

print(result.q1) # 0.15

print(result.q2) # 0.08

print(result.height) # 0.83

print(result.verdict) # "ACCEPT"

print(result.ece) # 0.042

Return Type: EpistemicAudit

Attribute	Type	Description
`q1`	float	Aleatoric uncertainty (0.0-1.0)
`q2`	float	Epistemic uncertainty (0.0-1.0)
`height`	float	Proximity to truth (0.0-1.0)
`verdict`	str	"ACCEPT" \| "MAYBE" \| "REFUSED"
`ece`	float	Expected Calibration Error
`brier`	float	Brier score
`confidence_interval`	Tuple[float, float]	95% CI for height
`explanation`	str	Human-readable reasoning
`metadata`	dict	Additional diagnostics

batch_evaluate() - Batch Audit

texts = [

"Paris is the capital of France",

"The sky is green",

"Water boils at 100°C"

]

results = auditor.batch_evaluate(

texts=texts,

batch_size=32 # Optional, default: 32

)

# Iterate over results

for i, result in enumerate(results):

print(f"Text {i}: {result.verdict}")

Performance Tip: Batch processing is 5-10x faster than individual calls for multiple texts.

Complete Examples

1. OpenAI Integration

import openai

from aletheion_guard import EpistemicAuditor

# Initialize

client = openai.OpenAI(api_key="your-key")

auditor = EpistemicAuditor()

# Get LLM response

response = client.chat.completions.create(

model="gpt-4",

messages=[{"role": "user", "content": "What is AI?"}]

)

text = response.choices[0].message.content

# Audit the response

audit = auditor.evaluate(text)

if audit.verdict == "REFUSED":

print("⚠️ High uncertainty detected")

print(f"Q2 (epistemic): {audit.q2:.3f}")

else:

print(text)

2. RAG Integration

from aletheion_guard import EpistemicAuditor

auditor = EpistemicAuditor()

# Generate initial response

response = llm.generate(query, context=initial_docs)

# Audit response

audit = auditor.evaluate(response)

# If Q2 is high, retrieve more context

if audit.q2 > 0.3:

print("High epistemic uncertainty - retrieving more context")

extra_docs = retriever.retrieve(query, k=10)

response = llm.generate(query, context=extra_docs)

audit = auditor.evaluate(response)

print(f"Final verdict: {audit.verdict}")

3. LangChain Integration

from langchain.llms import OpenAI

from langchain.chains import LLMChain

from langchain.prompts import PromptTemplate

from aletheion_guard import EpistemicAuditor

# Setup LangChain

llm = OpenAI(temperature=0.7)

prompt = PromptTemplate(

input_variables=["question"],

template="Answer this question: {question}"

)

chain = LLMChain(llm=llm, prompt=prompt)

# Setup auditor

auditor = EpistemicAuditor()

# Run chain and audit

response = chain.run("What is quantum computing?")

audit = auditor.evaluate(response)

print(f"Response: {response}")

print(f"Verdict: {audit.verdict} (Q2: {audit.q2:.3f})")

4. Compare Multiple Models

from aletheion_guard import EpistemicAuditor

auditor = EpistemicAuditor()

# Get responses from different models

prompt = "Explain neural networks"

responses = {

"gpt-4": gpt4.generate(prompt),

"claude-3": claude.generate(prompt),

"llama-3": llama.generate(prompt)

}

# Audit all responses

results = {}

for model, text in responses.items():

results[model] = auditor.evaluate(text)

# Rank by Q2 (epistemic uncertainty)

ranked = sorted(

results.items(),

key=lambda x: x[1].q2

)

print("Model Ranking (best to worst):")

for model, audit in ranked:

print(f"{model}: Q2={audit.q2:.3f}, Verdict={audit.verdict}")

Advanced Features

Custom Thresholds

# More conservative (less ACCEPT, more REFUSED)

auditor = EpistemicAuditor(config={

"q1_threshold": 0.25, # Lower threshold

"q2_threshold": 0.25 # Lower threshold

})

# More permissive (more ACCEPT, less REFUSED)

auditor = EpistemicAuditor(config={

"q1_threshold": 0.45, # Higher threshold

"q2_threshold": 0.45 # Higher threshold

})

GPU Acceleration

# Use GPU if available

auditor = EpistemicAuditor(config={"device": "cuda"})

# Force CPU

auditor = EpistemicAuditor(config={"device": "cpu"})

# Auto-detect (default)

auditor = EpistemicAuditor() # Uses CUDA if available

Error Handling

from aletheion_guard import EpistemicAuditor, AuditorError

try:

auditor = EpistemicAuditor()

result = auditor.evaluate("Some text")

except AuditorError as e:

print(f"Error: {e}")

except ValueError as e:

print(f"Invalid input: {e}")

Best Practices

✓ Use Batch Processing

For multiple texts, use batch_evaluate() instead of looping evaluate(). It's 5-10x faster.

✓ Reuse Auditor Instance

Create one EpistemicAuditor instance and reuse it. Initialization loads models (~500MB), which is expensive.

✓ Use GPU for Production

Enable CUDA for 3-5x faster inference. CPU is fine for testing but slower for production workloads.

⚠ Handle REFUSED Verdicts

Always check verdict before using the response. REFUSED means high hallucination risk.

⚠ Monitor Q2 Over Time

Track Q2 metrics to detect model drift or distribution shift in your application.

Performance

Operation	CPU (Intel i7)	GPU (RTX 3090)
Single evaluate()	~50ms	~15ms
Batch (32 items)	~500ms	~80ms
Throughput (single)	~20 req/sec	~65 req/sec
Throughput (batch 32)	~60 req/sec	~400 req/sec

Next Steps

🐍 More Python Examples

Advanced integration patterns

🌐 REST API Reference

HTTP API documentation

🧠 Core Concepts

Understand Q1, Q2, and Height

🎮 Try Playground

Test interactively in browser

Need Help?

Join our community or reach out to our support team

Contact Support GitHub Discussions