Documentation
Groundr is an AI reliability and arbitration platform built on the Reality Negotiation Protocol (RNP). It answers one question for developers: "Can I safely show this AI output to my users?"
The system forces multiple AI models to debate each other using evidence, then uses mathematical arbitration to determine verified truth. Every disagreement is logged, every agent builds a reputation, and every claim is grounded in external proof.
Quick Start
Get Groundr running locally in under 2 minutes:
1. Clone & Install
git clone https://github.com/your-org/groundr.git
cd groundr
pip install -r requirements.txt
2. Configure Environment
Copy the example environment file and add your API keys:
cp .env.example .env
# Edit .env with your keys:
# OPENAI_API_KEY=sk-...
# ANTHROPIC_API_KEY=sk-ant-...
# GOOGLE_API_KEY=AI...
3. Start the Server
uvicorn api.main:app --reload --port 8000
Visit http://localhost:8000 for the landing page, or http://localhost:8000/docs
for the interactive API explorer.
4. Make Your First Request (Verify v1)
curl -X POST http://localhost:8000/v1/verify \
-H "Content-Type: application/json" \
-H "X-API-Key: rnp_your_secret_key" \
-d '{
"query": "Does the EU Pro plan include feature X?",
"domain": "policy",
"mode": "balanced",
"outputs": [
{"model":"gpt-4o","provider":"openai","claim":"Yes, it is included.","confidence":0.90},
{"model":"claude-3-5-sonnet","provider":"anthropic","claim":"No, it is not included.","confidence":0.86}
]
}'
Authentication
Production: API key is required via X-API-Key.
Development: local environments may allow relaxed auth depending on
GROUNDR_ENV.
Verify v1 HTTP routes (including GET /v1/kpis, GET /v1/audit/{tenant_id},
POST /v1/verify) follow the same rule: when GROUNDR_ENV=production, a valid
X-API-Key is required; in development the server may accept requests without a key so local tools
and the analytics page can call KPIs from the browser.
X-API-Key: rnp_your_secret_key
Provider feedback reports require a separate provider token:
X-Provider-Token: grv-prov-...
BYOK headers are supported per request:
X-BYOK-OpenAI: sk-...
X-BYOK-Anthropic: sk-ant-...
X-BYOK-Google: AI...
X-BYOK-XAI: xai-...
X-BYOK-Brave: brave-...
X-BYOK-Goggles: https://my-goggles-config
Generate keys using the included utility:
python manage_keys.py generate --name "my-app"
python manage_keys.py list
python manage_keys.py revoke rnp_abc123
System Architecture
Groundr exposes two related surfaces: Verify v1 (POST /v1/verify) is the
production trust contract. The legacy negotiation engine (/agents,
/negotiate, WebSocket) remains available for experiments and older integrations.
Verify v1 stack (production contract)
src/verify_orchestrator.py— end-to-end verify pipelinesrc/verify_schemas.py— frozen request/response modelssrc/policy_engine.pyandconfig/policy.yaml— declarative trust decisionssrc/atoms.py— claim atomizationsrc/brave_research.py— evidence retrievalapi/v1_verify.py— public HTTP routes for Verify v1
Legacy negotiation engine
Four primary components backing the engine API and WebSocket:
Negotiation Engine
src/rnp_core.py — The brain. Collects assertions, detects conflicts using proprietary
semantic analysis, and executes negotiation strategies.
Evidence Gatherer
src/evidence_gatherer.py — Interfaces with web search APIs to pull live
evidence. Automatically identifies Truth Anchors from high-authority domains.
Model Broker
src/broker.py — Normalizes API calls to OpenAI, Anthropic, and Gemini so all models
communicate using the same Assertion format.
Persistence Layer
api/database.py — SQLAlchemy-based storage for agents, assertions, conflicts, and results.
Supports SQLite (dev) and PostgreSQL (prod).
Data Flow
User Query
→ Model Broker (queries GPT-4, Claude, Gemini in parallel)
→ Assertions submitted to Negotiation Engine
→ Evidence Gatherer pulls live web sources
→ Conflict Detection (semantic similarity)
→ Arbitration (40/60 Rule + Truth Anchors)
→ Shared Reality (verified answer returned)
Truth-Augmented Generation API
The /generate endpoint provides a complete "Truth-in, Content-out" pipeline. It takes a prompt, arbitrates facts behind the scenes, corrects any detected AI hallucinations using live web evidence, and synthesizes polished, fact-checked content in your requested format.
Generate polished, truth-grounded content.
{
"prompt": "Write a short essay on Bitcoin",
"format": "essay", // essay, summary, bullet_points, report, answer
"tone": "professional", // professional, casual, academic
"max_words": 500,
"mode": "fast" // fast, balanced, deep
}
Response:
{
"content": "Bitcoin, a digital cryptocurrency...",
"format": "essay",
"grounding_status": "corrected", // verified, corrected, or unverifiable
"corrections_made": [
{
"wrong_claim": "Bitcoin is backed by gold",
"correct_fact": "Bitcoin is not backed by physical assets",
"source": "https://investopedia.com/..."
}
],
"confidence_score": 0.95,
"citations": [
{
"id": "src_1",
"url": "https://investopedia.com/...",
"title": "Bitcoin Definition",
"trust_tier": "tier_2",
"snippet": "..."
}
],
"sources_used": 4,
"models_consulted": 3,
"factual_queries": ["What is Bitcoin?", "What is Bitcoin backed by?"],
"latency_ms": 14000
}
Key Concept: Hallucination Correction. If the internal arbitration models hallucinate a fact (e.g. they all claim the current US President is Joe Biden), the grounding engine detects the contradiction with web evidence, flags the claim as `CRITICAL`, and extracts the true fact (Donald Trump). The synthesis engine is then forced to use the corrected fact in the final content. These corrections are returned in the corrections_made array for transparency.
Verify v1 API (Primary)
The primary product surface for production integrations. Authoritative JSON models:
interactive schema at /docs (OpenAPI), or src/verify_schemas.py in the repo.
Submit one or more model outputs and receive a structured trust verdict with explicit action guidance.
Optional request fields include domain_risk_class, tenant_id,
allow_escalation, and policy_overrides; each output may include
cited_sources (URLs).
{
"query": "Does the EU Pro plan include feature X?",
"domain": "policy",
"mode": "balanced",
"outputs": [
{"model":"gpt-4o","provider":"openai","claim":"Yes...","confidence":0.90},
{"model":"claude-3-5-sonnet","provider":"anthropic","claim":"No...","confidence":0.86}
]
}
Response (illustrative; arrays/objects may be longer in production):
{
"schema_version": "1.0.0",
"run_id": "v_ab12cd34ef56",
"created_at": "2026-01-15T12:00:00+00:00",
"safe_to_display": false,
"risk_level": "HIGH",
"recommended_action": "HUMAN_REVIEW",
"confidence_score": 0.42,
"consensus_claim": "",
"warnings": ["Multiple models disagree on material facts."],
"failure_modes": [
{
"kind": "search_outage",
"description": "Live evidence fetch degraded.",
"severity": "HIGH",
"related_atom_ids": ["a_abc123def0"]
}
],
"blocked_atoms": [],
"edit_suggestions": [],
"atoms": [
{
"id": "a_abc123def0",
"text": "The EU Pro plan includes feature X.",
"type": "fact",
"sensitivity": "medium",
"source_models": ["gpt-4o"]
}
],
"atom_verdicts": [
{
"atom_id": "a_abc123def0",
"status": "conflicting",
"authority_summary": {"tier_1": 0, "tier_2": 1, "tier_3": 0, "tier_4": 0},
"nli_summary": {"entailment": 0, "contradiction": 1, "neutral": 0},
"diversity_groups": 1,
"freshness_ok": true,
"notes": []
}
],
"evidence": [
{
"id": "e_xyz789",
"url": "https://example.com/pricing",
"hostname": "example.com",
"authority_tier": "tier_2",
"diversity_group": "example.com",
"title": "",
"snippet": "...",
"relevance": 0.71,
"nli_label": "contradiction",
"nli_confidence": 0.82,
"is_outdated": false
}
],
"models_used": ["gpt-4o", "claude-3-5-sonnet"],
"policy": {
"version": "1.0.0",
"rules_fired": ["evidence.search_outage"],
"domain_risk_class": "HIGH",
"mode_used": "balanced",
"escalated": false,
"escalation_reason": null
},
"uncertainty": {
"disclaimer": "Groundr reduces risk by arbitrating AI outputs and checking sources; it does not guarantee factual truth.",
"notes": ["Review cited sources before acting."]
},
"atomizer_version": "1.0.0",
"latency_ms": 1840,
"cost_estimate_cents": 0.0
}
Record what the user did after seeing the verdict. Powers override-rate KPI.
{"user_action_taken": "HUMAN_REVIEW"}
Use user_action_taken to describe the host product action. Common pattern:
when the user explicitly followed Groundr, send the same string as recommended_action from the
verify response (SHOW, SHOW_WITH_WARNING, EDIT_SUGGESTED,
HUMAN_REVIEW, or BLOCK). Send IGNORED when the user dismissed or
overrode Groundr's guidance (not an enum value on the verify response, but accepted here for analytics).
Returns operational trust KPIs (false-safe rate, interception rate, action mix, override rate, p95 latency).
See Authentication for when X-API-Key is required.
Tenant-scoped decision replay export for governance and compliance review.
Provider-isolated feedback report. Requires X-Provider-Token.
Health and schema/policy version metadata for the v1 stack.
Legacy API
Legacy compatibility endpoints are still available but are no longer the primary integration path.
Legacy query arbitration endpoint.
Legacy multi-output validation endpoint.
Legacy health endpoint.
Migration guide
POST /validate/query -> POST /v1/verify (supply outputs[] or host-generated drafts)
GET /validate/health -> GET /v1/health
legacy UI verdict parsing -> use recommended_action + policy.rules_fired + uncertainty.notes
Engine API
Low-level endpoints for direct interaction with the negotiation engine.
Register a new AI agent with the negotiation engine.
{
"agent_id": "gpt-4",
"name": "GPT-4 Turbo",
"expertise": {"geography": 0.9, "finance": 0.85}
}
List all registered agents with reputation info, wins, losses, and active assertions.
Get detailed reputation breakdown for a specific agent, including win rate and expertise scores.
Submit an assertion from a registered agent.
{
"agent_id": "gpt-4",
"domain": "geography",
"claim": "Lusaka is the capital of Zambia",
"confidence": 0.98,
"evidence": [
{
"source": "https://en.wikipedia.org/wiki/Zambia",
"reliability": 0.85,
"verification_tier": "verified",
"is_truth_anchor": false
}
]
}
Trigger the full negotiation pipeline: detect conflicts and resolve them. Supports all strategies.
{
"strategy": "majority_vote"
}
// Valid strategies: "majority_vote", "bayesian_update", "weighted_consensus"
Retrieve the current "Shared Reality" — all verified, arbitrated truths across all domains.
List all detected conflicts with semantic similarity scores between competing claims.
WebSocket API
Connect to ws://localhost:8000/ws/negotiate for real-time streaming of negotiation events.
Events Received
- agent_registered — When a new agent joins
- assertion_submitted — When a new claim is made
- negotiation_complete — When arbitration finishes with results
Commands
// Trigger a negotiation
{"action": "negotiate", "strategy": "bayesian_update"}
// Get current system status
{"action": "status"}
JavaScript Example
const ws = new WebSocket('ws://localhost:8000/ws/negotiate');
ws.onmessage = (event) => {
const data = JSON.parse(event.data);
if (data.event === 'negotiation_complete') {
console.log(`Resolved ${data.total_conflicts} conflicts`);
}
};
// Trigger negotiation
ws.send(JSON.stringify({
action: 'negotiate',
strategy: 'majority_vote'
}));
Core Algorithms
Evidence-First Scoring
A proprietary formula ensures evidence always outweighs raw model confidence. Unsupported claims are systematically down-ranked, regardless of how confident the model appears.
Truth Anchor Multiplier
When evidence comes from a high-authority source, it receives a significant reliability bonus. This creates a "gravity well" for verified truth, allowing a single well-sourced minority claim to defeat an unsourced majority.
Hallucination Snowball Detection
When all models agree but none have evidence, Groundr flags it as a "Consensus Hallucination." This prevents the dangerous scenario where multiple AI models reinforce each others' lies because they share the same training data bias.
Temporal Decay
Claims lose confidence over time, encouraging frequent re-validation. This prevents stale assertions from persisting in the Shared Reality when newer, more accurate information becomes available.
Negotiation Strategies
Majority Vote
The most common cluster of semantically similar claims wins. Enhanced with evidence-awareness: if the majority cluster's average evidence weight is below 0.3, a "Consensus Hallucination" warning is triggered. A minority with a Truth Anchor can override the majority.
Bayesian Update
Uses agent reputation as a prior and evidence as the likelihood function. Agents with higher win rates get more weight, but evidence can still override reputation. Best for domains requiring long-term expertise tracking.
Weighted Consensus
Direct weighting based on pre-defined expertise scores per domain. Useful when you know which models excel in specific areas (e.g., GPT-4 for code, Gemini for multimodal).
Evidence & Grounding
Groundr supports three tiers of evidence verification:
Verified
Evidence from external, authoritative sources that have been cross-referenced. Highest reliability.
Self-Reported
Evidence provided by the agent itself without external verification. Moderate reliability.
Unverified
No evidence provided. Lowest reliability — the agent is making a claim purely on internal knowledge.
Truth Anchor Sources
The following domains are automatically flagged as Truth Anchors with a 1.5× reliability bonus:
.gov domains — Government sources
reuters.com — Reuters news agency
nature.com — Nature research journal
who.int — World Health Organization
worldbank.org — World Bank
un.org — United Nations
nasa.gov — NASA
esa.int — ESA
Risk Levels
Every validation response includes a risk assessment:
| Level | Confidence | Criteria | Recommendation |
|---|---|---|---|
| LOW | Confidence ≥ 0.8 | High agreement, strong evidence (ratio ≤ 0.2) | Safe to display to users |
| MEDIUM | 0.6 – 0.8 | Partial agreement or moderate evidence (ratio ≤ 0.5) | Display with disclaimer |
| HIGH | 0.4 – 0.6 | Conflict detected, weak evidence | Do not display without review |
| CRITICAL | < 0.40 | Consensus hallucination suspected | Block output entirely |
Deployment
Docker
# Build and run with Docker Compose
docker-compose up --build
# The API will be available at http://localhost:8000
Environment Variables
| Variable | Description | Required |
|---|---|---|
OPENAI_API_KEY |
OpenAI API key for GPT models | Yes |
ANTHROPIC_API_KEY |
Anthropic API key for Claude | Yes |
GOOGLE_API_KEY |
Google API key for Gemini | Yes |
DATABASE_URL |
PostgreSQL connection string | Production only |
BRAVE_SEARCH_API_KEY |
Brave Search for evidence gathering | Optional |
SERPAPI_KEY |
SerpAPI for evidence gathering | Optional |