// ALETHEIA · v0.1 ALPHA

Find signal.
Expose truth.

The analyst workspace where documents, knowledge graphs, and AI-assisted research compose into one defensible chain of evidence.

GET IN TOUCH → SEE IT WORK

// THE PROBLEM

Analysts drown in documents.

Existing tools split into two camps: collection platforms that ingest everything and surface nothing, or generic chat that sounds confident but cannot defend a conclusion. Neither is built for the analyst whose job is to be right and to show their work.

// WHAT IT IS

Four datastores, one chain of evidence.

PostgreSQL is the canonical state. Solr finds documents. Qdrant finds the passage inside the document. Neo4j holds the analyst-curated knowledge graph — entities, relationships, provenance. Each store has a single job. Together they compose into evidence you can defend.

// HOW WE'RE DIFFERENT

Not just RAG.

The market has settled into two shapes. On one end, collection platforms that ingest everything and surface nothing — heavy, expensive, built for organisations with their own analyst army. On the other, generic chat wrapped around a vector store: fast to demo, impossible to defend.

Aletheia is the workspace in between. The full pipeline an analyst actually runs — from entity extraction and enrichment, through agentic research with token-budgeted retrieval, to adversarial deep research and defensible reporting — composed as one coherent system.

Not just RAG

Vector search is one of four datastores, not the whole product. Solr finds documents. Qdrant finds passages. Neo4j holds the analyst-curated graph. PostgreSQL is the canonical state. Each does the job it is best at.
Not auto-extraction

The knowledge graph is curated, not hallucinated. Entities and relationships are promoted by the analyst with provenance attached. The graph is evidence, not a guess.
Not a black box

Every retrieved chunk, every team verdict, every citation is inspectable. Deep Research runs Blue, Red, and Yellow stages before any synthesis ships — adversarial review is part of the pipeline, not a checkbox.
Not a hosted silo

BYOK across every provider. Self-host where the data lives. Your documents, your model, your infrastructure.

// AI ARCHITECTURE

How we use the LLM.

The system schedules, plans, and retrieves. The LLM is the analytical instrument — not the orchestrator. Aletheia composes deterministic planners with token-budgeted agentic loops, MCP-exposed tools, and adversarial multi-agent review.

Agentic loop

OpenAI function-calling spec owned by the application — portable across Claude, GPT, and local models. The model returns tool calls; we dispatch, append results, and call again.
MCP server

Thirteen analyst tools exposed over JSON-RPC at /mcp — usable from Claude Desktop, LM Studio, or any MCP client.
Two LLM roles

Low-temperature enrichment for semantic analysis (claim extraction, sentiment, classification) and higher-temperature chat for agentic research. Separate priority queues, separate concurrency caps.
Token & cost accounting

Every retrieved chunk counts against an explicit token ceiling. Per-request token use and provider-priced cost estimates roll up live across the gateway — no silent overruns, no surprise bills.
Blue · Red · Yellow team analysis

Deep Research runs Blue (build the case), Red (challenge it), Yellow (resolve disagreements) before any synthesis ships. Adversarial review is part of the pipeline, not a checkbox.
Deterministic pipelines

Deep Research stages run as ordered, hand-built code with explicit early-exit conditions. The LLM is invoked at named decision points — not as the orchestrator.

// FEATURES

The workspace, in four moves.

// FEATURE 01

Knowledge graph workspace

Entities, relationships, and provenance — analyst-curated, not auto-generated. Search connections, expand networks.

Aletheia chat interface mid-research-session with token-budget indicator

// FEATURE 02

Chat with research sessions

Agentic tool loops with a hard token budget. Every retrieved chunk is tracked against the budget. The model searches, reads, and synthesizes — then stops when the evidence is in. Documents used are cited.

DECOMPOSE

Aletheia deep research decomposition view showing original query and ~20 generated sub-queries

REVIEW

Aletheia deep research team analysis view showing Blue, Red, and Yellow stage verdicts

REPORT

// FEATURE 03

Deep Research & Reporting

Five-stage pipeline that turns one question into a defensible report. Pre-flight enrichment seeds queries from what your project already knows. Decomposition fans out across all four datastores. Blue/Red/Yellow team analysis stress-tests the evidence. Every query, every ranked chunk, every team verdict — inspectable.

DECOMPOSE

ask · optimise · 20 queries

REVIEW

blue · red · yellow

REPORT

citations, inspectable

// FEATURE 04

Findings & report workspace

Capture findings with citations. Compose reports backed by evidence — and promote them back to the catalog when complete. Reports become evidence themselves, available to future investigations.

// HOW IT WORKS

Ingest. Investigate. Defend.

01 · INGEST

Upload documents, fetch web pages, or pull from configured corpora. Aletheia enriches each one with summary, sentiment, keywords, and entities.

02 · INVESTIGATE

Search the catalog, ask the agentic chat, build the knowledge graph. Research sessions track every retrieved chunk against a token budget.

03 · DEFEND

Capture findings with citations. Compose reports backed by evidence. Promote to the catalog when ready. Every claim traces back to a document.

// BUILT FOR ANALYSTS

Text in. Text out. No ETL.

Aletheia is an analyst tool, not a collection platform. The system schedules and plans; the LLM is an analytical instrument; you decide. We do not chase the latest agentic fad. We build the workspace that lets you find signal and expose truth.

// IN PRACTICE

Months of work, confirmed in minutes.

A statistician working on a public-sector research project recently used Aletheia as an independent check on a body of evidence she had spent months collating by hand.

She wasn't using it as her primary tool — she was using it to validate her own conclusions. What surprised her was the speed: connections and supporting passages it had taken her weeks to surface, Aletheia returned in minutes. Same answers. Same documents. A fraction of the time.

That is the test Aletheia is built to pass: not replacing the analyst, but standing up to one who already knows where the evidence leads.

// WHO IT'S FOR

Three analysts. One workspace.

Aletheia is built for people whose job is to be right and to show their work. Three patterns we keep seeing:

The independent OSINT analyst

Working solo or in a small team, often on retainer. Needs to move fast across open sources, hold a coherent picture across investigations, and ship reports a client can act on. Aletheia gives them the workspace a larger team would have — without the larger team.

The boutique intelligence consultancy

Five to fifty analysts, mixed disciplines, projects that span weeks. Needs shared context across the team, a knowledge graph that survives staff turnover, and reports that can be defended in front of a client or a court. Aletheia is the institutional memory.

The in-house investigations team

Corporate intelligence, due diligence, fraud, integrity. Working under compliance constraints, often with sensitive data that cannot leave the building. BYOK and self-host posture means Aletheia runs where the data lives.

If you recognise yourself here — or you don't, but the architecture speaks to a problem you have — get in touch.

// TRUST POSTURE

Your data. Your model. Your infrastructure.

Aletheia is BYOK across every LLM provider — Claude, GPT, or local models via LM Studio or Ollama. Documents, embeddings, and the knowledge graph live in your PostgreSQL, your Solr, your Qdrant, your Neo4j. We do not see your data. We do not host your data. We do not train on your data.

For sensitive work, Aletheia runs fully self-hosted, air-gapped if required. The four datastores and the application are containerised; the LLM gateway can point at a local inference endpoint. Nothing leaves your network unless you choose a hosted model and explicitly send it there.

This is not a feature we added. It is the default.

// ABOUT

Built by an engineer, for analysts.

Aletheia is built by E. Reyes — twenty years in software engineering, with deeper specialisation in distributed systems and HPC. Production experience spans Kafka-based data platforms, federated metadata catalogs, and defence-adjacent and geospatial software. The architecture reflects that background: four datastores composed by concern, deterministic pipelines around the LLM, and a hard refusal to let agentic hand-waving stand in for evidence.

Aletheia is not a pivot from another product or a wrapper around someone else's stack. It is built deliberately, by someone who has spent a career making distributed systems behave under pressure, for analysts who need to be right and to show their work.

For access or a conversation — hello@aletheia-systems.io.

// CONTACT

Get in touch.

For access, demos, or more information — write to us. Aletheia is in alpha; we're talking with analysts, researchers, and teams curious about defensible AI-assisted research.

hello@aletheia-systems.io

Find signal. Expose truth.