// ALETHEIA · v0.1 ALPHA
Find signal.
Expose truth.
The analyst workspace where documents, knowledge graphs, and AI-assisted research compose into one defensible chain of evidence.
// THE PROBLEM
Analysts drown in documents.
Existing tools split into two camps: collection platforms that ingest everything and surface nothing, or generic chat that sounds confident but cannot defend a conclusion. Neither is built for the analyst whose job is to be right and to show their work.
// WHAT IT IS
Four datastores, one chain of evidence.
PostgreSQL is the canonical state. Solr finds documents. Qdrant finds the passage inside the document. Neo4j holds the analyst-curated knowledge graph — entities, relationships, provenance. Each store has a single job. Together they compose into evidence you can defend.
// HOW WE'RE DIFFERENT
Not just RAG.
The market has settled into two shapes. On one end, collection platforms that ingest everything and surface nothing — heavy, expensive, built for organisations with their own analyst army. On the other, generic chat wrapped around a vector store: fast to demo, impossible to defend.
Aletheia is the workspace in between. The full pipeline an analyst actually runs — from entity extraction and enrichment, through agentic research with token-budgeted retrieval, to adversarial deep research and defensible reporting — composed as one coherent system.
- Not just RAG
Vector search is one of four datastores, not the whole product. Solr finds documents. Qdrant finds passages. Neo4j holds the analyst-curated graph. PostgreSQL is the canonical state. Each does the job it is best at.
- Not auto-extraction
The knowledge graph is curated, not hallucinated. Entities and relationships are promoted by the analyst with provenance attached. The graph is evidence, not a guess.
- Not a black box
Every retrieved chunk, every team verdict, every citation is inspectable. Deep Research runs Blue, Red, and Yellow stages before any synthesis ships — adversarial review is part of the pipeline, not a checkbox.
- Not a hosted silo
BYOK across every provider. Self-host where the data lives. Your documents, your model, your infrastructure.
// AI ARCHITECTURE
How we use the LLM.
The system schedules, plans, and retrieves. The LLM is the analytical instrument — not the orchestrator. Aletheia composes deterministic planners with token-budgeted agentic loops, MCP-exposed tools, and adversarial multi-agent review.
- Agentic loop
OpenAI function-calling spec owned by the application — portable across Claude, GPT, and local models. The model returns tool calls; we dispatch, append results, and call again.
- MCP server
Thirteen analyst tools exposed over JSON-RPC at /mcp — usable from Claude Desktop, LM Studio, or any MCP client.
- Two LLM roles
Low-temperature enrichment for semantic analysis (claim extraction, sentiment, classification) and higher-temperature chat for agentic research. Separate priority queues, separate concurrency caps.
- Token & cost accounting
Every retrieved chunk counts against an explicit token ceiling. Per-request token use and provider-priced cost estimates roll up live across the gateway — no silent overruns, no surprise bills.
- Blue · Red · Yellow team analysis
Deep Research runs Blue (build the case), Red (challenge it), Yellow (resolve disagreements) before any synthesis ships. Adversarial review is part of the pipeline, not a checkbox.
- Deterministic pipelines
Deep Research stages run as ordered, hand-built code with explicit early-exit conditions. The LLM is invoked at named decision points — not as the orchestrator.
// FEATURES
The workspace, in four moves.
// FEATURE 01
Knowledge graph workspace
Entities, relationships, and provenance — analyst-curated, not auto-generated. Search connections, expand networks.
// FEATURE 02
Chat with research sessions
Agentic tool loops with a hard token budget. Every retrieved chunk is tracked against the budget. The model searches, reads, and synthesizes — then stops when the evidence is in. Documents used are cited.
// FEATURE 03
Deep Research & Reporting
Five-stage pipeline that turns one question into a defensible report. Pre-flight enrichment seeds queries from what your project already knows. Decomposition fans out across all four datastores. Blue/Red/Yellow team analysis stress-tests the evidence. Every query, every ranked chunk, every team verdict — inspectable.
// FEATURE 04
Findings & report workspace
Capture findings with citations. Compose reports backed by evidence — and promote them back to the catalog when complete. Reports become evidence themselves, available to future investigations.
// HOW IT WORKS
Ingest. Investigate. Defend.
Upload documents, fetch web pages, or pull from configured corpora. Aletheia enriches each one with summary, sentiment, keywords, and entities.
Search the catalog, ask the agentic chat, build the knowledge graph. Research sessions track every retrieved chunk against a token budget.
Capture findings with citations. Compose reports backed by evidence. Promote to the catalog when ready. Every claim traces back to a document.
// BUILT FOR ANALYSTS
Text in. Text out. No ETL.
Aletheia is an analyst tool, not a collection platform. The system schedules and plans; the LLM is an analytical instrument; you decide. We do not chase the latest agentic fad. We build the workspace that lets you find signal and expose truth.
// IN PRACTICE
Months of work, confirmed in minutes.
A statistician working on a public-sector research project recently used Aletheia as an independent check on a body of evidence she had spent months collating by hand.
She wasn't using it as her primary tool — she was using it to validate her own conclusions. What surprised her was the speed: connections and supporting passages it had taken her weeks to surface, Aletheia returned in minutes. Same answers. Same documents. A fraction of the time.
That is the test Aletheia is built to pass: not replacing the analyst, but standing up to one who already knows where the evidence leads.
// WHO IT'S FOR
Three analysts. One workspace.
Aletheia is built for people whose job is to be right and to show their work. Three patterns we keep seeing:
Working solo or in a small team, often on retainer. Needs to move fast across open sources, hold a coherent picture across investigations, and ship reports a client can act on. Aletheia gives them the workspace a larger team would have — without the larger team.
Five to fifty analysts, mixed disciplines, projects that span weeks. Needs shared context across the team, a knowledge graph that survives staff turnover, and reports that can be defended in front of a client or a court. Aletheia is the institutional memory.
Corporate intelligence, due diligence, fraud, integrity. Working under compliance constraints, often with sensitive data that cannot leave the building. BYOK and self-host posture means Aletheia runs where the data lives.
If you recognise yourself here — or you don't, but the architecture speaks to a problem you have — get in touch.
// TRUST POSTURE
Your data. Your model. Your infrastructure.
Aletheia is BYOK across every LLM provider — Claude, GPT, or local models via LM Studio or Ollama. Documents, embeddings, and the knowledge graph live in your PostgreSQL, your Solr, your Qdrant, your Neo4j. We do not see your data. We do not host your data. We do not train on your data.
For sensitive work, Aletheia runs fully self-hosted, air-gapped if required. The four datastores and the application are containerised; the LLM gateway can point at a local inference endpoint. Nothing leaves your network unless you choose a hosted model and explicitly send it there.
This is not a feature we added. It is the default.
// ABOUT
Built by an engineer, for analysts.
Aletheia is built by E. Reyes — twenty years in software engineering, with deeper specialisation in distributed systems and HPC. Production experience spans Kafka-based data platforms, federated metadata catalogs, and defence-adjacent and geospatial software. The architecture reflects that background: four datastores composed by concern, deterministic pipelines around the LLM, and a hard refusal to let agentic hand-waving stand in for evidence.
Aletheia is not a pivot from another product or a wrapper around someone else's stack. It is built deliberately, by someone who has spent a career making distributed systems behave under pressure, for analysts who need to be right and to show their work.
For access or a conversation — hello@aletheia-systems.io.
// CONTACT
Get in touch.
For access, demos, or more information — write to us. Aletheia is in alpha; we're talking with analysts, researchers, and teams curious about defensible AI-assisted research.
hello@aletheia-systems.io