Ask Concord
Answers from our documentation
Ask anything about Concord. Every answer comes from our actual documentation.
Capability: Knowledge Graph + Retrieval
Concord by IaxaI indexes every entity, event, and relationship the engine produces into a dense-sparse knowledge graph. Analysts query it directly. Semantic Alert Dedup correlates through it. The Detection Portability Layer retrieves analogous detections from it. Compliance Auto-Packets assembles evidence out of it. Three latency tiers, all deterministic, all CPU-cheap.
The Problem
Dedup needs a correlation engine. Detection portability needs an analog finder. Auto-Packets needs an evidence assembler. The analyst chat needs all three. Most platforms ship four disconnected indexes, each with its own freshness lag, its own relevance bugs, and its own latency budget. Then someone wires a language model in front of the query path because the indexes disagree, and now every analyst question hits a black-box model that nobody can replay.
The Impact
A correlation lookup takes seconds because it traverses three stores. The same question asked twice returns different answers because the model in front is sampling. An auditor asks how an evidence packet was assembled and the honest answer is "the model decided." In a regulated SOC that's the wrong answer.
What Concord Does Differently
One graph. One retrieval engine. Three surfaces query it through the same API. The query path is embedding search plus Personalized PageRank traversal. No language model, fully deterministic, CPU-cheap. Language models run only in nightly batch enrichment to surface relationships rule-based extraction missed, never on the live path.
The Outcome
Dedup decisions, detection translations, and audit packets all rest on the same retrieval substrate. Every query writes a ledger entry with the seed set, the result set, the tier, and the latency. Regulated end-clients get an answer they can verify after the fact. Operators get a query path they can actually load-test.
What It Is
Two node types: phrase nodes for entities, controls, regulations, techniques, and indicators; passage nodes for events, documents, alerts, and incidents. Three edge classes: relations between entities, synonym links by embedding similarity, and context links between entities and the passages they appear in. Translation writes the passage. Entity Resolution writes the phrase. The retrieval engine reads both.
Phrase nodes
Canonical entities: a user, a host, an IP, a MITRE technique, a control objective, an indicator of compromise. Each carries a 768-dimensional embedding so semantic neighbors are discoverable, even when the literal strings disagree.
Passage nodes
Events, alerts, incidents, policy chunks, evidence documents. Each links back to the phrase nodes it touches. The dense-sparse bridge: passage matches expand into their phrase neighbors, phrase matches surface the passages that mention them.
How Retrieval Works
A query classifier routes each request to the cheapest tier that can answer it honestly. The technical buyer cares about latency budgets. Here they are.
Tier 1: Lookup
<100ms p95
Direct ID or entity-value resolution. No embedding, no graph walk. Used by the analyst omnibox when an analyst pastes a hash, an IP, or a username and wants the canonical record back without a roundtrip.
Tier 2: Semantic
<500ms p95
The common case. Embed the query, run dual-path search across phrase and passage indexes with similarity thresholds, seed Personalized PageRank on the graph, blend the scores, return ranked results with reasoning paths. This is what Dedup, DPL, and most analyst questions hit.
Tier 3: Deep Analysis
<5s p95
Extended graph traversal: wider hop budget, lower decay. Used by Compliance Auto-Packets when an audit window needs every event touching a control across months of telemetry. Optional language-model synthesis is opt-in only and ledgered when used.
Hot Path Discipline
Concord's retrieval path is embedding search plus graph traversal. Both are CPU-cheap and deterministic. Same input, same output, every time. The language model only runs in nightly batch enrichment, where it picks up passages with thin extracted triples and proposes new relationships for the next day's queries. Audit-grade systems can't have non-deterministic inference sitting between an analyst question and an answer that ends up in an evidence packet. Ours doesn't.
Query path
Sentence-transformer embedding plus Personalized PageRank on the in-memory graph. Around 80ms for the embedding, under 100ms for the traversal, deterministic on the same corpus.
Indexing path
Each ingested event becomes a passage node with its entities linked as phrase nodes. About 30 regex extractors plus per-OCSF-category templates pull triples on the way in. Sub-50ms budget per event.
Enrichment path
Nightly batch only. A small open-weights model runs over under-extracted passages from the previous 24 hours and proposes new triples and synonym edges. Throttled, throttle-able, never on live traffic.
One Brain, Three Surfaces
Without this layer, every surface would re-implement retrieval and the answers would diverge. With it, all three surfaces query the same graph through the same public API.
Semantic Alert Dedup
On every incoming event, Dedup asks the graph "anything in the last four hours involving this identity or this source IP that semantically matches this alert?" Tier 2 query, blended score, reasoning path. If the top hit is similar enough and close enough in time, the new event collapses into the existing Security Narrative card instead of producing a duplicate alert.
Detection Portability Layer
When a customer asks for a CrowdStrike rule ported to SentinelOne, the graph gets queried for analogous detections from MITRE, Sigma, and the canonical detection corpus already indexed as phrase and passage nodes. The reasoning path becomes the "why this translation" justification the customer sees alongside the ported rule.
Compliance Evidence Auto-Packets
For each control in FFIEC, GLBA, PCI, and HIPAA, Auto-Packets queries the graph scoped to the audit window. Returns every passage touching that control phrase node, every triple where the subject or object is a tenant asset under that control, and the audit-trail reasoning path the renderer turns into the exam packet.
Analyst chat and omnibox
Tier 1 for ID lookups, Tier 2 for everything else, Tier 3 for explicit deep-analysis requests. The chat surface already renders the reasoning path so analysts can see which seed nodes the query started from, which passages it walked, and why the top result ranked where it did.
The embedding model and the optional batch enrichment model both run on CPU. No external API calls in the retrieval path. The full engine plus the retrieval layer can run on-prem at the Edge Gateway, which matters for regulated banks that can't let telemetry leave their network. The cloud SaaS deployment runs the same code with the same query semantics. Same answers, different tenancy.
Status
The retrieval shape is built and tested. The V1 work is rebuilding the index under the new dense-sparse schema and wiring it to live ingestion. Existing analyst surfaces (chat, omnibox, NLP search) keep their public APIs through the migration.
Shipped
V1 build list
Latency targets are the design budgets. Coverage is what the existing engine ships with today.
<100ms
Tier 1 lookup latency, p95 target
<500ms
Tier 2 semantic latency, p95 target
<5s
Tier 3 deep-analysis latency, p95 target
Why It Matters
Every credible SOC platform ships a search box. What separates Concord is what sits behind it. The same graph powers correlation on the dedup surface, analog discovery on the detection-portability surface, and evidence assembly on the auto-packet surface. So three surfaces never disagree about what the corpus says. The retrieval is deterministic, so the same question returns the same answer twice. The query path is CPU-only, so it runs in an air-gapped bank or a regulated MSSP's on-prem rack without any cloud dependency.
The design borrows from current academic work on hybrid embedding-and-graph retrieval, with adaptations made specifically for CPU-only deployment and the audit requirements of regulated buyers. It's engineering excellence, not an IP claim. The patent moat lives upstream in Translation and Entity Resolution. This layer is how those patented engines actually pay off in front of an analyst.
30-minute walkthrough. Your tools. Your tenants. Your audit cycle. We will show you exactly where Concord earns its keep.