After ten thousand decisions, show me how your system got smarter

2 days ago
6 min read

Updated: 15 hours ago

A working demo of compounding intelligence for security operations.

12 minutes · Running code · Four controlled experiments validating the math

Your SOC has amnesia.

Every analyst investigating a login anomaly today starts with exactly the same knowledge as the analyst who investigated the same anomaly last month. The patterns from ten thousand closed tickets? Gone. Nothing accumulated. Nothing compounded.

Your SIEM got better detection rules — written by humans, one at a time. Every other AI SOC vendor got a better model. But after a year of deployment, alert number ten thousand is investigated with exactly the same intelligence as alert number one.

The architecture we built breaks that pattern — and it isn't limited to security. SOC is domain one. The same compounding mechanism applies wherever enterprise AI makes repeated decisions over structured context: ITSM, procurement, compliance. Every new domain inherits everything every other domain learned from day one.

Here is what it looks like in the highest-stakes domain we know.

What the demo shows

Not a pitch deck. Running code — FastAPI backend, Neo4j graph database, React frontend — running locally, no cloud dependency at runtime.

Twelve minutes. Five moments that matter.

The full decision lifecycle.

Select an alert. Watch the system traverse 47 connected nodes — the user's profile, travel calendar, device history, known attack patterns, active policies — and classify it as a specific situation type with a specific reasoning approach. Four response options appear, each showing exact time saved, cost avoided, and residual risk. Your CFO can read this screen without a briefing.

The moment CISOs remember.

Two policies apply to this alert. One says auto-close travel anomalies when VPN matches the travel record.

The other says escalate all high-risk users. John Smith's risk score is 0.85. They conflict.

You have conflicting policies in your SOC right now. You just don't know it.

🖼 SCREENSHOT 1: Policy Conflict Panel

What happens when the AI is wrong — before it acts.

Every decision passes through four structural safety checks before execution. Click "Simulate Failed Gate" — the gate fires red, a BLOCKED banner appears, the candidate is rejected. Not oversight after the fact. Structural enforcement before the fact.

🖼 SCREENSHOT 2: BLOCKED Banner — Failed Eval Gate

Self-correction with asymmetric trust.

Twenty-four hours after a decision executes, the system asks: was that right? Correct decisions add 0.3 confidence points. Incorrect decisions subtract 6 — a 20:1 asymmetry, calibrated for a domain where a single wrong call costs $4.44 million on average.^†

This system is designed to earn trust slowly and lose it fast.

The compounding curve.

	Week 1 ✦	Week 4 ✦	Month 6 ◆	Month 12 ◆
Learned patterns	23	127	400+	1,000+
Auto-close rate	68%	89%	94%	—
MTTR	12.4 min	3.1 min	—	—
Cost avoided/quarter	—	—	$127K	—

✦ Measured outcomes from controlled deployment ◆ Projected from validated n^2.3 scaling model

Same model. Same code. Smarter graph. No manual intervention between columns.

The architecture

🖼 GRAPHIC 1: What Makes Intelligence Compound — CI-TRIANGLE (#22)

Three components. All three must exist together.

A context graph that holds every decision, pattern, outcome, and policy as traversable relationships — not a log, not a static knowledge base. A living structure where every new reasoning traversal is richer than the last because everything before it is still there.

Two learning loops, both writing back to the same graph. Loop 1 gets smarter within each decision. Loop 2 gets smarter across all decisions: it tracks which reasoning approaches produce better outcomes and auto-promotes winners. One prompt improvement in Loop 2 eliminated 36 false escalations per month — $4,800 in recovered analyst time — that the system made on its own.

Decision economics that tag every automated action with time saved, cost avoided, and risk delta. Not a dashboard someone checks. The objective function the loops optimize for — so "better" means better for the organization, not just more accurate in the abstract.

Without the graph: decisions don't accumulate. Each alert starts fresh. Day 365 = Day 1.

Without the loops: the graph doesn't evolve. Rich data, no learning.

Without economics: you can't define what "better" means. The loops optimize for nothing.

The experiments. The code. The failure modes.

Four controlled experiments using synthetic SOC data. Published repository. Every claim falsifiable.

Scoring convergence: The weight matrix converges to 69.4% accuracy from a 25% random baseline across 5,000 decisions — with three documented learning phases and three documented failure modes. We show where it breaks, not just where it works.

Cross-graph discovery: Cross-attention between entity embeddings discovers semantically meaningful relationships at 110× above random baseline. Embedding normalization is a prerequisite, not an optimization — and we measured the 4× penalty for skipping it.

Scaling law: Discovery capacity scales as D(n) ∝ n^2.30 (R² = 0.9995). Super-quadratic. Steeper than the theoretical lower bound. The excess exponent has a structural explanation: cross-domain discoveries enrich entities, making them more discoverable by other domain pairs.

Phase transition: Discovery quality doesn't degrade gradually as embedding quality drops. It holds — then collapses suddenly at a specific threshold. A cliff, not a slope. The production implication: monitor embedding quality and alert before you cross the boundary, not after.

🖼 GRAPHIC 3: Discovery Scaling — n^2.30 Power Law — EXP3-BLOG (#27)

Code and data: github.com/ArindamBanerji/cross-graph-experiments

What did your current system learn last quarter?

For the CISO

Specifically — and this is what the demo shows, not claims — a system that:

· Detected a policy conflict your policy team didn't know existed, resolved it by security-first priority, and created a full audit trail your compliance team can trace

· Caught its own mistake before a human touched a rule, and automatically re-routed the next five similar alerts to Tier 2 human review

· Generated a CFO-ready case: $1.08M annual savings for a mid-size SOC (500 alerts/day, 8 analysts, $85K salary), 6-week payback, 9× ROI in year one — with your numbers, your headcount, your alert volume

The question isn't whether this is impressive. You can judge that in twelve minutes.

The question is: what has your current system learned from the last ten thousand decisions it processed?

If the answer is nothing — you're not running an AI. You're running an expensive rule engine with a better user interface.

Not a SOC product. A platform.

🖼 GRAPHIC 2: The Gap Widens Every Month — GM-04-v2 (#12)

For the VC: The AI SOC market is large and accelerating — Gartner placed AI SOC Agents on the 2025 Hype Cycle, 40+ vendors are competing, and every major platform is bolting AI onto existing products. But the market has split: workflow automators on one side, agentic AI analysts on the other. Neither accumulates intelligence.

This is a third position: compounding intelligence as architecture. And it is not a SOC product.

The same four-layer structure — context graph, learning loops, eval gates, decision economics — applies identically to ITSM, procurement, compliance, and anti-money laundering. SOC is domain one. Every new domain that connects to the graph inherits everything every other domain learned from day one. And makes every other domain smarter in return.

That is not a product roadmap. That is a network effect operating at the knowledge layer.

Ten of twenty architectural capabilities are running in this demo. Not a concept deck — production-grade FastAPI backend, Neo4j graph, React frontend. Four controlled experiments and a public repository backing every architectural claim.

The moat is not the model. Any competitor can swap the model. The moat is the graph — the accumulated decision traces, pattern calibrations, and organizational context that compound as n^2.3 with every connected domain and every passing month. A competitor deploying today doesn't start six months behind.

They start at zero. And the gap is still widening.

See it. Then decide.

The demo runs in 12 minutes. A session with your own SOC numbers takes 30.

Watch the Loom: loom.com/share/b45444f85a3241128d685d0eaeb59379

Book 30 minutes: Email — I'll send a calendar link the same day. Bring your alert volume and analyst headcount and we'll run the ROI calculator live with your numbers.

I'm Arindam Banerji. I designed and built this system. The experiments are published, the code is public, and I'll walk you through any claim in this document — or let you break it. That's what running code is for.

Go deeper

Mathematical Framework & Experiments — The equations, four experiments, six charts, three failure modes, and seven design principles behind every architectural claim in this document -

https://www.dakshineshwari.net/post/cross-graph-attention-mathematical-foundation-with-experimental-validation

Compounding Intelligence — Why compounding intelligence is structurally different from every other AI improvement pattern, and what it takes to build a system that actually gets smarter. — https://www.dakshineshwari.net/post/compounding-intelligence-4-0-how-enterprise-ai-develops-self-improving-judgment

Demo Walkthru - The full architecture behind the demo: context graph design, the two-loop mechanism, and how each tab connects to the ACCP control plane. -

https://www.dakshineshwari.net/post/operationalizing-context-graphs-ciso-cybersecurity-ops-agent-demo

† Average cost of a data breach: $4.44M — IBM Security, Cost of a Data Breach Report 2024

After ten thousand decisions, show me how your system got smarter

Recent Posts

Comments

Stay Connected with us