After ten thousand decisions, show me how your system got smarter (v3.1)

Feb 25
9 min read

A working demo of compounding intelligence for security operations.

15 minutes · Running code · Three learning loops · Live threat intelligence · Tamper-evident audit trail

Every triage decision should make the next one smarter. Yours doesn't.

Your analysts spend two hours per shift on mechanical collection — five browser tabs, copy-paste, Pulsedive says high-risk, GreyNoise says known scanner, the fusion happens in someone's head and walks out the door at shift change. That's a real problem. But it's the visible one.

The deeper problem is what happens after the decision. Alert number ten thousand is triaged with exactly the same intelligence as alert number one. The system didn't notice that the last forty VPN anomalies from APAC all resolved the same way. It didn't learn that a specific combination of travel context and device trust score predicts false positives with 94% accuracy. It didn't adjust its confidence after catching its own mistake last Tuesday. Nine thousand nine hundred and ninety-nine decisions worth of pattern, judgment, and correction — gone. Nothing accumulated. Nothing compounded.

A good analyst develops this instinct naturally. By month six, they're not just following playbooks — they're reading situations, weighing conflicting signals, recognizing patterns that cut across alert types. The architecture we built gives that same trajectory to the system itself. Every triage decision refines the next one. Confidence scores adjust from outcomes. Operational artifacts evolve at runtime. The system that triages alert ten thousand is measurably, provably sharper than the one that triaged alert one — and it can show you exactly why.

That mechanism isn't limited to security. SOC is domain one. The same compounding architecture applies wherever enterprise AI makes repeated decisions over structured context: ITSM, procurement, compliance. Every new domain inherits everything every other domain learned from day one.

What the demo shows

Not a pitch deck. Running code — FastAPI backend, Neo4j graph database, React frontend — running locally. Fourteen of twenty-one architectural capabilities working today.

Fifteen minutes. Six moments that matter.

1. What the graph knows — before you ask

The system loads. Nobody queried anything. IOCs already ingested and severity-scored. Relationships mapped. Policy conflicts detected.

This is what the graph knows at rest.

🖼 SCREENSHOT: Tab 1 — Threat Landscape at a Glance panel

A knowledge graph doesn't store events — it stores relationships. The moment a second source confirms what the first source suspected, the graph creates a connection that didn't exist before. That connection enriches every future query, every future decision.

"Which users traveled internationally and triggered login anomalies this week?" That's not a SIEM query. It requires correlating six data sources: user profiles, travel calendars, auth logs, threat intel, device inventory, and alert history. The graph traverses them in one pass. A SIEM can't — because it doesn't know they're related.

2. The full decision lifecycle — transparent

Select an alert. Watch the system traverse 47 connected nodes — the user's profile, travel calendar, device history, known attack patterns, active policies — and classify it as a specific situation type with a specific reasoning approach. Four response options appear, each showing exact time saved, cost avoided, and residual risk.

Then the system shows its work. Six weighted factors: travel context, asset criticality, threat intel enrichment, time anomaly, device trust, pattern history. Color-coded bars. No black box.

🖼 SCREENSHOT: Tab 3 — "Why This Decision?" six-factor breakdown

The threat intel factor is live. Pulsedive feeds directly into the knowledge graph through a connector pattern. The graph fuses the external verdict with internal context — alert history, user profile, device trust — into one enriched decision. The analyst doesn't hold the fusion in their head anymore. The graph holds it. And the connector pattern scales: GreyNoise, your sector ISAC, your internal IOC list — same integration, same graph, richer intelligence.

3. The moment CISOs remember — policy conflicts

Two policies apply to this alert. One says auto-close travel anomalies when VPN matches the travel record.

The other says escalate all high-risk users. John Smith's risk score is 0.85. They conflict.

You have conflicting policies in your SOC right now. You just don't know it.

🖼 SCREENSHOT: Tab 3 — Policy Conflict Panel with amber override banner

The system detects the conflict, resolves by security-first priority, and the Recommendation panel shows an amber Policy Override banner. The button says "Apply Policy Resolution" — not "Apply Recommendation." The system respects the governance chain. Full audit trail.

4. What happens when the AI is wrong — before it acts

Every decision passes through four structural quality gates before execution. Click "Simulate Failed Gate" — the gate fires red, a BLOCKED banner appears, the candidate is rejected. Structural enforcement before the fact.

🖼 SCREENSHOT: Tab 2 — BLOCKED Banner — Failed Quality Gate

5. Self-correction with asymmetric trust

Twenty-four hours after a decision executes, the system asks: was that right? Correct decisions add 0.3 confidence points. Incorrect decisions subtract 6 — a 20:1 asymmetry, calibrated for a domain where a single wrong call costs $4.44 million on average.†

The system catches its own mistake and reroutes the next five similar alerts to Tier 2 human review. No human wrote a rule. The graph adjusted its own behavior.

This system is designed to earn trust slowly and lose it fast.

6. Tamper-evident decisions

Every decision — timestamp, alert type, action taken, outcome, and a SHA-256 hash. Each record is chained to the previous one. If anyone alters a record, the chain breaks and the verification badge turns red. Exportable CSV. Hand it to your compliance team.

🖼 SCREENSHOT: Tab 4 — Evidence Ledger with "Chain verified ✓" badge

The compounding curve


	Week 1 ✦	Week 4 ✦	Month 6 ◆	Month 12 ◆
Learned patterns	23	127	400+	1,000+
Auto-close rate	68%	89%	94%	—
MTTR	12.4 min	3.1 min	—	—
Cost avoided/quarter	—	—	$127K	—

✦ Measured outcomes from controlled demo environment ◆ Projected from validated n^2.3 scaling model

Same model. Same code. Smarter graph. No manual intervention between columns.

The experiments. The code. The failure modes.

Four controlled experiments using synthetic SOC data. Published repository. Every claim falsifiable.

Scoring convergence: The weight matrix converges to 69.4% accuracy from a 25% random baseline across 5,000 decisions — with three documented learning phases and three documented failure modes. We show where it breaks, not just where it works.

Cross-graph discovery: Cross-attention between entity embeddings discovers semantically meaningful relationships at 110× above random baseline. Embedding normalization is a prerequisite, not an optimization — and we measured the 4× penalty for skipping it.

Scaling law: Discovery capacity scales as D(n) ∝ n^2.30 (R² = 0.9995). Super-quadratic. The excess exponent has a structural explanation: cross-domain discoveries enrich entities, making them more discoverable by other domain pairs.

Phase transition: Discovery quality doesn't degrade gradually as embedding quality drops. It holds — then collapses suddenly at a specific threshold. A cliff, not a slope. The production implication: monitor embedding quality actively.

🖼 GRAPHIC: Discovery Scaling — n^2.30 Power Law — EXP3-BLOG (#27)

Code and data: github.com/ArindamBanerji/cross-graph-experiments

The architecture

🖼 GRAPHIC: What Makes Intelligence Compound — CI-TRIANGLE (#22)

Three components. All three must exist together.

A context graph that holds every decision, pattern, outcome, and policy as connections the system can follow — not a log, not a static knowledge base. A living structure where every new reasoning traversal is richer than the last because everything before it is still there. Every IOC source feeds through a connector pattern into this graph. Each source makes every other source's data more valuable.

Three learning loops, all writing back to the same graph. Loop 1 gets smarter within each decision. Loop 2 gets smarter across all decisions: it tracks which reasoning approaches produce better outcomes and auto-promotes winners. Loop 3 — a continuous reinforcement signal with 20:1 asymmetric penalties — governs both. One prompt improvement in Loop 2 eliminated 36 false escalations per month — $4,800 in recovered analyst time — that the system made on its own.

Decision economics that tag every automated action with time saved, cost avoided, and risk delta. The goal the loops optimize for — so "better" means better for the organization, not just more accurate in the abstract.

Without the graph: decisions don't accumulate. Each alert starts fresh. Day 365 = Day 1.

Without the loops: the graph doesn't evolve. Rich data, no learning.

Without economics: you can't define what "better" means. The loops optimize for nothing.

This is not a SOAR. A SOAR runs a fixed playbook — if X, then Y. It automates a process someone already designed. This system writes its own playbook from 127 validated decisions and rewrites it when the evidence changes. The playbook gets smarter. The SOAR's doesn't.

Where we sit: Four Layers

🖼 SCREENSHOT: Tab 4 — Four Layers strip

Layer	What	Example
UCL (Unified Context Layer)	Your existing tools	CrowdStrike · Pulsedive (live today) · GreyNoise · Health-ISAC · SIEM
Agent Engineering	Runtime evolution	Prompt variants, scoring factors, pattern learning
ACCP	Cognitive control	Quality gates, policy resolution, decision economics
SOC Copilot	Domain copilot	What you're looking at

CrowdStrike detects. Pulsedive enriches. We decide — and we get better at deciding with every verified outcome.

Adding a new source — your GreyNoise subscription, your Health-ISAC feed, your internal IOC list — is config, not a project. Same connector pattern. Same graph. Richer intelligence. Pulsedive is live today. The pattern scales.

What did your current system learn last quarter?

For the CISO

Specifically — and this is what the demo shows, not claims — a system that:

Knows your threat landscape before the analyst opens the dashboard — IOCs ingested, severity-scored, and correlated to active alerts automatically
Replaces the manual IOC workflow — no more copy-pasting between five browser tabs
Shows exactly WHY it made each decision — six factors, weighted, transparent, with live threat intelligence feeding directly into the scoring
Fuses external intelligence with internal context on a single graph node — Pulsedive today, with the same connector pattern ready for GreyNoise, your sector ISAC, any source that speaks IOC
Detected a policy conflict your policy team didn't know existed, resolved it by security-first priority, and created a full audit trail your compliance team can trace
Caught its own mistake, and automatically re-routed the next five similar alerts to Tier 2 human review
Records every decision in a tamper-evident SHA-256 chain — exportable, verifiable, board-ready
Generated a CFO-ready case: $1.08M annual savings for a mid-size SOC (500 alerts/day, 8 analysts, $85K salary), 6-week payback, 9× ROI in year one — with your numbers, your headcount, your alert volume

You can judge that in fifteen minutes.

The question is: what has your current system learned from the last ten thousand decisions it processed?

If the answer is nothing — you're not running an AI. You're running an expensive rule engine with a better user interface.

Not a SOC product. A platform.

🖼 GRAPHIC: The Gap Widens Every Month — GM-04-v2 (#12)

For the VC: The AI SOC market is large and accelerating — 40+ vendors competing, every major platform bolting AI onto existing products. But the market has split: workflow automators on one side, agentic AI analysts on the other. Neither accumulates intelligence. Neither fuses intelligence across sources.

This is a third position: compounding intelligence as architecture. And it is not a SOC product.

The same four-layer structure — context graph, learning loops, quality gates, decision economics — applies identically to ITSM, procurement, compliance, and anti-money laundering. SOC is domain one. Every new domain that connects to the graph inherits everything every other domain learned from day one. And makes every other domain smarter in return.

That is not a product roadmap. That is a network effect operating at the knowledge layer.

Fourteen of twenty-one architectural capabilities are running in this demo. Four controlled experiments and a public repository backing every architectural claim. Every new IOC source that connects makes every existing source more valuable — network effects at the data layer, before the decision loops even start.

The moat is not the model. Any competitor can swap the model. The moat is the graph — the accumulated decision traces, pattern calibrations, fused intelligence from multiple sources, and organizational context that compound super-quadratically with every connected domain and every passing month. A competitor deploying today doesn't start six months behind.

They start at zero. And the gap is still widening.

See it. Then decide.

The demo runs in 15 minutes. A session with your own SOC numbers takes 30.

Watch the Loom: loom.com/share/b45444f85a3241128d685d0eaeb59379 (Update this link when Loom v2 is recorded — current link is v1.)

Book 30 minutes: Email arindam@dakshineshwari.net — I'll send a calendar link the same day. Bring your alert volume and analyst headcount and we'll run the ROI calculator live with your numbers.

I'm Arindam Banerji. I designed and built this system. The experiments are published, the code is public, and I'll walk you through any claim in this document — or let you break it. That's what running code is for.

Go deeper

Mathematical Framework & Experiments — The equations, four experiments, six charts, three failure modes, and seven design principles behind every architectural claim in this document: Cross-Graph Attention: Mathematical Foundation with Experimental Validation

Compounding Intelligence — Why compounding intelligence is structurally different from every other AI improvement pattern, and what it takes to build a system that actually gets smarter: Compounding Intelligence 4.0: How Enterprise AI Develops Self-Improving Judgment

† Average cost of a data breach: $4.44M — IBM Security, Cost of a Data Breach Report 2024