StratumJOURNAL

Stratum Journal

On agents, infrastructure,
and what comes next.

Thinking from the Stratum team on autonomous systems, the companies building with them, and the infrastructure they depend on.

ComplianceDecember 9, 202610 min read

The 2026 Reckoning

2026 was the first year AI governance became mandatory, not advisory. Colorado AI Act took effect June 30. EU AI Act high-risk obligations followed August 2. What the year taught organizations about the gap between AI capability and AI governance — and what 2027 requires.

InfrastructureNovember 18, 20268 min read

The Compounding Return

AI infrastructure investment compounds over time in ways that raw capability does not. Organizations that built the memory, governance, and coordination layers in 2026 are now seeing returns that purely capability-focused deployments cannot replicate. What compounding AI infrastructure actually produces.

ResearchNovember 4, 20269 min read

The Institutional Standard

Research institutions and knowledge-intensive organizations are deploying AI at scale without building institutional memory infrastructure. Every model interaction that produces no lasting context is a sunk cost. What the institutional standard for AI knowledge infrastructure actually requires.

ComplianceOctober 21, 20269 min read

The Governance Gap

Enforcement has started. The first wave of regulatory inquiries under the Colorado AI Act and EU AI Act is revealing a consistent pattern: organizations had governance intent but not governance infrastructure. The gap between them is measured in enforcement exposure.

InfrastructureOctober 7, 20269 min read

The Fleet at Scale

Single-agent success doesn't translate to multi-agent fleets. The coordination layer that makes fleet deployments reliable — state sharing, failure isolation, accountability at scale — is different infrastructure from what makes individual agents capable.

ComplianceSeptember 1, 20269 min read

The Evidence Standard

Regulatory compliance and compliance evidence are not the same thing. An organization can have processes, policies, and good intentions — and still fail an enforcement inquiry if what it produces doesn't match what regulators need to see. What AI compliance evidence actually looks like, and what infrastructure produces it.

RegulatoryAugust 4, 202610 min read

Brussels Day One

The EU AI Act's high-risk obligations took effect two days ago. For organizations deploying AI in HR, credit scoring, healthcare, education, and critical infrastructure across EU markets, this is the enforcement start. What the Act requires, where most organizations stand, and what the first 90 days of enforcement actually look like.

RegulatoryJuly 28, 20269 min read

The Retrofit Problem

Five days before EU AI Act enforcement begins, many organizations are attempting to retrofit compliance onto existing AI deployments. Three common approaches — logging retrofits, review step retrofits, documentation retrofits — each fail in predictable ways. What compliance-by-design actually requires, and why the distinction matters before your first enforcement interaction.

RegulatoryJuly 21, 20268 min read

The High-Risk Threshold

Twelve days before EU AI Act enforcement. Most compliance conversations focus on what compliant systems must do. Fewer focus on which systems must comply. The categorical test is not autonomy level or model sophistication — it is use case. And the perimeter is wider than most organizations have drawn it.

RegulatoryJuly 14, 20269 min read

The Compliance Window

The EU AI Act's high-risk obligations take effect August 2. Three weeks from now. For organizations deploying AI in HR, credit, healthcare, education, and other high-risk categories, the compliance window is not theoretical — it is closing. What the Act actually requires, why policy documents don't satisfy infrastructure requirements, and what compliant AI deployment actually looks like.

LogisticsJuly 7, 20268 min read

The Dispatch Problem

Every freight dispatch is a decision made on incomplete information. Your TMS has three years of carrier performance data. The dispatcher who just booked a load did not consult it — because the system was designed to store that information, not to deliver it at the moment of decision. That gap, accumulated over thousands of shipments, is the carrier intelligence deficit.

RegulatoryJune 30, 20269 min read

Day Zero

The Colorado AI Act is now in effect. Not the threat of enforcement — the thing itself. This is what it requires, what most organizations haven't built, and what happens for organizations still running consequential AI systems without governance infrastructure.

InfrastructureJune 23, 20268 min read

The Memory Advantage

Two organizations deploy AI assistants. Same models. Comparable compute. At month one, output quality is roughly equivalent. At month twelve, one is dramatically more useful. The divergence is not capability — it is memory. And it compounds.

OperationsJune 16, 20267 min read

The Solopreneur Gap

By the end of a typical workday, a solopreneur has interacted with five or six AI tools. One wrote the first draft. Another handled email. A third processed invoices. None of them know what the others did. The integration burden falls entirely on the one person who has no one to delegate it to.

OperationsJune 9, 20268 min read

The Silent Failure

Your agent ran 47 times last week. No errors. Latency normal. No alerts fired. You don't know what it actually produced or whether any of it was wrong. AI agents fail differently from servers — not with crashes and timeouts, but with systematically wrong output that looks completely healthy to every monitoring signal you have.

InfrastructureJune 2, 20268 min read

The Handoff Problem

A task passes from one agent to another. The second agent starts from nothing. It has the output — the file, the data, the intermediate result. It does not have the context that produced it: why these metrics, what constraints were active, what was explicitly excluded. Handoff transferred bytes, not understanding. This is not a failure case. It is the default behavior.

InfrastructureMay 26, 20268 min read

The Knowledge Debt

Technical debt compounds silently until the system breaks. Knowledge debt works the same way — except it's your organization's institutional context, not your codebase, that degrades. AI agents amplify knowledge debt. Every undocumented assumption becomes a default. Every piece of institutional memory that exists only in someone's head is a liability the agent will eventually reveal.

InfrastructureMay 19, 20268 min read

The Context Problem

An AI assistant that knows everything about the world but nothing about you. Every session starts at zero. The blank-slate problem is not a model limitation — it is an infrastructure failure, and it compounds every time you work with a tool that cannot remember what it learned last time.

LogisticsMay 12, 20268 min read

The Carrier Intelligence Gap

Every shipper has carrier performance data. Almost none of it has ever been used to make a decision. The gap between what logistics systems record and what operations teams can act on is not a data problem — it is an intelligence infrastructure problem.

ComplianceMay 5, 20268 min read

The Audit Gap

System logs capture what happened. They cannot capture why the agent was authorized to act, what context it was operating in, or who bears accountability for the outcome. That gap — between operational records and accountability records — is where most AI deployments break under regulatory scrutiny.

InfrastructureApril 28, 20268 min read

The Execution Layer

AI models are capable. AI deployments are not. The gap between what a model can do in a demo and what it reliably does in production is an execution infrastructure problem — state persistence, task handoff, failure recovery, accountability. Most organizations are only now discovering it.

SecurityApril 21, 20269 min read

The Trust Layer

Authentication tells you who an agent is. Authorization tells you what it's allowed to do — and most AI deployments have no answer for that at all. The permission explosion, the delegation problem, and what the Colorado AI Act and EU AI Act are actually requiring.

InfrastructureApril 14, 20268 min read

The Coordination Layer

MCP standardizes how agents talk to tools. It does not standardize how agents remember what they did. The protocol layer and the memory layer are different problems — and the gap between them is where most fleet deployments break.

FinanceApril 7, 20268 min read

The Financial Memory Gap

AI-first companies generate more financial signal per day than traditional businesses produce in a quarter. The tools built to surface that signal were designed for quarterly reviews and human-paced decisions. The result is a structural blindspot — and it's an infrastructure problem.

GovernanceMarch 31, 20269 min read

The Accountability Layer

Regulators aren't asking whether your AI is capable. They're asking whether you can prove what it did, why it did it, and who was responsible. That requires a different kind of infrastructure — one most AI deployments aren't built with. Colorado AI Act: 91 days.

InfrastructureMarch 24, 20268 min read

The Memory Problem

Every serious AI deployment hits the same wall — not capability, not speed, but memory. The agent forgets the institutional context that makes its output useful. Seven markets. One problem. Here's why solving it requires infrastructure, not prompting.

ResearchMarch 18, 20269 min read

The Research Lab That Builds Agents

CS research groups studying multi-agent coordination, LLM memory, and human-AI teaming publish papers on shared context and knowledge handoff. Their own labs run on email threads and the institutional memory of the fourth-year PhD student. The irony is architectural — and the fix is the same one they'd recommend for any multi-agent system.

ResearchMarch 18, 20269 min read

The Pharma Audit Problem

When a computational chemistry group's AI tool gets licensed to a pharmaceutical company, a documentation gap emerges that neither party anticipated. The lab's methods section was written to satisfy peer review. The pharma partner's regulatory environment requires something different — and the knowledge that bridges the gap walked out with the postdoc who graduated last spring.

ResearchMarch 17, 20268 min read

The Memory Problem in AI Research Systems

Research labs — the people who design multi-agent systems — usually have the worst knowledge handoff infrastructure of any technical team. The agents they build know more about shared context than the teams that build them. This is an architecture problem, not a documentation problem.

ResearchMarch 17, 20267 min read

The Lab Memory Problem

Research institutions lose half their working knowledge every 5–7 years — not because they fail to publish, but because the knowledge that drives decisions never gets written down at all. The problem is architectural, and electronic lab notebooks don't solve it.

ResearchMarch 14, 20268 min read

The Personal Knowledge System Problem: Why Your Org-Mode Notebooks Don't Scale

Some labs have already solved the knowledge management problem — for the PI. Org-mode, Obsidian, systematic Jupyter archives. Excellent systems with one structural flaw: they work for exactly one person's cognitive style. Here's what it takes to make a personal knowledge system work for a group of fifteen.

ResearchMarch 14, 20269 min read

The Methods Section Omits the Hard Part

A published methods section tells you what worked. It does not tell you what failed, why the protocol changed, which parameters were tested and discarded, or what the instrument quirk every researcher in the lab already knows. The gap between the methods-as-published and the methods-as-practiced is where most knowledge loss actually lives.

ResearchMarch 9, 20268 min read

Why HPC Job Failures Are a Knowledge Problem, Not Just a Technical One

When a VASP job fails at 3am, SLURM files the incident report. The fix to that exact error lives in a postdoc's head. Every HPC failure has two events: the technical failure the cluster captures, and the knowledge failure nobody tracks. The second one is what costs the compute hours.

ResearchMarch 6, 20268 min read

Why Your Commit History Isn't Lab Memory

Git tells you what changed. It doesn't tell you why the change was right for your material class, which three alternatives were ruled out first, or what failure mode the new approach sidesteps. Every computational lab uses version control. Almost none can reconstruct their own decisions from it.

ResearchMarch 6, 20268 min read

The Discovery Campaign Problem

Your lab ran a computational discovery campaign. Ten thousand DFT calculations. The dataset is saved. What isn't: why those candidates, what the screening criteria encoded, which dead ends were already mapped. A dataset is not a decision trail — and labs are only good at keeping one of them.

ResearchMarch 6, 20268 min read

The Overnight Job Problem

Your LAMMPS simulation ran for six hours and failed at 3am. You found out at 9am. The scheduler email told you it timed out; it did not tell you what the lab learned last October when the same potential file failed on a similar system. That context exists — it just isn't connected to the alert.

Research ComputingMarch 5, 20266 min read

LAMMPS and VASP Workflow Management: What Works and What Doesn't

A practical comparison of workflow tools for LAMMPS and VASP computational labs — AiiDA, Fireworks, atomate, ASE, Pyiron, and SLURM. What each solves, what none of them capture, and the knowledge layer that sits above the compute stack.

ResearchMarch 5, 20269 min read

The Persistent Postdoc

Imagine a lab member who never graduated, never forgot anything, and had been in the lab longer than anyone else. That thought experiment defines the design target for persistent research AI — and explains exactly why session-based AI tools can't fill the role.

InfrastructureMarch 7, 20268 min read

The Production Standard

Two camps in AI agent infrastructure are emerging: experimental systems built on crypto billing, and production systems built on cloud platforms. When your workflows depend on agents running reliably at 3am, the difference shows up in whether your credits vanish without a trace or your job just completes.

ResearchMarch 6, 20268 min read

The Agent That Forgets

Your lab has deployed an AI agent. It runs jobs, answers questions, helps with the literature. And it forgets everything between sessions. An agent without persistent memory is a faster version of the problem you already have — the knowledge layer above the computation is still missing.

ResearchMarch 6, 20267 min read

The Hidden Cost of Siloed Research

A lab that spans two departments has two bodies of knowledge that only connect inside the PI's head. The MSE group's DFT results and the CHBE group's synthesis insights exist in separate notebooks, separate conversations, separate people. The cost of not connecting them is measured in repeated work, missed connections, and integration overhead that scales badly.

ResearchMarch 6, 20268 min read

The Grant Renewal Problem

NSF wants you to justify five years of methodological decisions. The OUTCAR files are there. The papers are there. The reasoning behind the choices that determined the results — which functional and why, which convergence threshold and what tradeoff it encoded, which approach was tried and abandoned — graduated with your postdocs. Writing the renewal is an act of reconstruction.

ResearchMarch 13, 20267 min read

When Your Lab Is Also a Company

Some research groups have built tools that industry actually uses — licensed, spun into startups, deployed at pharmaceutical companies and national labs. That creates a knowledge management problem no tool has been designed for: the lab's knowledge and the company's knowledge diverge silently, from the day the first license is signed.

ResearchMarch 11, 20267 min read

Starting a Lab: The Knowledge Infrastructure Decision Nobody Told You About

When you start a lab, you make hundreds of decisions. Almost none of them are about how the lab will remember what it learns. That turns out to be the one that matters most — not for the first year, but for the fifth, when the first PhD student who knew everything defends and walks out the door.

ResearchMarch 6, 20268 min read

The ML Potential Training Problem

You trained the potential on 400,000 DFT configurations. The model fails on a new composition class. The person who assembled the training set graduated in May. The data curation decisions — which structures were excluded, which functional was chosen and why, what composition subspaces were left sparse — are gone with her.

ResearchMarch 4, 20268 min read

The Tool With No Manual

Every computational lab eventually builds a tool nobody else has — a custom LAMMPS extension, a DFT workflow wrapper, an ML potential pipeline. The code is on GitHub. The reasoning behind the design choices, the failure modes discovered over two years, the system classes it silently breaks on: none of that is anywhere. The knowledge is embedded in the grad student who built it.

ResearchMarch 4, 20267 min read

The Revise & Resubmit Problem

The journal sends back your paper in March. Reviewer 2 wants justification for the DFT settings on the oxide surface calculations. The postdoc who made those choices defended in December. You have the output files — OUTCAR, CONTCAR, the INCAR with two changed lines. You don't have the reasoning. This gap has a name.

ResearchMarch 4, 20268 min read

What 500 VASP Calculations Don't Record

The POSCAR files are saved. The OUTCAR is saved. The INCAR is there. But the reasoning — why this functional for this material class, which three dead-ends preceded the working protocol, what the unexpected oxygen vacancy result meant — never made it into any file. When the postdoc who ran the campaign graduates, the visible structure stays. The strategy doesn't.

ResearchMarch 4, 20267 min read

Your HPC Cluster Has Perfect Memory. Your Lab Does Not.

SLURM logs every job you've ever run — timestamp, nodes, exit code. Your lab can't tell you why that job was configured the way it was, what the convergence tests found, or what the postdoc who built the workflow was thinking. That is the reasoning layer. It is almost never captured, and it graduates with your students.

ResearchMarch 4, 20268 min read

Why Research Wikis Fail (And What Actually Works Instead)

Every few years, a computational research lab starts a new wiki. Two years later it's dead. Not because of the wrong tool — because of three structural failure modes that no wiki solves by design: documentation as afterthought, structure optimized for navigation not synthesis, and the unsolvable maintenance problem.

ResearchMarch 3, 20267 min read

Why 40% of PhD Research Gets Lost When Students Graduate

Every graduating PhD takes years of tacit laboratory knowledge with them. The VASP settings, the calibration quirks, the failed experiments, the reasoning behind every protocol change — none of it made it into a paper. This is not a documentation failure. It is an infrastructure failure.

ResearchMarch 3, 20267 min read

The Literature Review Your Last Postdoc Already Did

A new grad student spends six weeks reviewing the MLIPs literature. The postdoc who left six months ago spent the same six weeks, found the same papers, noted the same dead ends. That synthesis — the opinionated map of the field — is gone. This is not a search problem. It is a knowledge continuity problem.

ResearchMarch 3, 20268 min read

The Onboarding Tax

Every new researcher who joins a lab pays an onboarding tax: weeks of reconstructing institutional knowledge that already exists but isn't accessible. The tax is paid in full by every member, on every rotation. Here's what it costs and how it compounds.

ResearchMarch 3, 20267 min read

HPC Jobs Don't Fail Loudly

At 3:47 AM a VASP job failed on node cn0142. Nobody found out until 9:15 AM — by then, 11 compute hours were gone and the wallclock reservation had expired. SLURM logs every exit code. It doesn't know that the FFT segfault only happens on the Rome nodes, or that the postdoc who solved it six months ago has already graduated. This is not an HPC problem.

ResearchMarch 3, 20268 min read

The Grad Student Knowledge Drain: Why Every PhD Defense Costs Your Lab

A student shakes hands, accepts congratulations, and walks out the door. The calibration knowledge, the failure-mode intuitions, the five years of accumulated judgment about which parameters work for which systems — none of it comes back. This is not a people problem. It is a systems problem.

ResearchMarch 3, 20267 min read

Biomaterial Research Has a Protocol Memory Problem

A hydrogel synthesis protocol took three years to optimize. Your best postdoc defended in June. The parameters are in the paper. The reasoning — why that UV intensity, why that crosslinker ratio, which cell density worked for which scaffold geometry — is nowhere. It graduated with the postdoc who figured it out.

ResearchMarch 3, 20268 min read

Your Lab's Institutional Memory Is Graduating This May

The README doesn't explain why ENCUT=520 eV for that class of perovskites. The wiki entry doesn't capture the three-month detour into a dead-end functional. The commits show what changed, not why. The student carried all of that in their head. They graduated. Now it's gone — and the same knowledge drain happens every spring in every computational lab in the world.

ResearchMarch 3, 20268 min read

How Research Labs Actually Buy Software (And How to Sell to Them)

The PI is the entire buying committee. The money lives in grant accounts with expiry dates, startup packages, and PI discretionary funds — not a software budget line. What actually kills academic deals isn't price. It's data sovereignty concerns, IT approval queues, and student adoption failure. What took too long to learn about selling to research labs.

ResearchMarch 10, 20268 min read

The First 90 Days: What ResearchOS Does in a Computational Lab

Day 1 you connect your lab's existing records. By month three, a new postdoc asks 'what did we try for MXene convergence?' and gets an answer drawn from two years of the lab's actual work. A concrete walkthrough of what happens week by week — not aspirational, what actually accumulates.

ResearchMarch 10, 20268 min read

The Parameter Problem

Every VASP calculation has 40 parameters. The reasoning behind them — the failed convergence runs, the postdoc who tested the edge cases, the PI's validated choices for each material class — is stored nowhere. When the researcher who made those choices graduates, the lab makes them again from scratch.

ResearchMarch 10, 20267 min read

Your Lab's Institutional Memory Is Graduating This May

Every spring, a PhD student defends and takes five years of accumulated reasoning with them — the ENCUT thresholds that work for each material class, the dead-end functionals to avoid, the institutional knowledge that never made it into any paper. Why documentation doesn't stop this, and what actually does.

SecurityMarch 10, 20269 min read

Know Your Agent: The Security Framework AI Teams Are Missing

88% of organizations have had AI agent security incidents. More than half of all corporate AI agents run with no oversight or logging. The problem is architectural: organizations are treating agents like users. Here's the framework that changes that — and what we told NIST about it.

RegulatoryMarch 10, 20269 min read

Compliance Is Infrastructure

The Colorado AI Act takes effect June 30. The EU AI Act's high-risk obligations land August 2. Both impose technical requirements — logging, auditability, decision traceability — that cannot be satisfied with a policy document. Compliance is an infrastructure problem, and most AI deployments aren't built for it.

OperationsFebruary 28, 20267 min read

The Fleet Problem

Every serious AI deployment eventually becomes a fleet problem. Multiple agents, continuous operation, human accountability, regulatory exposure. The tools built for solo experiments were never designed for this — and the infrastructure gap is becoming urgent.

InfrastructureFebruary 27, 20268 min read

Why Autonomous Agents Need Persistent Infrastructure

Most AI agents are amnesiac by design. They start fresh every session, lose their context the moment you close the tab, and can't accumulate the institutional knowledge that makes them genuinely useful. Here's why that's the core problem — and what solving it actually requires.

Market AnalysisFebruary 27, 20266 min read

The Agent Infrastructure Gap

The AI tools market splits neatly into two categories: model providers and application wrappers. Almost no one is building the infrastructure layer in between — the persistent memory, execution environments, and cross-agent coordination that autonomous systems require to operate at scale.