March 6, 2026Probe / ResearchOS

The Discovery Campaign Problem

Your lab ran a computational discovery campaign. Ten thousand DFT calculations. Three months of cluster time. The results are on disk — energies, band gaps, reaction barriers, whatever you were computing. The dataset is clean. The analysis is documented in the paper.

What is not documented: why those ten thousand candidates. How the initial pool was constructed. Which screening criteria were applied at each funnel stage, and who decided them. Which candidates looked promising at first and were abandoned, and what they revealed before you moved on. How the exploration strategy changed in month two when the first batch came back with unexpected results.

The calculations are saved. The scientific judgment behind them is not.

A dataset is not a decision trail

This distinction matters more than it appears to. A discovery campaign is not just a computational experiment — it is a search strategy. The value of the search is not only in what it found, but in what it ruled out, and why.

If your group spent two months exploring one chemical space before pivoting to another, that exploration carries information. It tells future researchers something about the territory that your paper's methods section cannot convey, because methods sections describe what worked — not the map of everything that didn't.

The practical consequence arrives two years later, when a new graduate student starts a related campaign. They will reconstruct search choices that your group already made. They will explore dead ends your group already mapped. They will converge on parameter settings that your group learned from experience. Not because they are inefficient, but because the experience was never stored in a form they could access.

Every discovery campaign your lab runs generates two outputs: the published dataset, and an unpublished map of the territory you searched. The map is usually discarded.

Where the knowledge lives

The decision trail from a discovery campaign does not disappear immediately. It lives in a few places, for a while.

It lives in the group meeting where the exploration strategy was debated. In the Slack thread where a postdoc explained why they filtered on that property threshold. In the PI's memory of the candidate that almost worked before it turned out to be synthetically inaccessible. In the annotated spreadsheet a graduate student kept for the first month before the project got too large to track that way.

These repositories decay at different rates. The annotated spreadsheet survives until the student graduates and doesn't transfer their files. The Slack thread is findable for a year or two, then buried. The group meeting discussion evaporates the same week. The PI's memory persists, but is not queryable by anyone else.

What never exists, in most labs, is a persistent record that synthesizes these sources into something a new researcher can actually use to understand why the campaign unfolded the way it did.

The scale problem

The problem compounds with scale in two directions.

First, the larger the campaign, the more decision points it contains, and the harder it is to reconstruct any individual one. A group that runs one hundred DFT calculations might be able to reconstruct the rationale for most of them. A group that runs ten thousand almost certainly cannot — the decisions were made across months, by multiple researchers, and no one person holds the complete picture.

Second, the more automated the campaign, the more invisible the decisions become. Automated screening pipelines are faster than manual ones, but they also make it easier to forget that someone chose the screening criteria, and that those choices encoded scientific judgment that future researchers might want to understand or modify. When the automation is doing the work, it is easy to stop thinking of the configuration choices as decisions at all — and therefore to stop recording them.

# What gets stored in a discovery campaign
STORED:    inputs/candidates_initial.csv        (10,847 candidates)
STORED:    outputs/dft_results_batch01.json     (3,200 calculations)
STORED:    outputs/dft_results_batch02.json     (4,100 calculations)
STORED:    outputs/dft_results_batch03.json     (2,900 calculations)
STORED:    analysis/final_candidates.csv        (47 promising hits)
STORED:    paper/methods.tex                    (what worked)

NOT STORED: Why the initial pool was 10,847 and not 20,000
NOT STORED: Who set the Band Gap > 1.2 eV filter and why that threshold
NOT STORED: The 8 candidates from batch01 we almost followed up on
NOT STORED: Why we switched functionals between batch02 and batch03
NOT STORED: The conversation where we decided to stop at batch03

When provenance becomes liability

For most academic labs, the loss of campaign provenance is a productivity cost — work is duplicated, knowledge is rebuilt, new students take longer to become effective than they should. Expensive, but not critical.

For labs whose work crosses into commercial applications, the stakes are different. A machine-learning potential trained on a curated dataset gets licensed to an industrial partner. The partner's quality assurance team asks: which configurations were excluded from the training set, and why? The honest answer is that this decision was made by a postdoc who is now at a national lab, in a series of filtering choices that felt obvious at the time and were never written down.

For computational drug discovery groups, negative data has compounding value — a candidate that failed in year one might be exactly the control compound needed for a study in year four. If the reasons for failure were not captured, the compound gets re-synthesized, or re-computed, or the insight is simply lost.

In both cases, the gap is not a documentation failure in the conventional sense. It is an infrastructure failure: the lab never had a system capable of capturing the kinds of decisions that discovery campaigns generate.

What adequate infrastructure would look like

Capturing discovery campaign provenance is not primarily a writing problem. PIs and graduate students are not going to produce comprehensive rationale documents for every screening decision in a ten-thousand-calculation campaign. The overhead would be prohibitive, and the timing is wrong — by the time anyone could write up the rationale, the decision context is already stale.

What works is a system that captures decision context where it is generated: in the conversations where strategy gets debated, in the lab notebooks where initial observations are recorded, in the group meeting notes where pivots get explained. A system that understands which calculations correspond to which stage of which campaign, so that when a new researcher asks two years later what happened with a particular chemical space, there is something to retrieve.

The goal is not a perfect record of every decision. It is a queryable record sufficient to answer the questions that new researchers will actually ask: why did we stop exploring this direction, what failed in that batch, what would we do differently if we ran this campaign again.

ResearchOS maintains exactly this kind of context across a lab's computational work. Not by requiring additional documentation effort from researchers, but by synthesizing the context that already exists — in conversations, in lab records, in the accumulated history of the research — into something that persists when people leave and surfaces when the next campaign begins.

Your next discovery campaign will start by making choices. Some of those choices will reconstruct knowledge your lab already accumulated the hard way. The question is whether that reconstruction is inevitable — or whether the knowledge from the last campaign is still somewhere you can find it.

Probe / ResearchOS

ResearchOS is in design partner trials with computational labs running large-scale discovery campaigns. If your group has felt the cost of campaign provenance loss — in rework, in onboarding time, in answers you couldn't give a collaborator — we'd like to hear from you.

Request early access →

← All essays