March 4, 2026Probe / ResearchOS

What 500 VASP Calculations Don't Record

When a new postdoc joins a computational materials group, there is often a directory on the cluster with hundreds of VASP output files. Every OUTCAR is there. Every INCAR. Every POSCAR. The data is technically complete.

But there is no file that explains why ENCUT=520 eV for the MXene calculations and ENCUT=600 eV for the oxide supercells. No file that records the three months spent on PBE+U before the group switched to SCAN for the transition metal oxides. No file that captures what the unexpected oxygen vacancy result in batch 47 meant — whether it was an artifact or a real signal — and what the group decided to do about it.

That reasoning is the actual knowledge. And it is almost never in the files.

The structure of a DFT campaign

A computational materials discovery campaign has a visible structure and an invisible one.

The visible structure is the file system: directories organized by material system, calculation type, and date. Git commits for scripts. Jupyter notebooks with analysis. Plots. The narrative of what was tried and in what order can sometimes be reconstructed from this, with effort.

The invisible structure is the decision graph that connects all of it: why the first round of calculations used one functional and the second used another, which parameter combinations were tried and discarded before the convergence tests stabilized, how the unexpected result in one system changed the approach for the next. This graph exists nowhere. It lives in the mind of the researcher who ran the campaign.

When that researcher graduates, the visible structure stays. The invisible structure graduates with them.

A new student inheriting the project can see what was done. They cannot see why. The distinction matters, because the why is what determines whether the next thousand calculations are productive or redundant.

The three missing logs

In practice, there are three kinds of knowledge that DFT campaigns generate but never record:

The parameter rationale. Every INCAR is a set of choices: exchange-correlation functional, ENCUT, k-point density, smearing parameters, convergence criteria. Experienced researchers know that these choices are system-dependent in ways that aren't fully documented anywhere — the right settings for a metallic surface are different from the right settings for an insulating oxide, which are different again for a magnetic transition metal compound. The lab accumulates this knowledge over years of convergence testing. It is never written down explicitly; it's transmitted through apprenticeship and individual files.

The dead-end map. A discovery campaign is mostly dead ends. The approaches that didn't work, the functionals that gave spurious results for this material class, the structural relaxations that kept converging to the wrong phase — this is the most valuable knowledge a campaign generates, because it prevents the next person from running the same experiments. It is almost never documented. The new student doesn't know what was tried. They try it again.

The interpretation layer. Density of states plots and formation energy tables don't interpret themselves. The researcher who produced them has a running internal commentary: this peak is consistent with the literature value for this termination, this energy difference is within the known DFT error for this functional, this unexpected structure suggests we're seeing a phase transition at this composition. That commentary never makes it into the output files. It may make it into the eventual paper — stripped of the uncertainty, caveated for reviewers, translated into the passive voice of published methods.

The org-mode problem

Some researchers are acutely aware of this gap and try to solve it individually. The most rigorous keep research notebooks — structured logs of what was tried, why, and what was found. A small fraction use org-mode in Emacs or dedicated lab notebook software to maintain this record systematically.

These systems work well for individuals. They do not scale to groups.

A 15-person computational lab with five active researchers running parallel campaigns cannot maintain a coherent group-level knowledge record through individual notebooks. Each person captures their own reasoning in their own format. The synthesis — the lab's collective understanding of what has been tried, what works for which material class, what the dead-ends are — doesn't exist as a queryable artifact. It exists as the distributed mental state of the current group members.

When any one of them leaves, that fragment of the distributed state is gone.

The accumulation problem at scale

Consider a lab that runs a sustained computational program on MXenes over five years. The first postdoc figures out the convergence parameters for the Ti–C system. The second postdoc uses those parameters as a starting point and figures out why they don't transfer cleanly to Mo–N. The third postdoc, working on the oxygen-functionalized variants, discovers that the oxygen coverage changes which functional is appropriate.

All of this knowledge is real and hard-won. After five years and three postdocs, it represents years of compute time and many person-months of careful systematic work.

How much of it is accessible to the postdoc starting today?

In most labs: very little. The first postdoc's thesis is available. Some papers are published. But the intermediate knowledge — the decision rationale, the failed approaches, the interpretation context — is gone with the people who generated it. The new postdoc starts with the files and has to reconstruct the understanding.

# Inherited directory structure
MXene_campaign/
  Ti2C/
    convergence_tests/   # Why ENCUT=520? Unknown.
    surface_calcs/       # Which terminations were tried? See README.
    README.txt           # "See Zhao's thesis, Chapter 3." Zhao graduated 2023.
  MoN_system/
    v1/  v2/  v3_final/  # Why three versions? No notes.
    analysis/            # analysis_cleaned.ipynb, analysis_final_really.ipynb
  # Questions the new postdoc cannot answer from files alone:
  # - Why did we switch from PBE to SCAN for the oxides?
  # - What did the unexpected vacancy formation in Ti2C_O mean?
  # - Which k-point mesh matters for the metallic systems? Why?

What would actually help

The gap is not in the data. The gap is in the layer that connects the data to the reasoning that produced it.

What a research group actually needs is a system that accumulates the decision context alongside the calculations: why this functional was chosen for this material class, what convergence tests established the parameter choices, what the results in batch 47 meant in context. Not as a separate documentation task — as a byproduct of doing the research.

ResearchOS is built for this problem. It maintains persistent context across a research group's HPC workflows, captures decision rationale alongside calculation runs, and surfaces institutional knowledge when a new researcher needs it — without requiring any additional documentation effort from the people running the calculations.

The goal is not to replace the files. It is to make the invisible structure of a DFT campaign as persistent as the visible one.

Five years from now, a new postdoc will join your group and inherit your current campaign. The OUTCARs will be there. The INCARs will be there. The question is whether they will also be able to answer: why was it done this way?

Probe / ResearchOS

ResearchOS is in design partner trials with computational materials and chemistry labs at R1 universities. If your group runs VASP, LAMMPS, or Quantum ESPRESSO campaigns and has felt the knowledge continuity problem described here, we'd like to hear from you.

Request early access →

← All essays