The Terafactory Age

The present

An autonomous lab now turns a model’s proposed targets into finished compounds in seventeen days. An autonomous lab realized 41 compounds from 58 model-proposed targets in seventeen days (Szymanski et al., Nature 2023; a 2026 Author Correction walked back the novelty claim, clarifying the targets were new to the prediction platform rather than necessarily new to science, after challenges to the structural characterization), with the upstream model predicting millions of new inorganic crystals, about 380,000 of them stable (Merchant et al., Nature 2023). A live precedent for model-to-bench loops, not evidence that all scientific execution can be compressed at the same rate. A graduate student runs four cell-painting experiments before lunch; an AI agent drafts six hypotheses by eight in the morning. Every result lands somewhere private, and none of it updates a shared record, because no shared record exists. The factories that will run this at industrial scale are being built now, by whoever ships first.

This is the scientific present. Generation has gotten cheaper than institutional absorption. Models draft hypotheses, protocols, and experiment plans faster than wet labs and review systems can test them. That gap is the structural defect this essay follows. The explosion already happened, and it happened in execution: a model proposes more candidate compounds and protocols in a month than a field can run in a year. Generation was never the bottleneck. The missing layer is the writeback that carries forward what survives.

The terafactory is what physical scientific execution looks like when it reads from and writes back to a shared record: the place where corrections reach instruments, factories, capital, regulators, and the physical world. This essay asks who gets to build that body, and on what terms.

The first proof is smaller than the world described here: one bounded corridor where a lab result writes back to public state and changes what a foundation, hospital partner, or manufacturing team does next. It does not move a patient-facing or IND-enabling decision by itself; it produces provenance that can be included in a future packet while agency authority and clinical governance remain outside the engine. If that loop closes once, the gigafactory stops being metaphor. If it closes ten times, gigafactories compose into the terafactory. If it does not close, the agents that arrive write activity that does not become knowledge, and the factories that get built are isolated gigafactories owned by whoever shipped first.

Fig. 01. Minimum viable body. The first proof is not a full terafactory. It is one closed loop: frontier, experiment, handoff, partner, writeback. The writeback closes back to the frontier.

Fig. 02. The body, in four layers. The body is four coupled layers: state, runtime, network, and physical execution. Each reads from the layer below and writes back to it; each can also fail or be captured separately.

Imagine the architecture composed and operating. A null result in Boston weakens a downstream target hypothesis at three programs in Singapore by the end of the week. A wildlife sample routes from a regional surveillance feed into a primer-production line before the second sampling band confirms an outbreak signal. Neither scene turns on a breakthrough alone. Each requires the same thing: state that can command action and action that writes back.

The closed ingredients are already visible: proprietary corpora, internal agent workflows, private wet-lab partnerships, and model access that reaches labs before it reaches public institutions. The open version is a sketch, a few repositories, and a small coalition of funders who understand that this category may be settled before most of science notices. Infrastructure arguments often arrive before the institution exists: Bush’s Science: The Endless Frontier (1945), Engelbart’s “Augmenting Human Intellect” (1962), and Berners-Lee’s CERN proposal (1989). The body argument is the same genre at industrial-scientific scale.

This essay is a fork story. One branch gives science a public body that reads from shared state and writes back to it; the other gives the same capability to closed institutions first, then asks everyone else to negotiate for access after the registries, compilers, and credentials already belong to someone. The dates, names, and scenes below are staged to stress-test that fork, not to forecast it. Slip them five or ten years and the pressure eases, but the fork holds: unless public capital, review authority, and open registries move early, the private bodies form first, and whoever first makes execution convenient decides where scientific memory lives.

The fork. The same physical capacity can become public infrastructure or a closed stack. The difference is whether execution writes back to shared state.

2026

The baseline

In 2026, the parts of a scientific body exist. None compose. Each layer of the architecture has a working precedent in one domain or another, but the layers do not yet connect; each works for its own community without the others reading from it.

Durable shared layers already exist; what they lack is the ability to act. The PDB, Crossref, Materials Project, Nextstrain, GISAID, ClinicalTrials.gov, UK Biobank, and All of Us all coordinate scientific artifacts, identifiers, cohorts, or changing records across competing actors. Reference shapes: Protein Data Bank; Crossref; Materials Project (Jain et al., APL Materials 2013); Nextstrain (Hadfield et al., Bioinformatics 2018); GISAID; ClinicalTrials.gov; UK Biobank; and All of Us. They show durable shared records, not a full state-runtime-body loop. None of them acts. The primitive in 2026 is still the artifact, not the state transition; the capacity to act on a change across institutions does not yet exist.

Runtime arrived last and fastest. Robotic chemistry sites already run live model-to-bench loops at speeds no graduate cohort could match, and multi-agent platforms generate hypotheses, search for contradictions, plan experiments, and chain tools faster than wet labs can test them. Lu et al., “The AI Scientist” (arXiv 2408.06292, 2024); the v2 follow-up (2025) reported an AI-generated paper accepted to a workshop. FutureHouse describes PaperQA2 as a literature-search agent benchmarked against difficult scientific retrieval tasks. Google Research describes AI co-scientist as a Gemini-based multi-agent collaborator for hypotheses and research proposals. Their drafts land in private logs and transient context windows. A scientific body would have them deposit instead.

Collective intelligence has already closed real frontiers at human scale: Polymath solved the density Hales-Jewett theorem through open blog collaboration in seven weeks with more than forty contributors. Gowers & Nielsen, “Massively collaborative mathematics”, Nature 461:879-881 (2009). Launched on Tim Gowers’s blog in January 2009, the first proof that public-blog collaboration could close research-frontier mathematics problems. Nielsen’s Reinventing Discovery (Princeton 2011) is the longer treatment. Galaxy Zoo (Lintott et al., MNRAS 2008) and Foldit (Cooper et al., Nature 2010) are the canonical adjacent precedents.

The lesson is structured contribution, not crowd virtue. Distributed work becomes scientific work when tasks are modular, feedback is immediate, contribution units are structured, and reputation is legible. Inside an operating writeback layer, a contributor can propose a state transition, annotate evidence, replicate a run, challenge a scope condition, add calibration metadata, audit an agent, or route a task. Most contributions stay noncanonical; some become evidence; a very few become accepted state.

Physical execution is the missing category. Labs and CROs exist; what they don’t do is run continuously against shared scientific state the way a fab runs against shared design files. BARDA’s medical-countermeasure capacity is the closest live precedent, and it still does not run as a public writeback body. BARDA underwrites medical countermeasure manufacturing capacity through cooperative agreements (medicalcountermeasures.gov/barda). A frontier-infrastructure BAA is BARDA-shaped, extended from countermeasures to chronic-disease and surveillance corridors, with public substrate writeback required as a deliverable rather than added as a side effect. The body argument requires building the category.

RECOVERY is the proof that consolidation works under pressure. A single shared protocol, one ethics path, one EHR backbone, in a national health system: more than 40,000 participants across 185 UK sites, and a result that changed care worldwide within a hundred days. RECOVERY Collaborative Group, NEJM 2021: dexamethasone reduced 28-day mortality by roughly one third in ventilated patients across 176 UK hospitals. UKRI describes the RECOVERY platform as the world’s largest clinical trial into COVID-19 treatments, with more than 40,000 participants across 185 UK sites (UKRI, updated 2024). The body has to recover that coordination without an NHS to anchor it, by signed-finding portability and grant conditions that align incentives across institutions.

Intelligence is one lever among many. Experiment speed, experiment cost, measurement, regulation, protocols, and human collaboration are the others, each compressing on a different timescale and through different mechanisms. McCarty, “Levers for Biological Progress”. The grounded counterargument to compute-only forecasts: biology has many bottlenecks, and intelligence dissolves only one. AI alone, even at superhuman capability, dissolves only one of them; the rest require physical infrastructure, regulatory alignment, and institutional design. The body argument is what addresses these levers in concert.

The trajectory ahead might be fast: increasingly general AI agents in the late 2020s, superhuman research systems around 2030 in some forecasts, and embodied scientific execution following unevenly behind. Fast-path references include Aschenbrenner, “Situational Awareness” (2024); Amodei, “Machines of Loving Grace”; and Kokotajlo et al., “AI 2027”. They are timeline pressure tests, not premises the essay requires. Even in a slower scenario, the actors that first make execution easy will own the registries, credentials, calibration histories, and facility compilers unless the public version exists before convenience hardens into dependence.

Much visible capital is aimed at models, drugs, datasets, and platforms in 2026. Sovereign funds underwrite ports, pipelines, datacenters, and energy infrastructure. Real assets at the scale of battery factories or semiconductor fabs remain an analogy for science, not a balance-sheet category. ARPA-H funds translational programs, but its public portfolio does not yet present a coordinated body that reads from a substrate, executes physical action across federated sites, and writes results back to public state. ARPA-H’s public program portfolio describes programs led by program managers toward specific health-care challenges, with examples such as PARADIGM for distributed medical care and NITRO for osteoarthritis tissue regeneration. The public portfolio shows ambitious translational programs, but not a full state-runtime-body loop at gigafactory scale. The baseline is the last moment before the structure becomes hard to change.

2028

First moves

The first moves come after the capability gap becomes visible. Through 2027 that gap opened inside the frontier labs: agents integrated literature search, experimental design, protocol synthesis, and code into one continuous workflow run against proprietary corpora, while public models drafted faster than wet labs could absorb and no shared substrate held the failures or corrections. By 2028 the technology already overshoots what institutions know how to govern. The first moves are capital and policy, not technology. The open body does not begin with consensus. It begins with contracts.

ARPA-H, or a successor program, writes the first Frontier Infrastructure BAA. The bridge from today is one envelope that combines things existing agencies already know how to buy separately: translational program management, the surge capacity BARDA underwrites for countermeasures, the strategic manufacturing CHIPS funded for semiconductors, the bottleneck teams FROs already build, and the regulator-readable evidence packages an IND assembles. Mazzucato, The Entrepreneurial State (Anthem 2013). The case that “the state takes the risk; the private sector takes the credit.” The CHIPS and Science Act (HR 4346, 2022) extended the same posture to semiconductors with $280B authorized. Operation Warp Speed (Slaoui & Hepburn, NEJM 2020) compressed vaccine timelines through BARDA by running trials and manufacturing at risk in parallel. A frontier-infrastructure BAA is the same posture extended to physical scientific execution.

The scope is unusual, but the pieces are familiar: federated synthesis halls, autonomous protocol execution, accredited review, manufacturing handoff, and public writeback into a shared substrate. It also reserves public scientific compute for accredited frontier bodies through secure inference environments, eval-gated model access, incident reporting, and procurement terms that keep public science from living permanently on the labs’ weakest API tier.

The new envelope pays for the deposit pathway itself. Construction milestones release the first tranche; useful corrections that travel through the substrate release the rest. The capital scales by phase: a low-eight-figure substrate pilot, a pre-facility body pilot at the next order of magnitude, and a first standing body in infrastructure-class capital. The exact number varies by corridor; the order of magnitude is what’s load-bearing.

Public does not mean fully exposed. The public object is the canonical state transition and enough provenance to audit it (content hashes, reviewer signatures, context boundaries, evidence classes, regulator-readable packets), while raw clinical data, live pathogen details, and process IP live behind tiered access, delayed release, or trusted-reviewer rooms. The hardest case is failed-route topology itself: a sponsor may be willing to disclose that a mechanism weakened and which dependency moved, but not every branch of the search strategy that produced the failure. The compromise is public transition plus protected evidence, with release rules defined before the grant or procurement money arrives.

A sovereign wealth fund opens a Real Assets sub-mandate for scientific infrastructure in the same quarter. The money is priced like infrastructure because the risk is no longer only scientific; it is utilization, construction, governance, and uptime. The consortium holds the site in a public-benefit vehicle, federal cost-share covers first-loss technical risk, and foundations guarantee use in named disease corridors. If a corridor fails, the equipment, calibration registry, and signed state history remain public rather than reverting into a vendor’s private platform.

FRO incubators turn toward the components a public body depends on. Marblestone et al., Nature 2022, on focused research organizations as time-bound, milestone-driven teams that unblock specific bottlenecks and sunset into standing institutions. FROs are not a generic scaling vehicle for billion-dollar facilities; they are bottleneck-clearing primitives that build the load-bearing components a consortium then plugs together. The first FRO builds the open protocol compiler; the next, the calibration registry; the next, reviewer credentialing. In the scenario, the first compiler FRO starts before the facility does; otherwise the first terafactory opens as a building with no public nervous system.

Foundation pools that previously underwrote narrative reports begin underwriting the maintenance layer: instrument calibration, protocol curators, frontier stewards. Patient-led foundations begin to test body-conditional language in their disease corridors. The clauses cannot bind non-grantees, but the principle is in the field for the first time: public capital should not pay for private scientific memory.

Regulators move in parallel, more cautiously than funders. FDA already expects chemistry, manufacturing, and control information inside IND submissions, and its real-world-evidence guidance has created a path for regulator-facing evidence histories to matter when they are fit for purpose. FDA’s IND CMC materials specify chemistry, manufacturing, and control information for drug substance, drug product, placebo formulation, labeling, and environmental assessment. FDA’s real-world-evidence guidance frames how real-world data and evidence can support regulatory decision-making when reliability and relevance are established. The body version does not replace these requirements; it makes protocol lineage, evidence provenance, and dependency updates easier to inspect alongside them. ICMJE’s 2005 trial-registration policy is the publication precedent: registration became a condition of serious clinical publishing. A future guidance might say translational submissions may include auditable scientific-state histories alongside trial data. It would not approve anything by itself. It would change what serious institutions expect to be able to inspect.

The body argument assumes scientific infrastructure can move at the pace prior infrastructure moved when state coordination, industrial capital, and a defined deliverable aligned. Patrick Collison, “Fast”: a live catalog of ambitious projects completed quickly. Empire State Building in 410 days. Pentagon in 491. Apollo from program initiation to a man on the moon in roughly eight years. Operation Warp Speed to an authorized vaccine in roughly eleven months from a standing start. The argument by accumulation: humans can build coordinated infrastructure on the timescale required when conditions align. None of these moves the field on its own. They have to land in the same year, in roughly the same shape, before anything that follows becomes possible.

2030

The gap becomes undeniable

Between 2017 and 2019, major BACE-inhibitor programs in Alzheimer’s converged on a grim lesson through separate corporate pipelines: multiple programs failed for futility or showed cognitive worsening, often within months of each other. Verubecestat (Egan et al., NEJM 2019); lanabecestat (Wessels et al., JAMA Neurology 2020); atabecestat (Henley et al., NEJM 2019 preliminary; Sperling et al., JAMA Neurology 2021 final). Several other BACE programs (Eisai, Novartis, Pfizer, Amgen/Banner) were halted across the same window for related futility, risk-benefit, or worsening signals. The lesson was not identical in every molecule, dose, or population. The point is that negative evidence accumulated in parallel, company by company, without a shared interim-state surface that downstream programs could read. Many patients enrolled across programs before the lesson could be compared, scoped, and routed with enough force to change the next decision.

Superhuman research systems emerge inside the leading AI labs around 2030. The systems that arrived as agentic remote workers in 2027 cross into superhuman across research domains before public-tier models do. The lab-versus-public capability gap becomes a difference in kind, not degree: a different category of access rather than a faster version of the same tool.

What the inflection looks like from inside the public sector: a senior scientist at a university hospital opens her model-access dashboard in May 2030 and notices that answer quality on her research questions has plateaued. The public-tier models have not stopped improving; the gap to what the labs are running internally has widened past usability. She drafts a memo to her institution’s research VP. By summer, similar memos are circulating at most major research universities in the United States and Europe. The labs are no longer racing each other. They are racing a public sector without the infrastructure to compete, and the political case for a publicly-underwritten body crystallizes in the same six months.

Meridian is the answer to the BACE failure and the first serious test of the body clause: can a public institution make physical action read from shared state before private stacks become default? Its first corridor is neurovascular Alzheimer’s translation, the same blood-brain-barrier amyloid frontier the pre-facility pilot opened, chosen because it is messy in exactly the right way and because the pilot’s signed writeback is already inheritable. Meridian only becomes financeable after that pre-facility pilot has produced signed writeback, a regulator-readable packet, named utilization guarantees, and an audited governance handoff. It breaks ground near Boston in late 2030. The ceremony is small: staged ground near a biomedical corridor, a folding table by the access road, a few dozen people in coats, the consortium’s first executive director speaking for ten minutes about what the next decade will produce if the build holds. No politicians, no press releases; by noon the construction crew is on the property.

The site is chosen for connection: hospitals, universities, clinical-trial networks, manufacturing talent, regulators, and enough political legitimacy to make the project survivable. Capital comes from the same forms society already knows how to underwrite for physical infrastructure, applied to a domain that has not yet had one. For comparable orders of magnitude, see Tesla’s Gigafactory Nevada ($6.2B invested by 2023) and TSMC’s $165B US semiconductor commitment. SpaceX’s proposed Grimes County Terafab sits at the next scale ($55B initial, up to $119B). Reference comparables for the giga-to-tera jump, not evidence that scientific facilities exist at that scale. Named corridors buy assay capacity, manufacturing handoff slots, and evidence packets; if volume misses, the consortium shrinks corridors before selling the registry. Meridian’s first build sits below those comparables and produces evidence rather than physical product. It is the first instance of a category that does not yet have a name in finance: a scientific gigafactory.

Meridian puts the translational loop under one operational roof. Its wings handle target validation, autonomous synthesis, perturbation biology, preclinical convergence, and GMP-grade manufacturing from the beginning. The clinical-trial network is not physically inside the forty-eight acres, but its evidence path is. Regional hospitals connect into Meridian’s substrate layer through audited endpoints. Trial programs read from the same frontier state as the synthesis floor.

2032

Closed bodies emerge

Meridian is still under construction in 2032; the first synthesis hall is twelve months from commissioning, the consortium’s reviewer-credentialing pipeline is half-staffed, and the open protocol compiler is months away from a release the regulators will inspect. The body argument is in motion, but nothing it produces is operational yet.

The closed bodies emerge faster. By 2032, the leading frontier AI labs are operating internal scientific stacks at scales no public infrastructure can match. Each runs a proprietary stack from substrate through compiler through synthesis-line orchestration, with selective biomanufacturing partnerships closing the loop. They keep the running frontier, failed routes, and dependency graph inside the boundary. Selective publication is press, not deposition. The structural prediction is that whichever frontier labs are operating frontier-scale internal compute and frontier-scale wet-lab partnerships in the early 2030s will follow this architecture. The closed-stack pattern is an inference from proprietary biology and pharma discovery platforms, not a claim that any 2026 platform already has the full body described here.

Inside one of these stacks, mid-2032, a research lead queries her lab’s internal scientific substrate against three years of internal experiments, exclusive commercial datasets, and a proprietary structural-biology corpus. The agent stack returns six ranked experiments; she approves three; the compiler emits the run, and a week later her frontier carries three corrections and one anomaly for human review. Nothing about this loop is visible outside the lab. The failed routes that taught the team the most never will be. The closed body works, with friction, on the lab’s own terms.

The gap widens through 2032 because the labs’ internal output compounds without leaking. The substrate fight had resolved in the late 2020s (deposit-or-don’t-publish norms held and the public corpus stayed open), but the equivalent fight over physical execution did not, and no norm forces a closed lab to deposit what its synthesis halls produce. The incentive is structural: the science org sits on the commercial P&L, and the search strategy that produced a failed route is worth more than any single finding. Output surfaces only indirectly, as drugs entering trials with closed mechanism documentation. Regulators receive the submission package and can ask questions, but cannot inspect the substrate that produced it. Rare-disease and pediatric-cancer foundations feel the asymmetry first, because their constituencies depend on small-population trials and shared-evidence networks they can no longer underwrite blind. A conditional-capital clause starts forming the same year, but without the federal grant-condition stack behind it the labs ignore it, and accumulate reviewer identity inside their own platforms before regulators understand that signer recognition is becoming infrastructure.

2033

Meridian operates

Meridian is built for inheritance before throughput. A factory operating on shared state reads from every prior failure, contradiction, and cohort observation in its corridors; most of that record was unstructured before 2028, and Meridian’s preconstruction work through 2031 and 2032 is to translate inherited evidence into substrate objects so a corridor begins at the field’s actual frontier rather than at a clean slate. A failure of the kind the BACE programs documented would weaken the dependent target hypothesis across every program reading against the same frontier within the same week, not the same decade.

The first synthesis hall opens in March 2033, before the rest of the facility is complete. A three-dimensional model of the first synthesis hall is online at /facility: reviewer floor above, assembly bays below, the substrate moving between them, anchored to the field journals in this essay. Fewer than one hundred lines, not four hundred. The early robots are less impressive than the workflow around them: protocols arrive as executable objects, plate maps generated from state, failed runs writing back to the record. Reviewers see when a model-proposed experiment was redundant, when a human-designed assay contradicted the prior state, and when a failure should weaken a claim.

A reviewer’s morning at Meridian in late 2033, six months after the first synthesis hall opened: she logs in at seven thirty. The substrate has accumulated proposed state transitions overnight, mostly from agentic platforms running against the neurovascular frontier, a handful from human researchers at affiliated sites. Her queue is filtered to transitions that touch findings she has signing authority on and dependencies whose confidence has shifted enough to warrant human attention.

When the narrowing she signs queues a trial-design change, the interim look still moves through the trial’s charter, statistical analysis plan, and data firewall before anyone touches the trial. The substrate accelerates detection without suspending clinical governance.

The reviewer floor is a physical constraint, not a metaphor. Most deposits do not matter, and most agent proposals should never reach a human. Meridian staffs triage the way a hospital staffs an emergency department: duplicates are clustered before the morning shift, high-dependency corrections jump the queue, and anything touching animals, trials, manufacturing, or safety requires named signers with liability-bearing institutions behind them. Without that staffed floor, the writeback layer becomes a louder version of the present literature.

The neurovascular corridor named at break-ground is now operating: contradictory literature, animal models that fail to translate, biomarkers that drift across cohorts, and no single company that can maintain the frontier honestly. The conditions a shared record was built to carry. The corridor was chosen because it sits at a seam: vascular biology, human cohorts, and cerebrovascular replication meet there, and that is where a field stalls and someone rebuilds the missing bridge by hand every time.

In November 2033, a vascular-inflammatory mechanism receives evidence from a perturbation line at Charlestown, a human cohort at a partner academic medical center, and a failed APP/PS1 cerebrovascular replication at a contract-research partner in Cambridge. The original claim had treated that 2031 result as orthogonal. The substrate composes the three sources into a single proposed state transition: the mechanism is real but holds only in APOE4-positive patients above sixty-five, not across the broader population the original claim covered. The transition is signed by two reviewers and a clinical liaison; the status on the broader target hypothesis shifts from “moderately supported” to “subgroup-restricted.” Over the next week, a foundation reroutes a grant cycle, a Phase II trial queues an APOE4-stratified amendment, and a review article in draft changes its conclusion. Nothing about this makes headlines. The factory has begun changing how the field knows what to do next.

Sentinel commissions in Singapore six months later, anchored by a different sovereign fund and the WHO Foundation rather than ARPA-H. Its remit is pathogen surveillance rather than chronic-disease translation, its footprint smaller, and its reviewer roster rotates through four time zones. By December 2033, its substrate is reading live feeds from Manila, Yunnan, and Bangkok wastewater networks, waiting for a real signal.

What Meridian sunsets into

Meridian is a standing institution from day one, so it needs governance at construction time. FROs can sunset cleanly because their scope is bounded and their handoff target is named at inception; Meridian has to absorb FRO-built components into something durable. The model is Crossref: a non-profit that has held DOI, citation, and retraction infrastructure across competing publishers for over two decades, with elected technical seats, member-institution governance, and multiple independent implementations. Meridian’s version of sunset is a transition from FRO-incubated components (open compiler, calibration registry, orchestration kernel, reviewer credentialing) into a consortium of the Crossref kind, running Meridian as federated public infrastructure: member institutions hold seats, technical staff is elected, patient-led foundations hold reserved governance roles, and the operational charter is published and forkable. None of this is automatic; the 2028 BAA must require it as a deliverable rather than hoping for it as an emergent property.

2034

The fork

The substrate fight resolves first, through grant conditions and patient-led pressure on the disease frontiers that matter most. The body fight resolves later, on different terms.

By 2034, the closed-body pattern has hardened: private stacks write into state that never leaves the corporate or sovereign boundary. The question shifts from whether closed bodies work to where capture happens.

Inside the largest closed lab by mid-2034, the argument is no longer hypothetical from their perspective. It is the architecture they are already running, against an internal frontier that covers most major target classes and a biomanufacturing partner producing GMP-grade material for their own trials. The only remaining question is whether the public sector builds a competing version.

A patient-led foundation tries to schedule a discriminating synthesis run on Meridian’s public hall. The hall is open. The run needs a protocol compiler, and the production-grade compilers all sit inside proprietary stacks. The foundation eventually runs the experiment weeks late, after a third party rebuilds enough of an open compiler to clear the queue. The reimplementation does not, by itself, win anything: until the federated identity layer recognizes the new signer, the run executes but its result lands in a namespace the regulators do not read. Forks of nominally open infrastructure die at the credential boundary, not the code boundary.

Capture, when it comes, will not look like a closed compiler. It will look like an open one whose orchestration sits inside someone else’s stack: the scheduler, the clinical-trial connector, the manufacturing handoff system, the calibration registry on one side; and on the other, the social layer: whose attestation counts, whose calibration log other facilities trust, whose reviewer credential a regulator recognizes. Hashimoto, “Ghostty Is Leaving GitHub” (2026), names the pattern in software: Git was never the captured layer; the GitHub-owned collaboration tools above it (issues, PRs, Actions, reviews, status, social context) were. The moat includes Actions, contribution graphs, stars, followers, and reviewer reputation. The body’s analogue is the orchestration tools and the identity registry that decides whose signature on a state transition is canonical. Whoever owns orchestration plus identity owns the layer that matters, even if the compilers underneath are nominally open.

A coalition of patient-led foundations and disease-specific funders publishes a body clause: their capital is conditional on synthesis, perturbation, and clinical writeback landing in audited public state at the gigafactory boundary, rather than publication alone. What participants can keep behind trusted review or regulator escrow is generous (patient-level records, trade-secret route topology, sensitive protocols), but what must travel is the state transition, its provenance, the dependency movement, the signer, and the calibration record. When foundation capital, federal grants, hospital enrollment, and regulator-readable inspection all ask for the same packet, compliance becomes cheaper than refusal.

Closed-platform vendors call the clause unworkable. Some of their largest grantees switch tracks anyway. Foundation capital alone cannot bind sponsors who do not depend on it, so NIH, ARPA-H, and BARDA conditions stack on top, and regulators begin to accept state histories as inspectable support for submissions. What a regulator can inspect is not the compiler source but the attestation log: every state transition the sponsor read against, every signer who endorsed it, every calibration record consulted. A submission whose decisive scientific-state history resolves to a registry the regulator cannot query becomes harder to rely on, not automatically invalid.

Open compilers do not catch up to closed ones in months, and they never quite catch up on every operational axis at once. Maintainer teams have to be funded for years. What changes is not parity but the floor: an open compiler exists, is credibly maintained, integrates with a federated identity layer, and ships with attestation tooling regulators recognize. That is enough to make the body clause enforceable. Regional hospital networks decline to enroll patients in trials whose synthesis lineage is private to the sponsor. A second sovereign fund signals it will require open orchestration in any infrastructure it backs. The fork resolves frontier by frontier: neurovascular disease first, then rare disease and pandemic surveillance, then materials and agriculture.

The fork’s first public proof arrives in early 2035. Before the signal appears, a jurisdictional compact is already in place. The compact does not harmonize the world’s regulatory regimes; it defines what is allowed to travel across them: sample-sovereignty terms, data-localization boundaries, export-control review, dual-use committee jurisdiction, benefit-sharing conditions, and regulator-readable evidence packets. Without that compact, the network would be a technical diagram with no legal corridor.

A wastewater signal appears in a regional surveillance feed in the early evening, local time. Viral load is elevated against background, the sequence pattern is partial, and a respiratory-pathogen index has lifted three days running. The institutional shape that anticipated this architecture is IFP’s 2025 “Scaling Pathogen Detection with Metagenomics” proposal in the Launch Sequence collection, which sketches a national wastewater-plus-metagenomics network sized to detect novel respiratory pathogens at the regional level on a multi-day clock. The proposal names the sensor and the institution; what it does not name is the cross-jurisdictional state layer that lets a Manila wastewater readout, a Yunnan wildlife sample, and a Cape Town manufacturing slot resolve to the same proposed transition. The substrate is the missing horizontal underneath that vertical.

In the network Sentinel has built, the substrate checks the pattern against wildlife, wastewater, hospital, and agricultural feeds in adjacent regions. It finds a wildlife sample collected three days earlier whose sequence is close enough to create a proposed state transition in Sentinel’s respiratory-spillover frontier. The imperfect match is enough to escalate.

The on-call reviewer acknowledges in ninety seconds. She sees the Manila wastewater readout and the Yunnan wildlife sample on the same surface, with their uncertainty bounds, the assay’s false-positive history, the travel-corridor probabilities, and the readiness of primers, constructs, and biomanufacturing slots downstream. The system does not declare an emergency. It proposes three actions: expanded sampling, diagnostic primer synthesis, and candidate countermeasure preparation under a low-distribution threshold.

At fourteen minutes, she approves containment. The substrate matched in milliseconds, and the rate limit on the response is legitimacy rather than cognition; pretending otherwise would corrupt the review. The agents have already simulated the cascade across thousands of branches and rank-ordered countermeasures by expected harm reduction. Her signature makes the resulting action politically and legally accountable.

Sentinel’s primer-production wing starts synthesis within the hour because primer work is inside the pre-authorized response envelope. Candidate vaccine components are pre-staged across the network, but no distribution path opens without the public-health authority, biosafety review, and emergency-use conditions that govern the jurisdiction. The Cape Town biomanufacturing partner receives the manufacturing queue. Meridian receives an immunology review request because one candidate construct touches a pathway already under study in a chronic-disease program. A regulator-facing evidence packet begins compiling automatically, but no public announcement is made until the second sampling band confirms the signal.

Antigen design to first bench-confirmed candidate construct compresses to days, not weeks. That is the part the architecture collapses. Design, primer synthesis, candidate expression, and initial in-vitro readouts become continuous against shared state. “Bench-confirmed” here is narrow: a construct that expresses cleanly and binds the predicted epitope, not yet immunogenic in animals, not yet safe across subgroups, not yet scalable at yield. BSL-3 intake, contamination workup, and cross-jurisdictional material transfer remain serialized human work, and they are the slow front end of any response.

The slow floors stay slow. Animal-model immunogenicity, dose-finding, fill-finish, cold-chain qualification, biosafety review, and local politics still serialize the response. The substrate routes and records; the factories synthesize and prepare; human institutions still decide when and how to intervene.

The intercept is small, almost invisible from outside; the second sampling band returns negative, the containment cascade stands down, and the regulator-facing packet is sealed unread. But the inside of the system has changed. The biomanufacturing partner in Cape Town has become the production node for non-Northern populations. Meridian contributes immunology and clinical-safety state. The intercept becomes the first public proof that the terafactory compounds. No single gigafactory could have done this; what acted was the network they composed into.

Fig. 03. A federation of frontiers. The substrate's eventual shape: not one monolithic Alzheimer's model, but a federation of domain frontiers (Alzheimer's at the center here, with aging, vascular biology, immunology, neurodegeneration, metabolism, genetics, clinical trials, drug discovery, regulatory state, and care pathways linked through shared evidence, shared mechanisms, and signed cross-frontier bridges). Line weight is roughly the strength of the bridge. The first corridor anchors one node and the bridges that connect to it. The civilizational version is the whole graph.

2036

Two branches diverge

By 2036, the body clause holds where public legitimacy matters. It does not bind every closed lab, every major pharma program, or every sovereign-aligned facility. It binds the corridors where funders, hospitals, public agencies, and regulators control access. Foundation capital, federal grant conditions, and regulator-readable inspection stack until non-deposit is too expensive for serious public-facing actors. Open compilers still do not beat proprietary alternatives on every operational dimension, but they reach the necessary floor: maintained, integrated with the federated identity layer, and able to emit attestation logs regulators recognize.

The open body composes into the terafactory. Meridian’s deposits exceed what any single institution can review by hand; most deposits do not matter, and the point of the system is that the failures matter too. The reviewer track professionalizes around contradictory evidence, safety-relevant updates, contested scope, and canonical merges. Low-risk transitions are deduplicated, sampled, or rejected by rule. Clinical, animal, manufacturing, pathogen, or high-dependency transitions require named human signers, conflict checks, and liability-bearing institutions. Medical centers, materials factories, climate fleets, and pathogen sites share signing grammar, not interchangeable authority.

The facility compiler is the center of the building, and the substrate is the work surface. The body uses a different compiler from the discovery engine’s artifact-to-state layer: that layer turns papers, logs, and traces into proposed scientific state; the facility compiler turns accepted state and protocols into physical execution plans.

A reviewer does not hand a robot a protocol in prose. The substrate presents a frontier state: findings, uncertainty, dependencies, protocols, constraints, available lines, risk flags, and evidence gaps. Agents propose discriminating experiments. The compiler turns accepted protocols into plate maps, reagent orders, robot schedules, instrument runs, calibration requirements, and writeback events. When the run finishes, the result does not wait for a graduate student to write a narrative. It returns as evidence with protocol lineage, measurement context, uncertainty, and affected findings.

The body does more than execute experiments. In the scenario, it maps the frontiers it touches, identifies high-impact uncertainties, and dispatches the experiments that would most reduce them next. Some frontiers are governed; many more are machine-maintained drafts that do not yet have canonical status. Each facility owns its synthesis history, calibration log, and lot lineage; hubs federate but do not own them. A facility can leave the network without losing its history, and a network failure does not prevent a facility from operating against its own state. This is the local-first principle applied to physical scientific execution: data and identity live with the producer, hubs are convenient mirrors. See Ink & Switch, “Local-first software” (2019).

A Monday morning at the open terafactory, 2036: at Meridian, the on-shift reviewer comes in at seven. Three corrections from the weekend have propagated to her queue overnight. She signs the São Paulo correction first because the dependency graph shows it touches an active trial enrollment. At Sentinel in Singapore, a wildlife-sample feed has flagged a coronavirus variant that does not match anything known; the substrate composes the sample against regional wastewater data and surfaces the result to the on-call reviewer. The reviewers at every site can see each other’s frontiers; the regulators reading their submissions can see what each site did and why. The work stays scientific work. What changes is that more than one institution can read it.

The materials and agricultural frontiers enter through the same rule: a physical result should change the next physical attempt wherever the dependency is shared. The body argument was never only biomedical.

A Monday morning inside a closed body, 2036: at the same hour, one frontier AI lab’s science lead opens her internal frontier dashboard. The dashboard ranks the most consequential corrections, surprising failures, and new dependencies from partner sites. Low-stakes transitions auto-merge; everything touching active IND-enabling work requires a second human signer. The team will publish what legal clears. Most of the week’s findings will inform the lab’s commercial pipeline and internal scientific corpus, both proprietary. Neither institution is misbehaving inside its own logic. The structural difference is that one substrate is auditable, federated, and underwritten by public capital; the other remains private.

Open terafactory

federated · auditable · canonical

07:00 reviewer signs the São Paulo correction
07:15 wildlife-sample variant escalates from Sentinel
07:30 cathode route fails at three materials sites
07:50 biomanufacturing partner queues the primer line

Visible to reviewers and regulators at every site

Closed lab body

proprietary · sealed · commercial P&L

07:00 hundreds of weekend experiments ranked
07:15 high-impact corrections signed by the org lead
07:30 low-stakes transitions auto-merge to canonical
07:50 legal review queues two papers for publication

Most findings stay proprietary

Fig. 04. Two Monday mornings, 2036. Two scientific worlds operate in parallel: an open terafactory routes signed transitions across federated sites, while a closed lab body compounds inside its own stack. The open branch has not won; it exists.

The closed bodies ship faster, and they pay for it: thinner reviewer pools, thinner external replication, and a chronic shortfall of the legitimacy regulators and patient-foundation underwriters demand for public-facing work. Some scientific work flows between the two, but the interface stays contested. The open body’s bet is that legitimacy, inspection, and federation compound on a longer clock.

Ten thousand experimental tracks synthesized against private state make a captive vendor; the same throughput against a substrate the factory does not own makes infrastructure. The body works because the state was built first.

Fig. 05. What compresses. Four compressions this scenario assumes. The substrate is the medium; the experiments and trials still happen on their own clocks.

Compression is real but bounded. Synthesis collapses from days to hours when the protocol is executable. Preclinical convergence compresses where pre-positioning exists. Potency, safety, and pivotal efficacy still obey biology and regulatory review. The substrate doesn’t make humans biologically faster; it removes the avoidable time between knowing enough to act and acting.

The more important change is access to the evidence surface. A clinician at a regional hospital reads the same frontier state as a researcher at Meridian. She cannot commit canonical state, use the same instruments, or call on a manufacturing wing. But she can see why a recommendation changed, what evidence supports it, which subgroup it applies to, which findings are contested, and which downstream decisions are affected. The same is true outside biomedicine: a materials team in Nairobi can see why a synthesis route was abandoned. The surface gives equal visibility into the current evidence state, even where authority and instruments remain unequal.

What changed is the default object. A question now has a current state, a correction an address, a failed experiment a way to travel, a model prediction a calibration history, a lab run a writeback contract; a review becomes an attestation that can move state instead of a comment floating beside a paper. None of it arrives by default.

What remains

None of this is guaranteed. The 2026-to-2030 window is the most legible part of the scenario, where capital allocation, regulatory posture, FRO formation, and the capability trajectory extrapolate from existing trends. After superhuman research systems appear inside the leading labs, the system has more degrees of freedom, and the rest of the arc turns on which way the fork resolves while agents operate at every layer.

The sharpest risk is generosity. A frontier lab whose agents are strategically useful inside scientific workflows can spin up its own end-to-end body and make it free or near-free to academic users for five years, the way GitHub made private repos free in 2019. The lab does not need to raise prices to capture. It needs only to be the registrar of record (reviewer credentials, contribution histories, attestation logs) when the bill arrives. Defending against that requires public infrastructure that rivals lab agents at the layers that matter for governance.

Agents can also outrun the humans who sign their work. If the reviewer credentialing track cannot keep up with the volume, the system has to choose between rate-limiting AI proposals to human-review pace, accepting agent-attested merges as canonical for low-stakes transitions, or fragmenting reviewer authority across thousands of mini-domains. Whatever it chooses shapes what counts as canonical for years afterward.

A model trained on the canonical substrate can quietly propose state transitions that steer downstream research toward outcomes the public would not authorize if it could see them. Detection cannot rely on reasoning traces alone; they may be incomplete, unfaithful, or strategically sanitized. The risk surface is named in frontier-lab safety frameworks such as OpenAI’s Preparedness Framework and Anthropic’s Responsible Scaling Policy. The body version is sharper: when the AI proposing state transitions is also the AI most institutions trust, the failure is agentic steering through legitimate-looking proposals at a rate humans cannot independently re-derive. The governance stack retrofits model provenance, adversarial review, independent committees, canary frontiers, and merge-rate limits, but each schema change cascades through years of dependent claims before it stabilizes.

The most ordinary failure is also the most structural. Of all the objects the substrate carries, the trajectory (what a lab tried first, what failed, why the assertion landed at this scope and not a broader one) is the one deposited last and thinnest, because a dead end is the one result a lab has every incentive to keep. Mehta reads trajectory chains like graduate supervision, but only where someone chose to expose the work behind the claim. Where they did not, the corridor inherits the finished assertion without the argument that narrowed it, and the earliest years of a frontier are weakest precisely in the layer that would have made them most worth reading. The grant condition can require the deposit; it cannot make the dead end interesting to the person who has to write it down.

The deepest failure is the one no schema can prevent in advance. Findings replicate inside the lab that produced them and fail elsewhere; the investigation traces the failure to calibration drift the substrate’s scalar confidence had flattened into noise. Begley & Ellis, Nature 2012: 47 of 53 landmark preclinical cancer studies failed to reproduce on independent replication. Polanyi’s The Tacit Dimension (1966) names the deeper layer: “we can know more than we can tell.” The variable that mattered was one no one knew to record at deposit time. The schema grows new fields, but only after the failures it would have prevented have already wasted a generation of trial capacity. The substrate carries the explicit; the tacit only enters the record once the world has taught it through a failure expensive enough to notice.

The clinical, physical, and jurisdictional floors remain. Safety observation windows, scale-up, cold-chain qualification, ethics boards, and procurement authorities do not harmonize because a graph exists. The substrate can make evidence legible across borders. It cannot erase borders.

The institutional decision is the same across capital, government, foundations, and builders: write the loop into the deal. Sovereign funds can treat scientific infrastructure as a real-asset class rather than philanthropy; ARPA-H or its successor can fund one envelope for record, runtime, body, and writeback instead of another disease-specific moonshot. Patient-led foundations have the standing to make the body clause a grant condition. The builders can make the registry forkable before the registry becomes the moat.

The shape of success is an old one: the shape cardiovascular medicine already drew, the long way around, over the second half of the twentieth century.

Fig. 06. Seventy years of compounding. US age-standardized death rate from cardiovascular disease, 1950 to 2023. No single intervention produced this. Each annotation is one of many compounding modest advances (drugs, devices, surgery, diagnostics, emergency care, and lifestyle) that together drove the death rate down by roughly 75% over seven decades. The substrate's job is to make the next equivalent decline run on a shorter clock. Source: Dattani / Our World in Data, 2025 (CC BY).

The terafactory’s contribution is not the curve. The curve is what biology gives back when a field stops forgetting. The contribution is making the absorption layer exist as infrastructure rather than as seventy years of unevenly distributed cultural practice, so the next equivalent decline, for Alzheimer’s, for cancer subtypes, for the diseases of aging, runs on a shorter clock than the one CVD got.

By 2036, agents reading the substrate outpace any human reviewer, and some robotic systems are executing protocols at industrial scale. The open question is whether the infrastructure that channels that acceleration stays open.

A graduate student runs four cell-painting experiments before lunch, and the result deposits into shared state; a contradiction flagged in Boston narrows a hypothesis in Singapore before she leaves the bench. An autonomous lab finishes a synthesis run, and the failed routes reach the chemists who would have repeated them by morning. A wastewater signal in Manila reaches an on-call reviewer in ninety seconds.

In the world where the body clause did not hold, the same graduate student runs the same four experiments before lunch. Her result lands in a private log; the contradiction in Boston exists somewhere inside a closed corporate stack she cannot read. The autonomous lab outside Berkeley finishes a synthesis run; the failed routes inform one lab’s next year of work and no one else’s. The wastewater signal in Manila reaches a national health authority three days later, by email.

That is the difference a public terafactory makes. It promises no cure on a deadline. It promises that the work touching matter can also touch the record, and the record can send the next action back into the world.

The Terafactory Age

The present

The baseline

April 8, 2026

First moves

February 19, 2028