Skip to main content

Insights

Field notes from the evidence pipeline.

Clinical accreditation is being reshaped by forces that show up differently from the position of the evidence pipeline than from the position of any single program, accreditor, or device review. This series writes from that pipeline-side view. The lead piece takes on where AI-use evidence becomes visible. The pieces that follow take on the visibility gap between cycles, the voluntary- standards erosion question, the cost of manual chart abstraction, and what changes architecturally when monitoring goes continuous. None of these forces are the pipeline's to decide. The pipeline only sees them clearly from where it sits. The four briefs that follow describe what is visible from that position, written for readers who run programs, write Standards, or review applications.

Lead essay

Where AI-use evidence becomes visible.

A four-part argument about the structural position from which override patterns, threshold clusters, and AI-suggested-versus- final-reading deltas are observable, and why making them visible is a different act from deciding what they mean.

Editorial diagram with two stacked horizontal layers labeled in small caps, frameworks-level governance above, pre-market device review below, separated by an empty band annotated in the margin as in-practice use.

Two layers, one band in between

Two structural positions in the lifecycle of clinical AI are already well-occupied. Frameworks-level governance bodies set assurance expectations, what an organization is supposed to have in place before it deploys an AI tool. Pre-market regulators review devices before those devices reach a clinician. Between those two layers sits a third question that neither layer is positioned to answer: how the tool is being used after it is installed, inside a specific program, on the actual case mix that walks through the door.

Editorial small-multiples panel of three side-by-side hairline charts, an override-rate trend, a threshold-cluster scatter, and an AI-suggested-versus-final-reading delta histogram.

What in-practice use looks like

In-practice use is observable. When a clinician overrides an AI-suggested measurement, that override is a trace. When thresholds cluster at a particular value rather than the one the device was cleared at, that is a trace. When AI-suggested values and final human readings diverge in one direction across thousands of cases, that is a trace. These are not abstractions. They are events that get logged by the same clinical systems that already log the rest of the procedure.

Override patterns are concrete evidence of how the tool actually lands in care.
Editorial synthesis spread with a hairline pipeline diagram running from clinical systems on the left through a labeled compliance evaluation node to a tabular pattern register on the right.

Why the same pipeline

The evidence pipeline that supports compliance evaluation reads the same clinical records. It pulls structured measurements. It records which physician finalized the report. It captures equipment service events. The cost of also surfacing override and threshold patterns is incremental, not architectural. The substrate already has the rails; the patterns ride on them. That is why these patterns are visible from this position, even though answering what they mean is a different question.

Editorial brief cover with a tabular pattern register at the top and a teal hairline arc passing the patterns out toward an unlabeled standards-body silhouette in the margin.

Visible here, decided elsewhere

The substrate makes the patterns visible. It does not decide what they mean. Whether a clustering of overrides is a calibration signal, a workflow issue, a sign that a tool is not fit for that case mix, or simply how clinical judgment is supposed to interact with an automated suggestion, those are questions for the bodies that already define what good practice looks like, written into the Standards that programs are accredited against. The pipeline is descriptive infrastructure. The reading is theirs.

Descriptive infrastructure, not prescriptive judgment.

Brief 2

The visibility gap between cycles.

Most modality-level accreditation runs on a three-year cycle. A program submits an application, a sampled set of cases gets reviewed, and a decision lands. Between that decision and the next renewal, the program is on its own. Reports get written. Equipment drifts out of calibration. New physicians join and old ones leave. None of it surfaces to the accreditor until the next application window opens, often two or three years later, when gaps have to be reconstructed from records that were not collected for that purpose.

The shape of the problem is structural, not procedural. The Standards have not failed. The application process has not failed. The window between applications is simply not instrumented. Programs that are doing well between cycles have no way to demonstrate it. Programs that are quietly accumulating gaps have no way to see them. The renewal is the first time anyone looks, and by then the gap has been accumulating for years.

There is published evidence that closing the gap is tractable. A study in the Journal of Nuclear Medicine (2018) examined facilities that used a publicly available quality-improvement tool published by their accrediting body. Facilities that used the tool had a deficiency rate of 49.1 percent at application time. Facilities that did not use the tool had a deficiency rate of 73.0 percent. That is a 24-point gap, on the same Standards, with the same reviewers, separated only by whether the program had been watching its own evidence between cycles.

The intervention in that study was a paper tool, a checklist a program could work through on its own. The same logic applied to live clinical data, pulled continuously from the systems where it already exists, produces the same effect at much lower marginal cost. The accreditor still reviews. Peer reviewers still exercise judgment. Sampled cases are still sampled. What changes is that the program no longer arrives at its renewal blind to its own posture.

How the substrate compiles evidence between cycles →

Brief 3

The voluntary-standards erosion question.

Not every accreditation program is a regulatory requirement. A meaningful share of modality-level accreditation is voluntary, a quality signal a program elects to carry because peer review, standards from the societies that write them, and publicly recognized seals add up to a credible mark of clinical seriousness. In at least one major modality-level accreditor's portfolio, six of ten programs are voluntary. The voluntary base is not a side market. It is the base.

Voluntary participation is sensitive to friction. When the cost of compiling an application rises, more documentation, more abstractions, more re-submissions on technicalities, the marginal program decides the seal is not worth the effort and lets accreditation lapse. The Standards did not get harder. The act of demonstrating compliance did. The erosion happens at the edge first: smaller programs, single- modality practices, facilities without dedicated quality staff. They are also the programs that benefit most from external review.

Lowering compilation cost is therefore a structural question, not a convenience one. If the evidence required for an application can be assembled from clinical systems rather than from a months-long manual abstraction effort, the friction tax drops and the voluntary base holds. The Standards stay where they are. Peer review stays where it is. What changes is whether a small program can afford to participate at all.

How the substrate reduces compilation cost →

Brief 4

What manual abstraction costs the field.

Chart abstraction is the unglamorous backbone of nearly every quality program in healthcare. A trained abstractor opens a record, reads the report, decides whether the indication on the order matches the procedure performed, checks the impression against the findings, notes whether the report addressed the questions raised by the referring clinician, and logs the result against a measure definition. Published estimates for this kind of work typically run on the order of 55 to 75 minutes per case for mature measure sets, with substantial variation by case complexity, measure breadth, and abstractor training.

Multiply that by application sample sizes, three to ten cases per application for many modality-level programs, hundreds to thousands of cases for registry-grade work, and the labor model becomes the binding constraint on how much evidence a program can afford to compile. The cost is paid twice: once by the program, in staff hours that might otherwise go to clinical care, and once by the field, in measures that get narrower, sample sizes that get smaller, and quality questions that go unasked because the labor to answer them does not exist.

FHIR-native ingest changes the binding constraint. When indications, impressions, finalized reports, signing physicians, and equipment service events are pulled directly from clinical systems as structured data, the abstraction step does not disappear, clinical nuance still requires human judgment on sampled cases, but the completeness check, the volume tally, the credential currency check, and the timeliness measurement all become computed rather than abstracted. The expensive labor gets re-routed to the work it was meant to do.

How FHIR-native ingest changes the labor model →

Brief 5

The transition to continuous monitoring.

Continuous monitoring is sometimes framed as a software question. It is more usefully framed as an architectural question about what stays the same and what changes when evidence stops being compiled in batches and starts accumulating in flow.

What stays the same: the Standards themselves, written by the societies that have refined them for decades. Peer review, where reviewers exercise clinical judgment on cases that are too nuanced for any rule pack to evaluate. The sampling of cases for narrative review, the right answer to questions about interpretive quality is still another physician reading the same images. The accreditation decision, which sits with the accrediting body, not the substrate.

What becomes possible: between-cycle visibility, so a program knows where it stands without waiting for the next application window. Pre-validated applications, so when the window does open, the completeness work is already done. Tracked remediation, so when a finding is surfaced, the closure of that finding is observable rather than asserted. Findings that cite the clause, the rule version, the metric, and the source record, so that when a program disputes a finding, the dispute can be resolved at the level of evidence rather than the level of memory.

The transition is not from human review to automated review. It is from periodic compilation to continuous compilation. The reviewers do not go away. The Standards do not move. What changes is that the evidence catches up with the care, instead of the care being reconstructed from records that were not collected for the purpose.

How continuous evaluation stays auditable →

How the briefs cluster

Four structural transitions, one substrate view.

The five briefs above are not five disconnected topics. Each tracks a structural transition the field is already in the middle of, and each is visible from the position of the evidence pipeline for the same reason: the pipeline already reads the data the transition is moving through.

  1. 01

    Visibility

    The gap between cycles, and what published quality- improvement tools have already demonstrated about closing it.

  2. 02

    Labor

    Manual chart abstraction as the binding constraint on measure breadth, and what structural ingest changes.

  3. 03

    Governance

    The position from which in-practice AI use is visible, and the voluntary-standards erosion question.

  4. 04

    Architecture

    What stays the same when monitoring goes continuous, and what becomes possible that was not possible before.

The thesis

The Standards are not the bottleneck. The evidence pipeline is.

Periodic surveys are the labor model the field has inherited. They were what was possible when chart abstraction took an hour per case, when clinical data lived in paper records, and when AI deployment in care was rare enough to be treated as an exception. None of those constraints hold any more.

The pieces in this series describe what the substrate sees from where it sits. The decisions about what to do with what is visible, which patterns matter, which findings warrant remediation, which voluntary programs deserve protection, are not the substrate's to make. They belong to the bodies that already define what good practice looks like.

Figure 3.1, Where the substrate sees from

Editorial marginalia composition titled FOUR STRUCTURAL TRANSITIONS, with four labeled clusters arranged in a quadrant, visibility, labor, governance, architecture, connected by hairlines to a central node labeled evidence pipeline.

Figure 4.1

Four transitions, one substrate view

Editorial small-multiples figure titled FOUR TRANSITIONS, ONE SUBSTRATE VIEW, four side-by-side panels labeled Visibility, Labor, Governance, Architecture, each rendering a hairline trend line over a 2018-to-2026 x-axis with a muted-teal accent at the terminal point.
Each panel sketches one of the four transitions the series tracks. The diagonal is the same in each, from sampled batch compilation toward continuous flow, but the panel tells a different story about which forces move first.

Read more

Adjacent sections.

Use Cases

Pain-to-solution scenarios for the structural transitions described in this series, at the program and facility level.

Explore use cases →

Trust

Deterministic evaluation, engineered separation of PHI and PII, peer-reviewer tooling, and the audit-trail discipline that backs every finding in this series.

See the trust architecture →

About

An infrastructure company writing about the field it builds for. Self-funded, two-org structure, field experience deploying clinical systems at medical centers in Central Asia.

Read about Regain →

Talk to the people writing the briefs.

We will walk through how the substrate implements the structural arguments in this series, in the context of a specific program and a specific Standards framework.

Request a demo

Footnotes

  1. Lead essay framing, the two-layer model (frameworks- level governance bodies; pre-market device review) and the in-practice-use band between them, restates a structural argument developed in private briefings in Q2 2026. The categories are described by what they do, not by name.
  2. Brief 2 cites a study published in the Journal of Nuclear Medicine in 2018 examining facilities that used a publicly available quality-improvement tool published by their accrediting body. Reported deficiency rates: 49.1 percent among tool users versus 73.0 percent among non-users at application time.
  3. Brief 3 references a portfolio composition figure (six of ten programs voluntary) drawn from public materials of a major modality-level accreditor. The argument holds for any accreditor whose voluntary base is a meaningful share of the portfolio; the specific ratio is illustrative.
  4. Brief 4 chart-abstraction time estimates (55 to 75 minutes per case for mature measure sets) are composite figures from published registry and quality- measure operations literature. The range varies by case complexity, measure breadth, and abstractor training.
  5. Brief 5 distinguishes what stays the same (the Standards, peer review, sampled-case narrative review, the accreditation decision) from what becomes architecturally possible (between-cycle visibility, pre-validated applications, tracked remediation, clause- and version-cited findings) when compilation moves from batch to continuous.