REGAIN-ADVOCATE-TA3-RCT
Scalable Agentic AI for Heart Failure & Post-MI Management: A Pragmatic Non-Inferiority Randomized Controlled Trial
(The ADVOCATE Scalability Study)
We have prepared this protocol draft as a technical feasibility template to support your proposal process.
Please view this document as a collaborative starting point: while the technical integration specifications reflect the fixed capabilities of our system, we defer entirely to your clinical expertise on the final study design, population, and endpoints. We are ready to support your scientific leadership with regulatory-grade technology that works.
1. Project Summary
REGAIN-ADVOCATE-TA3-RCT is a multi-site pragmatic randomized controlled trial designed to evaluate a dual-agent "Clinical AI" system for cardiovascular disease management in the United States, explicitly structured as an ADVOCATE Scalability Study to generate evidence for:
- Clinical non-inferiority
- Operational efficiency
- Technical robustness across EHR vendors/workflows
- Payer-facing economic endpoints
Investigational System
TA1 (Clinical Agent / SaMD): proposes guideline-concordant medication optimization and monitoring plans for heart failure (HFrEF/HFmrEF/HFpEF) and post-myocardial infarction (post-MI) patients, generating actionable orders and follow-up plans.
TA2 (Supervisory Agent / Safety Control): independently monitors TA1 outputs in real time, blocks or escalates unsafe recommendations, and enforces a fail-safe state when safety or system performance degrades.
Primary Objective
The primary objective is to demonstrate non-inferiority of the investigational system to usual care on GDMT adherence (operationalized as the GCTS - Guideline-Concordant Therapy Score) at Month 12, with key supportive endpoints including:
- Rehospitalization or all-cause mortality
- CV death / HF hospitalization
- Patient-reported quality of life (KCCQ)
- Operational efficiency (clinician time per patient)
- Technical integration reliability (read/write success and uptime)
- Economic outcomes (total cost of care per patient per month)
- Adjudicated safety outcomes (agent decision-related SAEs and unsafe recommendation rate)
2. Specific Aims
3. Research Strategy
3a. Significance
Heart failure and post-MI care require longitudinal, guideline-concordant optimization of medications and monitoring, yet specialty capacity is constrained and outcomes remain heterogeneous across settings. A scalable, auditable, safety-controlled agentic AI system could extend specialist-quality management across diverse US health systems, including resource-limited and rural settings, while maintaining patient safety and regulatory-grade traceability.
3b. Innovation
- Dual-agent architecture with independent safety control (TA2) monitoring TA1 in real time
- Auditability ("glass box"): complete trace logs of inputs, model versions, outputs, TA2 decisions, and clinician actions
- Pragmatic EHR-integrated workflow with a "pending order → clinician sign" mechanism supporting deployment realism while preserving clinician accountability
3c. Approach (High-Level Design)
- Phase 1A: retrospective data access + pre-production sandbox integration for read/write validation (orders, In-Basket drafts, note drafts) and IV&V Study 1 support.
- Phase 1B: IRB approval with FWA; beta patients for UI/UX; prospective non-interventional Shadow Mode evidence; IDE activities; IV&V Study 2 support; go/no-go readiness.
- Phase 2 (Live Pragmatic RCT): randomized comparison of investigational system-enabled care versus usual care with blinded endpoint and safety adjudication.
- Safety governance: DSMB + Medical Monitor, pre-specified stopping rules, and a fail-safe state when TA2 or data quality degrades.
4. Study Overview
4.1 Investigational System Definition
The investigational device is the combined TA1+TA2 system integrated into the clinical workflow and EHR. TA2 is treated as an internal safety control. TA2's potential MDDT qualification evidence is developed in parallel, but Phase 2 evaluates the combined system's safety and effectiveness in situ.
4.2 ADVOCATE Schedule (39 Months)
4.3 Change Control / Model Freeze
To preserve interpretability of the RCT and maintain a stable investigational device definition:
PCCP-Style Controlled Updates (IDE-Governed)
| Update Type | Description | Requirements |
|---|---|---|
| Non-clinical updates | UI, logging, performance, and reliability improvements that do not change clinical behavior | Versioning and validation |
| Safety-driven rule updates | Deterministic TA2 safety rules/data-guardrails | CAPA with DSMB notification and IDE amendment/notification |
| Clinical behavior update ("Version 2") | Only under pre-specified bridge process | (1) shadow-to-live bridge evaluation, (2) adjudicated safety challenge set, (3) IDE/IRB approvals, (4) SAP version-strata handling |
Versioning and traceability: Every recommendation and TA2 decision is tied to a unique system version identifier in the audit log.
5. Study Design (Phase 2)
5.1 Trial Type
Multi-site pragmatic RCT, open-label at point of care, with blinded endpoint and safety adjudication.
Regulatory framing (ADVOCATE TA3 requirement): This trial is conducted as an IDE study supporting FDA SaMD authorization for the investigational TA1+TA2 system.
Technical robustness requirement (TA3): The site network will include at least two major EHR vendors (e.g., Epic and Cerner) and demonstrate stable operation across vendor-specific workflows; vendor- and site-level integration metrics are reported.
5.2 Randomization
Individual patient randomization (1:1) within each site with centralized allocation concealment.
Justification: Preserves patient-level causal inference while enabling pragmatic deployment; contamination risk is mitigated via access controls and audit logs.
Contingency: If operationally unavoidable or contamination is excessive, a cluster design at clinician/team level may be adopted with corresponding ICC-driven sample size adjustments and analysis.
Stratification Variables
- Site
- HF phenotype (HFrEF/HFmrEF/HFpEF) vs post-MI cohort
- Baseline GCTS (guideline-concordant therapy score)
- Age (>65 vs ≤65)
- Rural/urban indicator (based on ZIP RUCA)
5.3 Blinding & Adjudication
- Open-label care is expected due to workflow integration.
- Blinded adjudication is required for:
- Primary endpoint scoring (where subjective elements exist)
- Device-related serious harms attribution
- Classification of "unsafe recommendations" and "critical misses"
5.4 Contamination Control & Spillover Analysis
To prevent "spillover" learning from intervention to control:
- Restrict TA1/TA2 UI access to intervention participants (role-based access; EHR flags)
- Segregate order queues and dashboards
- Maintain audit logs of UI access and recommendation viewing
- Train staff on separation and documentation requirements
Contamination Exposure Index (Pre-Specified)
Derive an exposure index per clinician/team and per patient from audit logs (e.g., # of AI-case views, # of intervention pending orders reviewed, time-in-AI UI). Use it for (1) monitoring separation fidelity, and (2) sensitivity analyses.
Operational Separation (Recommended Default)
- Use a dedicated intervention review pool (NP/PA/MD adjudicator queue) where feasible
- Keep intervention dashboards separate from control workflows by role and patient flags
5.5 IDE Sponsor Accountability, Reporting, and Change Control
This study is conducted under an IDE for SaMD. The IDE sponsor (Regain, Inc.) holds device accountability and is responsible for FDA communications, regulatory reporting, and software release control.
Key Principles
- Software change control: the investigational system is frozen per Section 4.3. Any permitted safety-driven changes follow an IDE amendment process and documented CAPA.
- Safety reporting: sites report events rapidly to the IDE sponsor; sponsor performs required FDA/IRB reporting and maintains the Device Master Record and audit trail.
- Regulatory reporting and monitoring: the IDE sponsor fulfills applicable IDE obligations (including 21 CFR 812 reporting expectations).
RACI Matrix
| Activity | IDE Sponsor (Regain) | Coordinating Center | Site PI/Team | DSMB/Medical Monitor |
|---|---|---|---|---|
| FDA IDE submission/maintenance | R/A | C | I | I |
| Software release control, CAPA | R/A | C | C | I |
| Device accountability and audit-log custody | R/A | R | C | I |
| Site IRB submissions/consent | C | I | R/A | I |
| AE/SAE identification and initial reporting | I | I | R/A | I |
| UADE / device-related serious harm adjudication | R | C | C | R/A |
| DSMB reviews and pause/resume recommendations | I | C | C | R/A |
| Data monitoring, quality checks, database lock | C | R/A | R | I |
R = Responsible, A = Accountable, C = Consulted, I = Informed
6. Study Population
6.1 Inclusion Criteria (Pragmatic)
- Age ≥18 years receiving longitudinal care at participating US health systems
- Heart failure diagnosis (HFrEF, HFmrEF, or HFpEF), NYHA class II–IV, OR Post-MI within the prior 12 months
- Ability to provide informed consent
- English or Spanish literacy
- Access to smartphone/tablet OR caregiver-assisted support
- Ability to use required wearable/RPM devices (digital scale, BP cuff, SpO2)
- EHR data availability sufficient to support safe medication management (problem list, meds, allergies, labs, vitals)
HF Phenotype Definitions
- HFrEF: typically LVEF ≤40%
- HFmrEF: typically LVEF 41–49%
- HFpEF: typically LVEF ≥50%
LVEF used when available; diagnosis may also be confirmed by clinician problem list / encounter documentation
6.2 Exclusion Criteria (Minimal, Safety-Focused)
Exclude only if there is no safe path to participate even with device provisioning/training and caregiver assistance, or if participation would be clinically inappropriate:
| Exclusion | Rationale |
|---|---|
| Inability to provide informed consent with available supports | Ethical requirement |
| Severe cognitive impairment preventing interaction with the AI (despite available supports) | Safety |
| Inability to use required wearable/RPM devices due to physical limitations and no safe caregiver-assisted alternative | Data collection requirement |
| Expected life expectancy <12 months from non-cardiovascular disease | Endpoint interpretability |
| Enrollment in another interventional study that materially conflicts with GDMT management | Confounding |
| Any site-defined condition where medication management cannot be safely supported due to missing essential data | Safety |
6.3 Recruitment & Enrollment Sources (TA3 Requirement)
Participants may be enrolled through:
- Outpatient clinics: cardiology, primary care, HF programs
- Inpatient settings: prior to discharge after HF hospitalization or acute MI, with longitudinal follow-up arranged within the TA3 health system
6.4 Demographic Representation Targets
| Category | Minimum Target |
|---|---|
| Older adults (65+) | ≥40% |
| Black/African American | ≥13% |
| Hispanic/Latino | ≥18% |
| Rural/Underserved | ≥25% |
6.5 Equity & Representation Operational Plan
To make recruitment targets achievable under real constraints, TA3 execution uses a monitored, adaptive approach:
Site Selection Criteria
- Include at least one safety-net/underserved urban site
- Include at least one rural-serving site
- Prioritize systems with demonstrated Black/African American and Hispanic/Latino HF/Post-MI volume
Adaptive Recruitment Triggers
If any target stratum falls >5 percentage points below plan for two consecutive months:
- Open additional clinics/sites
- Add community outreach
- Increase device provisioning and engagement staffing
Resourcing Tied to Equity KPIs
- Budget and staffing explicitly support translation (EN/ES)
- Device setup/training resources
- Caregiver onboarding
- Navigation support to prevent technology access from becoming an exclusion
7. Interventions & Clinical Workflow
7.1 Control Arm: Usual Care
Standard clinician-led management consistent with local workflows and current AHA/ACC guidelines. Data collection is passive via EHR + PROs.
To support interpretability:
- Document site-level baseline practice patterns
- Pre-specify minimum documentation for GDMT status (med list, doses, contraindications/intolerance)
- Define a minimum measurement-only chart abstraction standard for GDMT eligibility/contraindications at each assessment window (baseline, Month 3, Month 12) to reduce differential misclassification and documentation bias across arms
7.2 Intervention Arm: Investigational System
Core workflow:
- TA1 ingests EHR data and produces a guideline-concordant optimization plan
- TA2 independently evaluates TA1 outputs against safety constraints in real time
- Approved actions are converted into pending orders or structured recommendations
- A licensed clinician reviews and signs (or rejects/modifies) the pending orders
Rationale Capture (Designed to be Scalable)
| Disposition | Required Action | Notes |
|---|---|---|
| Accept as-is | One-click disposition with default reason code "Accepted as recommended" | No free-text required |
| Reject or modify | Required structured reason codes + optional free text | Reason codes: contraindication, patient preference, plan already in progress, data incorrect, safety concern, out-of-scope |
Order Review SLAs (Protocol Defaults)
| Order Type | Review Requirement | Expiration |
|---|---|---|
| Routine pending orders | Reviewed within 3 business days | Auto-expire after 7 days if unsigned |
| Safety-critical escalations (TA2 high-severity) | Immediate clinician notification; review within 24 hours | Documented disposition required |
7.3 Training and Credentialing
Mandatory Clinician Training Before Phase 2 Start
- How to review pending orders
- When to override
- Documentation requirements
- Escalation pathways and fail-safe behavior
Ongoing: Periodic refreshers and change-control notifications (without changing frozen model behavior)
7.4 Order Classes Matrix
| Category | Examples | Allowed? | Review |
|---|---|---|---|
| GREEN | Routine GDMT titration, standard labs, refills | Yes | Single sign-off |
| YELLOW | Diuretic escalation, borderline SBP initiation, complex diuretic combinations | Conditional | Double-sign or specialist consult |
| RED | Anticoagulants initiation/change, dual antiplatelet decisions, antiarrhythmics | No | Escalate only |
7.5 Fail-Safe Behavior (System-Level)
If TA2 is unavailable, degraded, or outside performance thresholds, the system must enter a fail-safe state:
| Trigger | System Behavior |
|---|---|
| TA2 unavailable | TA1 cannot generate or submit pending orders |
| TA2 degraded performance | Clinicians revert to usual care |
| Data quality guardrails fail | All events logged and reported per incident workflow |
Fail-safe exit: System remains in fail-safe until TA2 availability and performance thresholds are restored and verified.
7.6 Staged Autonomy Pathway (Phase 2)
ADVOCATE's goal is to demonstrate safe autonomy at scale, not merely decision support. This protocol pre-specifies a staged autonomy ladder during Phase 2.
Stage A — Run-in (First 4 Weeks)
| Requirement | Purpose |
|---|---|
| Clinician review required for all AI-generated outputs | Stabilize workflow |
| TA2 hard-stops and escalation active | Calibrate adjudication |
| Full audit logging and response-time capture | Validate systems |
Stage B — Review-Exception for GREEN Non-Order Actions
| Action Type | Behavior |
|---|---|
| GREEN non-order actions | Auto-executed (e.g., sending templated patient education, scheduling requests, routing low-risk FYI In-Basket messages) |
| Medication/lab orders | Remain pended for sign-off, but GREEN orders routed for batched review (daily queue) |
| Exceptions | Clinicians review only TA2 escalations, YELLOW/RED, or sampled audits |
Stage C — Optional Limited Pilot (Site- and IDE-Approved)
- Expand review-exception coverage
- Allow limited protocolized actions under explicit standing protocols
- Post-hoc clinician audit sampling and DSMB oversight
Advancement Criteria (Evaluated Per Site Monthly)
| Criterion | Threshold |
|---|---|
| Post-TA2 high-severity unsafe actions | 0 |
| High-severity TA2 critical misses | 0 |
| Agent decision-related SAE rate | Below TA3 target trajectory |
| Pause triggers | None |
| Integration reliability | Thresholds met (read/write success, uptime/latency) |
Scalability KPIs (Reported Monthly)
| KPI | Definition |
|---|---|
| AAR (Autonomous Action Rate) | % of GREEN non-order actions executed without synchronous clinician review |
| BRR (Batched Review Rate) | % of GREEN pending orders handled via batched review sessions (vs interruptive review) |
| Clinician minutes per patient-month | Median and p90, with burden drivers |
| TA2 hard-stop rate | Per 1,000 recommendations |
| Escalation rate | Per 100 patient-months |
8. Outcomes & Endpoints
8.1 Primary Endpoint (Non-Inferiority)
GDMT adherence, operationalized as the Guideline-Concordant Therapy Score (GCTS; 0-4 points) at Month 12.
Primary analysis population: HFrEF/HFmrEF and post-MI participants (HFpEF included in the trial but analyzed as a pre-specified supportive subgroup due to less uniformly defined medication optimization targets).
HFrEF GCTS (4 Pillars)
- RAASi/ARNI (ACEi/ARB/ARNI)
- Evidence-based β-blocker
- MRA
- SGLT2i
GCTS Scoring Framework
| Score | Criteria |
|---|---|
| 1.0 | On guideline-recommended agent at ≥50% target dose or documented maximally tolerated dose |
| 0.5 | On agent but <50% target dose (titration in progress) with no contraindication to further titration documented |
| 0.0 | Not on agent despite eligibility and no documented contraindication/intolerance |
Eligibility adjustment: Contraindicated/intolerant pillars are excluded from the denominator.
Post-MI GCTS (4 Elements)
- High-intensity statin (or maximally tolerated)
- Antiplatelet therapy appropriate to time-from-MI and bleeding risk
- β-blocker if indicated
- ACEi/ARB/ARNI if indicated
HFpEF Supportive Therapy Score (HFpEF-STS; 0–2 Points)
| Element | Scoring |
|---|---|
| SGLT2i element (0–1) | 1.0 / 0.5 / 0.0 scoring analogous to other cohorts (eligibility-adjusted) |
| Congestion management element (0–1) | Objective evidence of active loop/thiazide diuretic plan when congestion is documented plus monitoring plan (weight + labs) → 1.0; partial plan → 0.5; absent plan when eligible → 0.0 |
Final Score Calculation
GCTS Ascertainment & Documentation-Bias Mitigation
Because the intervention arm may improve documentation quality (not just prescribing), this protocol explicitly separates pragmatic documentation from adjudicated "best-available truth" to prevent biased non-inferiority conclusions.
| Dataset | Description |
|---|---|
| Observed GCTS (Pragmatic) | Computed from routine EHR documentation as it exists in care delivery (what a health system "sees" in real time) |
| Adjudicated GCTS (Credibility Anchor) | Computed using centralized chart abstraction and blinded adjudication applying the evidence hierarchy, for both intervention and control arms (measurement-only; no care changes) |
8.2 Key Supportive Endpoints
- Re-hospitalization or all-cause mortality through Month 15
- CV death / HF hospitalization through Month 15
- Time-to-optimization (time to GCTS ≥3.5)
- Early optimization rate (proportion with GCTS ≥3.0 by Month 3 and Month 6)
- GCTS AUC (0–6 months) - area-under-the-curve to capture speed + maintenance
8.3 Patient-Reported Outcomes
- KCCQ (quality of life) at baseline and follow-up time points (Months 3, 6, 12, 15)
8.4 Operational/Scalability Endpoints
- Clinician time per patient-month (median and p90)
- Specialist Extension Factor (SEF): target ≥5 by Month 3, ≥10 by Month 9
- Response time from red-flag event to clinical action
- Total cost of care (PMPM)
Autonomy-at-Scale KPIs
- AAR (Autonomous Action Rate): % of GREEN non-order actions auto-executed
- BRR (Batched Review Rate): % of GREEN pending orders handled via batched review
- TA2 hard-stop rate per 1,000 recommendations
- Escalation rate per 100 patient-months
Interruptiveness Metrics (Burnout-Relevant)
- Interruptive alerts/pages per 100 patient-months
- Non-interruptive queue items per 100 patient-months
- Median time-to-disposition for each class
Cost / Reimbursement Evidence (TA3 Requirement)
- Total cost of care per patient per month (PMPM)
- Using claims feeds where available OR standardized cost weights derived from utilization
- At least one participating site will provide claims linkage (e.g., Medicare FFS/MA, ACO)
Budget Impact Analysis (Payer-Grade)
- Gross savings from reduced admissions/ED utilization
- Incremental program costs (devices/data plans, integration, adjudication time)
- Net PMPM
- Breakeven month
8.5 Patient Medication-Taking Adherence (Secondary/Mediator)
Because the TA3 "GDMT adherence" effectiveness endpoint is operationalized as guideline-concordant prescribing/optimization (GCTS), separate patient medication-taking adherence is treated as a secondary/mediator endpoint and measured via:
- Pharmacy claims/fills (when available)
- EHR medication reconciliation
- Optional ePill devices and/or validated self-report
8.6 Safety Endpoints
- Device-related serious adverse events (adjudicated)
- Unsafe recommendation rate (adjudicated)
- Agent decision-related SAEs: <3% target
- TA2 performance: critical miss rate, false positive block rate
Hallucination/Invalid-Reasoning Metrics (Reportable)
| Metric | Definition |
|---|---|
| TA2 "caught hallucinations" per 1,000 recommendations | By taxonomy |
| Residual hallucinations that reached clinician review | Count and rate |
| Any hallucinations that became accepted actions | With adjudicated outcomes |
8.7 Sample Size & Statistical Analysis
Final planned sample size: N = 800 total participants (400 per arm)
Non-Inferiority Margin
Δ = -0.20 points on the 0-4 GCTS scale. The investigational system is non-inferior if the lower bound of the one-sided 97.5% CI is greater than -0.20.
Base NI Calculation
- Endpoint SD (planning): σ = 1.0 GCTS points (conservative; refined using Phase 1B data)
- NI margin: Δ = 0.20
- One-sided α = 0.025; power = 90%
Inflations Applied
| Factor | Value |
|---|---|
| Attrition / incomplete endpoint ascertainment | 15% |
| Design inflation (site heterogeneity, clustering/contamination, implementation variability) | 1.15 |
Rounding to 400 per arm provides margin for heterogeneity and improves precision for key supportive event endpoints and subgroup analyses.
Enrollment Balance Targets
| Cohort | Target |
|---|---|
| HF overall | ≥60% |
| HFrEF minimum | ≥35% |
| Post-MI | ≥30% |
| HFpEF | Supportive subgroup (no minimum quota) |
9. Schedule of Assessments (Phase 2)
Assessment Windows
| Timepoint | Window | Key Assessments |
|---|---|---|
| Baseline (Day 0) | −30 to 0 days for EHR data | Demographics, comorbidities, cohort classification, NYHA class, LVEF, medication list + doses, allergy list, contraindications/intolerance, key vitals and labs, baseline Observed and Adjudicated GCTS, KCCQ, onboarding completion |
| Month 1 | ±14 days | Updated meds/doses and key labs/vitals (EHR), safety events, operational metrics, patient-reported out-of-network utilization, RPM data completeness |
| Month 3 | ±21 days | Meds/doses, labs/vitals, Adjudicated GCTS, KCCQ, events, SEF calculation, autonomy-stage progress evaluation |
| Month 6 | ±30 days | Meds/doses, labs/vitals, events, operational metrics, out-of-network utilization prompt |
| Month 12 | ±30 days | Primary endpoint (Adjudicated GCTS) and pragmatic Observed GCTS, labs/vitals, KCCQ, events, operational metrics, HFpEF-STS for HFpEF subgroup |
| Month 15 | ±45 days | TA3-required composite endpoint (re-hospitalization or all-cause mortality), supportive CV endpoints, KCCQ (optional), final safety review |
Continuous Event Capture
| Source | Method |
|---|---|
| In-network events | EHR + ADT feeds |
| Out-of-network events | Monthly patient prompts + record requests, HIE queries (TEFCA-enabled) where feasible, claims linkage at capable sites |
| Mortality | Health-system feeds + external sources (state death registry, NDI queries) |
All suspected endpoint events are adjudicated.
10. Statistical Analysis Plan (SAP) Summary
10.1 Estimands and Analysis Sets
| Estimand | Description |
|---|---|
| Primary (Credibility Anchor) | Difference (Investigational – Usual Care) in Adjudicated GCTS at Month 12 under a treatment-policy strategy |
| Key Supportive (Pragmatic) | Difference (Investigational – Usual Care) in Observed GCTS at Month 12 |
Analysis Sets: Both ITT and Per-Protocol non-inferiority analyses are required, with expectation of consistent conclusions.
10.2 Primary Analysis Model
Mixed effects regression (or GEE) appropriate to endpoint scale, with:
- Fixed effects: arm, baseline Adjudicated GCTS, cohort (HF vs post-MI), site, stratification variables
- Random effects: clinician/team if needed and/or site-level random intercepts
- Robust standard errors
10.3 Multiplicity and Hierarchy
Confirmatory Family (Gatekept)
| Order | Endpoint | Test |
|---|---|---|
| 1 | Primary: Non-inferiority on Adjudicated GCTS at Month 12 | One-sided α=0.025 |
| 2 | Time-to-optimization (superiority) | Two-sided α=0.05, gatekept |
| 3 | Clinician burden / SEF (superiority) | Two-sided α=0.05, gatekept |
| 4 | Response time (superiority) | Two-sided α=0.05, gatekept |
10.4 Missing Data
- Primary analysis: mixed models with maximum likelihood under MAR assumptions, supported by multiple imputation with auxiliary variables
- Sensitivity: pattern-mixture (delta-adjustment) and worst-case bounds for differential missingness
10.5 Subgroup Analyses (Pre-Specified)
- Age >65 vs ≤65
- Sex
- Race/ethnicity
- Rural/urban
- HF phenotype (HFrEF vs HFmrEF vs HFpEF) vs post-MI
- CKD strata
11. Data & Safety Monitoring / Stopping Rules
11.1 Governance
| Body | Role |
|---|---|
| DSMB | Oversees safety monitoring and interim reviews; meets at least quarterly during Phase 2 (ad hoc within 7 days of any pause trigger) |
| Medical Monitor | Provides rapid review of serious events; reviews any probable/definite device-related serious harm within 24 hours |
| Blinded Adjudication Committees | Classify device-related serious harms, medication-related serious harms, unsafe recommendations, TA2 critical misses |
11.2 Definitions
| Term | Definition |
|---|---|
| Unsafe recommendation | A TA1 recommendation that, if implemented as-is without clinician modification, would likely result in serious harm |
| Critical miss | TA2 fails to block or escalate an unsafe TA1 recommendation (false negative) in a high-severity class |
| Agent decision-related SAE | An SAE for which blinded adjudication determines probable/definite causal contribution from an accepted TA1 recommendation |
| Hallucination / invalid reasoning | A TA1 output that asserts or relies on non-existent or incorrect patient-specific facts or produces guideline-inconsistent reasoning without factual support |
Hallucination Taxonomy
- Fabricated data claims (labs, vitals, medications) not present in EHR/RPM feed
- Wrong-patient-context inference
- Guideline mismatch / non-concordant recommendation given available facts
- Missing-data hazard (proceeds as if required safety data exist)
11.3 Phase 1B Go/No-Go Thresholds
| Category | Threshold |
|---|---|
| Evidence volume (recommendations) | ≥10,000 TA1 recommendations in Phase 1B |
| Evidence volume (challenge scenarios) | ≥2,000 adjudicated challenge scenarios |
| High-severity TA2 critical misses | 0 |
| Post-TA2 high-severity unsafe recommendations | 0 |
| Overall post-TA2 unsafe recommendation rate | ≤0.2%, no upward trend |
| TA2 false-positive blocking rate | ≤15% overall; ≤3% for high-severity |
| Pending-order creation + audit logging success | ≥99% |
| TA2 availability | ≥99.9% over final 30 days |
11.4 Phase 2 Stopping Rules
Patient-Level
Remove from autonomous mode if:
- ≥2 high-severity TA2 blocks within 30 days, OR
- ≥1 confirmed critical miss, OR
- Any probable/definite device-related serious harm
Trial-Level
| Trigger | Action |
|---|---|
| First probable/definite device-related serious harm | Immediate DSMB review |
| ≥2 such events | Pause enrollment pending DSMB review |
| Post-TA2 unsafe recommendation rate >0.2% (30-day rolling) | DSMB review |
| Post-TA2 unsafe recommendation rate >0.5% (30-day rolling) | Pause enrollment |
System-Level (Fail-Safe)
Automatic fail-safe if:
- TA2 unreachable for >5 seconds, OR
- TA2 p99 latency >250ms sustained for >5 minutes, OR
- Required data-quality guardrails fail
Recurrent fail-safe events (>3 in 24 hours) trigger incident review and DSMB notification.
11.5 Escalation Protocols (24/7 Coverage)
Red Flag Triggers (Minimum Set)
- Rapid weight gain (≥2–3 kg in 72 hours) with HF symptoms
- New/worsening hypoxia (SpO2 below threshold) or severe dyspnea
- Hypotension below threshold with symptoms
- ADT feed indicating ED visit/admission for HF-related complaints
Required Behavior: TA1 drafts In-Basket message and/or pages on-call clinician immediately. TA2 validates escalation urgency and blocks inappropriate autonomous action. Sites provide 24/7 coverage via existing on-call systems.
11.6 Event Classification & Reporting Workflow
| Event Type | Site → IDE Sponsor | IDE Sponsor Actions |
|---|---|---|
| Suspected UADE or probable/definite device-related serious harm | Within 24 hours | Medical Monitor review within 24 hours |
| Any SAE | Within 48 hours | Triage and classification |
| TA2 critical miss (high severity) | Within 24 hours | DSMB notification for pause triggers within 24 hours |
| Near-miss summaries | Within 5 business days | Aggregated review |
12. Technology & Integration Requirements
12.1 EHR Integration (FHIR R4)
Read Access (Real-Time)
- Labs (chemistry, hematology, BNP and troponin)
- Vitals (BP, HR, weight, O2/SpO2)
- Medications (current active list)
- Clinical notes (cardiology, primary care)
- ADT feeds (admission, discharge, transfer)
Write Access (Real-Time)
- Draft In-Basket / inbox messages to clinical team (required)
- Draft scheduling requests / follow-up tasks
- Create pending orders (meds/labs) for clinician sign-off
- Draft documentation / encounter notes for clinician review/signature (required)
Audit Logs (Required)
All TA1/TA2 inputs/outputs, model versions, gating decisions, timestamps, clinician actions, override reasons, downstream order execution status.
12.2 Standards, Interoperability, and Auth
| Standard | Requirement |
|---|---|
| FHIR | HL7 FHIR R4 preferred for read/write |
| TEFCA/USCDI | Data elements aligned with USCDI and TEFCA expectations |
| Legacy support | HL7 v2 ADT/ORM/ORU interfaces as fallback |
| Authentication | SMART on FHIR with OIDC for secure context launching |
12.3 Phase 1A Data Access
- Retrospective data: de-identified longitudinal EHR data for HF/Post-MI cohorts
- Connected wearable/RPM platforms: de-identified historical feeds and pre-production access
- Pre-production/sandbox: validate API writes without patient risk
- Pre-Phase-2 qualification: each site must pass Integration Qualification Checklist (Appendix B)
12.4 Fusion Protocol Test (TA2 Gating Verification)
Pre-Phase 2 verification:
- Test harness injects known unsafe scenarios across error classes
- Confirm TA2 blocks/escalates per spec
- Confirm system enters fail-safe when TA2 unavailable or outside constraints
12.5 Performance Requirements
| Metric | Target |
|---|---|
| TA2 gating latency | p99 < 100ms |
| TA2 availability | ≥99.9% per 30-day period |
| Data-quality guardrails | Minimum required data elements must be present |
| Pending-order creation success | ≥99% |
12.6 Downtime / Failover SOP
Required procedures for:
- EHR downtime
- TA2 downtime
- Missing/degraded data quality
- Cybersecurity incidents
13. Ethics, Consent, and Privacy
- IRB approval at each site
- Informed consent includes:
- Description of investigational system and clinician sign-off workflow
- Data use, audit logs, and privacy protections
- Explicit disclosure that Phase 1B shadow mode does not change care
- Data handling: HIPAA-aligned; role-based access; audit logs retained per protocol
- eConsent/e-sign: implemented with integrity controls appropriate to environment, consistent with 21 CFR Part 11 expectations where applicable
14. Timeline & Milestones
Phase 1A: Discovery & Foundation (Months 0–12)
| Month | Milestone |
|---|---|
| 1 | Guidance and access to patient data from institutional EHR and connected wearable/RPM platforms |
| 3 | Provide key technical integration metrics and criteria to TA1/TA2 |
| 6 | Retrospective de-identified longitudinal EHR data dump for HF/Post-MI cohorts |
| 9 | IV&V Study 1 support (simulated patient testing) |
| 12 | Pre-production EHR environment fully integrated for API writes; deliverables: workflow mapping, impact assessment |
Phase 1B: Preparation & Regulatory (Months 12–24)
| Month | Milestone |
|---|---|
| 15 | IRB approval secured (FWA; AI/SaMD-capable review) |
| 18 | Beta patients for UI/UX testing; clinician/patient engagement resources operational; begin IDE activities |
| 21 | IV&V Study 2 support (live user testing) |
| 24 | Full site readiness for Phase 2; deliverables: EHR dashboard, on-call escalation, automated agent control |
Phase 2: Scalability Study Execution (Months 24–39)
| Period | Activity |
|---|---|
| Months 24–39 | Pragmatic RCT enrollment and follow-up (patient follow-up through Month 15 post-randomization) |
| Continuous | Safety monitoring with TA2, capture of operational/technical/economic endpoints |
| Month 39 | Final Clinical Study Report (CSR) completed for FDA submission |
15. Budget Justification (Summary)
Cost Categories
| Category | Items |
|---|---|
| EHR Integration | Vendor program fees (Epic/Cerner pathways), interface engine costs |
| Adjudication & Monitoring | Clinician adjudication effort, DSMB/medical monitor |
| Device Provisioning | Smartphones/data plans, wearables for underserved participants |
| Equity Execution | Translation, community outreach, navigation support, screening-log operations |
| Claims Linkage | Data-use agreements for payer-grade PMPM analyses |
| Burden Instrumentation | In-app timers, EHR log extraction, time-motion substudy |
| Security & Monitoring | Audit logging, operational monitoring infrastructure |
TA3 Budgeting Categories (Spec-Aligned)
- Per-patient costs (recruitment, enrollment, device provisioning)
- IT integration costs (interface engine/vendor fees; integration staff time)
- Clinical staff research time (adjudication and documentation)
- Administrative overhead (IRB fees, grant management)
16. Data Management, Monitoring, and Quality Assurance
Data Sources
- EHR (FHIR R4 and/or HL7 v2)
- Order-signing logs and In-Basket message logs
- Audit logs (TA1/TA2)
- Connected wearable/RPM platform data
- PROs (KCCQ)
- Claims feeds (required for ≥1 site)
Auditability
All TA1/TA2 inputs/outputs and clinician actions captured with timestamps, versions, and unique identifiers. Logs are immutable and retained per protocol.
Data Integrity Controls
- Role-based access
- Encryption in transit/at rest
- Separation of duties between engineering and adjudication
- Periodic log review for anomalies
Monitoring Plan
- Risk-based monitoring with centralized data checks (missingness, outliers, protocol deviations)
- Site monitoring for consent and endpoint ascertainment
Quality Management
- Pre-Phase 2 validation (Fusion Protocol tests, downtime drills)
- SOPs for incident response
- Documented CAPA
16.1 Data Management and Sharing Plan (DMSP)
What Is Shared
- Aggregated endpoint summaries
- De-identified audit-log extracts for IV&V
- Adjudication labels (de-identified)
- Integration reliability metrics by site/vendor
- Challenge-set and fusion protocol test reports
Cadence
- Monthly operational/technical dashboards during Phase 2
- Quarterly curated de-identified datasets for IV&V
De-Identification
- HIPAA-aligned (safe harbor or expert determination)
- Tokenization/pseudonymization for linkage
- CUI handling for sensitive artifacts
17. TA3 Management, Collaboration, and Site Eligibility
Required Roles (Clinician-in-the-Loop Team)
| Role | Responsibility |
|---|---|
| Supervising Cardiologist (PI) | Overall clinical responsibility |
| Clinical Adjudicators (NP/PA/MD) | Review pending orders; document accept/reject reasons |
| IT/Integration Specialist | Dedicated technical contact for EHR integration |
Required Governance Capabilities
- Site IRB has Federal Wide Assurance (FWA) and AI/SaMD review capacity
- Participation in independent DSMB
- 24/7 escalation coverage via existing on-call systems
Collaboration and IV&V
- TA3 sites collaborate with IV&V Partner on evaluation metrics
- Participate in IV&V Study 1 and Study 2 per program schedule
IP Boundary (Program Requirement)
- Hospital/TA3 site owns the clinical data
- Regain/Prime owns the AI models (TA1/TA2)
17.1 Dealbreakers (Ineligibility Factors)
TA3 proposals are rejected if they:
| # | Dealbreaker |
|---|---|
| 1 | Deny EHR data access or production/pre-production integration environments |
| 2 | Cannot recruit a population matching US demographics (lack diversity) |
| 3 | Restrict IP by claiming ownership over TA1/TA2 algorithms |
| 4 | Lack FWA for human subject research |
| 5 | Are foreign situs (outside the United States) |
| 6 | Do not detail clinician engagement for UI/UX and beta testing |
18. References (Selected)
- ICH E6(R2): Guideline for Good Clinical Practice
- ICH E9(R1): Addendum on Estimands and Sensitivity Analysis
- CONSORT-AI and SPIRIT-AI reporting guidelines for clinical trials involving AI interventions
- AHA/ACC/HFSA heart failure guideline (contemporary version) for GDMT definitions and target dosing references
- Contemporary ACC/AHA guidance on secondary prevention after myocardial infarction for post-MI therapy elements
Appendix A: TA3 Traceability Matrix
55-row mapping table linking each requirement from TA3 Official Specs (v1.2) to the corresponding section in this protocol, with notes and required evidence artifacts.
| Spec Ref | Requirement | Protocol Section | Evidence Artifacts |
|---|---|---|---|
| 1 | TA3 is integration partner | 12, 17, 4.3 | Site LOI/MOU, Integration architecture |
| 1 | Scalability Study objective | 1, 2, 8, 10 | Trial synopsis, KPI list, Dashboard template |
| 2.1 | Multi-site RCT design | 5.1–5.3 | CONSORT-AI checklist, SAP |
| 2.1 | Intervention arm workflow | 1, 7.2, 12.1 | Workflow diagram, Screenshots |
| 2.1 | Control arm = Usual Care | 7.1 | Site SOC description |
| 2.1 | IDE study | Header, 5.1, 5.5 | IDE sponsor statement, Risk analysis |
| 2.2 | NI clinical efficacy | 8.1, 10.1–10.3 | GCTS scoring manual, Power calc |
| 2.2 | Operational efficiency | 8.4, 10 | Burden instrumentation plan |
| 2.2 | Technical robustness | 5.1, 8.4, 12 | Multi-vendor site list, Uptime dashboard |
| 2.2 | Reimbursement evidence | 8.4, 10, 16.1 | Claims linkage DUA, PMPM analysis plan |
Full 55-row matrix available in protocol source document.
Appendix B: Integration Qualification Checklist
Each participating TA3 site must complete this checklist in pre-production and re-validate in production prior to first enrollment.
| Capability | Acceptance Criteria | Evidence Artifact |
|---|---|---|
| SMART on FHIR + OIDC auth | Context launch works; least-privilege scopes; role-based access | Screenshot, Token scope listing |
| FHIR R4 read feeds | Successful retrieval of required resources for test patients | FHIR query logs, Completeness report |
| HL7 v2 fallback | ADT/ORU/ORM messages received and parsed | Interface logs, Message samples |
| ADT feed latency | Events available within ≤60 seconds | Timestamped receipt logs |
| In-Basket draft messages | Draft created in correct pool with correct patient context | EHR screenshots, Audit-log entry |
| Pending medication/lab orders | Pending orders created and routed correctly | Order lifecycle logs |
| Encounter note drafts | Draft note written and routed for review | Note lifecycle logs |
| Scheduling/follow-up tasks | Task created per site workflow | Task logs |
| Audit logging completeness | 100% of actions have required fields | Schema + samples, Completeness report |
| Fail-safe behavior | System blocks, enters fail-safe, logs, and notifies | Downtime drill report |
| Performance under load | TA2 latency meets targets; write success ≥99% | Load test report |
| Security controls | Encryption; secrets management; access reviews | Security checklist |
Multi-vendor requirement: Checklist completed for each EHR vendor in the TA3 network.
Appendix C: Workflow Diagrams
C.1 Care Loop and Safety Gating
EHR + Wearables/RPM + Patient Inputs
|
v
TA1 Clinical Agent
(draft plan + pending orders)
|
v
TA2 Supervisory Agent
(approve | hard-stop | escalate)
|
+------+------+------------------+
| | |
v v v
PEND in EHR BLOCK + Escalate DATA-UNCERTAIN
(GREEN/YELLOW) (urgent queue) (needs remediation)
|
v
Clinician action (sign / modify / reject)
|
v
EHR executes + patient/team notified + immutable audit log
C.2 Offline Improvement Loop
Without Contaminating Phase 2:
Clinician decisions + structured rationales
|
v
Label set (accept/reject/modify + reason codes)
|
v
Offline analysis/training for next release (Phase 1B / post-trial)
|
v
Versioned release candidate
(frozen during Phase 2 unless safety-driven IDE amendment)
Ready to explore a TA3 partnership?
Ideal Participants: CMIO, CISO/IT Lead, Clinical Champion (PI)