- On February 24, 2026, a study by 10 organizations, including Professor Kate Kellogg (MIT Sloan) and Danielle Bitterman (Harvard Medical School), highlighted the gap between expectations and reality in deploying agentic AI in clinical healthcare.
- Agentic AI refers to systems of AI agents capable of autonomously completing multi-step workflows with high levels of autonomy.
- The research focused on an AI system for detecting side effects in cancer patients undergoing immunotherapy by analyzing unstructured electronic health records.
- The system can process hundreds of notes in minutes, a task that typically takes hours or days.
- The accuracy of side-effect detection was equivalent to or more stable than standard processes conducted by clinical research coordinators.
- However, less than 20% of the deployment effort was spent on prompt engineering and model development; over 80% was dedicated to “sociotechnical” work.
- For every 1 hour spent optimizing the model, the organization required approximately 4 hours for real-world implementation. The 5 Main “Burdens”:
- Data Integration: Requires stable data pipelines and infrastructure; if data is not standardized, the agentic system will suffer a “chain failure.”
- Model Validation: Not just checking outputs but ensuring AI agents adhere to policies, maintain full logs, and only access authorized tools.
- Ensuring Economic Value: ROI is difficult to calculate as costs fluctuate based on workflow complexity and coordination levels between agents.
- Drift Monitoring (Model/Data): Requires “adaptive monitoring” to continuously track multiple dynamic metrics instead of static “if-then” thresholds.
- Governance: Clarifying responsibilities, legal risks, security, and accountability mechanisms when errors occur.
- The study emphasizes that the high-risk nature of healthcare requires stricter controls, but every industry needs a playbook for these five factors.
Conclusion: Deploying agentic AI is not just an algorithmic challenge but an organizational transformation: over 80% of the effort lies in infrastructure, governance, and data integration. While the system can process hundreds of medical records in minutes with high accuracy, every hour of model optimization demands four hours of practical implementation. Success hinges on managing five key burdens, especially in high-stakes environments like healthcare.

