AI in Healthcare Shifts to Physician-Led Development and Security Risks
Healthcare artificial intelligence is entering a distinct new phase, split between two divergent realities. On one front, physicians are using agentic AI tools to build custom clinical applications without traditional engineering teams. On the other, security leaders warn that AI-generated code introduces vulnerabilities that could expose patient data to sophisticated attacks.
The shift toward doctor-led software development emerged during a webinar hosted by Anthropic. Physicians demonstrated how Claude Code—an agentic coding assistant that reads codebases, edits files, and runs commands—enables care teams to build features and automate development tasks. The tool relies on the company's Opus 4.7 and Sonnet 4.6 AI models.
Dr. Graham Walker, an emergency medicine physician and cofounder of clinical decision tool MDCalc, posed a pointed question during the session: "Why aren't we letting our physicians build these tools?" The sentiment reflects growing frustration with electronic health record systems that often feel disconnected from actual clinical workflows.
Dr. Michał Nedoszytko, an interventional cardiologist and AI developer who placed third at Anthropic's hackathon earlier this year, took the concept further. "If the EHR is a problem, maybe just create your own," he said. The statement captures a broader industry tension between rigid institutional systems and the practical needs of clinicians at the point of care.
The physical reality of this shift matters. Instead of navigating eight screens to complete one task, physicians can now describe a workflow and watch code assemble itself. The friction of traditional software procurement—months of vendor negotiations, compliance reviews, and IT ticket queues—disappears. (This is both liberating and terrifying for hospital CIOs.)
Security concerns loom large. TrustedSec CEO Dave Kennedy, a former NSA analyst, told Forbes that novice developers won't spot flaws in AI-generated code, "introducing serious defects." The concern intensifies with Anthropic's latest frontier model, Claude Mythos, which can detect system vulnerabilities. The same capability that helps build secure tools could help attackers exploit them.
Privacy regulations add another layer of complexity. While Claude Code can set up VPNs and distribute tools on public servers, HIPAA and Europe's General Data Protection Regulation create compliance hurdles. Nedoszytko noted that while his hackathon-winning post-patient visit tool was built with HIPAA pathways from the outset, physicians still need engineers for production-ready code.
"It's one thing creating something on your computer, but another thing is actually running it with live data of patients, especially if you're within an institution," Nedoszytko said. "This always needs to be run through your team." The distinction between prototype and production remains critical in healthcare, where a bug can directly impact patient safety.
Healthcare IT News reported that Anthropic is working on regulatory plug-ins beyond current HIPAA audit skills. At present, using the HIPAA compliance audit review skill could make compliance audits faster and less costly, according to Walker.
Broader industry movements support this transition. The American Hospital Association submitted formal comments to the Department of Health and Human Services in February 2026, urging policy frameworks that balance innovation with patient safety. The AHA recommended removing regulatory barriers while ensuring clinicians remain in the decision loop for algorithms impacting care delivery.
The AHA letter also emphasized the need for post-deployment standards to ensure ongoing integrity of AI tools. This reflects growing recognition that AI implementation isn't a one-time event but a continuous process requiring monitoring and adjustment.
Major EHR vendors are responding with native AI capabilities. Epic Systems plans over 150 AI features in 2026, including conversational search and AI agents designed as a digital workforce. athenahealth announced athenaAmbient, a free ambient documentation solution launching in February 2026. The U.S. Department of Veterans Affairs is expanding ambient AI scribes to all medical centers nationwide.
These deployments represent the largest government healthcare AI rollout in the United States. The technology listens to appointments and automatically generates clinical notes, allowing doctors to maintain eye contact with patients instead of focusing on computers. The physical experience changes—less typing, more listening, more human connection.
Implementation costs remain a barrier. Multiple studies estimate implementation costs over $200,000, with the MIT Nanda report finding a 95% pilot failure rate. The pattern is consistent: pilots succeed on paper, then stall in practice. A model can perform beautifully in aggregate and still fail in the only moments clinicians need it.
At the Johns Hopkins Research Symposium on Engineering in Healthcare in December, Andrew Menard, executive director for radiology strategy and innovation, described evaluating a breast cancer AI tool. He asked radiologists one question: "Do you sleep better at night?" The answer was overwhelmingly yes. That answer mattered more than utilization curves or sensitivity charts.
The industry is entering what some call the "prove-it phase II." Phase I demanded transparency—show your model card, your data, your validation studies. Phase II asks tougher questions: What changed because of your tool? Did it reduce harm? Free up even five minutes of a clinician's day? If the answer is no, it doesn't matter how elegant your ROC curve looks.
Whether physicians building their own tools will scale beyond individual departments remains uncertain. Whether security teams can audit AI-generated code fast enough to prevent vulnerabilities is another open question. The technology is ready. The infrastructure and governance are catching up. Whether users actually pay for it remains the real question.
Artūras Malašauskas is an AI Systems Integrator with 20+ years of production-grade web engineering experience. He has designed, shipped, and scaled enterprise Python/PHP systems for logistics, SaaS, and public-sector clients. For the past year, he has focused exclusively on AI integrations: deploying open-source LLMs, building generative media pipelines (image, audio, video), and engineering multi-agent workflows for real production environments. His standard: reproducibility, security, cost-efficient inference—no vaporware. He documents and evaluates emerging AI tooling, separating verified capabilities from marketing noise. Technical editor at: muza-ai.eu, ai-verslas.lt, ai-naujinos.lt Connect on LinkedIn
Artūras Malašauskas is an AI Systems Integrator with 20+ years of production-grade web engineering experience. He has designed, shipped, and scaled enterprise Python/PHP systems for logistics, SaaS, and public-sector clients. For the past year, he has focused exclusively on AI integrations: deploying open-source LLMs, building generative media pipelines (image, audio, video), and engineering multi-agent workflows for real production environments. His standard: reproducibility, security, cost-efficient inference—no vaporware. He documents and evaluates emerging AI tooling, separating verified capabilities from marketing noise. Technical editor at: muza-ai.eu, ai-verslas.lt, ai-naujinos.lt
Comments