At Vega Health, we've assembled a network of collaborators — team members, advisors, investors, and allies — with decades of experience making technology actually work for clinicians and operators. Vega Conversations is a series highlighting their perspectives: what brought this group together, what we're doing differently, and how we believe AI can structurally improve healthcare.

Dr. Robert Califf is a cardiologist, clinical researcher, and served as the Commissioner of the Food and Drug Administration (FDA) from 2016-2017 and 2022-2025. He founded the Duke Clinical Research Institute, served as head of medical strategy at Alphabet, Inc., and is a tenured professor at Duke University.

For years as a journalist, I’d wanted to interview Dr. Rob Califf. His second stint as FDA Commissioner under President Biden straddled the launch of ChatGPT and kicked off a regulatory reckoning with how to oversee the burgeoning use of generative AI in healthcare. It was a central topic I was covering, first for Inside Health Policy and then Fierce Healthcare. I pored over every public remark he made on the subject to understand where AI regulation was heading.

I didn't know then that I'd eventually sit down with him not as a reporter, but as a colleague drawn to the same question: where is AI actually making a difference in healthcare?

That shared interest is part of what brought us both to Vega Health. Early one morning last month, I finally got my chance to ask him the questions I'd been saving up — this time, with Vega Health at the center of the conversation.

Califf recognizes that AI will revolutionize industries, including healthcare. The transformation, though, will in part be dictated by who has the most access to shaping it.

“My skepticism is about who’s going to control the purpose of AI,” Califf said. “And to me, that’s the most important issue that we’re dealing with.”

Will healthcare and technology incumbents decide who, how, and when healthcare gets access to AI? With the deep inequities that already run through the healthcare system, Califf hopes that the distorted incentives that put profits over patient outcomes aren’t what dictates AI development, procurement, and implementation.

Take the most basic question that most healthcare clinicians or executives ask of any AI vendor: “does this AI model work?”

The answer to that question depends fundamentally on the purpose of the model or solution. The goal of one AI model may be to improve the financial performance of a service line. Another may be built to improve patient and family well-being. Califf asserted that the two may both work, but they will achieve inherently different outcomes and should be used with the understanding of what they are designed to achieve.

Health systems should also keep this in mind when purchasing.

For too many healthcare delivery organizations, though, this information is hard to come by. Health systems lack credible ways to assess what outcomes AI tools are achieving in real-world settings and are almost always must rely on how vendors define performance.

Califf described independent evaluation as a critical need in what he called the “wild west” phase of AI development.

That is one reason he was drawn to Vega Health’s business model. Rather than simply selling an opaque, AI-powered product, Vega Health’s goal is to bring best-in-class models to community health systems through a curated marketplace. Models on the Vega Health Marketplace are licensed from leading institutions and are required to have achieved real outcomes in a live clinical setting.

Our conversations with customers begin not by showcasing the range of solutions we have available, but instead by asking our potential customers what outcomes or priorities they are trying to achieve. Our success depends on helping them accomplish their goals.

The Need for Local Validation and Continuous Performance Monitoring

Previous validation frameworks for healthcare technology fail when confronted with generative AI, and one topic Califf grappled with at FDA was how to regulate a constantly evolving technology.

In real healthcare settings, model behavior changes with the environment: patient populations differ, workflows vary, practice patterns are inconsistent, and generative systems can produce different outputs depending on the context in which they are used.

That means validation cannot be treated as a one-time exercise completed before implementation. In Califf’s view, monitoring must be continuous, local, and grounded in real-world use, with systems in place to track performance as the model interacts with actual patients and clinicians.

Califf argued that benchmark performance is at best a starting point, not proof that an AI system is ready for clinical use.

This has major implications for health systems. If model performance shifts by setting, then healthcare needs infrastructure for ongoing monitoring, feedback, and reassessment. In other words, evaluation is not a side function. It should be part of the product, he argues.

Vega Health can bring multiple models designed to solve the same problem to a customer and test which one will work best in their environment. At the same time that we’re helping health systems evaluate the best model for them, we are tuning the monitoring metrics to specifically work for that system’s goals.

But AI evaluation is still hindered by health systems’ ability to track outcomes. A sticking point for Califf in his clinical research has been the difficulties following up with patients across their life spans to understand how a clinical intervention affected their long-term health and mortality.

He argued that if the outcomes that matter most are whether people live longer, function better, and stay healthy, then the system needs complete follow-up – and that follow-up remains an outstanding question for health researchers, health systems, AI vendors and payors to solve.

A parallel for regulating health AI: The food industry?

On regulation, Califf offered a framework that feels more practical than the usual binary debate between heavy oversight and laissez-faire experimentation. He argued that the FDA should not try to regulate every AI technology in healthcare in the traditional device-by-device way. The scope is too large, and the technology is changing too quickly. Instead, he suggested something closer to the Food Safety Modernization Act model: establish high-level principles, require organizations to operationalize them, and hold health systems and vendors accountable for having transparent plans, following them, and correcting them when problems are found.

Food safety does not depend on federal inspectors checking every farm every day. It depends on a system of standards, documented operating practices, and periodic inspection against those practices. Califf’s argument is that healthcare AI needs something similar. The organizations deploying AI should be responsible for defining how they will ensure safety, effectiveness, and accountability. Regulators should then evaluate whether those operating systems are credible and maintained in the real world.

He did make one important exception: AI embedded in high-risk medical devices should still be regulated more traditionally. But for decision support tools and many operational AI systems, he sees local accountability as the more realistic path.

Tuning for Equity

Califf was blunt that market-driven solutions rarely produce equitable outcomes. The health gap is also a wealth gap, Califf said quoting a popular healthcare ad. AI can just as easily widen that divide by helping organizations better target profitable lines of work. He suggested that if equity were treated as a foundational operating principle, it would materially change how healthcare organizations designed and governed AI.

Healthcare organizations often optimize solutions for financial performance. Califf strongly cautions health systems against only prioritizing financial outcomes when designing an AI strategy.

The healthcare industry needs better evaluation, clear accountability, and explicit equity standards. Without those, AI may still scale—but it will scale the logic of the system we already have.