Artificial intelligence (AI) is really good at some things, and not so good at others. Most software can handle tasks, requests, and actions that are fairly black-and-white—ones with relatively little context to identify and interpret.

Therapy is messier than that.

What happens in-session between a provider and client carries a lot of nuance, ambiguity, and themes that will never fit neatly under any one category or label. Ask five different therapists to define the same moment, and you’ll probably get five different answers—which is why training AI to assess behavioral healthcare isn’t for the faint of heart.

In a recent episode of the No Notes podcast, host Denny Morrison, PhD, Chief Clinical Officer at Eleos, sat down with Natalia Szapiro, Clinical Team Lead at Eleos, and Samuel Jefroykin, Director of Data and AI Research at Eleos, to talk about what it really takes to build AI that truly aligns with the realities of behavioral health

Together, they dug into why ambiguity isn’t a problem to solve, but a fact of life—and how building useful AI means listening, translating, and constantly iterating on what “good enough” looks like. In some ways, progress in building intelligent behavioral health tech is a lot like progress in therapy: you have to learn from the grey areas and focus on finding what works instead of what’s perfect.

Want to hear their full conversation? Watch or listen to the podcast episode here.

Two Worlds, One AI Goal

At Eleos, building AI is a constant back-and-forth process, with engineers and practicing clinicians working side by side from start to finish. “I think it’s fair to say that we balance each other,” Szapiro said. “But more than that, we kind of bounce off each other and improve each other.”

Natalia Szapiro, Clinical Team Lead at Eleos, shares how her team works hand-in-hand with Eleos’ tech team to design smarter, more effective mental health tools—all while continuing to treat clients in their own practices.

That close collaboration is essential, because behavioral health data is unique—even in the healthcare world. Therapy sessions are built on conversation, connection, and a kind of under-the-surface subtleness you just don’t find in a diagnostic test, lab report, or prescription.

That’s exactly why, as Jefroykin put it, “There’s no one single project [at Eleos] that an AI researcher is doing that a clinical expert didn’t work on.”

Sometimes, even basic clinical concepts don’t translate easily into code. Take the “Golden Thread,” for example—that elusive link tying together treatment plans, progress notes, and interventions. “[The] Golden Thread is making sure that there is one thread aligning throughout the whole therapy episode, and that the treatment plan is aligned with the notes and what comes up in the sessions,” Szapiro explained. “We go into a process of really defining it in terms that the AI would understand.”

Natalia Szapiro, Clinical Team Lead at Eleos, shares how her team helps bridge the gap between therapy language and AI development.

Want to see how AI helps the Golden Thread shine in every progress note? Download our Compliant Progress Note Examples here

It’s messy, iterative work—and it can only happen when both clinical and technical experts are collaborating closely in the same proverbial “room.”

The Build Process: What Good Behavioral Health AI Looks Like

So what does it actually mean for AI to be “good enough” for use in a behavioral health setting? For Jefroykin, Szapiro, and their teams, it comes down to an uncompromising focus on safety, usefulness, and continual improvement.

“We don’t expect the machine to be more accurate than a group of clinicians,” Szapiro explained. “So if we have disagreements between us, we would expect that the machine also wouldn’t be 100%.”

Disputes are unavoidable, but they serve as a reminder of just how complex real clinical work is. Even experienced clinicians can interpret the same session or note in different ways, so it would be unrealistic—or even risky—to expect an AI tool to always “get it right.”

Samuel Jefroykin, Director of Data and AI Research at Eleos, shares how every new product feature begins with close collaboration across disciplines.

That’s why model testing and validation continues well after launch. “Usually when it’s a new feature, we monitor almost on a daily basis,” Jefroykin said. “Then we’ll ask Natalia’s team to go over the notes, the AI output, and do an analysis of what’s good, what we’re missing, and what we can improve.” In practice, that means establishing regular feedback loops, facilitating frequent direct reviews by clinicians, and making ongoing tweaks as the system learns from real-world use.

Or as Szapiro put it: “We call it a qualitative analysis…looking at where the gaps are between what we provide and what our clients are using.”

Launching a new AI feature in behavioral health isn’t a “set it and forget it” process. Here, Samuel Jefroykin, Director of Data and AI Research at Eleos, breaks down how his team monitors usage data, clinician interactions, and documentation quality in real time.

At Eleos, no product-related project happens without a clinical expert involved—and the process is always ongoing. Every new feature launches with a plan to iterate, revisit, and keep closing the gap between what technology can do and what behavioral health providers and organizations actually need.

A Word of Caution on ChatGPT

Generative AI tools are literally everywhere, so it’s easy to wonder why therapists can’t just drop session notes into something like ChatGPT and be done with it. Jefroykin and Szapiro have strong feelings about that.

Jefroykin doesn’t mince words about the data risks: “If you don’t have a pro account and you didn’t sign an NDA and you didn’t make sure you are out of their training system, you can’t use [general AI],” he said. In other words, unless you have a formal business arrangement and explicit data protections (which most clinicians don’t), using a public tool like ChatGPT for anything containing client information is off-limits.

Privacy is core to trust, and trust is core to the therapeutic relationship. Behavioral health notes are so packed with protected health information (PHI) that using a general AI tool like ChatGPT that isn’t HIPAA-compliant could put clients and organizations at serious risk.

Samuel Jefroykin, Director of Data and AI Research at Eleos, explains why you can’t just upload therapy sessions into public AI models like ChatGPT and offers real-world examples of how clinicians can use AI to streamline administrative work without risking PHI exposure.

But even if privacy weren’t a concern, generic AI models can’t reliably capture the information that is truly meaningful from a clinical standpoint.

“We’ve been through a whole iterative process to make sure that our products give something that is really written in good clinical language…and knows how to identify intervention,” Szapiro explained.

At this point, pretty much any generative AI tool can generate a summary of a transcript, but therapy notes need to reflect not just what was said, but also the context, the clinical significance, and the reasoning behind the interventions provided.

Natalia Szapiro, Clinical Team Lead at Eleos, explains why clinicians shouldn’t expect perfection from AI on the first try—and why that’s part of the process.

“AI can create a really good and nice summary of a conversation, but if you want to make it more adapted to clinical work, you need to analyze what’s happening behind the scenes,” Szapiro said. A progress note is a tool for tracking client progress, supporting compliance, and communicating with the entire care team—it’s so much more than a simple conversation recap.

AI Privacy and Security: Guardrails that Matter

But let’s go back to privacy for a moment—because every time AI is used in behavioral health, client privacy is at stake. Jefroykin and Szapiro both know that protecting sensitive data means drawing firm boundaries at every step.

“We make sure we’re using a third party that is HIPAA compliant…that we’re out of any data retention system,” Jefroykin said. “Even for one minute, we don’t want it.”

That means any time Eleos works with external vendors or tools, they take extra steps to ensure client data isn’t stored or exposed—ever.

Get a detailed breakdown of all the data safeguards in place at Eleos. Download our ultimate guide to behavioral health AI privacy and security.

Of course, there are times when developers and data scientists need to work with clinical material to test new features. That’s where mock data comes in. “Even today, I sat down with one of my clinicians to record a mock session to be used for testing, to be used for making sure that our products are working,” Szapiro shared. These simulated sessions are carefully constructed, de-identified examples that mimic real cases without risking anyone’s privacy.

To ensure AI tools are safe, ethical, and effective, Eleos relies heavily on mock data rather than real patient conversations. Natalia Szapiro, Clinical Team Lead, shares how her team creates realistic, imaginary therapy sessions to test product functionality and performance.

Szapiro’s team has leaned more heavily into this approach as the Eleos platform has grown. The goal is simple: limit access to real-world client data as much as possible, instead relying on fake data to stress-test new ideas, features, and enhancements.

“Nowadays, we’re doing a big-scale project of making sure we have enough mock data to keep everyone else’s data as safe as possible,” Szapiro said. 

The Delicate Balance Between Practice and Product

When the team at Eleos sits down to build or refine their AI tools, they bring their clinical backgrounds with them, sometimes right into the test data. “There are many cases where we come to the conclusion that some things could be ambiguous,” Szapiro said. “[For example], even within the team, we couldn’t agree on the threshold to count something as a mindfulness practice.” Ongoing debates like this happen because therapy isn’t always clear-cut, even among experienced clinicians.

Therapy is full of gray areas—and building AI that supports clinicians means embracing that ambiguity. Natalia Szapiro, Clinical Team Lead at Eleos, shares how her team navigates tough questions: What counts as a mindfulness practice? When is an intervention actually happening? And how do you decide what makes it into a progress note?

Everyone on the clinical team still actively sees clients outside of Eleos, so the models are guided by people who know what real therapy is like. They’re not just working off textbook examples or transcripts. This close connection to actual care means the tools have to reflect the messy, in-between nature of real sessions.

The team’s goal is to build tools that truly support therapists—not ones that can make clinical decisions for therapists. In addition to lifting some of the administrative burden off their shoulders, the AI often surfaces details or topics the clinician missed, offers another perspective on what happened in-session, or jogs the provider’s memory.

“ I often speak to people about what Eleos does in a therapy session as kind of like having a co-therapist in the room,” Morrison shared. “You will get a different opinion from the Eleos app…to stimulate a clinician to think about things in ways that they may not have thought of [on their own].”

The Breakneck Pace of AI Advancement

For Jefroykin, Szapiro, and all their colleagues at Eleos, keeping up with new technology is a routine part of the job. “What you are building today, what you put in production today, you need to understand that probably in three months we’ll erase it and replace it with something new,” Jefroykin said, adding that AI models and methods change so fast that the best solution today could be old news by the next quarter.

Morrison also emphasized how quickly the field is moving—and how the team at Eleos is working at the leading edge: “There are so many new features and facets of AI coming out that you guys are using as the fundamentals of what you do,” he said.

The pace of innovation in behavioral health AI means today’s breakthrough can become tomorrow’s standard.

Samuel Jefroykin, Director of Data and AI Research at Eleos, breaks down what it takes to build cutting-edge AI tools in the mental health space.

Jefroykin described the work as running on two parallel tracks. “You have two tracks…how we can understand better with the data we have today and how we can mix that with new AI advances happening in the market.” The team is always working from both directions at once, refining their existing tools based on feedback from providers—and at the same time, staying open to new developments that could move things forward even faster.

Throughout this conversation, one idea shone through again and again: behavioral health AI only works when it’s built in partnership with real-world clinicians. Tools have to be shaped by on-the-ground experience and flexible enough to handle the ambiguity that’s always present in therapy. The best results come from ongoing collaboration, where technical and clinical teams learn from each other and keep refining what they create together.

Szapiro put it this way: “We are here, we are involved, we’re hands-on. We have a team of clinicians—who are actually practicing clinicians—involved hands-on inside the product.” This approach keeps the technology honest and focused on what really matters in care.

“What I like to say about Eleos is that it’s an AI from the therapist, for the therapist,” Jefroykin added.

At the end of the day, this process is about creativity and respect for the work clinicians do. “Your job is to reverse engineer the brain of a therapist,” Morrison summarized.

Want to learn more? Tune into the full episode here for more stories, insights, and behind-the-scenes details on how clinicians and data scientists are coming together to create the next generation of behavioral health AI.