By Allison Proffitt
March 22, 2021 | It’s too soon to know the full impact of the pandemic on suicide rates in the United States, but even before 2020, suicide rates in the US have been climbing. CDC reports that from 1999 through 2018, the suicide rate increased 35%. Suicide prevention begins with risk identification, and the standard of care remains face-to-face screening and routine clinical interaction. But it is impossible to screen all of the individuals within a healthcare system. There must be a better way.
Interestingly, studies show suicide attempts or deaths from suicide often follow healthcare encounters—sometimes the very same day, explains Colin Walsh, an assistant professor of Biomedical Informatics, Medicine and Psychiatry at Vanderbilt University Medical Center (VUMC). But those healthcare encounters are not necessarily related to mental health. “Clinicians and health professionals might find themselves discussing suicide risk in settings far removed from mental health specialty care,” Walsh says. “We want to scale risk detection to the varied settings in health systems without asking every provider to ask the same questions to every patient at every visit. Automated screening is one step in that direction.”
Walsh proposed that artificial intelligence could streamline screening—flagging individuals for further questions and identifying those at risk. He theorized that a model reliant solely on routine, passively collected clinical data, such as medication and diagnostic data available in the EHR, might scale to any clinical setting regardless of screening practices.
Walsh is quick to emphasize that this is only a screen. “Our approach intends to add a new data point to patient and clinician decision-making. We do not seek to replace the judgment of smart providers with algorithms nor would we recommend it in the high stakes and critically important space of suicide prevention,” he says.
Walsh originally created the algorithm—now called the Vanderbilt Suicide Attempt and Ideation Likelihood (VSAIL) model—with colleagues now at Florida State University. In a new paper in JAMA Network Open, he and his Vanderbilt colleagues validated the algorithm using EHR data from the Vanderbilt University Medical Center. (DOI: 10.1001/jamanetworkopen.2021.1428).
Over 11 consecutive months beginning June 2019, the team let the VSAIL model run in the background at VUMC, ingesting information from electronic health records (EHRs) to calculate 30-day risk of return visits for suicide attempt among adult patients. The model required no data in addition to what is routinely collected in an EHR and did not require face-to-face screenings.
The VSAIL model predictors include demographic data, diagnostic data, medication data, past healthcare facility visits, and Area Deprivation Indices, a tool for ranking neighborhoods by socioeconomic disadvantage by zip code. At registration for inpatient, emergency department, or ambulatory surgery visits, the modeling pipeline used 5 years of historical data to build a vector of predictors. The predictive model then generated a probability of subsequent suicide attempt in 30 days.
During the validation period, the system assigned risk scores to 77,973 adult patients and stratified them into eight groups. Then Walsh and his colleagues manually reviewed any subsequent entries to the EHR coded as suicidal behaviors and compared those findings to the predicted risk scores. The top risk group alone accounted for more than one-third of all suicide attempts documented in the study, and approximately half of all cases of suicidal ideation. As documented in the EHR, one in 23 individuals in this highest-risk group went on to report suicidal thoughts, and one in 271 went on to attempt suicide.
“Primary findings include accuracy at scale regardless of face-to-face screening in nonpsychiatric settings,” Walsh and his coauthors write. "For every 271 people identified in the highest predicted risk group, one returned for treatment for a suicide attempt," he explained in a press release about the work.
This ratio can only be applied to nonpsychiatric specialty settings in a large clinical system, the authors caution, and highlight that future work should include development and validation of models that will be “site aware” in deployment. But even then, the current VSAIL model significantly limits the pool of patients for whom personal screening is best conducted. “Assuming that face-to-face screening takes, on average, 1 minute to conduct, automated screening for the lowest quantile alone would release 50 hours of clinician time per month,” the authors calculate.
During this validation phase, the model did not trigger any EHR alerts or deliver messaging to healthcare providers, but in the future, Walsh expects that to change. “We anticipate prompts to respond to detected risk would be given at the time the risk calculation is made,” he says. “For example, in our study, risk was calculated within seconds of registering for a given visit. The next time a clinician opens that chart to deliver care during that same encounter, they might see a prompt to consider the risk in their decision-making.”
Delivering such a prompt—possibly while in the room with a patient—is a sensitive task. Walsh and the team have been, “working closely with diverse collaborators on how best to deliver that information in an actionable and minimally disruptive way.” Validation of VSAIL is important, of course, but “the real test begins when we deliver new information to practicing clinicians in the context of real-world care,” Walsh says.
In the paper, Walsh and his coauthors next recommend careful pairing with low-cost, low-harm preventive strategies in a pragmatic trial of effectiveness in preventing future suicidality. They are planning to launch such a study this year in a small set of clinics to gather data and move to a larger scale study based on those findings.
“What we sought to show in this study does not just apply to suicide prevention. It applies to predictive models and AI broadly in healthcare,” Walsh says. “Even the most rigorous study in the lab does not replace the need to test algorithms like these in real-world settings on the road to full clinical implementation.”