Latest News

New AI Tool Addresses Data Accuracy and Fairness

By Irene Yeh 

October 24, 2025 | Modern healthcare algorithms are trained with large volumes of data and can recognize patterns and correlate them with outcomes, resulting in early diagnoses. However, this can be a double-edged sword. While these algorithms are helpful with identifying patterns and predicting and identifying symptoms, their behaviors may not be completely aligned with what healthcare providers are looking for. Namely, there is still bias against underrepresented groups, even though these algorithms are trained using diverse populations.  

To address this issue, researchers from the Icahn School of Medicine at Mount Sinai have developed AEquity, a tool of identification that reduces biases in datasets used to train machine learning algorithms. Bias in healthcare has been documented for decades, and the data from these datasets can create models that may reproduce health inequity. From an ethical and regulatory perspective, identifying and remediating potential bias in datasets are urgent matters, especially with the executive order that was placed to hold liability for algorithmic bias on developers and deployers (JMIR Publications, DOI: 10.2196/71757).  

Analyzing Bias in Data 

“Just looking around us—in day-to-day life and studies done over the last couple of decades—the delivery of health care and outcomes of patients who experience the U.S. healthcare system are not always equal or equitable,” says Ashwin Sawant, M.D., attending physician of internal medicine at Mount Sinai and one of the study’s authors.  

The team used publicly available datasets from previous studies that investigated bias among patients of various ethnic groups at small sample sizes. In one example, they used x-ray datasets with a balanced subset of patient data with 512 Black patients and 512 White patients. The model was then implemented to detect underdiagnosis bias and help curate a more equitable dataset to mitigate this bias (JMIR Publications, DOI: 10.2196/71757). In a second example, AEquity calculated the risk category (high risk and low risk) for each of three outcomes (total costs, avoidable costs, and active chronic conditions). When calculating for total costs and avoidable costs, there was a 95% difference between Black patients and White patients, leading to unequal resource allocation for high-risk Black patients. When calculated with active chronic conditions, there was no difference between high-risk Black and White patients. This indicates that both groups under this particular category were similar. As such, AEquity could determine and guide the choice of a better outcome measure to identify high risk patients and moderate label bias.  

The team looked at all types of ethnic groups for the study, but there was not enough data on patients of other races, such as Asian descent, for the algorithm to use or make strong conclusions to report in the main paper, Sawant explains. The team also investigated two other types of biases—age and healthcare cost scenario, which is the individual’s circumstances surrounding the ability to afford and access healthcare. 

AEquity can work with both small and large sample sizes, showing its flexibility to work across different dataset sizes, modalities, and model architectures. Because it analyzes the dataset itself, “it can be applied to different kinds of models or different architectures,” according to Sawant. In other words, the model is flexible enough to learn the distribution of data and the relations of labels or outcomes in the dataset. 

Limitations and Next Steps 

While AEquity was able to detect bias, there are still limitations to the model. For one, data collection is an expensive and difficult process. “The availability of high-quality data is limited,” explains Sawant. As such, if a machine learning model for a particular condition were to be created, it would need enough patient data, resulting in higher costs. This could be alleviated by reducing costs via easing approval through regulatory organizations.  

Second, compared to other algorithms that use other forms of regularization and are multimodal, AEquity in its current form mainly focuses on classification tasks and is agnostic to model architecture. Sawant recalls how Aequity reduced but failed to completely mitigate bias against underserved patients. There was also another failure where the team found the model’s performance lagged not because of too much discrepancy or complexity in a particular subgroup but because that particular task was more complicated.  

Third, there is also the requirement to create new gold-standard labels if AEquity identifies systematic label distortion, which is another expensive and tedious process. Finally, AEquity cannot pinpoint underlying causes for dataset bias. AEquity can help characterize bias as primarily complexity-related, but without explicit information on lack of access to care, it cannot confirm if that lack of access is the cause or not. 

But these limitations are not restricting AEquity from its potential. Sawant and his team are excited to continue developing the algorithm. The model is also publicly available for the community to improve upon, and now the team is working with a handful of researchers within Mount Sinai to get some feedback. 

“What we’ve found is that the system needs to be made easier to use,” says Sawant. “We need to make our code a little more modular and plug-and-play for researchers who may not be familiar with our particular layout and organization.”  

The team plans to also make AEquity available as a more useful library and have released their code without any license or patent restrictions. According to Sawant, they are not planning to commercialize AEquity.  

Load more comments
comment-avatar