Building Fairer AI for Health Care
Researchers propose a framework for more equitable health care AI systems.
As the use of AI in health care grows, there are concerns that AI models trained on real-world clinical data could perpetuate or amplify existing disparities between patient populations. Clinical AI systems must be designed for accuracy but also ensure equitable treatment for all patients.
In a new study, researchers from UHN and North York General examined how AI models can predict race based on clinical notes written by health care providers and how these models can be designed to perform more equitably across race, sex, and age. This research provides insight into developing fair and accurate clinical AI systems.
Racial discrimination in health care plays a significant role in patient outcomes and health care utilization, and there is a need for equity-focused research and care that reflects diverse communities. As digital health tools, like AI, are increasingly used to support clinical care and research, there is a need to ensure that these systems perform consistently across diverse patient populations. If not carefully designed, these systems may amplify existing inequities.
One challenge in addressing bias in clinical AI is the inconsistency of racial data in medical records. Information about a patient's race is often missing or incomplete in electronic health records, making it difficult for researchers to evaluate the performance of AI models for different patient groups.
To better understand these challenges, the research team evaluated how well AI models could predict race from health care providers’ clinical notes. They compared several advanced language models, including a widely used transformer-based system, which analyzes text or images as one continuous section, with another type of model, called a hierarchical convolutional neural network. This model is designed to better reflect the layered structure of clinical notes.
They also applied specific rules to these models to optimize fairness—the ability of AI algorithms to make decisions without prejudice against individuals or groups based on characteristics like race, gender, or age.
The study found that the AI model designed to mirror the structure of clinical notes was more accurate and fairer than the other models. Additionally, including fairness rules helped some AI models perform with less bias across groups, but in others, it reduced accuracy. This showed that fairness tools did not work the same way for every AI system.
The researchers also found that many of the disparities in AI performance seen across patient groups were linked to how the health care providers wrote their notes. This showed that bias in how information is documented can also impact how AI systems interpret and use information.
The study highlights that fairness can be built into clinical AI systems, but approaches must be carefully matched to the model. These findings offer a practical framework for developing more equitable language-based AI tools in health care and underscore the need to address systemic gaps in how health information is recorded.
Dr. Rawan Abulibdeh is a Postdoctoral Researcher at UHN and first author of the study.
Dr. Ervin Sejdić is an Affiliate Scientist at KITE Research Institute and a Professor in the Rogers Sr. Department of Electrical and Computer Engineering at the University of Toronto. He is the Research Chair in Artificial Intelligence for Health Outcomes at North York General, and he is the corresponding and co-senior author of the study.
Dr. Karen Tu, Clinician Scientist at UHN and Professor in the Department of Family and Community Medicine at the University of Toronto. She is also a Research Scientist at North York General and the co-senior author of the study.
This work was supported by the Canadian Institutes of Health Research, the North York General TD Smart Technologies for Early Prediction and Prevention (STEPP) Lab funded by TD Bank Group, the University of Toronto, the National Institute of Health, the National Science Foundation, and UHN Foundation.
Dr. Tu is a Chair in Family and Community Medicine Research in Primary Care at UHN
Abulibdeh R, Lin Y, Ahmadi S, Sejdić E, Celi LA, Zhao Q, Tu K. Integration of fairness-awareness into clinical language processing models. Commun Med (Lond). 2026 Feb 24;6(1):178. doi: 10.1038/s43856-026-01433-9