Use the CRF Critical Appraisal Tool
Three models - ChatGPT, Grok, and Gemini - were used to assess agreement on an editorial of the Critical Race Framework Study. ChatGPT originated the essay, which was refined using feedback from other models until all models reached at least a 9.5 out of 10 in agreement. Dr. Christopher Williams is the principal investigator of the CRF study. All models rated agreement very highly - ChatGPT (10/10), Gemini (10/10), and Grok (9.5/10).
For decades, American public health institutions have committed themselves to the pursuit of equity. Researchers have documented racial disparities in mortality, chronic disease, and access to care. Federal agencies have called racism a public health crisis. Journals have published special issues on structural inequity. But amid all this activity, a basic question has gone largely unasked: What does the science of race in public health actually rest on?
A new doctoral dissertation by Christopher Williams offers a bracing answer. His Critical Race Framework Study: Standardizing Critical Evaluation for Research Studies That Use Racial Taxonomy concludes that much of the research using race as a variable in public health rests on shaky scientific ground. That doesn’t mean it lacks good intentions. It means it often lacks good methods.
Williams’ study introduces the Critical Race (CR) Framework, a tool designed to critically assess how researchers define, measure, and interpret race. It focuses on four standard domains of scientific evaluation — reliability, validity, internal validity, and external validity — and applies them to race-based analysis in public health literature. The goal is not to remove race from research, but to treat it with the same level of scientific scrutiny as any other construct.
He put this tool to the test. In a systematic review of twenty highly cited studies in health disparities and behavioral health, Williams and his research team found that fifteen of the studies scored 25% or lower in "high" or "moderate" quality ratings across the framework’s criteria. Only one study received a rating above 70%. The conclusion? While the studies varied in strength, the overwhelming majority demonstrated low-quality or absent discussion of how race was used or justified.
This is not merely a statistical concern — it is a structural one. Williams argues that the persistent use of racial categories in research, without conceptual clarity or valid measurement, reflects long-standing institutional norms rather than scientific rigor. Public health researchers often treat “race” as a self-evident proxy for other constructs — culture, biology, place, or discrimination — with little transparency or consistency. The result, as Williams describes, is a field in which symbolic equity often displaces analytical precision.
Yet the study is no polemic. It is a contribution grounded in empirical evidence, conceptual analysis, and methodological care. Williams explicitly acknowledges the perspectives of scholars who argue that race is necessary for identifying structural inequities. Indeed, he cites work that defends racial classification as essential for addressing historical injustice and monitoring disparities. But his response is clear: if race is to be used for these ends, it must be scientifically defensible.
The CR Framework offers a replicable, rigorously developed tool that journals, funders, and institutions can adopt to improve the quality of research involving race. It was tested in three phases with national public health experts and demonstrated strong content validity and moderate to high interrater agreement. And crucially, it encourages researchers to specify what race is doing in their models — a marker of inequality? A cultural construct? A social experience? — and to justify that use based on evidence.
This work comes at a critical moment. Public trust in science is fragile. Discussions of equity and race are politically charged. Health researchers must not only defend their findings, but also the foundations on which those findings rest. Williams’ study does not call for abandoning race as a category — it calls for building a more rigorous and transparent science around it.
Some will resist this challenge, fearing it undermines decades of disparities research. But the opposite is true. Williams’ work strengthens the field by asking it to live up to its own standards. His study points toward a public health discipline that is not just morally committed, but methodologically coherent.
As Williams writes, poor research practices around race do more than cloud our understanding — they waste resources, misguide interventions, and ultimately compromise the very communities researchers aim to support. This is not a condemnation. It is an invitation. Public health now has a tool to reflect, recalibrate, and reform.
It’s time to use it.