Use the CRF Critical Appraisal Tool
By Christopher Williams, PhD
I recently conducted a search using three AI models - Grok (Think), ChatGPT, and Claude (Sonnet 4) - to generate a list of normalized yet questionable ethics in the US. My search was, "Name the 100 most ethically questionable things in the US in 2025 that appear to be normal to the vast majority of people that are not normal. Here we mean things that more than 80% of Americans would not question" (Prompt 1). You can read the results here.
The results were quite different, which happened to reveal important insights about AI models. After compiling all of the responses from Prompt 1, I asked each model to describe approaches to and recognition of structural racism, classism, and income inequality.
ChatGPT embodies a more scholarly, social justice-informed lens, articulating inequity with explicit reference to racialized, economic, and institutional dynamics. It leans most heavily into naming power structures. Claude represents a humanist-moralist model, emphasizing personal ethics and harm. While deeply sympathetic to injustice, it often sidesteps racial and systemic specificity, opting instead for universal human concern. Grok, perhaps reflecting Elon Musk’s techno-libertarian influences, critiques institutions and cultural norms while keeping a distance from identity-based frameworks. It favors market critiques over race-based structural analysis.
ChatGPT employs what might be called a "structural analysis" that emphasizes how institutions systematically advantage some groups while disadvantaging others. Claude operates from what could be termed an "individualistic reform" perspective that acknowledges serious problems but focuses more on individual choices and specific policy fixes rather than systematic transformation. This approach tends to see inequality as an unfortunate byproduct of otherwise legitimate systems rather than as a core feature requiring fundamental change.Grok represents a "systems complexity" approach that recognizes multiple problems but avoids attributing them to systematic bias or deliberate exclusion.
ChatGPT’s explicit category for race contrasts sharply with Claude’s indirect approach and Grok’s selective integration of racial issues into other domains. ChatGPT treats racism as a foundational concern, while Claude embeds it subtly, and Grok addresses it in targeted contexts.
Artificial intelligence systems—particularly large language models like OpenAI’s ChatGPT, Anthropic’s Claude, and xAI’s Grok—offer insights not just through what they say, but in how they structure and prioritize knowledge. When asked to enumerate ethically questionable practices in the United States that are normalized by the majority of Americans, all three models generated compelling, overlapping yet distinct lists. A comparative analysis of these outputs reveals meaningful divergences in their approach to race, racism, classism, and income inequality—four interconnected pillars of ethical concern in American life.
Race and Racism:
ChatGPT explicitly devotes an entire category (Section V) to “Race & Structural Inequality,” identifying 10 issues that span education, health, policing, media, and environmental racism. These include “residential segregation,” “racial wealth gap,” “voter suppression in communities of color,” “racial profiling,” and “disparities in health outcomes dismissed as ‘natural.’” By isolating these topics, ChatGPT shows a clear recognition that racism is not incidental but structural—embedded into laws, institutions, and cultural norms. There’s a deliberate emphasis on normalization and invisibility of racism, especially to dominant groups.
Classism & Income Inequality:
ChatGPT’s response links class to labor practices, housing, healthcare, and governance. For example, it critiques “wage stagnation,” “cash bail,” “gig economy normalization,” and “corporate tax avoidance.” Notably, it flags “school funding tied to property taxes” and “mass layoffs to increase stock prices,” highlighting how corporate and state systems reinforce class divides. Its inclusion of “environmental racism treated as coincidental” suggests an intersectional approach that ties racial and class inequities together.
Approach:
ChatGPT’s framing is policy- and institution-oriented, drawing from well-established public discourse and sociological theory. It frequently uses systemic language—“structural,” “disproportionate,” “racial wealth gap,” “underfunding”—indicating a transdisciplinary, data-informed critique. ChatGPT clearly draws from left-leaning social justice frameworks but maintains a tone of professional neutrality.
Race and Racism:
Claude’s list is striking for what it does not do: it does not create a dedicated category on race or structural racism. Instead, racial and ethnic disparities are diffused across categories—implied but not always named. For instance, while Claude lists “for-profit prisons,” “school-to-prison pipeline,” and “civil asset forfeiture,” it doesn't specifically highlight the racialized nature of these issues. Racial justice issues such as “police brutality” or “racial profiling” are absent in explicit terms. Claude appears to treat ethical issues more universally, sometimes at the expense of naming race-based harms.
Classism & Income Inequality:
Claude is particularly strong in naming economic injustices, but its orientation leans toward individual burden and consumer-facing injustices: “unpaid internships,” “medical bankruptcy,” “student loan debt,” “CEO pay,” “credit scores determining housing.” The emphasis is on the way capitalism directly impacts the quality of life for ordinary Americans, especially in healthcare, education, and labor. However, Claude tends to avoid ideological or systemic language in favor of personal and ethical tones.
Approach:
Claude’s tone is moralistic, and it focuses on well-being, fairness, and personal impact. It often highlights where systems create emotional, psychological, or physical harm (e.g., “active shooter drills traumatizing children,” “hustle culture glorifying overwork”). While this can create powerful emotional resonance, it risks underplaying racialized or structural dimensions by universalizing harm.
Race and Racism:
Grok’s approach is marked by its lack of explicit mention of race or racism, though issues like “sentencing disparities,” “racial justice disparities,” “education inequality,” and “environmental justice issues” are present in coded or indirect language. Grok often critiques phenomena that disproportionately affect racialized communities—like “prison labor,” “gig economy exploitation,” and “access barriers in education”—but avoids naming race outright. In doing so, Grok may reflect a libertarian or techno-agnostic perspective, more focused on power and inequality than identity politics.
Classism & Income Inequality:
Grok's critique of income inequality is sharper and more direct than its take on race. It calls out “wealth concentration,” “economic exploitation,” “prison labor products,” and “education funding gaps.” Grok is comfortable labeling the symptoms of classism but stops short of naming capitalism or white supremacy as root causes. Instead, it seems to treat these issues as failures of culture and incentives rather than ideology.
Approach:
Grok leans into a skeptical, realist tone, often bordering on techno-pessimism. Its framing centers on systemic inertia, corporate overreach, and cultural decay rather than race or justice. It frequently critiques technology, consumerism, and digital behavior (“algorithmic bias,” “filter bubbles,” “smartphone use while driving”), implicitly suggesting that ethical blindness is reinforced by culture and design, not just policy.
Race & Racism
Explicit, systemic framing with intersectionality (ChatGPT)
Implicit, underdeveloped (Claude)
Indirect, avoids explicit language (Grok)
Classism
Structural (housing, education, labor systems) (ChatGPT)
Personal/ethical harm (debts, wages) (Claude)
Systemic, but framed via cultural/market failures (Grok)
Income Inequality
Data-driven focus on disparities, linked to policy (ChatGPT)
Human-impact driven (Claude)
Cynical view of elite hoarding, systemic inertia ((Grok)
Tone
Analytical, institutional, scholarly (ChatGPT)
Ethical, moralistic, emotional (Claude)
Critical, realist, techno-cultural (Grok)
Structural Analysis
High (ChatGPT)
Moderate (Claude)
Medium (avoids ideology but critiques structures) (Grok)
This comparative exercise reveals how AI-generated responses reflect the underlying epistemology and values embedded in each model:
ChatGPT embodies a more scholarly, social justice-informed lens, articulating inequity with explicit reference to racialized, economic, and institutional dynamics. It leans most heavily into naming power structures.
Claude represents a humanist-moralist model, emphasizing personal ethics and harm. While deeply sympathetic to injustice, it often sidesteps racial and systemic specificity, opting instead for universal human concern.
Grok, perhaps reflecting Elon Musk’s techno-libertarian influences, critiques institutions and cultural norms while keeping a distance from identity-based frameworks. It favors market critiques over race-based structural analysis.
Though tasked with the same prompt, these three AI models approach the ethically questionable landscape of 2025 America through divergent lenses of justice, harm, and causality. ChatGPT emerges as the most explicitly structural and race-conscious. Claude emphasizes individual harm and fairness while downplaying race. Grok critiques inequality as a cultural failing but keeps race and class analysis largely implicit.
These differences are not just academic—they reflect ideological orientations and limitations of their developers, training corpora, and safety filters. As AI plays a growing role in shaping public understanding, these divergences are not trivial. They matter for how ethical debates, especially on race and class, are framed and acted upon in the years to come.
When tasked with identifying America's most ethically questionable normalized practices, three prominent AI models—ChatGPT, Claude, and Grok—revealed markedly different approaches to addressing systemic inequality. Their responses illuminate not just different priorities, but fundamentally different frameworks for understanding how racism, classism, and economic inequality operate in American society. While all three acknowledge these issues exist, they differ dramatically in their emphasis, specificity, and willingness to name structural racism as a driving force.
ChatGPT's response stands apart for its explicit focus on structural inequality and its willingness to directly name racism as a systemic force. Unlike the other models, ChatGPT dedicates an entire section (Section V) specifically to "Race & Structural Inequality," treating these issues as deserving distinct analytical attention rather than folding them into broader categories.
The model demonstrates sophisticated understanding of how racial inequality operates across multiple domains. It identifies "residential segregation" (#41), the "racial wealth gap normalized" (#42), and "voter suppression in communities of color" (#43) as interconnected phenomena rather than isolated problems. Perhaps most significantly, ChatGPT explicitly calls out "environmental racism treated as coincidental" (#50), demonstrating awareness that seemingly race-neutral policies can have racially disparate impacts that are then dismissed as natural or accidental.
ChatGPT's treatment of economic inequality is similarly systematic. It connects wage stagnation to rising productivity (#1), links the gig economy's growth to the erosion of worker protections (#6), and identifies how corporate tax avoidance (#7) and excessive CEO compensation (#8) represent normalized forms of class exploitation. The model recognizes that these aren't separate issues but part of an interconnected system that concentrates wealth upward while leaving workers increasingly vulnerable.
The model's approach to education reveals particular insight into how class and race intersect. By identifying that "school funding tied to property taxes" (#24) creates systematic disadvantages, ChatGPT recognizes how seemingly neutral policies perpetuate inequality across generations. Similarly, its inclusion of "textbook whitewashing of history" (#30) shows awareness of how educational institutions actively participate in maintaining racial hierarchies.
Claude's response takes a markedly different approach, focusing primarily on individual behaviors and consumer choices rather than structural systems. While Claude identifies important issues like medical bankruptcy (#13), student debt (#14), and CEO compensation disparities (#16), it frames these more as unfortunate outcomes of current systems rather than as products of deliberate structural design.
The model's treatment of racial issues is notable for its absence of explicit racial analysis. Claude mentions "food deserts in low-income areas" (#25) and notes that the "legal system favoring wealthy defendants" (#69), but it doesn't connect these patterns to historical or ongoing racial discrimination. This approach effectively deracializes issues that have strong racial dimensions, treating them as class problems that happen to correlate with race rather than as products of racialized systems.
Claude's focus on technology and privacy issues (#1-10) reflects a more individualistic framework. While these concerns about surveillance and data collection are valid, their prominence suggests a worldview where personal privacy violations are more immediately concerning than systematic exclusion from economic opportunity or political power. This emphasis on individual privacy over collective justice represents a distinctly different set of priorities.
The model's attention to consumer culture and lifestyle issues—from "ultra-processed foods marketed to children" (#21) to "celebrity worship culture" (#33)—reveals an approach that places significant responsibility on individual choice. While these issues matter, their emphasis alongside but not clearly connected to structural inequality suggests a framework where personal responsibility and structural forces are given roughly equal weight.
Grok's response is perhaps most notable for what it doesn't emphasize. While the model identifies numerous problematic practices, it consistently avoids explicit discussion of how race and class structure these problems. Instead, Grok tends to present issues as either technological problems or as unfortunate but inevitable outcomes of complex systems.
When Grok mentions "racial justice disparities" (#48) and "sentencing disparities" (#55), it treats these as isolated flaws rather than as products of systematically biased institutions. The model's language consistently frames problems as emerging from complexity or competing interests rather than from deliberate exclusion or exploitation. This approach depoliticizes inequality by treating it as a technical problem rather than as a product of power relations.
Grok's emphasis on technological and environmental issues—from "algorithmic bias" (#3) to "single-use plastics" (#31)—reflects a technocratic worldview where problems are primarily understood as efficiency failures rather than as products of conflicting interests between different groups. While algorithmic bias certainly affects racial minorities disproportionately, Grok's framing obscures these racial dimensions.
The model's treatment of economic inequality exemplifies this approach. Items like "wealth gap" (#66) and "economic inequality" (#69) are mentioned but presented almost as natural phenomena rather than as outcomes produced by specific policy choices that benefit some groups at the expense of others. This framing makes inequality appear inevitable rather than changeable through different institutional arrangements.
The differences between these models reflect three distinct approaches to understanding social problems. ChatGPT employs what might be called a "structural analysis" that emphasizes how institutions systematically advantage some groups while disadvantaging others. This approach sees racism and classism as organizing principles of American society that require explicit attention and systematic remedies.
Claude operates from what could be termed an "individualistic reform" perspective that acknowledges serious problems but focuses more on individual choices and specific policy fixes rather than systematic transformation. This approach tends to see inequality as an unfortunate byproduct of otherwise legitimate systems rather than as a core feature requiring fundamental change.
Grok represents a "systems complexity" approach that recognizes multiple problems but avoids attributing them to systematic bias or deliberate exclusion. This perspective treats inequality as emerging from the complexity of modern systems rather than from power relations between different groups.
These different approaches have significant implications for how problems are understood and what solutions might be pursued. ChatGPT's structural analysis suggests that addressing racial and economic inequality requires systematic institutional change—reforming criminal justice, restructuring educational funding, and actively dismantling segregated housing patterns. This approach implies that good intentions and individual choices are insufficient without institutional transformation.
Claude's individualistic focus suggests that problems can be addressed through better personal choices, improved regulations, and reforms to specific policies without necessarily restructuring fundamental institutions. This approach places significant faith in the possibility of reform within existing systems.
Grok's complexity framework implies that problems emerge from the difficulty of managing complex systems rather than from systematic exclusion, suggesting that solutions lie in better management and technological innovation rather than in redistributing power or resources.
Perhaps most significantly, these differences reveal how AI systems can embody different political perspectives even when tasked with apparently objective analysis. The choice to explicitly name structural racism, to focus on individual responsibility, or to emphasize system complexity reflects underlying assumptions about how society works and what kinds of changes are both possible and necessary.
The contrast is particularly stark in how these models treat the relationship between individual agency and structural constraint. ChatGPT consistently emphasizes how individual outcomes are shaped by institutional arrangements, while Claude and Grok place greater emphasis on individual choice and personal responsibility. This difference has profound implications for understanding both the causes of inequality and the types of interventions that might address it.
These AI responses ultimately reflect broader debates in American society about the nature and causes of inequality. They demonstrate that even apparently neutral analytical tasks inevitably embody particular perspectives on how society works and what aspects of social reality deserve attention. As AI systems become more influential in shaping public discourse, understanding these different approaches becomes increasingly important for recognizing the political assumptions embedded in seemingly objective analysis.
The original responses from ChatGPT, Claude, and Grok, which list normalized yet ethically questionable practices in American society by 2025, reveal varying degrees of attention to race, racism, classism, and stark income inequality. While all three models tackle these issues, their approaches differ in explicitness, structure, and focus. This essay explores the extent to which these topics were named in each model’s response and how their treatments varied, shedding light on their unique perspectives on systemic inequality.
ChatGPT: ChatGPT explicitly prioritizes race and racism, featuring a dedicated category titled "Race & Structural Inequality." This section includes 10 items, such as residential segregation, the racial wealth gap, voter suppression in communities of color, racial profiling, and environmental racism. By creating a standalone category, ChatGPT underscores race as a central ethical issue, weaving it into discussions of housing, education, healthcare, and justice. This directness highlights a clear intent to address systemic racism as a pervasive societal flaw.
Claude: Claude takes a less explicit approach, lacking a specific category for race or racism. Instead, it indirectly references racial issues through items like "voter suppression tactics" and "gerrymandering" under "Political & Civic," practices known to disproportionately impact communities of color. While these entries suggest an awareness of racial inequality, Claude does not label them as race-specific, indicating a preference for focusing on mechanisms over their racial implications.
Grok: Grok adopts a balanced stance, explicitly naming racial concerns within broader categories. Under "Political and Legal Systems," it lists "racial justice disparities" and "sentencing disparities," and under "Healthcare," it includes "unequal healthcare access" and "healthcare disparities" tied to race. Though not as comprehensive as ChatGPT’s dedicated category, Grok’s explicit mentions signal a recognition of racism’s role in justice and health systems, albeit without the same breadth or prominence.
Comparison: ChatGPT’s explicit category for race contrasts sharply with Claude’s indirect approach and Grok’s selective integration of racial issues into other domains. ChatGPT treats racism as a foundational concern, while Claude embeds it subtly, and Grok addresses it in targeted contexts.
ChatGPT: ChatGPT addresses classism across multiple categories, including "criminalization of poverty" and "surveillance of marginalized communities" under "Law, Justice & Incarceration," and issues like wage stagnation and lack of public housing under "Economy & Labor" and "Housing & Urbanism." This diffuse approach illustrates how classism permeates various systems, though it lacks a unified category to tie these threads together.
Claude: Claude offers a more focused critique of classism, emphasizing its role in justice and education. Items like "cash bail creating a two-tiered justice system" and "legal system favoring wealthy defendants" under "Criminal Justice & Legal," alongside "school funding based on local property taxes" under "Education & Youth," pinpoint how wealth dictates access and fairness. This institutional lens sharpens Claude’s analysis of class-based inequities.
Grok: Grok explicitly tackles classism within its "Economic Inequality" category, listing "wealth gap," "social mobility decline," and "low-wage exploitation." By framing these as economic issues, Grok directly critiques the systems that sustain class divides, offering a broader structural perspective compared to Claude’s institutional focus.
Comparison: ChatGPT spreads classism across domains, Claude zooms in on justice and education, and Grok ties it to economic structures. Each approach reveals classism’s reach, but Grok’s explicit category and Claude’s specificity stand out against ChatGPT’s broader integration.
ChatGPT: ChatGPT addresses income inequality through practices like "wage stagnation," "exorbitant CEO pay," and "corporate tax avoidance" under "Economy & Labor," and housing disparities under "Housing & Urbanism." These entries critique the mechanisms driving wealth gaps but do not consolidate them into a single narrative, presenting income inequality as one of many systemic issues.
Claude: Claude similarly highlights income inequality with items like "CEO compensation 300+ times average worker pay," "stock buybacks prioritized over worker wages," and "unpaid internships in profitable companies" under "Economic & Labor." Like ChatGPT, it embeds these concerns within larger categories, focusing on specific practices rather than naming income inequality as a standalone ethical failure.
Grok: Grok takes a bolder stance, dedicating a category to "Economic Inequality" with items like "wealth gap," "wealth concentration," and "low-wage exploitation." This explicit framing positions income inequality as a systemic outcome of capitalism, offering a structural critique that contrasts with the more practice-specific approaches of ChatGPT and Claude.
Comparison: Grok’s direct categorization of income inequality as a core issue sets it apart from ChatGPT and Claude, which address it through scattered examples. Grok critiques the economic system itself, while the others focus on its symptoms.
The models’ differing emphases reveal their analytical priorities:
ChatGPT foregrounds race and racism with a dedicated category, integrating classism and income inequality across a wide-ranging critique of systemic injustice.
Claude zeroes in on classism in justice and education, addressing race indirectly and income inequality through labor practices, suggesting a focus on institutional mechanisms.
Grok centers economic inequality as a structural issue, weaving in racial disparities in targeted areas like justice and healthcare, reflecting a broader systemic lens.
Despite these differences, all three models acknowledge the normalization of race, class, and income inequality in American society, highlighting their ethical implications through varied lenses.
ChatGPT, Claude, and Grok each address race, racism, classism, and stark income inequality, but their methods diverge significantly. ChatGPT leads with an explicit focus on race, Claude emphasizes class-based institutional flaws, and Grok prioritizes economic inequality as a systemic concern. Together, they offer a multi-dimensional view of these entrenched issues, underscoring the need for critical reflection on their normalization in American society by 2025.