Artificial Intelligence▼ bearishImpact 8/10
Fair outputs, Biased Internals: Causal Potency and Asymmetry of Latent Bias in LLMs for High-Stakes Decisions
cs.AI updates on arXiv.org·
✦AI Analysis
A recent study reveals that while instruction-tuned language models produce fair outputs in high-stakes decisions like mortgage underwriting, they retain biased internal representations that can significantly influence outcomes. This latent bias is asymmetric and can be exploited, highlighting the need for comprehensive audits that assess both outputs and internal biases for effective AI governance.
Key Topics
language modelsmortgage underwritingAI governanceadversarial prompt engineering
Originally reported by cs.AI updates on arXiv.org. Read the full article ↗