Rift: A Conflict Signature for Deception in Language Models

ELK-relevant result. A model that lies while knowing the truth is in a measurably different internal state than a model that is simply wrong. The difference is large, per-example, and label-free.

arXiv: 2606.17229
GitHub: omibranch/rift
Demo: Omibranch/rift-demo

Key results

Result	Value
Label-free lie identification	100% (GPT-2 small, 3 seeds + natural Qwen)
Length-controlled AUC (Qwen2.5-1.5B)	1.000, orientation 20/20, p ≈ 9.5e-7
Phi-3-mini AUC lie vs honest	1.000 (34/34, p ≈ 6e-11)
Phi-3-mini AUC lie vs hallucination	1.000
Cross-lingual AUC (ru/zh/es/de, LC)	1.000 in all 4 languages
Cross-family AUC (3 architectures)	0.933 mean, all 6 pairs p < 0.001
Strategic self-constructed deception	AUC 1.0, 24/24 Qwen-7B
Concealment countermeasure	Fails — conceal-AUC 1.0

Method

Residual rank — fraction of signal outside the top-8 singular vectors of hidden states — is elevated when a model maintains conflicting representations (truth + false output) simultaneously.

def residual_rank(H, k=8):
    _, s, _ = torch.linalg.svd(H.float(), full_matrices=False)
    return 1.0 - s[:k].sum() / s.sum()

License

Dual-licensed: PolyForm Noncommercial 1.0 for academic/research use. Commercial use requires a separate license — see LICENSE-COMMERCIAL.md.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Space using Omibranch/rift 1

Paper for Omibranch/rift

Rift: A Conflict Signature for Deception in Language Models

Paper • 2606.17229 • Published 19 days ago • 1