Post
25
A single forward pass of the frozen Qwen-2.5-7B model plus a lightweight classifier reaches 0.866 plus or minus 0.011 AUC on the full TruthfulQA-MC2 benchmark. No adapters. No fine-tuning. No extra parameters on the backbone.
This is the strongest hidden-state truthfulness detector reported on the benchmark to date.
The same latent features that the SRT-NLA-AV-v1 demo reads out as coherent natural-language verbalizations turn out to be rich enough to support production-grade auditing for honesty versus hallucination. The internal semiotic infrastructure we have been exploring in public is already information-dense enough to solve hard downstream problems with almost trivial overhead.
You can watch the underlying latent geometry in action right here:
RiverRider/srt-nla-av-v1-demo
Full code, artifacts, and reproduction steps are in the repository:
https://github.com/space-bacon/SRT
Try the Glass Box
RiverRider/srt-nla-demo
This is the strongest hidden-state truthfulness detector reported on the benchmark to date.
The same latent features that the SRT-NLA-AV-v1 demo reads out as coherent natural-language verbalizations turn out to be rich enough to support production-grade auditing for honesty versus hallucination. The internal semiotic infrastructure we have been exploring in public is already information-dense enough to solve hard downstream problems with almost trivial overhead.
You can watch the underlying latent geometry in action right here:
RiverRider/srt-nla-av-v1-demo
Full code, artifacts, and reproduction steps are in the repository:
https://github.com/space-bacon/SRT
Try the Glass Box
RiverRider/srt-nla-demo