MARS: Enabling Autoregressive Models Multi-Token Generation Paper • 2604.07023 • Published Apr 8 • 38
FastVLM Collection Efficient Vision Encoding for Vision Language Models • 8 items • Updated Mar 2 • 113
OmniVoice: Towards Omnilingual Zero-Shot Text-to-Speech with Diffusion Language Models Paper • 2604.00688 • Published Apr 1 • 14