UniVidX: A Unified Multimodal Framework for Versatile Video Generation via Diffusion Priors Paper β’ 2605.00658 β’ Published May 1 β’ 84
Beyond the Exploration-Exploitation Trade-off: A Hidden State Approach for LLM Reasoning in RLVR Paper β’ 2509.23808 β’ Published Sep 28, 2025 β’ 47
Light of Normals: Unified Feature Representation for Universal Photometric Stereo Paper β’ 2506.18882 β’ Published Jun 23, 2025 β’ 89
DanceTogether! Identity-Preserving Multi-Person Interactive Video Generation Paper β’ 2505.18078 β’ Published May 23, 2025 β’ 6
DanceTogether! Identity-Preserving Multi-Person Interactive Video Generation Paper β’ 2505.18078 β’ Published May 23, 2025 β’ 6
DanceTogether! Identity-Preserving Multi-Person Interactive Video Generation Paper β’ 2505.18078 β’ Published May 23, 2025 β’ 6 β’ 2
Soulstyler: Using Large Language Model to Guide Image Style Transfer for Target Object Paper β’ 2311.13562 β’ Published Nov 22, 2023 β’ 1
ZhuJiu: A Multi-dimensional, Multi-faceted Chinese Benchmark for Large Language Models Paper β’ 2308.14353 β’ Published Aug 28, 2023
DRiVE: Diffusion-based Rigging Empowers Generation of Versatile and Expressive Characters Paper β’ 2411.17423 β’ Published Nov 26, 2024
Idea23D: Collaborative LMM Agents Enable 3D Model Generation from Interleaved Multimodal Inputs Paper β’ 2404.04363 β’ Published Apr 5, 2024
Running on Zero Agents Featured 5.07k FLUX.1 [Schnell] π 5.07k Generate images from text prompts with FLUX.1-schnell
Configuration error Agents Featured 4.78k TRELLIS π’ 4.78k Scalable and Versatile 3D Generation from images
Running on Zero Agents 98 Make It Animatable π 98 Authoring Animation-Ready 3D Characters with One Click
raulc0399/flux_dev_openpose_controlnet Text-to-Image β’ 0.7B β’ Updated Sep 26, 2024 β’ 858 β’ 52