Audio Qwen/Qwen2-Audio-7B-Instruct Audio-Text-to-Text • 8B • Updated Jan 12, 2025 • 594k • 544
Video/CV LanguageBind/MoE-LLaVA-Phi2-2.7B-4e Text Generation • 6B • Updated Feb 1, 2024 • 136 • 40 LanguageBind/LanguageBind_Video_FT Zero-Shot Image Classification • Updated Feb 1, 2024 • 5.29k • 7 stabilityai/stable-video-diffusion-img2vid-xt Image-to-Video • Updated Jul 10, 2024 • 215k • 3.33k ShareGPT4Video: Improving Video Understanding and Generation with Better Captions Paper • 2406.04325 • Published Jun 6, 2024 • 75
LanguageBind/LanguageBind_Video_FT Zero-Shot Image Classification • Updated Feb 1, 2024 • 5.29k • 7
stabilityai/stable-video-diffusion-img2vid-xt Image-to-Video • Updated Jul 10, 2024 • 215k • 3.33k
ShareGPT4Video: Improving Video Understanding and Generation with Better Captions Paper • 2406.04325 • Published Jun 6, 2024 • 75
Audio Qwen/Qwen2-Audio-7B-Instruct Audio-Text-to-Text • 8B • Updated Jan 12, 2025 • 594k • 544
Video/CV LanguageBind/MoE-LLaVA-Phi2-2.7B-4e Text Generation • 6B • Updated Feb 1, 2024 • 136 • 40 LanguageBind/LanguageBind_Video_FT Zero-Shot Image Classification • Updated Feb 1, 2024 • 5.29k • 7 stabilityai/stable-video-diffusion-img2vid-xt Image-to-Video • Updated Jul 10, 2024 • 215k • 3.33k ShareGPT4Video: Improving Video Understanding and Generation with Better Captions Paper • 2406.04325 • Published Jun 6, 2024 • 75
LanguageBind/LanguageBind_Video_FT Zero-Shot Image Classification • Updated Feb 1, 2024 • 5.29k • 7
stabilityai/stable-video-diffusion-img2vid-xt Image-to-Video • Updated Jul 10, 2024 • 215k • 3.33k
ShareGPT4Video: Improving Video Understanding and Generation with Better Captions Paper • 2406.04325 • Published Jun 6, 2024 • 75