Video-Text-to-Text
Transformers
Safetensors
qwen2_5_omni
text-to-audio
multimodal
video-captioning
audio-visual
ugc
Instructions to use openinterx/UGC-VideoCaptioner with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use openinterx/UGC-VideoCaptioner with Transformers:
# Load model directly from transformers import AutoProcessor, AutoModelForTextToWaveform processor = AutoProcessor.from_pretrained("openinterx/UGC-VideoCaptioner") model = AutoModelForTextToWaveform.from_pretrained("openinterx/UGC-VideoCaptioner") - Notebooks
- Google Colab
- Kaggle
- Xet hash:
- f59fce72f616172554e519584834dd7b49bdc28f94a018b533f17d49de8d78d1
- Size of remote file:
- 11.4 MB
- SHA256:
- 8441917e39ae0244e06d704b95b3124795cec478e297f9afac39ba670d7e9d99
·
Xet efficiently stores Large Files inside Git, intelligently splitting files into unique chunks and accelerating uploads and downloads. More info.