Min vram?

by scraper01 - opened Jul 30, 2024

Tried to load the model unto an 4060 mobile with 8gb VRAM.

Not up to it - inference time way over 25 min. Flash attention disabled because windows.

If i want this to run on windows, how much VRAM do i need to get reasonable inference times - circa 15-20s ?

Regards,

Andy.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment