Muratcan Laloğlu's picture

21

Muratcan Laloğlu

muratcanlaloglu

·

AI & ML interests

None yet

Recent Activity

liked a model about 1 month ago

openbmb/VoxCPM2

liked a dataset about 2 months ago

merve/vlm_test_images

reacted to Hellohal2064's post with 🔥 4 months ago

🚀 Excited to share: The vLLM container for NVIDIA DGX Spark! I've been working on getting vLLM to run natively on the new DGX Spark with its GB10 Blackwell GPU (SM121 architecture). The results? 2.5x faster inference compared to llama.cpp! 📊 Performance Highlights: • Qwen3-Coder-30B: 44 tok/s (vs 21 tok/s with llama.cpp) • Qwen3-Next-80B: 45 tok/s (vs 18 tok/s with llama.cpp) 🔧 Technical Challenges Solved: • Built PyTorch nightly with CUDA 13.1 + SM121 support • Patched vLLM for Blackwell architecture • Created custom MoE expert configs for GB10 • Implemented TRITON_ATTN backend workaround 📦 Available now: • Docker Hub: docker pull hellohal2064/vllm-dgx-spark-gb10:latest • HuggingFace: huggingface.co/Hellohal2064/vllm-dgx-spark-gb10 The DGX Spark's 119GB unified memory opens up possibilities for running massive models locally. Happy to connect with others working on the DGX Spark Blackwell!

View all activity

Organizations

None yet

liked a model about 1 month ago

openbmb/VoxCPM2

Text-to-Speech • Updated Apr 16 • 188k • 1.32k

liked a dataset about 2 months ago

merve/vlm_test_images

Viewer • Updated Apr 2 • 31 • 99.3k • 9

reacted to Hellohal2064's post with 🔥 4 months ago

Post

1710

🚀 Excited to share: The vLLM container for NVIDIA DGX Spark!

I've been working on getting vLLM to run natively on the new DGX Spark with its GB10 Blackwell GPU (SM121 architecture). The results? 2.5x faster inference compared to llama.cpp!

📊 Performance Highlights:
• Qwen3-Coder-30B: 44 tok/s (vs 21 tok/s with llama.cpp)
• Qwen3-Next-80B: 45 tok/s (vs 18 tok/s with llama.cpp)

🔧 Technical Challenges Solved:
• Built PyTorch nightly with CUDA 13.1 + SM121 support
• Patched vLLM for Blackwell architecture
• Created custom MoE expert configs for GB10
• Implemented TRITON_ATTN backend workaround

📦 Available now:
• Docker Hub: docker pull hellohal2064/vllm-dgx-spark-gb10:latest
• HuggingFace: huggingface.co/Hellohal2064/vllm-dgx-spark-gb10

The DGX Spark's 119GB unified memory opens up possibilities for running massive models locally. Happy to connect with others working on the DGX Spark Blackwell!

4 replies

·

liked 2 models 5 months ago

moondream/md3p-int4

Updated Dec 19, 2025 • 359 • 7

cyankiwi/Qwen3-VL-32B-Instruct-AWQ-4bit

Image-Text-to-Text • 7B • Updated Oct 21, 2025 • 4.92k • 5

liked a Space 6 months ago

DeepSeek OCR 2 Demo

Try out DeepSeek-OCR-2 on your PDFs or images

liked 6 models 6 months ago

thelamapi/next-4b

Image-Text-to-Text • 4B • Updated Mar 1 • 177 • 8

thelamapi/next-1b

Text Generation • 1.0B • Updated Mar 1 • 601 • • 27

vafipas663/Qwen-Edit-2509-Upscale-LoRA

Image-to-Image • Updated Nov 17, 2025 • 12.9k • • 226

cyankiwi/Magistral-Small-2509-AWQ-4bit

5B • Updated Oct 14, 2025 • 271 • 3

jeffcookio/Mistral-Small-3.2-24B-Instruct-2506-awq-sym

5B • Updated Jul 4, 2025 • 10.1k • 12

unsloth/Mistral-Small-3.2-24B-Instruct-2506-bnb-4bit

Image-Text-to-Text • 25B • Updated Jun 23, 2025 • 304k • 10

liked 2 models 7 months ago

mistralai/Magistral-Small-2509-GGUF

24B • Updated Sep 18, 2025 • 1.4k • 73

PaddlePaddle/PaddleOCR-VL

Image-Text-to-Text • 1.0B • Updated 18 days ago • 11.2k • 1.6k

liked a model 8 months ago

moondream/moondream3-preview

Image-Text-to-Text • 9B • Updated Apr 9 • 141k • 635

liked a model 9 months ago

unsloth/Mistral-Small-3.2-24B-Instruct-2506-GGUF

Image-Text-to-Text • 24B • Updated Aug 26, 2025 • 68.5k • 173

liked 3 models 10 months ago

BAAI/bge-m3

Sentence Similarity • Updated Jul 3, 2024 • 26.6M • • 3.01k

sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2

Sentence Similarity • 0.1B • Updated Jan 28 • 48.1M • • 1.23k

merve/smol-vision

Image-Text-to-Text • Updated Nov 5, 2025 • 193

liked a model about 1 year ago

intfloat/multilingual-e5-large-instruct

Feature Extraction • 0.6B • Updated Jul 10, 2025 • 1.33M • • 622