Nathan Habib PRO
AI & ML interests
Evals
Recent Activity
new activity about 3 hours ago
nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B-NVFP4:Add evaluation results (GPQA, MMLU-Pro, SWE-bench Verified, HLE) new activity about 3 hours ago
nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B-BF16:Add evaluation results (GPQA, MMLU-Pro, SWE-bench Verified, HLE) liked a model about 3 hours ago
nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B-NVFP4Organizations
benchmarks
RULER Datasets Falcon-H1-3B-Base
RULER Datasets
RULER Datasets Lamma3-Instruct
RULER Datasets
RULER Datasets Qwen2.5-Instruct
RULER Datasets
RULER Datasets Qwen-3-Instruct
RULER Datasets
RULER Datasets Qwen-3
RULER Datasets
agents
Agents ressources
All the ressources I found / used when getting up to speed with agents.