AI & ML interests
We are dedicated to advancing the field of natural language processing, in collaboration with the open-source community, through bleeding-edge research and a commitment to symbiotic development.
Recent Activity
View all activity
Papers
Decoupling the Benefits of Subword Tokenization for Language Model Training via Byte-level Simulation
Targeted Neuron Modulation via Contrastive Pair Search
The Hermes 3 Series of Models
-
Hermes 3 Technical Report
Paper • 2408.11857 • Published • 60 -
NousResearch/Hermes-3-Llama-3.1-405B
Text Generation • 406B • Updated • 723 • 268 -
NousResearch/Hermes-3-Llama-3.1-70B
Text Generation • 71B • Updated • 2.22k • • 126 -
NousResearch/Hermes-3-Llama-3.1-8B
Text Generation • 8B • Updated • 252k • • 449
Nous' Flagship LLM Series
-
NousResearch/Hermes-2-Theta-Llama-3-70B
Text Generation • 71B • Updated • 1.37k • • 82 -
NousResearch/Hermes-2-Pro-Llama-3-70B
Text Generation • 71B • Updated • 97 • • 35 -
NousResearch/Hermes-2-Theta-Llama-3-8B
Text Generation • 8B • Updated • 11.4k • • 206 -
NousResearch/Hermes-2-Pro-Llama-3-8B
Text Generation • 8B • Updated • 19.2k • • 450
allenai's Dolma dataset as native Hugging Face datasets
Evals from the Hermes-4 Technical Report
-
NousResearch/eval-Hermes-4-405B-reasoning
Viewer • Updated • 103k • 2.23k • 6 -
NousResearch/eval-Hermes-4-405B-nonreasoning
Viewer • Updated • 102k • 587 • 3 -
NousResearch/eval-Hermes-4-70B-reasoning
Viewer • Updated • 103k • 1.32k • 4 -
NousResearch/eval-Hermes-4-14B-nonreasoning-old
Viewer • Updated • 103k • 789 • 3
Preview models of hybrid reasoner Hermes series
-
NousResearch/DeepHermes-3-Mistral-24B-Preview
Text Generation • 24B • Updated • 290 • • 122 -
NousResearch/DeepHermes-3-Llama-3-8B-Preview
Text Generation • 8B • Updated • 1.58k • • 362 -
NousResearch/DeepHermes-3-Llama-3-3B-Preview
Text Generation • 3B • Updated • 725 • • 39 -
NousResearch/DeepHermes-3-Mistral-24B-Preview-GGUF
24B • Updated • 301 • 34
A collection of experimental artifacts created with Atropos, Nous' RL Environments framework - https://github.com/NousResearch/Atropos
-
NousResearch/SWE-smith-oracle
Viewer • Updated • 10.2k • 69 • 5 -
NousResearch/DeepHermes-ToolCalling-Specialist-Atropos
Reinforcement Learning • 8B • Updated • 51 • 18 -
NousResearch/DeepHermes-Financial-Fundamentals-Prediction-Specialist-Atropos
Text Generation • 8B • Updated • 73 • • 16 -
NousResearch/DeepHermes-Egregore-v2-RLAIF-8b-Atropos
Reinforcement Learning • 8B • Updated • 37 • 7
Extension of Llama 2 to 128k context windows
-
YaRN: Efficient Context Window Extension of Large Language Models
Paper • 2309.00071 • Published • 85 -
NousResearch/Yarn-Mistral-7b-128k
Text Generation • Updated • 5.09k • • 571 -
NousResearch/Yarn-Solar-10b-64k
Text Generation • Updated • 866 • 16 -
NousResearch/Yarn-Llama-2-70b-32k
Text Generation • Updated • 1.11k • 38
Evals from the Hermes-4 Technical Report
-
NousResearch/eval-Hermes-4-405B-reasoning
Viewer • Updated • 103k • 2.23k • 6 -
NousResearch/eval-Hermes-4-405B-nonreasoning
Viewer • Updated • 102k • 587 • 3 -
NousResearch/eval-Hermes-4-70B-reasoning
Viewer • Updated • 103k • 1.32k • 4 -
NousResearch/eval-Hermes-4-14B-nonreasoning-old
Viewer • Updated • 103k • 789 • 3
Preview models of hybrid reasoner Hermes series
-
NousResearch/DeepHermes-3-Mistral-24B-Preview
Text Generation • 24B • Updated • 290 • • 122 -
NousResearch/DeepHermes-3-Llama-3-8B-Preview
Text Generation • 8B • Updated • 1.58k • • 362 -
NousResearch/DeepHermes-3-Llama-3-3B-Preview
Text Generation • 3B • Updated • 725 • • 39 -
NousResearch/DeepHermes-3-Mistral-24B-Preview-GGUF
24B • Updated • 301 • 34
The Hermes 3 Series of Models
-
Hermes 3 Technical Report
Paper • 2408.11857 • Published • 60 -
NousResearch/Hermes-3-Llama-3.1-405B
Text Generation • 406B • Updated • 723 • 268 -
NousResearch/Hermes-3-Llama-3.1-70B
Text Generation • 71B • Updated • 2.22k • • 126 -
NousResearch/Hermes-3-Llama-3.1-8B
Text Generation • 8B • Updated • 252k • • 449
A collection of experimental artifacts created with Atropos, Nous' RL Environments framework - https://github.com/NousResearch/Atropos
-
NousResearch/SWE-smith-oracle
Viewer • Updated • 10.2k • 69 • 5 -
NousResearch/DeepHermes-ToolCalling-Specialist-Atropos
Reinforcement Learning • 8B • Updated • 51 • 18 -
NousResearch/DeepHermes-Financial-Fundamentals-Prediction-Specialist-Atropos
Text Generation • 8B • Updated • 73 • • 16 -
NousResearch/DeepHermes-Egregore-v2-RLAIF-8b-Atropos
Reinforcement Learning • 8B • Updated • 37 • 7
Nous' Flagship LLM Series
-
NousResearch/Hermes-2-Theta-Llama-3-70B
Text Generation • 71B • Updated • 1.37k • • 82 -
NousResearch/Hermes-2-Pro-Llama-3-70B
Text Generation • 71B • Updated • 97 • • 35 -
NousResearch/Hermes-2-Theta-Llama-3-8B
Text Generation • 8B • Updated • 11.4k • • 206 -
NousResearch/Hermes-2-Pro-Llama-3-8B
Text Generation • 8B • Updated • 19.2k • • 450
Extension of Llama 2 to 128k context windows
-
YaRN: Efficient Context Window Extension of Large Language Models
Paper • 2309.00071 • Published • 85 -
NousResearch/Yarn-Mistral-7b-128k
Text Generation • Updated • 5.09k • • 571 -
NousResearch/Yarn-Solar-10b-64k
Text Generation • Updated • 866 • 16 -
NousResearch/Yarn-Llama-2-70b-32k
Text Generation • Updated • 1.11k • 38
allenai's Dolma dataset as native Hugging Face datasets