Activity Feed

AI & ML interests

None defined yet.

Recent Activity

Locutusqueย 
posted an update 10 days ago
view post
Post
235
๐Ÿš€ Introducing Esmeralda-Llama-3.1-8B-control
The first release in the Esmeralda model family by Locutusque.

This model is intentionally small and experimental โ€” a control/baseline proof-of-concept designed to answer one question:

ยซโ€œHow strong is my new "Locutusque/esmeralda-agentic" dataset before scaling to larger runs?โ€ยป

Training Details

- Base: Llama 3.1 8B
- Training precision: bf16 mixed precision
- Chat template: modified ChatML
- Dataset size: ~37k examples
- Examples actually used for this run: ~5k

The dataset includes:

- multi-turn agentic traces
- reasoning traces
- structured assistant behavior
- generalist instruction data

Benchmark Results

Compared against:

- Llama 3.1 8B Instruct
- Hermes-3-Llama-3.1-8B

HumanEval

57.3 โ€” Esmeralda
56.1 โ€” Llama 3.1 Instruct
52.4 โ€” Hermes-3

MBPP

53.2 โ€” Esmeralda
56.8 โ€” Llama 3.1 Instruct
48.2 โ€” Hermes-3

GPQA Diamond

15.7 โ€” Esmeralda
15.7 โ€” Llama 3.1 Instruct
18.2 โ€” Hermes-3

EQ-Bench

59.2 โ€” Esmeralda
61.1 โ€” Llama 3.1 Instruct
63.1 โ€” Hermes-3

EQ-Bench Parseable (Syntax Stability)

๐Ÿ”ฅ 100.0% โ€” Esmeralda
92.4% โ€” Llama 3.1 Instruct
91.2% โ€” Hermes-3

Here Be Dragons ๐Ÿ‰

I also experimented with a new TruthfulQA free-generation evaluation setup.

- Responses were judged by Gemma 4 26B A4B
- The judge compared generations directly against ground-truth answers
- Models were evaluated in 8-bit quantized form to speed up inference

TruthfulQA (LLM Judge)

0.682 โ€” Esmeralda-Llama-3.1-8B-control
0.587 โ€” Hermes-3-Llama-3.1-8B (reported MC2 score; methodology differs)

For a lightweight control run trained on only a fraction of the dataset, Iโ€™m pretty encouraged by the results.

The model is released under the standard Llama 3.1 license, and Iโ€™d genuinely love feedback from people testing it in real workflows.

Model: Locutusque/Esmeralda-Llama-3.1-8B-control

Dataset: Locutusque/esmeralda-agentic

Tonicย 
posted an update 22 days ago
view post
Post
2822
๐Ÿ™‹๐Ÿปโ€โ™‚๏ธ Hey there folks ,

Turns out : if we predict ๐ŸŒ earth we can save a lot of time looking for interesting things and less time looking at things that we expect to see.

Sentinel-2 imagery ๐Ÿ›ฐ๏ธbasically takes a long time to download towards earth. so our "near real time" systems are quite far from that in practical terms.

meanwhile , if we "predict" what we will see , based on what we do see , we can send down much less data in a timely way , and prioritize ๐Ÿ“กearth-bound response .

I'm talking about illegal fishing , logging , mining or building in nature reserves , the more of that we predict early the more we're able to stop it on time.

At least that's the concept !

check out the blog : https://huggingface.co/blog/Tonic/save-patagonia-by-predicting-earth


- Collection: https://huggingface.co/collections/NuTonic/earth-observation-with-temporal-and-general-understanding
- Code: https://github.com/Josephrp/Nutonic
- Dataset: NuTonic/sat-vl-sft-training-ready-v1
- Model: NuTonic/lspace
- Training: NuTonic/lspace-trackio
- Evals: NuTonic/Patagonia_Eval
  • 2 replies
ยท
blanchonย 
posted an update 23 days ago
view post
Post
2619
I'm releasing OpenCS2 a 11TB dataset of around 5000 hours of counter strike gameplay recording.
- HD resolution - 1280ร—720 ยท 32 fps
- For each frame keyboard and mouse + world state (player position, velocity, weapon ...)
- HD Stereo audio
- All 10 players perspective

https://huggingface.co/collections/blanchon/opencs2
  • 1 reply
ยท
Aurelien-Morganย 
posted an update about 1 month ago
view post
Post
1086
@retrain-pipelines v0.2.0 is out !
I'm at Station F at My booth with GOSIM Paris 2026 today & tomorrow.
Come meet me for a live in-person demo and a chat !
  • 1 reply
ยท
Tonicย 
posted an update about 1 month ago
view post
Post
4296
๐Ÿ™‹๐Ÿปโ€โ™‚๏ธ Hey there folks,

since everyone liked my previous announcement post ( https://huggingface.co/posts/Tonic/338509028435394 ) so much , i'm back with more high quality proceedural datasets in the Geospacial domain for SFT training !

Check this one out :
NuTonic/sat-bbox-metadata-sft-v1

the goal is to be able to train vision models on multiple images for remote sensing analysis with one shot .

hope you like it ! ๐Ÿš€
  • 2 replies
ยท
Tonicย 
posted an update about 1 month ago
view post
Post
3643
๐Ÿ™‹๐Ÿปโ€โ™‚๏ธ Hey there folks ,

I'm sharing huggingface's largest dataset of annotated statelite images today.

check it out here : NuTonic/sat-image-boundingbox-sft-full

I hope you like it , the idea is to be able to use this with small vision models ๐Ÿš€
Aurelien-Morganย 
posted an update about 2 months ago
view post
Post
228
Launching a workweek of @retrain-pipelines wheels.

Day #1 : Compose
  • 4 replies
ยท