Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
14
3
2
Tommaso Cerruti
Cerru02
Follow
thomas201's profile picture
1 follower
·
12 following
https://tommasocerruti.github.io/
tommasocerruti
tommasocerruti
AI & ML interests
AI safety and evaluation
Recent Activity
upvoted
an
article
3 days ago
Safety Evals Should Project Test-Time Compute
published
an
article
3 days ago
Safety Evals Should Project Test-Time Compute
new
activity
13 days ago
evaleval/EEE_datastore:
Update HELM to schema version v0.2.2
View all activity
Organizations
Cerru02
's activity
All
Models
Datasets
Spaces
Buckets
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
upvoted
an
article
3 days ago
view article
Article
Safety Evals Should Project Test-Time Compute
Cerru02
•
3 days ago
•
3
upvoted
an
article
15 days ago
view article
Article
AI evals are becoming the new compute bottleneck
evaleval
•
15 days ago
•
26
upvoted
a
paper
15 days ago
CocoaBench: Evaluating Unified Digital Agents in the Wild
Paper
•
2604.11201
•
Published
Apr 13
•
36