Ashish Mishra's picture

Ashish Mishra

ashbuilds

·

ashbuilds

AI & ML interests

None yet

Recent Activity

liked a Space 12 days ago

HiDream-ai/HiDream-O1-Image

liked a model about 1 month ago

LiquidAI/LFM2.5-VL-450M

liked a model about 1 month ago

google/gemma-4-31B-it

View all activity

Organizations

None yet

upvoted a paper about 2 months ago

TriAttention: Efficient Long Reasoning with Trigonometric KV Compression

Paper • 2604.04921 • Published Apr 6 • 114

upvoted a paper 3 months ago

GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning

Paper • 2602.12099 • Published Feb 12 • 62

upvoted a paper 5 months ago

WorldWarp: Propagating 3D Geometry with Asynchronous Video Diffusion

Paper • 2512.19678 • Published Dec 22, 2025 • 32

upvoted an article 5 months ago

Article

Codex is Open Sourcing AI models

burtenshaw, evalstate

•

Dec 11, 2025

• 82

upvoted a paper 6 months ago

What about gravity in video generation? Post-Training Newton's Laws with Verifiable Rewards

Paper • 2512.00425 • Published Nov 29, 2025 • 53

upvoted an article 8 months ago

Article

Smol2Operator: Post-Training GUI Agents for Computer Use

+3

A-Mahla, merve, sergiopaniego, reach-vb, lewtun

•

Sep 23, 2025

• 138

upvoted a collection 8 months ago

Granite Docling

Models for parsing complex PDFs and structured documents, designed to complement Docling. • 4 items • Updated 24 days ago • 64

upvoted an article 8 months ago

Article

Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers

+5

ariG23498, sergiopaniego, reach-vb, pcuenq, ArthurZ, SaylorTwift, cyrilvallez

•

Sep 11, 2025

• 188

upvoted an article 11 months ago

Article

Creating custom kernels for the AMD MI300

ror, seungrokj

•

Jul 9, 2025

• 54

upvoted a collection 12 months ago

Holo1

Vision-Language Action Model for use in Surfer-H web navigation agent • 6 items • Updated Jun 10, 2025 • 49

upvoted an article 12 months ago

Article

Holo1: New family of GUI automation VLMs powering GUI agent Surfer-H

Hcompany

•

Jun 3, 2025

• 71

upvoted a collection about 1 year ago

Qwen2.5-VL

Vision-language model series based on Qwen2.5 • 10 items • Updated Mar 2 • 561

upvoted 3 papers about 1 year ago

CoRAG: Collaborative Retrieval-Augmented Generation

Paper • 2504.01883 • Published Apr 2, 2025 • 9

LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM

Paper • 2503.04724 • Published Mar 6, 2025 • 72

InterFeedback: Unveiling Interactive Intelligence of Large Multimodal Models via Human Feedback

Paper • 2502.15027 • Published Feb 20, 2025 • 7

upvoted 2 papers over 1 year ago

Transformer^2: Self-adaptive LLMs

Paper • 2501.06252 • Published Jan 9, 2025 • 55

MinMo: A Multimodal Large Language Model for Seamless Voice Interaction

Paper • 2501.06282 • Published Jan 10, 2025 • 53

upvoted a collection over 1 year ago

DeepSeek-V3

4 items • Updated Nov 27, 2025 • 284

upvoted a paper over 1 year ago

DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation

Paper • 2412.07589 • Published Dec 10, 2024 • 48