Writer

Enterprise

company

Verified

https://writer.com/

Get_Writer

writer

Activity Feed

AI & ML interests

AGI, LLMs, Knowledge Graph, Palmyra, Domain Specific LLM

Recent Activity

aparnabalagopalan0825 published a dataset 1 day ago

Writer/colm-data

aparnabalagopalan0825 updated a dataset 1 day ago

Writer/colm-data

wassemgtk authored a paper 4 months ago

Accurate Failure Prediction in Agents Does Not Imply Effective Failure Prevention

View all activity

Papers

Accurate Failure Prediction in Agents Does Not Imply Effective Failure Prevention

View all Papers

Articles

Introducing the Palmyra-mini family: Powerful, lightweight, and ready to reason!

Sep 11, 2025

• 58

aparnabalagopalan0825

published a dataset 1 day ago

Writer/colm-data

Viewer • Updated 1 day ago • 200 • 9

aparnabalagopalan0825

updated a dataset 1 day ago

Writer/colm-data

Viewer • Updated 1 day ago • 200 • 9

melisa

authored a paper 4 months ago

Accurate Failure Prediction in Agents Does Not Imply Effective Failure Prevention

Paper • 2602.03338 • Published Feb 3 • 26

melisa

submitted a paper to Daily Papers 4 months ago

Accurate Failure Prediction in Agents Does Not Imply Effective Failure Prevention

Paper • 2602.03338 • Published Feb 3 • 26

sanderland

authored a paper 7 months ago

Global PIQA: Evaluating Physical Commonsense Reasoning Across 100+ Languages and Cultures

Paper • 2510.24081 • Published Oct 28, 2025 • 24

tperes

posted an update 9 months ago

Post

251

Introducing Palmyra-mini: Compact AI Models for Efficient Inference

The Palmyra-mini family from Writer includes three lightweight models designed for high performance and efficient inference. These models are ideal for developers looking to integrate AI capabilities without excessive computational overhead.

Model Variants

* palmyra-mini: A base model for general-purpose generative tasks, achieving 52.6% on Big Bench Hard (exact match).

* palmyra-mini-thinking-a: Optimized for complex logical reasoning with a Chain of Thought (CoT) approach, scoring 82.87% on GSM8K (strict match).

* palmyra-mini-thinking-b: Specialized for mathematical reasoning, achieving 92.5% on AMC23.

Technical Details

* All models are based on the Qwen architecture, compatible with popular inference frameworks like vLLM, SGLang, and TGI.

* "Thinking" models utilize CoT training for enhanced reasoning capabilities.

* GGUF and MLX quantizations are available for optimized performance.

For more information, including benchmark methodologies and detailed performance metrics, refer to our blog post: (https://huggingface.co/blog/Writer/announcing-palmyra-mini).

Model repos can be found here:
* Writer/palmyra-mini
* Writer/palmyra-mini-thinking-a
* Writer/palmyra-mini-thinking-b

Also check out a mobile implementation of palmyra-mini on iOS here to see a to see a working example of how inference can be incorporated on-device.(https://github.com/tsperes/palmyra-mini-mobile/)

dmytro-writer

authored a paper 12 months ago

Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning

Paper • 2505.24726 • Published May 30, 2025 • 282

shelly-writer

authored a paper 12 months ago

Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning

Paper • 2505.24726 • Published May 30, 2025 • 282

sanderland

authored a paper 12 months ago

RewardBench 2: Advancing Reward Model Evaluation

Paper • 2506.01937 • Published Jun 2, 2025 • 7

melisa

authored a paper about 1 year ago

Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning

Paper • 2505.24726 • Published May 30, 2025 • 282

sanderland

authored 2 papers about 1 year ago

Fishing for Magikarp: Automatically Detecting Under-trained Tokens in Large Language Models

Paper • 2405.05417 • Published May 8, 2024 • 1

Command A: An Enterprise-Ready Large Language Model

Paper • 2504.00698 • Published Apr 1, 2025 • 30

melisa

authored a paper about 1 year ago

Expect the Unexpected: FailSafe Long Context QA for Finance

Paper • 2502.06329 • Published Feb 10, 2025 • 133

dmytro-writer

authored 2 papers over 1 year ago

Comparative Analysis of Retrieval Systems in the Real World

Paper • 2405.02048 • Published May 3, 2024 • 1

Expect the Unexpected: FailSafe Long Context QA for Finance

Paper • 2502.06329 • Published Feb 10, 2025 • 133

samjulien

posted an update over 1 year ago

Post

1600

🔥 RAG in just a few lines of code?!

Try out our Hacker News Listener with new built-in RAG capabilities and Palmyra X 004 from the team at Writer!

This Writer Framework app:

- Scrapes up to 500 HN stories and comments
- Uploads them to a Knowledge Graph
- Enables interactive chat with the content using graph-based RAG
- Provides source attribution with every response

The best part? Setting up RAG is now incredibly simple - just a few lines of code to connect your Knowledge Graph as a tool with Palmyra X 004.

🤗 Space: samjulien/hacker-news-listener
💻 Code: https://github.com/writer/framework-tutorials/tree/main/hacker-news-social-listener

melisa

posted an update almost 2 years ago

Post

3329

🔥 Introducing "Writing in the Margins (WiM)" - better inference pattern for long context LLMs that solves the Lost-in-the-Middle problem 🔥

Paper page: Writing in the Margins: Better Inference Pattern for Long Context Retrieval (2408.14906)

TL;DR
Make your model write "margin notes" as you chunk prefill the KV cache. Then ask it reread all notes before it speaks up.
Works with humans, works with AI 🤖

WiM leverages the chunked prefill of the key-value cache, which concurrently generates query-based extractive summaries at each step of the prefill that are subsequently reintegrated at the end of the computation. We term these intermediate outputs “margins”, drawing inspiration from the practice of making margin notes for improved comprehension of long contexts in human reading. We show that this technique, which adds only minimal additional computation, significantly improves LLMs long context reasoning capabilities.

Think: Every chunk has a chance to be attended to/ be at the end of the context at least once. 🎉

📊 Results:
- An average accuracy boost of 7.5% in multi-hop reasoning tasks like HotpotQA and MultiHop-RAG.
- Even a 30% increase in F1-score for summarisation-like tasks (CWE).

Plus, WiM fits seamlessly into interactive applications (think: progress bar!). It can provide real-time progress updates during data retrieval and integration, making it user-friendly and transparent - a stark contrast to feeding 1mln tokens to an LLMs and waiting 6 min for the first token. 🤯

👩‍💻🧑‍💻 Check it out and contribute to our open-source project here: https://github.com/writer/writing-in-the-margins

🧠 More about chunked prefill: https://docs.vllm.ai/en/latest/models/performance.html#chunked-prefill