Instructions to use arcee-ai/Trinity-Mini with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use arcee-ai/Trinity-Mini with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="arcee-ai/Trinity-Mini", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("arcee-ai/Trinity-Mini", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("arcee-ai/Trinity-Mini", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use arcee-ai/Trinity-Mini with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "arcee-ai/Trinity-Mini"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "arcee-ai/Trinity-Mini",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/arcee-ai/Trinity-Mini

SGLang

How to use arcee-ai/Trinity-Mini with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "arcee-ai/Trinity-Mini" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "arcee-ai/Trinity-Mini",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "arcee-ai/Trinity-Mini" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "arcee-ai/Trinity-Mini",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use arcee-ai/Trinity-Mini with Docker Model Runner:
```
docker model run hf.co/arcee-ai/Trinity-Mini
```

This model fails the "carwash" question

#12

by rekrek - opened Feb 12

Discussion

rekrek

Feb 12

•

edited Feb 12

Hi, there is a trick question on r/LocalLLaMA and this thinking model fails at it.

the carwash is only 50m from my house, I want to get my car cleaned there, should I drive there or walk ?
the carwash is only 50m from my house, I want to get my car cleaned there, how should I go there ?
the carwash is only 50m from my house, I want to get my car cleaned there, how should I go there ? I can't walk.

Do you happen to know why your model could be answering wrong ?

I didn't had the time to look at the base model to see it it produces a wrong answer
Is it in SFT data on environmental thoughts that prevents the model from understanding that no matter the reason to walk there, if you don't go with your car, it can't be washed.
Is it spacial understanding and associative propriety (my car-> just bring it in my hands ?)

It seems some concepts are overly trained like efficiency, safety, costs, convenience, time, effort... while common sense and practicality are less ?

Note that this is not unique to your model, it's shared by many models, but since you are training Trinity Large, you should account for things like this.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment