Instructions to use tim1900/cvx-coder with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use tim1900/cvx-coder with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="tim1900/cvx-coder", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("tim1900/cvx-coder", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("tim1900/cvx-coder", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use tim1900/cvx-coder with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "tim1900/cvx-coder"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "tim1900/cvx-coder",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/tim1900/cvx-coder

SGLang

How to use tim1900/cvx-coder with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "tim1900/cvx-coder" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "tim1900/cvx-coder",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "tim1900/cvx-coder" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "tim1900/cvx-coder",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use tim1900/cvx-coder with Docker Model Runner:
```
docker model run hf.co/tim1900/cvx-coder
```

cvx-coder / README.md

tim1900

Update README.md

e172850 verified over 1 year ago

preview code

raw

history blame contribute delete

2.6 kB

	---
	license: mit

	language:
	- en
	pipeline_tag: text-generation
	tags:
	- nlp
	- code
	inference:
	parameters:
	temperature: 0.0
	widget:
	- messages:
	- role: user
	content: How to express n-th root of the determinant of a semidefinite matrix in cvx?
	---
	# cvx-coder
	[Github](https://github.com/jackfsuia/cvx-coder) \| [Modelscope](https://www.modelscope.cn/models/tommy1235/cvx-coder)

	## Introduction

	cvx-coder aims to improve the [Matlab CVX](https://cvxr.com/cvx) code ability and QA ability of LLMs. It is a [phi-3 model](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) finetuned on a dataset consisting of CVX docs, codes, [forum conversations](https://ask.cvxr.com/) ( my cleaned version of them is at [CVX-forum-conversations](https://huggingface.co/datasets/tim1900/CVX-forum-conversations)).

	## Quickstart
	For one quick test, run the following:
	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
	m_path="tim1900/cvx-coder"
	model = AutoModelForCausalLM.from_pretrained(
	m_path,
	device_map="auto",
	torch_dtype="auto",
	trust_remote_code=True,
	)
	tokenizer = AutoTokenizer.from_pretrained(m_path)
	pipe = pipeline(
	"text-generation",
	model=model,
	tokenizer=tokenizer,
	)
	generation_args = {
	"max_new_tokens": 2000,
	"return_full_text": False,
	"temperature": 0,
	"do_sample": False,
	}
	content='''my problem is not convex, can i use cvx? if not, what should i do, be specific.'''
	messages = [
	{"role": "user", "content": content},
	]
	output = pipe(messages, **generation_args)
	print(output[0]['generated_text'])
	```
	For the chat mode in web, run the following:
	```python
	import gradio as gr
	from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
	m_path="tim1900/cvx-coder"
	model = AutoModelForCausalLM.from_pretrained(
	m_path,
	device_map="auto",
	torch_dtype="auto",
	trust_remote_code=True,
	)
	tokenizer = AutoTokenizer.from_pretrained(m_path)
	pipe = pipeline(
	"text-generation",
	model=model,
	tokenizer=tokenizer,
	)
	generation_args = {
	"max_new_tokens": 2000,
	"return_full_text": False,
	"temperature": 0,
	"do_sample": False,
	}

	def assistant_talk(message, history):
	message=[
	{"role": "user", "content": message},
	]
	temp=[]
	for i in history:
	temp+=[{"role": "user", "content": i[0]},{"role": "assistant", "content": i[1]}]

	messages =temp + message

	output = pipe(messages, **generation_args)
	return output[0]['generated_text']
	gr.ChatInterface(assistant_talk).launch()
	```