Instructions to use abideen/MulitLoRA-Mistral-Merging with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use abideen/MulitLoRA-Mistral-Merging with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="abideen/MulitLoRA-Mistral-Merging")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("abideen/MulitLoRA-Mistral-Merging", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use abideen/MulitLoRA-Mistral-Merging with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "abideen/MulitLoRA-Mistral-Merging" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "abideen/MulitLoRA-Mistral-Merging", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/abideen/MulitLoRA-Mistral-Merging
- SGLang
How to use abideen/MulitLoRA-Mistral-Merging with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "abideen/MulitLoRA-Mistral-Merging" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "abideen/MulitLoRA-Mistral-Merging", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "abideen/MulitLoRA-Mistral-Merging" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "abideen/MulitLoRA-Mistral-Merging", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use abideen/MulitLoRA-Mistral-Merging with Docker Model Runner:
docker model run hf.co/abideen/MulitLoRA-Mistral-Merging
MulitLoRA-Mistral-Merging
MultiLoRA-Mistral-Merge is a MultiLoRA Ties Merge made with the following adapters using π§ AutoLoRAMerging:
The merged adapter can generate SQL statements, give legal advices, and perform function calling.
π§© Configuration
density: 0.2
merging_type: "ties"
weights: [2.0, 0.3, 0.7]
π» Usage
!pip install -qU transformers bitsandbytes accelerate peft
from peft import PeftConfig, PeftModel
from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
peft_model = "abideen/MulitLoRA-Mistral-Merging"
config = PeftConfig.from_pretrained(peft_model)
model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path, device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(peft_model)
model.resize_token_embeddings(len(tokenizer))
model = PeftModel.from_pretrained(model, peft_model)
prompt = "Table: Sports; Columns: ['Team', 'Head Coach', 'President', 'Home Ground', 'Location'] Natural Query: Who is the Head Coach of the team whose President is Mario Volarevic? SQL Query:" # @param {type:"string"}
messages = [
{"role": "user", "content": prompt},
]
text = tokenizer.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)
inputs = tokenizer(text, return_tensors="pt") # , add_special_tokens=False)
inputs = {k: v for k, v in inputs.items()}
outputs = model.generate(
**inputs,
max_new_tokens=256,
do_sample=True,
top_p=0.95,
temperature=0.2,
repetition_penalty=1.2,
eos_token_id=tokenizer.eos_token_id,
)
print(tokenizer.decode(outputs[0]))
