Instructions to use THUMT/mGPT with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use THUMT/mGPT with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="THUMT/mGPT")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("THUMT/mGPT") model = AutoModelForCausalLM.from_pretrained("THUMT/mGPT") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use THUMT/mGPT with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "THUMT/mGPT" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "THUMT/mGPT", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/THUMT/mGPT
- SGLang
How to use THUMT/mGPT with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "THUMT/mGPT" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "THUMT/mGPT", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "THUMT/mGPT" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "THUMT/mGPT", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use THUMT/mGPT with Docker Model Runner:
docker model run hf.co/THUMT/mGPT
YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
mGPT
mGPT is pre-trained on the mC4 dataset using a causal language modeling objective. It was introduced in this paper and first released on this page.
Model description
mGPT is a Transformer-based model which pre-trained on massive multilingual data covering over 101 languages. Similar to GPT-2, It was pre-trained on the raw texts only, with no human labeling. We use the same tokenization and vocabulary as the mT5 model.
Intended uses
You can use the raw model for text generation or using prompts for adapting it to a downstream task.
How to use
You can use this model directly with a pipeline for text generation. Here is how to use this model to get the features of a given text in PyTorch:
from transformers import MT5Tokenizer, GPT2LMHeadModel, TextGenerationPipeline
tokenizer = MT5Tokenizer.from_pretrained("THUMT/mGPT")
model = GPT2LMHeadModel.from_pretrained("THUMT/mGPT")
pipeline = TextGenerationPipeline(model=model, tokenizer=tokenizer)
text = "Replace me by any text you'd like."
text = pipeline(text, do_sample=True, max_length=1024)[0]["generated_text"]
Preprocessing
The texts are tokenized using sentencepiece and a vocabulary size of 250,100. The inputs are sequences of 1,024 consecutive tokens. We use <extra_id_0> to separate lines in a document.
BibTeX entry and citation info
@misc{tan2021msp,
title={MSP: Multi-Stage Prompting for Making Pre-trained Language Models Better Translators},
author={Zhixing Tan and Xiangwen Zhang and Shuo Wang and Yang Liu},
year={2021},
eprint={2110.06609},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
- Downloads last month
- 2,555