Instructions to use autopilot-ai/Indic-sentence-completion with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use autopilot-ai/Indic-sentence-completion with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="autopilot-ai/Indic-sentence-completion")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("autopilot-ai/Indic-sentence-completion")
model = AutoModelForCausalLM.from_pretrained("autopilot-ai/Indic-sentence-completion")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use autopilot-ai/Indic-sentence-completion with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "autopilot-ai/Indic-sentence-completion"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "autopilot-ai/Indic-sentence-completion",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/autopilot-ai/Indic-sentence-completion

SGLang

How to use autopilot-ai/Indic-sentence-completion with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "autopilot-ai/Indic-sentence-completion" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "autopilot-ai/Indic-sentence-completion",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "autopilot-ai/Indic-sentence-completion" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "autopilot-ai/Indic-sentence-completion",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use autopilot-ai/Indic-sentence-completion with Docker Model Runner:
```
docker model run hf.co/autopilot-ai/Indic-sentence-completion
```

Indic-Sentence-Completion

license: other

Details

The model cannot be commercially used. It's a fine-tuned Bloom-3B in several Indian languages:

Gujarati
Marathi
Bangali
Punjabi
Kannada
Malayalam
Telugu
Tamil
Hindi

Architecture

Same as Bloom-3B, the model is decoder only.

Motivation behind the model fine-tuning

The model can be fine-tuned for any downstream task that requires the use of the aforementioned Indian languages
PEFT LoRA is advised.
Can be stacked with an Encoder if needed for any Sequence to Sequence task that requires aforementioned Indian languages

Example of getting inference from the model

from transformers import AutoModel, AutoConfig, AutoModelForCausalLM, AutoTokenizer

# Path to the directory containing the model files
model_directory = "autopilot-ai/Indic-sentence-completion"
tokenizer = AutoTokenizer.from_pretrained(model_directory)
model = AutoModelForCausalLM.from_pretrained(
    model_directory,
    load_in_8bit=True,
    device_map="auto",
)

# Load the model configuration
config = AutoConfig.from_pretrained(model_directory)

# Load the model
model = AutoModel.from_pretrained(model_directory, config=config)
batch = tokenizer("હેલો કેમ છો?", return_tensors='pt')

with torch.cuda.amp.autocast():
   output_tokens = model.generate(**batch, max_new_tokens=10)

print('\n\n', tokenizer.decode(output_tokens[0], skip_special_tokens=True))

To run the above code snippet (in 8 bits), make sure to install the following

pip install accelerate bitsandbytes

Downloads last month: 8

Safetensors

Model size

3B params

Tensor type

F16

Model tree for autopilot-ai/Indic-sentence-completion

Adapters

1 model

autopilot-ai
/

Indic-sentence-completion