Instructions to use openpecha/uchen-ume-classifier with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use openpecha/uchen-ume-classifier with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-classification", model="openpecha/uchen-ume-classifier") pipe("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/parrots.png")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("openpecha/uchen-ume-classifier", dtype="auto") - Notebooks
- Google Colab
- Kaggle
Uchen vs Umê Classifier (DINOv3 ViT-S)
Binary Tibetan script classifier: Uchen (དབུ་ཅན།, headed/printed script) vs Umê (དབུ་མེད།, headless/cursive script). Fine-tuned from DINOv3 ViT-S on ~10,000 manuscript scans from the Buddhist Digital Resource Center (BDRC).
Dataset: openpecha/uchen-ume-classification-dataset
Which checkpoint to use
Pick the variant that matches how you preprocess at inference:
| Your pipeline | Weights | Inference preprocess |
|---|---|---|
| Center-crop whole page (resize short edge → 224, center crop) | center_crop_all/final_model.pt |
--preprocess center_crop_whole_page |
| Raw full manuscript page (no PIL crop before DINO) | without_preprocess/final_model.pt |
--preprocess none |
Do not use with_preprocess/ — it was trained with center crop on train/val but evaluated on full-page test (56% acc). That train/test mismatch is why val looked ~99% while test JSON was ~56%.
Best results
Hub split: 9,110 train / 1,000 val / 851 test (work-stratified).
| Variant | Train | Val | Test @ eval | Test acc | Test macro-F1 | Val macro-F1 (best) |
|---|---|---|---|---|---|---|
center_crop_all/ |
center crop | center crop | center crop | 99.3% | 0.983 | 0.996 |
without_preprocess/ |
none | none | none (full page) | 80.7% | 0.708 | 0.771 |
Test confusion matrices (851 pages)
| Variant | uchen→uchen | uchen→ume | ume→uchen | ume→ume |
|---|---|---|---|---|
center_crop_all/ |
94 | 3 | 3 | 751 |
without_preprocess/ |
97 | 2 | 165 | 603 |
See confusion_matrix.json and confusion_matrix.png in each variant folder on the Hub.
Training data
| Class | Train | Validation | Test | Total |
|---|---|---|---|---|
| Uchen | ~3,124 | ~340 | ~290 | ~3,754 |
| Ume | ~5,986 | ~660 | ~561 | ~7,207 |
| Total pages | 9,110 | 1,000 | 851 | 10,961 |
Splits are partitioned at the work level — all pages from the same manuscript stay in one split only.
Architecture
- Backbone: DINOv3 ViT-S/16 (21M params)
- Head: LayerNorm → Dropout(0.1) → Linear(384, 128) → GELU → Dropout(0.1) → Linear(128, 2)
- Stages: A (head) → B (last 2 blocks) → C (last 4 blocks)
- Balancing: WeightedRandomSampler + class-weighted cross-entropy
Quick start
Center-crop pipeline (recommended if you crop pages)
from huggingface_hub import hf_hub_download
import torch
path = hf_hub_download(
"openpecha/uchen-ume-classifier",
"center_crop_all/final_model.pt",
repo_type="model",
)
ckpt = torch.load(path, map_location="cpu", weights_only=False)
python inference_uchen_ume.py \
--image page.jpg \
--weights center_crop_all/final_model.pt \
--preprocess center_crop_whole_page
Full-page pipeline
path = hf_hub_download(
"openpecha/uchen-ume-classifier",
"without_preprocess/final_model.pt",
repo_type="model",
)
python inference_uchen_ume.py \
--weights without_preprocess/final_model.pt \
--preprocess none
Repo layout
center_crop_all/ ← center_crop_whole_page at inference (~99% test)
final_model.pt
model_card.json
results.json ← includes confusion_matrix
confusion_matrix.json
confusion_matrix.png
without_preprocess/ ← full pages (~81% test)
final_model.pt
model_card.json
results.json
confusion_matrix.json
confusion_matrix.png
Limitations
- Preprocess must match training. Center-crop model on full pages ≈ 56%; full-page model expects uncropped input.
- Trained on BDRC digitised manuscripts; may underperform on photos or non-BDRC scans.
- Access requirement: DINOv3 is gated — accept facebook/dinov3-vits16-pretrain-lvd1689m and run
huggingface-cli login.
Citation
@misc{karma2026uchenume,
title = {Uchen-Ume Classifier: Binary Tibetan Script Classification with DINOv3},
author = {Karma Tashi and Elie Roux},
year = {2026},
publisher = {HuggingFace},
url = {https://huggingface.co/openpecha/uchen-ume-classifier},
note = {Funded by Khyentse Foundation. Images sourced from the Buddhist Digital Resource Center (BDRC).}
}
Acknowledgements
Developed by Dharmaduta for the Buddhist Digital Resource Center (BDRC) Etext Corpus project, with funding from the Khyentse Foundation. Annotation guidelines by Pentsok Rtsang.
Model tree for openpecha/uchen-ume-classifier
Base model
facebook/dinov3-vit7b16-pretrain-lvd1689mDataset used to train openpecha/uchen-ume-classifier
Evaluation results
- Macro F1 (center crop) on openpecha/uchen-ume-classification-benchmarktest set self-reported0.983
- Accuracy (center crop) on openpecha/uchen-ume-classification-benchmarktest set self-reported0.993
- Macro F1 (full page) on openpecha/uchen-ume-classification-benchmarktest set self-reported0.708
- Accuracy (full page) on openpecha/uchen-ume-classification-benchmarktest set self-reported0.807