YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
lilt-only-base
Layout-only pretrained checkpoint from the official LiLT repository.
This is not a complete model — it contains only the 2D spatial (layout) encoder, with no text encoder. It is intended as a building block for combining with any RoBERTa-like text encoder.
What is this?
LiLT (Language-Independent Layout Transformer) decouples text and layout understanding into two separate encoders. lilt-only-base contains exclusively the layout encoder weights, pretrained on document layout understanding (IIT-CDIP dataset).
This allows combining it with any RoBERTa-compatible text encoder to produce a language-specific document understanding model.
Usage
Use gen_weight_roberta_like.py from the official repository to combine with your text encoder of choice:
python gen_weight_roberta_like.py \
--lilt lilt-only-base/pytorch_model.bin \
--text your-roberta-model/pytorch_model.bin \
--config your-roberta-model/config.json \
--out lilt-your-language-base
Compatible text encoders: any RoBERTa-like model (roberta-base, camembert-base, microsoft/infoxlm-base, etc.)
Files
| File | Description |
|---|---|
model.safetensors |
Layout encoder weights (safetensors format) |
pytorch_model.bin |
Layout encoder weights (PyTorch format) |
config.json |
Model configuration (model_type: liltrobertalike) |
Note on model type
This checkpoint uses model_type = liltrobertalike, a custom type defined in the original LiLT repository. It cannot be loaded directly with AutoModel from HuggingFace transformers without first combining it with a text encoder via the procedure above.
License
MIT — following the original jpwang/lilt repository.
Acknowledgements
- LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding — Wang et al., 2022
- Original weights: jpwang/lilt
Note: This is not an official HuggingFace release from the original authors.
- Downloads last month
- 16