Feature Extraction
Chinese

Forge-SID-Model

This repository contains a pre-trained RQVAE (Residual Quantized Variational Autoencoder) model designed for SID (Semantic Identifier) generation tasks. It is part of the FORGE ecosystem, introduced in the paper FORGE: Forming Semantic Identifiers for Generative Retrieval in Industrial Datasets.

The model weights are stored in final_sid_rq_model.pth.

Usage

1. Download the Model

You can download the model files locally using the huggingface_hub library:

import os
# Optional: Use mirror for faster download in some regions (e.g., China)
os.environ["HF_ENDPOINT"] = "https://hf-mirror.com" 
os.environ["KMP_DUPLICATE_LIB_OK"] = "True"

from huggingface_hub import snapshot_download

snapshot_download(
    repo_id='AL-GR/Forge-SID-Model',
    local_dir='./Forge-SID-Model',  # Replace with your desired local path
    local_dir_use_symlinks=False,
)

2. Run Inference

To use this model for inference, you need to update the checkpoint path in the official inference script provided by the al_sid repository.

Step 1: Clone or download the inference code: https://github.com/selous123/al_sid/blob/main/SID_generation/infer_SID.py

Step 2: Open infer_SID.py and locate Line 23.

Step 3: Modify the CKPT_PATH variable to point to your downloaded .pth file:

# Original line:
# CKPT_PATH = 'output_model/checkpoint-7.pth'

# Update to (example):
CKPT_PATH = './Forge-SID-Model/final_sid_rq_model.pth' 

Note: Ensure the path matches the actual location where you saved the final_sid_rq_model.pth file.


For more details about the training setup or the FORGE framework, please refer to the main repository: AL-GR/FORGE.

Citation

If you find this work helpful, please cite:

@article{fu2025forge,
  title={FORGE: Forming Semantic Identifiers for Generative Retrieval in Industrial Datasets},
  author={Fu, Kairui and Zhang, Tao and Xiao, Shuwen and Wang, Ziyang and Zhang, Xinming and Zhang, Chenchi and Yan, Yuliang and Zheng, Junjun and others},
  journal={arXiv preprint arXiv:2509.20904},
  year={2025}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train AL-GR/Forge-SID-Model

Paper for AL-GR/Forge-SID-Model