OneGenome-Rice / README.md
zhejianglab-ospo's picture
Update README.md
6d35880 verified
metadata
license: apache-2.0
tags:
  - biology

OneGenome-Rice (OGR)

OGR is a foundational model for AI-driven precision breeding and functional genomics in rice. It is a generative genomic foundation model trained to process DNA sequences up to 1 million base pairs in length, with 1.25B total parameters and a Mixture-of-Experts (MoE) architecture. It was pre-trained on a curated corpus of 422 rice genomes spanning cultivated and wild Oryza diversity.

For instructions, details, and examples, see the project repository OGR GitHub.

The table below summarizes training scale and key hyperparameters.

Model Specification OneGenomeRice (OGR)
Model Scale
Total Parameters 1.25B
Activated Parameters 0.33B
Architecture
Architecture MoE
Number of Experts 8
Selected Experts per Token 2
Number of Layers 12
Attention Hidden Dimension 1024
Number of Attention Heads 16 (GQA, 8 KV groups)
MoE Hidden Dimension (per Expert) 4096
Vocabulary Size 128 (padded)
Context Length up to 1Mb