Update README.md
Browse files
README.md
CHANGED
|
@@ -3,29 +3,23 @@ license: mit
|
|
| 3 |
tags:
|
| 4 |
- biology
|
| 5 |
---
|
| 6 |
-
|
| 7 |
<div align="center">
|
| 8 |
-
<
|
| 9 |
-
<!-- <img src="YOUR_IMAGE_URL" width="100%" alt="OneGenome-Rice (OGR)" /> -->
|
| 10 |
-
*(Banner / architecture figure: add URL, then uncomment the line above.)*
|
| 11 |
</div>
|
| 12 |
|
| 13 |
-
#
|
| 14 |
|
| 15 |
OGR is a foundational model for AI-driven precision breeding and functional genomics in rice. It is a generative genomic foundation model trained to process DNA sequences up to **1 million** base pairs in length, with **1.25B** total parameters and a **Mixture-of-Experts (MoE)** architecture. It was pre-trained on a curated corpus of **422** rice genomes spanning cultivated and wild *Oryza* diversity.
|
| 16 |
|
| 17 |
-
For instructions, details, and examples, see the project repository
|
| 18 |
-
|
| 19 |
-
The table below summarizes training scale and key hyperparameters. **Trained Tokens** follows the **Training Process** section (sequence curriculum and CPT).
|
| 20 |
|
| 21 |
-
|
| 22 |
|
| 23 |
-
| Model Specification |
|
| 24 |
| --- | --- |
|
| 25 |
| **Model Scale** | |
|
| 26 |
| Total Parameters | 1.25B |
|
| 27 |
| Activated Parameters | 0.33B |
|
| 28 |
-
| Trained Tokens | ~490B (sequence curriculum) + ~104B (CPT) |
|
| 29 |
| **Architecture** | |
|
| 30 |
| Architecture | MoE |
|
| 31 |
| Number of Experts | 8 |
|
|
|
|
| 3 |
tags:
|
| 4 |
- biology
|
| 5 |
---
|
|
|
|
| 6 |
<div align="center">
|
| 7 |
+
<img src="https://cdn-uploads.huggingface.co/production/uploads/65a9e8563b9e1f0f308378b7/H2qI2OOSl-KqOlg01fRGR.png" width="100%" />
|
|
|
|
|
|
|
| 8 |
</div>
|
| 9 |
|
| 10 |
+
# OneGenomeRice (OGR)
|
| 11 |
|
| 12 |
OGR is a foundational model for AI-driven precision breeding and functional genomics in rice. It is a generative genomic foundation model trained to process DNA sequences up to **1 million** base pairs in length, with **1.25B** total parameters and a **Mixture-of-Experts (MoE)** architecture. It was pre-trained on a curated corpus of **422** rice genomes spanning cultivated and wild *Oryza* diversity.
|
| 13 |
|
| 14 |
+
For instructions, details, and examples, see the project repository[OGR GitHub](https://github.com/zhejianglab/OneGenomeRice).
|
|
|
|
|
|
|
| 15 |
|
| 16 |
+
The table below summarizes training scale and key hyperparameters.
|
| 17 |
|
| 18 |
+
| Model Specification | OGR |
|
| 19 |
| --- | --- |
|
| 20 |
| **Model Scale** | |
|
| 21 |
| Total Parameters | 1.25B |
|
| 22 |
| Activated Parameters | 0.33B |
|
|
|
|
| 23 |
| **Architecture** | |
|
| 24 |
| Architecture | MoE |
|
| 25 |
| Number of Experts | 8 |
|