File size: 2,434 Bytes
ee218f1
7606571
ee218f1
 
 
 
9736f90
ee218f1
 
6d35880
ee218f1
 
 
6d35880
ee218f1
ac80656
ee218f1
9736f90
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7606571
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
---
license: apache-2.0
tags:
- biology
---
<div align="center">
  <img src="https://cdn-uploads.huggingface.co/production/uploads/65a9e8563b9e1f0f308378b7/H2qI2OOSl-KqOlg01fRGR.png" width="50%" />
</div>

# OneGenome-Rice (OGR)

OGR is a foundational model for AI-driven precision breeding and functional genomics in rice. It is a generative genomic foundation model trained to process DNA sequences up to **1 million** base pairs in length, with **1.25B** total parameters and a **Mixture-of-Experts (MoE)** architecture. It was pre-trained on a curated corpus of **422** rice genomes spanning cultivated and wild *Oryza* diversity.

For instructions, details, and examples, see the project repository [OGR GitHub](https://github.com/zhejianglab/OneGenome-Rice).

The table below summarizes training scale and key hyperparameters.

<div align="center">

<table>
  <thead>
    <tr>
      <th align="center"><strong>Model Specification</strong></th>
      <th align="center"><strong>OneGenomeRice (OGR)</strong></th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td align="center" colspan="2"><strong>Model Scale</strong></td>
    </tr>
    <tr>
      <td align="center">Total Parameters</td>
      <td align="center">1.25B</td>
    </tr>
    <tr>
      <td align="center">Activated Parameters</td>
      <td align="center">0.33B</td>
    </tr>
    <tr>
      <td align="center" colspan="2"><strong>Architecture</strong></td>
    </tr>
    <tr>
      <td align="center">Architecture</td>
      <td align="center">MoE</td>
    </tr>
    <tr>
      <td align="center">Number of Experts</td>
      <td align="center">8</td>
    </tr>
    <tr>
      <td align="center">Selected Experts per Token</td>
      <td align="center">2</td>
    </tr>
    <tr>
      <td align="center">Number of Layers</td>
      <td align="center">12</td>
    </tr>
    <tr>
      <td align="center">Attention Hidden Dimension</td>
      <td align="center">1024</td>
    </tr>
    <tr>
      <td align="center">Number of Attention Heads</td>
      <td align="center">16 (GQA, 8 KV groups)</td>
    </tr>
    <tr>
      <td align="center">MoE Hidden Dimension (per Expert)</td>
      <td align="center">4096</td>
    </tr>
    <tr>
      <td align="center">Vocabulary Size</td>
      <td align="center">128 (padded)</td>
    </tr>
    <tr>
      <td align="center">Context Length</td>
      <td align="center">up to 1Mb</td>
    </tr>
  </tbody>
</table>

</div>