Instructions to use PoetschLab/GROVER with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use PoetschLab/GROVER with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("fill-mask", model="PoetschLab/GROVER")# Load model directly from transformers import AutoTokenizer, AutoModelForMaskedLM tokenizer = AutoTokenizer.from_pretrained("PoetschLab/GROVER") model = AutoModelForMaskedLM.from_pretrained("PoetschLab/GROVER") - Inference
- Notebooks
- Google Colab
- Kaggle
| tags: | |
| - biology | |
| - DNA | |
| - genomics | |
| This is the official pre-trained model introduced in [DNA language model GROVER learns sequence context in the human genome](https://www.nature.com/articles/s42256-024-00872-0) | |
| from transformers import AutoTokenizer, AutoModelForMaskedLM | |
| # Import the tokenizer and the model | |
| tokenizer = AutoTokenizer.from_pretrained("PoetschLab/GROVER") | |
| model = AutoModelForMaskedLM.from_pretrained("PoetschLab/GROVER") | |
| Some preliminary analysis shows that sequence re-tokenization using Byte Pair Encoding (BPE) changes significantly if the sequence is less than 50 nucleotides long. Longer than 50 nucleotides, you should still be careful with sequence edges. | |
| We advice to add 100 nucleotides at the beginning and end of every sequence in order to guarantee that your sequence is represented with the same tokens as the original tokenization. | |
| We also provide the tokenized chromosomes with their respective nucleotide mappers (They are available in the folder tokenized chromosomes). | |
| ### BibTeX entry and citation info | |
| ```bibtex | |
| @article{sanabria2024dna, | |
| title={DNA language model GROVER learns sequence context in the human genome}, | |
| author={Sanabria, Melissa and Hirsch, Jonas and Joubert, Pierre M and Poetsch, Anna R}, | |
| journal={Nature Machine Intelligence}, | |
| pages={1--13}, | |
| year={2024}, | |
| publisher={Nature Publishing Group UK London} | |
| } | |
| ``` | |