ffurfaro/PixelBytes-Pokemon
Viewer • Updated • 964 • 82 • 2
Welcome to the PixelBytes repository! This project features models designed to generate text and images simultaneously, pixel by pixel, using a unified embedding. (only testing weight)
The PixelByte model generates mixed sequences of text and images, handling transitions with line breaks and maintaining image dimension consistency.
We use the PixelBytes-Pokemon dataset, available on Hugging Face: PixelBytes-Pokemon. It contains text and image sequences of Pokémon for training our model.
Furfaro, F. (2024). PixelBytes: A Unified Multimodal Representation Learning Project. (https://github.com/fabienfrfr/PixelBytes)
Thank you for exploring PixelBytes! We hope this model aids your multimodal generation projects.