Buckets:
| <meta charset="utf-8" /><meta name="hf:doc:metadata" content="{"title":"Multi-backend support (non-CUDA backends)","local":"multi-backend-support-non-cuda-backends","sections":[{"title":"Alpha Release","local":"alpha-release","sections":[],"depth":2},{"title":"Benchmarks","local":"benchmarks","sections":[{"title":"Intel","local":"intel","sections":[{"title":"Inference (CPU)","local":"inference-cpu","sections":[],"depth":4},{"title":"Fine-Tuning (CPU)","local":"fine-tuning-cpu","sections":[],"depth":4}],"depth":3}],"depth":2}],"depth":1}"> | |
| <link href="/docs/bitsandbytes/pr_1385/en/_app/immutable/assets/0.e3b0c442.css" rel="modulepreload"> | |
| <link rel="modulepreload" href="/docs/bitsandbytes/pr_1385/en/_app/immutable/entry/start.00028fac.js"> | |
| <link rel="modulepreload" href="/docs/bitsandbytes/pr_1385/en/_app/immutable/chunks/scheduler.852ec091.js"> | |
| <link rel="modulepreload" href="/docs/bitsandbytes/pr_1385/en/_app/immutable/chunks/singletons.06cb70fc.js"> | |
| <link rel="modulepreload" href="/docs/bitsandbytes/pr_1385/en/_app/immutable/chunks/index.268e315a.js"> | |
| <link rel="modulepreload" href="/docs/bitsandbytes/pr_1385/en/_app/immutable/chunks/paths.e1fa8378.js"> | |
| <link rel="modulepreload" href="/docs/bitsandbytes/pr_1385/en/_app/immutable/entry/app.3ae45f63.js"> | |
| <link rel="modulepreload" href="/docs/bitsandbytes/pr_1385/en/_app/immutable/chunks/index.28275fd3.js"> | |
| <link rel="modulepreload" href="/docs/bitsandbytes/pr_1385/en/_app/immutable/nodes/0.b7be74bf.js"> | |
| <link rel="modulepreload" href="/docs/bitsandbytes/pr_1385/en/_app/immutable/chunks/each.e59479a4.js"> | |
| <link rel="modulepreload" href="/docs/bitsandbytes/pr_1385/en/_app/immutable/nodes/12.f634aa83.js"> | |
| <link rel="modulepreload" href="/docs/bitsandbytes/pr_1385/en/_app/immutable/chunks/Tip.9f398c59.js"> | |
| <link rel="modulepreload" href="/docs/bitsandbytes/pr_1385/en/_app/immutable/chunks/EditOnGithub.582011f0.js"><!-- HEAD_svelte-u9bgzb_START --><meta name="hf:doc:metadata" content="{"title":"Multi-backend support (non-CUDA backends)","local":"multi-backend-support-non-cuda-backends","sections":[{"title":"Alpha Release","local":"alpha-release","sections":[],"depth":2},{"title":"Benchmarks","local":"benchmarks","sections":[{"title":"Intel","local":"intel","sections":[{"title":"Inference (CPU)","local":"inference-cpu","sections":[],"depth":4},{"title":"Fine-Tuning (CPU)","local":"fine-tuning-cpu","sections":[],"depth":4}],"depth":3}],"depth":2}],"depth":1}"><!-- HEAD_svelte-u9bgzb_END --> <p></p> <h1 class="relative group"><a id="multi-backend-support-non-cuda-backends" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#multi-backend-support-non-cuda-backends"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Multi-backend support (non-CUDA backends)</span></h1> <div class="course-tip bg-gradient-to-br dark:bg-gradient-to-r before:border-green-500 dark:before:border-green-800 from-green-50 dark:from-gray-900 to-white dark:to-gray-950 border border-green-50 text-green-700 dark:text-gray-400"><p data-svelte-h="svelte-1mb3sz">If you feel these docs need some additional info, please consider submitting a PR or respectfully request the missing info in one of the below mentioned Github discussion spaces.</p></div> <p data-svelte-h="svelte-vyk3k9">As part of a recent refactoring effort, we will soon offer official multi-backend support. Currently, this feature is available in a preview alpha release, allowing us to gather early feedback from users to improve the functionality and identify any bugs.</p> <p data-svelte-h="svelte-12ne44y">At present, the Intel CPU and AMD ROCm backends are considered fully functional. The Intel XPU backend has limited functionality and is less mature.</p> <p data-svelte-h="svelte-1wpsr8f">Please refer to the <a href="./installation#multi-backend">installation instructions</a> for details on installing the backend you intend to test (and hopefully provide feedback on).</p> <div class="course-tip bg-gradient-to-br dark:bg-gradient-to-r before:border-green-500 dark:before:border-green-800 from-green-50 dark:from-gray-900 to-white dark:to-gray-950 border border-green-50 text-green-700 dark:text-gray-400"><p data-svelte-h="svelte-ztwrs6">Apple Silicon support is planned for Q4 2024. We are actively seeking contributors to help implement this, develop a concrete plan, and create a detailed list of requirements. Due to limited resources, we rely on community contributions for this implementation effort. To discuss further, please spell out your thoughts and discuss in <a href="https://github.com/bitsandbytes-foundation/bitsandbytes/discussions/1340" rel="nofollow">this GitHub discussion</a> and tag <code>@Titus-von-Koeller</code> and <code>@matthewdouglas</code>. Thank you!</p></div> <h2 class="relative group"><a id="alpha-release" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#alpha-release"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Alpha Release</span></h2> <p data-svelte-h="svelte-njzoox">As we are currently in the alpha testing phase, bugs are expected, and performance might not meet expectations. However, this is exactly what we want to discover from <strong>your</strong> perspective as the end user!</p> <p data-svelte-h="svelte-5kenpg">Please share and discuss your feedback with us here:</p> <ul data-svelte-h="svelte-7ea0dg"><li><a href="https://github.com/bitsandbytes-foundation/bitsandbytes/discussions/1339" rel="nofollow">Github Discussion: Multi-backend refactor: Alpha release ( AMD ROCm ONLY )</a></li> <li><a href="https://github.com/bitsandbytes-foundation/bitsandbytes/discussions/1338" rel="nofollow">Github Discussion: Multi-backend refactor: Alpha release ( Intel ONLY )</a></li></ul> <p data-svelte-h="svelte-16dyb3n">Thank you for your support!</p> <h2 class="relative group"><a id="benchmarks" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#benchmarks"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Benchmarks</span></h2> <h3 class="relative group"><a id="intel" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#intel"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Intel</span></h3> <p data-svelte-h="svelte-1qmjl0d">The following performance data is collected from Intel 4th Gen Xeon (SPR) platform. The tables show speed-up and memory compared with different data types of <a href="https://huggingface.co/meta-llama/Llama-2-7b-chat-hf" rel="nofollow">Llama-2-7b-chat-hf</a>.</p> <h4 class="relative group"><a id="inference-cpu" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#inference-cpu"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Inference (CPU)</span></h4> <table data-svelte-h="svelte-1wzn4ht"><thead><tr><th>Data Type</th> <th>BF16</th> <th>INT8</th> <th>NF4</th> <th>FP4</th></tr></thead> <tbody><tr><td>Speed-Up (vs BF16)</td> <td>1.0x</td> <td>0.6x</td> <td>2.3x</td> <td>0.03x</td></tr> <tr><td>Memory (GB)</td> <td>13.1</td> <td>7.6</td> <td>5.0</td> <td>4.6</td></tr></tbody></table> <h4 class="relative group"><a id="fine-tuning-cpu" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#fine-tuning-cpu"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Fine-Tuning (CPU)</span></h4> <table data-svelte-h="svelte-zjxg1q"><thead><tr><th>Data Type</th> <th>AMP BF16</th> <th>INT8</th> <th>NF4</th> <th>FP4</th></tr></thead> <tbody><tr><td>Speed-Up (vs AMP BF16)</td> <td>1.0x</td> <td>0.38x</td> <td>0.07x</td> <td>0.07x</td></tr> <tr><td>Memory (GB)</td> <td>40</td> <td>9</td> <td>6.6</td> <td>6.6</td></tr></tbody></table> <a class="!text-gray-400 !no-underline text-sm flex items-center not-prose mt-4" href="https://github.com/bitsandbytes-foundation/bitsandbytes/blob/main/docs/source/non_cuda_backends.mdx" target="_blank"><span data-svelte-h="svelte-1kd6by1"><</span> <span data-svelte-h="svelte-x0xyl0">></span> <span data-svelte-h="svelte-1dajgef"><span class="underline ml-1.5">Update</span> on GitHub</span></a> <p></p> | |
| <script> | |
| { | |
| __sveltekit_ojtpg9 = { | |
| assets: "/docs/bitsandbytes/pr_1385/en", | |
| base: "/docs/bitsandbytes/pr_1385/en", | |
| env: {} | |
| }; | |
| const element = document.currentScript.parentElement; | |
| const data = [null,null]; | |
| Promise.all([ | |
| import("/docs/bitsandbytes/pr_1385/en/_app/immutable/entry/start.00028fac.js"), | |
| import("/docs/bitsandbytes/pr_1385/en/_app/immutable/entry/app.3ae45f63.js") | |
| ]).then(([kit, app]) => { | |
| kit.start(app, element, { | |
| node_ids: [0, 12], | |
| data, | |
| form: null, | |
| error: null | |
| }); | |
| }); | |
| } | |
| </script> | |
Xet Storage Details
- Size:
- 14.4 kB
- Xet hash:
- d8024ec3d0de4436f5f5d45b33bc68066c0e675b5b34765c59a887140d11e474
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.