Buckets:

download
raw
25.9 kB
<meta charset="utf-8" /><meta name="hf:doc:metadata" content="{&quot;title&quot;:&quot;FSDP-QLoRA&quot;,&quot;local&quot;:&quot;fsdp-qlora&quot;,&quot;sections&quot;:[{&quot;title&quot;:&quot;Quantized data storage&quot;,&quot;local&quot;:&quot;quantized-data-storage&quot;,&quot;sections&quot;:[],&quot;depth&quot;:2},{&quot;title&quot;:&quot;Training&quot;,&quot;local&quot;:&quot;training&quot;,&quot;sections&quot;:[],&quot;depth&quot;:2},{&quot;title&quot;:&quot;Resources&quot;,&quot;local&quot;:&quot;resources&quot;,&quot;sections&quot;:[],&quot;depth&quot;:2}],&quot;depth&quot;:1}">
<link href="/docs/bitsandbytes/pr_1532/en/_app/immutable/assets/0.e3b0c442.css" rel="modulepreload">
<link rel="modulepreload" href="/docs/bitsandbytes/pr_1532/en/_app/immutable/entry/start.59d1f88d.js">
<link rel="modulepreload" href="/docs/bitsandbytes/pr_1532/en/_app/immutable/chunks/scheduler.852ec091.js">
<link rel="modulepreload" href="/docs/bitsandbytes/pr_1532/en/_app/immutable/chunks/singletons.9e001e7c.js">
<link rel="modulepreload" href="/docs/bitsandbytes/pr_1532/en/_app/immutable/chunks/index.268e315a.js">
<link rel="modulepreload" href="/docs/bitsandbytes/pr_1532/en/_app/immutable/chunks/paths.75544b5a.js">
<link rel="modulepreload" href="/docs/bitsandbytes/pr_1532/en/_app/immutable/entry/app.08e5acae.js">
<link rel="modulepreload" href="/docs/bitsandbytes/pr_1532/en/_app/immutable/chunks/index.28275fd3.js">
<link rel="modulepreload" href="/docs/bitsandbytes/pr_1532/en/_app/immutable/nodes/0.569c3c91.js">
<link rel="modulepreload" href="/docs/bitsandbytes/pr_1532/en/_app/immutable/chunks/each.e59479a4.js">
<link rel="modulepreload" href="/docs/bitsandbytes/pr_1532/en/_app/immutable/nodes/8.1dfbb38c.js">
<link rel="modulepreload" href="/docs/bitsandbytes/pr_1532/en/_app/immutable/chunks/Tip.9f398c59.js">
<link rel="modulepreload" href="/docs/bitsandbytes/pr_1532/en/_app/immutable/chunks/CodeBlock.c3366071.js">
<link rel="modulepreload" href="/docs/bitsandbytes/pr_1532/en/_app/immutable/chunks/EditOnGithub.582011f0.js"><!-- HEAD_svelte-u9bgzb_START --><meta name="hf:doc:metadata" content="{&quot;title&quot;:&quot;FSDP-QLoRA&quot;,&quot;local&quot;:&quot;fsdp-qlora&quot;,&quot;sections&quot;:[{&quot;title&quot;:&quot;Quantized data storage&quot;,&quot;local&quot;:&quot;quantized-data-storage&quot;,&quot;sections&quot;:[],&quot;depth&quot;:2},{&quot;title&quot;:&quot;Training&quot;,&quot;local&quot;:&quot;training&quot;,&quot;sections&quot;:[],&quot;depth&quot;:2},{&quot;title&quot;:&quot;Resources&quot;,&quot;local&quot;:&quot;resources&quot;,&quot;sections&quot;:[],&quot;depth&quot;:2}],&quot;depth&quot;:1}"><!-- HEAD_svelte-u9bgzb_END --> <p></p> <h1 class="relative group"><a id="fsdp-qlora" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#fsdp-qlora"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>FSDP-QLoRA</span></h1> <p data-svelte-h="svelte-5l5q4j">FSDP-QLoRA combines data parallelism (FSDP enables sharding model parameters, optimizer states, and gradients across GPUs), 4-bit quantization, and LoRA to train LLMs up to 70B parameters on a dual 24GB GPU system. This technique was released by <a href="https://www.answer.ai/posts/2024-03-06-fsdp-qlora" rel="nofollow">Answer.AI</a> in collaboration with bitsandbytes to make training LLMs more efficient and accessible for everyone.</p> <p data-svelte-h="svelte-1naqdkg">This guide provides a brief guide on how bitsandbytes supports storing quantized weights to enable FSDP-QLoRA, and how to run training with the Hugging Face libraries.</p> <div class="course-tip bg-gradient-to-br dark:bg-gradient-to-r before:border-green-500 dark:before:border-green-800 from-green-50 dark:from-gray-900 to-white dark:to-gray-950 border border-green-50 text-green-700 dark:text-gray-400"><p data-svelte-h="svelte-10tshw4">Other changes required for bitsandbytes to support FSDP-QLoRA, such as reconstructing the weights from the quantization metadata and preventing quantizing already quantized weights when they’re moved from a CPU to GPU, are documented in this <a href="https://github.com/TimDettmers/bitsandbytes/pull/970" rel="nofollow">Pull Request</a> and described in the <a href="https://www.answer.ai/posts/2024-03-14-fsdp-qlora-deep-dive" rel="nofollow">Enabling 70B Finetuning on Consumer GPUs</a> blog post. We highly recommend reading these resources for a better understanding of FSDP-QLoRA!</p></div> <h2 class="relative group"><a id="quantized-data-storage" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#quantized-data-storage"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Quantized data storage</span></h2> <p data-svelte-h="svelte-r065g8">FSDP only supports sharding float data types which can be problematic because quantized weights are typically stored as integer data types (uint8). bitsandbytes doesn’t have this problem because it uses <code>StoreChar</code> to read and write quantized weights regardless of the data type storage. This makes it simple to add a <code>quant_storage</code> parameter to the <a href="/docs/bitsandbytes/pr_1532/en/reference/nn/linear4bit#bitsandbytes.nn.Linear4bit">Linear4bit</a> and <a href="/docs/bitsandbytes/pr_1532/en/reference/nn/linear4bit#bitsandbytes.nn.Params4bit">Params4bit</a> classes and set it to <code>torch.uint8</code> to maintain backward compatibility with the codebase. With the <code>quant_storage</code> parameter, you can select any of the FSDP supported data types to shard <a href="/docs/bitsandbytes/pr_1532/en/reference/nn/linear4bit#bitsandbytes.nn.Linear4bit">Linear4bit</a> with such as bfloat16, float16 or float32.</p> <p data-svelte-h="svelte-2kgr0m">You’ll typically access and configure this option from <a href="https://huggingface.co/docs/transformers/main/en/main_classes/quantization#transformers.BitsAndBytesConfig" rel="nofollow">transformers.BitsAndBytesConfig</a> by setting the <code>bnb_4bit_quant_storage</code> parameter. It is very <strong>important</strong> the <code>quant_storage</code> data type matches the data types used throughout the model because FSDP can only wrap layers and modules that have the <em>same floating data type</em>. Making sure the data types are aligned will ensure the model is correctly sharded.</p> <div class="course-tip bg-gradient-to-br dark:bg-gradient-to-r before:border-green-500 dark:before:border-green-800 from-green-50 dark:from-gray-900 to-white dark:to-gray-950 border border-green-50 text-green-700 dark:text-gray-400"><p data-svelte-h="svelte-1xdoh7b">The <code>compute_dtype</code> is the data type used for computation inside the CUDA kernel, where the 4-bit quantized weights are unpacked from the data type in <code>quant_storage</code> and dequantized to <code>compute_dtype</code>. We recommend using torch.bfloat16 (if available on your hardware) for better numerical stability.</p></div> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">from</span> transformers <span class="hljs-keyword">import</span> BitsAndBytesConfig, AutoModelForCausalLM
bnb_config = BitsAndBytesConfig(
load_in_4bit=<span class="hljs-literal">True</span>,
bnb_4bit_quant_type=<span class="hljs-string">&quot;nf4&quot;</span>,
bnb_4bit_compute_dtype=torch.bfloat16,
bnb_4bit_quant_storage=torch.bfloat16,
)
model = AutoModelForCausalLM.from_pretrained(
<span class="hljs-string">&quot;meta-llama/Llama-2-70b&quot;</span>,
quantization_config=bnb_config,
torch_dtype=torch.bfloat16,
)<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1mcsrsf">Check out this <a href="https://hf.co/docs/peft/main/en/accelerate/fsdp#use-peft-qlora-and-fsdp-for-finetuning-large-models-on-multiple-gpus" rel="nofollow">section</a> of the PEFT documentation for the config file and training code to run FSDP-QLoRA training.</p> <h2 class="relative group"><a id="training" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#training"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Training</span></h2> <div class="course-tip bg-gradient-to-br dark:bg-gradient-to-r before:border-green-500 dark:before:border-green-800 from-green-50 dark:from-gray-900 to-white dark:to-gray-950 border border-green-50 text-green-700 dark:text-gray-400"><p data-svelte-h="svelte-1k9ae9o">FSDP is a distributed training framework that needs to be launched as a distributed training job with a library like <a href="https://hf.co/docs/accelerate/index" rel="nofollow">Accelerate</a> or <a href="https://pytorch.org/docs/stable/elastic/run.html" rel="nofollow">torchrun</a>. The launch command provided in this section uses Accelerate to launch the training script.</p></div> <p data-svelte-h="svelte-11niy1x">bitsandbytes is deeply integrated with the Hugging Face ecosystem, making it easy to use with libraries like <a href="https://hf.co/docs/transformers" rel="nofollow">Transformers</a>, <a href="https://hf.co/docs/peft" rel="nofollow">PEFT</a>, and <a href="https://hf.co/docs/trl" rel="nofollow">TRL</a>.</p> <p data-svelte-h="svelte-eapz72">PEFT provides a configuration file (<a href="https://github.com/huggingface/peft/blob/main/examples/sft/configs/fsdp_config_qlora.yaml" rel="nofollow">fsdp_config_qlora.yaml</a>), launch command (<a href="https://github.com/huggingface/peft/blob/main/examples/sft/run_peft_qlora_fsdp.sh" rel="nofollow">run_peft_qlora_fsdp.sh</a>), and training script (<a href="https://github.com/huggingface/peft/blob/main/examples/sft/train.py" rel="nofollow">train.py</a>) for running FSDP-QLoRA. To learn more, check out the <a href="https://huggingface.co/docs/peft/main/en/accelerate/fsdp#use-peft-qlora-and-fsdp-for-finetuning-large-models-on-multiple-gpus" rel="nofollow">Use PEFT QLoRA and FSDP for finetuning large models on multiple GPUs</a> documentation. This section briefly covers the steps to run FSDP-QLoRA training.</p> <p data-svelte-h="svelte-1x35k4g">Before you begin, make sure you have the latest libraries installed.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->pip install -U bitsandbytes accelerate transformers peft trl<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-2os12x">The important change that enables FSDP-QLoRA training is the <code>bnb_4bit_quant_storage</code> parameter in the <a href="https://huggingface.co/docs/transformers/main/en/main_classes/quantization#transformers.BitsAndBytesConfig" rel="nofollow">BitsAndBytesConfig</a> class. This allows you to set the storage data type of the quantized weights to a float data type.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">from</span> transformers <span class="hljs-keyword">import</span> BitsAndBytesConfig
bnb_config = BitsAndBytesConfig(
load_in_4bit=<span class="hljs-literal">True</span>,
bnb_4bit_quant_type=<span class="hljs-string">&quot;nf4&quot;</span>,
bnb_4bit_compute_dtype=torch.bfloat16,
bnb_4bit_use_double_quant=<span class="hljs-literal">True</span>,
bnb_4bit_quant_storage=torch.bfloat16,
)<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1robkbi">Pass the <a href="https://huggingface.co/docs/transformers/main/en/main_classes/quantization#transformers.BitsAndBytesConfig" rel="nofollow">BitsAndBytesConfig</a> to a model to set it up for FSDP-QLoRA. You should set the <code>torch_dtype</code> parameter to match <code>bnb_4bit_quant_storage</code> so that the <a href="/docs/bitsandbytes/pr_1532/en/reference/nn/linear4bit#bitsandbytes.nn.Linear4bit">Linear4bit</a> layers are wrapped identically to the <code>Linear</code> layers. If the storage types do not match, then each <a href="/docs/bitsandbytes/pr_1532/en/reference/nn/linear4bit#bitsandbytes.nn.Linear4bit">Linear4bit</a> layer is wrapped individually.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">from</span> transformers <span class="hljs-keyword">import</span> AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained(
<span class="hljs-string">&quot;meta-llama/Llama-2-70b&quot;</span>,
quantization_config=bnb_config,
torch_dtype=torch.bfloat16,
)<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1c6aae">Configure the <code>~peft.LoraConfig</code> class for QLoRA training by setting <code>target_modules=&quot;all-linear&quot;</code>.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">from</span> peft <span class="hljs-keyword">import</span> LoraConfig
peft_config = LoraConfig(
lora_alpha=<span class="hljs-number">16</span>,
lora_dropout=<span class="hljs-number">0.1</span>,
r=<span class="hljs-number">64</span>,
bias=<span class="hljs-string">&quot;none&quot;</span>,
task_type=<span class="hljs-string">&quot;CAUSAL_LM&quot;</span>,
target_modules=<span class="hljs-string">&quot;all-linear&quot;</span>,
)<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1yk5yum">Now you can pass everything to the <a href="https://huggingface.co/docs/trl/main/en/sft_trainer#trl.SFTTrainer" rel="nofollow">SFTTrainer</a> for training.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">from</span> trl <span class="hljs-keyword">import</span> SFTTrainer
trainer = SFTTrainer(
model=model,
train_dataset=dataset,
peft_config=peft_config,
processing_class=tokenizer,
args=training_arguments,
)
trainer.train()<!-- HTML_TAG_END --></pre></div> <h2 class="relative group"><a id="resources" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#resources"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Resources</span></h2> <p data-svelte-h="svelte-ijbsql">To learn more about FSDP and QLoRA, check out the following resources:</p> <ul data-svelte-h="svelte-47gauf"><li>The <a href="https://github.com/AnswerDotAI/fsdp_qlora" rel="nofollow">AnswerDotAI/fsdp_qlora</a> repository.</li> <li>The introductory <a href="https://www.answer.ai/posts/2024-03-06-fsdp-qlora.html" rel="nofollow">You can now train a 70b language model at home</a> blog post by Answer.AI.</li> <li>For an introduction to FSDP, read the <a href="https://pytorch.org/blog/introducing-pytorch-fully-sharded-data-parallel-api" rel="nofollow">Introducing PyTorch Fully Sharded Data Parallel (FSDP) API</a> blog post.</li> <li>For more details about QLoRA, take a look at the <a href="https://huggingface.co/blog/4bit-transformers-bitsandbytes" rel="nofollow">Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA</a> blog post.</li></ul> <a class="!text-gray-400 !no-underline text-sm flex items-center not-prose mt-4" href="https://github.com/bitsandbytes-foundation/bitsandbytes/blob/main/docs/source/fsdp_qlora.md" target="_blank"><span data-svelte-h="svelte-1kd6by1">&lt;</span> <span data-svelte-h="svelte-x0xyl0">&gt;</span> <span data-svelte-h="svelte-1dajgef"><span class="underline ml-1.5">Update</span> on GitHub</span></a> <p></p>
<script>
{
__sveltekit_16rq4e7 = {
assets: "/docs/bitsandbytes/pr_1532/en",
base: "/docs/bitsandbytes/pr_1532/en",
env: {}
};
const element = document.currentScript.parentElement;
const data = [null,null];
Promise.all([
import("/docs/bitsandbytes/pr_1532/en/_app/immutable/entry/start.59d1f88d.js"),
import("/docs/bitsandbytes/pr_1532/en/_app/immutable/entry/app.08e5acae.js")
]).then(([kit, app]) => {
kit.start(app, element, {
node_ids: [0, 8],
data,
form: null,
error: null
});
});
}
</script>

Xet Storage Details

Size:
25.9 kB
·
Xet hash:
21826cb9f0af96ede14dd53a9ac9f060d4c35162826bb9eef88ebff3bef0bfd7

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.