LoRA Configuration

Understanding LoRA parameters is crucial for successful fine-tuning. Let's explore each parameter and its impact.

Core LoRA Parameters

The LoraConfig Object

from peft import LoraConfig, TaskType

config = LoraConfig(
    r=16,                           # Rank
    lora_alpha=32,                  # Alpha scaling
    target_modules="all-linear",    # Which layers
    lora_dropout=0.05,              # Dropout
    bias="none",                    # Bias training
    task_type=TaskType.CAUSAL_LM    # Task type
)

Rank (r)

The rank determines the size of the LoRA matrices and directly affects capacity.

Original weight: W ∈ R^(d×k)
LoRA adds: A ∈ R^(d×r), B ∈ R^(r×k)
Output: W' = W + (A × B)

Trainable params = 2 × r × (d + k)

Rank Selection Guide

Rank	Parameters	Memory	Use Case
4	Very few	Minimal	Simple style changes
8	Few	Low	Basic tasks
16	Moderate	Medium	Default choice
32	Many	Higher	Complex tasks
64+	Very many	High	Maximum capacity

# Low rank for simple tasks
simple_config = LoraConfig(r=8, ...)

# Higher rank for complex domain knowledge
complex_config = LoraConfig(r=32, ...)

Alpha (lora_alpha)

Alpha is a scaling factor that controls how much the LoRA update affects the output.

Scaling = alpha / r
Output = W + (alpha/r) × (A × B)

Common Patterns

# Pattern 1: Alpha = Rank (scaling = 1)
config = LoraConfig(r=16, lora_alpha=16)

# Pattern 2: Alpha = 2×Rank (scaling = 2)
config = LoraConfig(r=16, lora_alpha=32)

# Pattern 3: Fixed alpha (adjust with rank)
config = LoraConfig(r=32, lora_alpha=16)  # scaling = 0.5

Rule of thumb: Start with alpha = rank (scaling = 1), then adjust based on training stability.

Target Modules

Which layers to add LoRA adapters to. This significantly affects both capacity and memory.

Common Options

# All linear layers (recommended for 2026)
config = LoraConfig(target_modules="all-linear")

# Attention only (traditional approach)
config = LoraConfig(target_modules=["q_proj", "k_proj", "v_proj", "o_proj"])

# Attention + MLP
config = LoraConfig(target_modules=[
    "q_proj", "k_proj", "v_proj", "o_proj",
    "gate_proj", "up_proj", "down_proj"
])

Finding Target Modules

from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.2-3B-Instruct")

# Print all linear layer names
for name, module in model.named_modules():
    if isinstance(module, torch.nn.Linear):
        print(name)

Output for Llama:

model.embed_tokens
model.layers.0.self_attn.q_proj
model.layers.0.self_attn.k_proj
model.layers.0.self_attn.v_proj
model.layers.0.self_attn.o_proj
model.layers.0.mlp.gate_proj
model.layers.0.mlp.up_proj
model.layers.0.mlp.down_proj
...

Dropout

Regularization to prevent overfitting.

# No dropout (for larger datasets)
config = LoraConfig(lora_dropout=0.0)

# Light dropout (for smaller datasets)
config = LoraConfig(lora_dropout=0.05)

# Higher dropout (for very small datasets or overfitting)
config = LoraConfig(lora_dropout=0.1)

Bias Training

Whether to train the bias terms along with LoRA.

# None (default, recommended)
config = LoraConfig(bias="none")

# All biases
config = LoraConfig(bias="all")

# Only LoRA biases
config = LoraConfig(bias="lora_only")

Recommendation: Keep bias="none" unless you have a specific reason to train biases.

Complete Configuration Examples

General Purpose (Recommended Default)

from peft import LoraConfig

config = LoraConfig(
    r=16,
    lora_alpha=16,
    target_modules="all-linear",
    lora_dropout=0.0,
    bias="none",
    task_type="CAUSAL_LM"
)

Memory Constrained

config = LoraConfig(
    r=8,
    lora_alpha=16,
    target_modules=["q_proj", "v_proj"],  # Fewer modules
    lora_dropout=0.0,
    bias="none",
    task_type="CAUSAL_LM"
)

Maximum Capacity

config = LoraConfig(
    r=64,
    lora_alpha=128,
    target_modules="all-linear",
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM"
)

Applying LoRA to a Model

from transformers import AutoModelForCausalLM
from peft import get_peft_model, LoraConfig

# Load base model
model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.2-3B-Instruct")

# Create LoRA config
config = LoraConfig(
    r=16,
    lora_alpha=16,
    target_modules="all-linear",
    lora_dropout=0.0,
    bias="none",
    task_type="CAUSAL_LM"
)

# Apply LoRA
model = get_peft_model(model, config)

# Check trainable parameters
model.print_trainable_parameters()
# Output: trainable params: 41,943,040 || all params: 3,255,044,096 || trainable%: 1.29%

Tuning Tips

Issue	Solution
Underfitting	Increase rank, add more target modules
Overfitting	Decrease rank, add dropout
Memory issues	Decrease rank, fewer target modules
Slow training	Lower rank, use QLoRA

Next, we'll add 4-bit quantization with QLoRA to dramatically reduce memory requirements. :::