QR-LoRA: Efficient and Disentangled Fine-tuning via QR Decomposition for Customized Generation

ICCV 2025
1Harbin Institute of Technology. 2Li Auto. 3University of Science and Technology of China.

Abstract

Existing text-to-image models often rely on parameter fine-tuning techniques such as Low-Rank Adaptation (LoRA) to customize visual attributes. However, when combining multiple LoRA models for content-style fusion tasks, unstructured modifications of weight matrices often lead to undesired feature entanglement between content and style attributes. We propose QR-LoRA, a novel fine-tuning framework leveraging QR decomposition for structured parameter updates that effectively separate visual attributes. Our key insight is that the orthogonal Q matrix naturally minimizes interference between different visual features, while the upper triangular R matrix efficiently encodes attribute-specific transformations. Our approach fixes both Q and R matrices while only training an additional task-specific $\Delta R$ matrix. This structured design reduces trainable parameters to half of conventional LoRA methods and supports effective merging of multiple adaptations without cross-contamination due to the strong disentanglement properties between $\Delta R$ matrices. Experiments demonstrate that QR-LoRA achieves superior disentanglement in content-style fusion tasks, establishing a new paradigm for parameter-efficient, disentangled fine-tuning in generative models.


Motivation and Analysis of QR-LoRA Methodology

Through extensive matrix similarity analysis across different tasks, we observe that both Q and A matrices exhibit high similarity scores, while R matrices maintain relatively lower but still significant similarities (>0.75). This observation inspires us to leverage the stability of Q matrices as fixed orthogonal bases, while introducing a ∆R mechanism for precise feature control. This design enables superior feature disentanglement while maintaining model stability, as evidenced by consistently low similarity scores (<0.2) between ∆R matrices across different tasks.


Qualitative Comparisons of Content & Style Fusion-Gen Task

Comparison of our QR-LoRA against state-of-the-art methods on SDXL and a naive baseline on SD3 and FLUX.1-dev models, demonstrating the model-agnostic nature and superior performance of our framework. Zoom in to view details.