ICLR 2022 · 2021

LoRA: Low-Rank Adaptation of Large Language Models

Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang

Microsoft

Abstract summary

Introduces Low-Rank Adaptation (LoRA): a parameter-efficient fine-tuning technique that adds small low-rank matrices to a frozen base model. Demonstrates that LoRA matches full fine-tuning performance on multiple benchmarks while updating only 0.1%–1% of parameters. Reduces GPU memory requirements and storage footprint by orders of magnitude.

Our summary in our own words — see the canonical source links below for the original abstract.

Why we cite this paper

LoRA is the standard parameter-efficient fine-tuning method in 2026, deployed across Hugging Face's PEFT ecosystem and integrated into every major open-weight LLM serving stack. Engineering signals around LoRA adapter ecosystems are one of the cleanest measures of practical AI-application velocity in our /trend/ai-coding-tools-2026 leaderboard.

Key findings

1LoRA matches full fine-tuning quality on benchmarks while updating only 0.1%–1% of base-model parameters.
2Adapters can be mixed and matched at inference time, enabling multi-tenant LLM serving.
3Memory and storage requirements drop by 3-10× compared to full fine-tuning.
4Established as the default PEFT method for open-weight models (Llama, Mistral, Qwen, Gemma).

Canonical sources

https://arxiv.org/abs/2106.09685 https://www.semanticscholar.org/paper/a8ca46b171467ceb2d7652fbb7e5e2f4631c4ac0

Related glossary terms

Fine-tuning LoRA (Low-Rank Adaptation)Foundation Model Open-Weight Model

Frequently Asked Questions

What is LoRA?▾

Low-Rank Adaptation — a parameter-efficient fine-tuning method that adds small low-rank matrices to a frozen base model. See /define/lora for the full term definition.

Who created LoRA?▾

Edward J. Hu and colleagues at Microsoft, published at ICLR 2022 (arXiv:2106.09685).

How many parameters does LoRA update?▾

Only 0.1%–1% of the base model's parameters, while matching full fine-tuning quality on the paper's benchmarks and cutting GPU memory and storage requirements by roughly 3–10×.

Why is LoRA the default fine-tuning method?▾

Its parameter efficiency makes specialization cheap, and adapters can be mixed and matched at inference time to enable multi-tenant serving. It is the standard PEFT method for open-weight models like Llama, Mistral, Qwen, and Gemma.

Five breakout startups, every Sunday — before the round gets crowded

The free Acceleration Watch: five venture-backed teams accelerating on the engineering signal, translated into plain English — 21 to 47 days before the deck circulates. No code-reading, no card.

Get the free Sunday issue →

Signed The Data Nerd · pseudonymous narrator · methodology over personality

Other research papers

NeurIPS 2017 · 2017

Read our own methodology paper

Code-Side Sourcing methodology, replicable on the open dataset.

Read /methodology

LoRA: Low-Rank Adaptation of Large Language Models

Abstract summary

Why we cite this paper

Key findings

Canonical sources

Related glossary terms

Frequently Asked Questions

Five breakout startups, every Sunday — before the round gets crowded

Other research papers

Attention Is All You Need

Language Models are Few-Shot Learners

Training language models to follow instructions with human feedback

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

Constitutional AI: Harmlessness from AI Feedback

Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

Read our own methodology paper

🚀 Explore Our Network

LoRA: Low-Rank Adaptation of Large Language Models

Abstract summary

Why we cite this paper

Key findings

Canonical sources

Related glossary terms

Frequently Asked Questions

Five breakout startups, every Sunday — before the round gets crowded

Other research papers

Attention Is All You Need

Language Models are Few-Shot Learners

Training language models to follow instructions with human feedback

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

Constitutional AI: Harmlessness from AI Feedback

Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

Read our own methodology paper