Instruction Tuning

Fine-tuning a model on instruction-following examples to improve task adherence.

A form of fine-tuning that improves a generative AI model's ability to follow instructions. Instruction tuning involves training a model on a series of instruction prompts, typically covering a wide variety of tasks. The resulting instruction-tuned model then tends to generate useful responses to zero-shot prompts across a variety of tasks.

Compare and contrast with:

parameter-efficient tuning

prompt tuning

Examples

1.
Google's FLAN demonstrated that instruction tuning on 62 NLP tasks improved zero-shot performance by 30%+ on unseen tasks - establishing instruction tuning as a key alignment technique.
2.
OpenAI InstructGPT used instruction tuning on 13,000 human-labelled examples to transform GPT-3 from a text-completion model into a helpful instruction-following assistant - the predecessor to ChatGPT.
3.
Alpaca (Stanford) instruction-tunes LLaMA-7B on 52,000 GPT-4-generated instruction pairs - demonstrating that open-source models can achieve instruction-following quality close to GPT-3.5 at minimal cost.

Real-world uses

Created for this library

1.
An LLM team applies instruction tuning to a base model with labeled task-instruction-response triples to make it follow user instructions reliably.
2.
An enterprise legal team uses instruction tuning on its base model with labeled contract review examples so the model produces consistent outputs.
3.
A SaaS team uses instruction tuning on a foundation model with internal use-case examples so the assistant follows the company's preferred response format.

Back to glossary

Fine-tuning a model on instruction-following examples to improve task adherence.

Compare and contrast with:

parameter-efficient tuning

prompt tuning

Examples

1.
Google's FLAN demonstrated that instruction tuning on 62 NLP tasks improved zero-shot performance by 30%+ on unseen tasks - establishing instruction tuning as a key alignment technique.
2.
OpenAI InstructGPT used instruction tuning on 13,000 human-labelled examples to transform GPT-3 from a text-completion model into a helpful instruction-following assistant - the predecessor to ChatGPT.
3.
Alpaca (Stanford) instruction-tunes LLaMA-7B on 52,000 GPT-4-generated instruction pairs - demonstrating that open-source models can achieve instruction-following quality close to GPT-3.5 at minimal cost.

Real-world uses

Created for this library

1.
An LLM team applies instruction tuning to a base model with labeled task-instruction-response triples to make it follow user instructions reliably.
2.
An enterprise legal team uses instruction tuning on its base model with labeled contract review examples so the model produces consistent outputs.
3.
A SaaS team uses instruction tuning on a foundation model with internal use-case examples so the assistant follows the company's preferred response format.

Back to glossary

Instruction Tuning

Examples

Real-world uses

Related terms

Loading…

Instruction Tuning

Examples

Real-world uses

Related terms