An Overview of Prompt Learning in Natural Language Processing
This article reviews the evolution of NLP training paradigms, explains why prompt learning is needed, defines its core concepts, and surveys major hard‑template and soft‑template methods such as PET, LM‑BFF, P‑tuning, and Prefix‑tuning, highlighting their advantages for few‑shot and zero‑shot scenarios.
Recent years have seen rapid advances in NLP, moving from traditional machine‑learning models to deep learning, then to pretrained models with fine‑tuning, and finally to the emerging paradigm of pretrained models combined with prompts, which dramatically reduces the amount of labeled data required.
1. NLP Training Paradigms
The field is commonly divided into four stages: (1) traditional models using handcrafted features, (2) deep‑learning models such as word2vec + LSTM, (3) pretrained models with fine‑tuning (e.g., BERT), and (4) pretrained models with prompt‑based prediction, which cuts training data needs.
2. Why Prompt Learning?
Fine‑tuning suffers from a gap between pretraining objectives (auto‑regressive or auto‑encoding) and downstream tasks, leading to poor few‑shot performance and high resource consumption when large models are adapted for specific tasks.
3. What Is Prompt Learning?
Prompt learning leverages the knowledge already stored in pretrained language models by converting downstream tasks into a format similar to the pretraining task, typically using a natural‑language template with masked tokens. This approach relies on three steps: designing a pretraining‑compatible task, crafting input templates (prompt engineering), and mapping model outputs to labels (answer engineering).
4. Common Prompt Learning Methods
Hard‑Template Methods
Examples include PET (Pattern‑Exploiting Training) and LM‑BFF, which use manually designed or automatically searched discrete token templates. PET jointly optimizes the language‑model loss and a masked‑language‑model loss, while LM‑BFF adds demonstrations and automatic prompt generation.
Soft‑Template Methods
Soft‑template approaches replace discrete tokens with learnable embeddings. P‑tuning inserts trainable pseudo‑prompt tokens into the input and optimizes them via a small LSTM encoder. Prefix‑tuning extends this idea by adding learnable continuous vectors to each transformer layer, keeping the pretrained model frozen. Soft Prompt Tuning further simplifies by fixing the base model and learning only a few additional tokens.
5. Summary
Prompt learning consists of prompt templates, verbalizers (label mappings), and pretrained language models. Hard‑template methods rely on discrete token designs, while soft‑template methods optimize continuous prompt embeddings, achieving comparable performance to full fine‑tuning with far fewer trainable parameters.
Sohu Tech Products
A knowledge-sharing platform for Sohu's technology products. As a leading Chinese internet brand with media, video, search, and gaming services and over 700 million users, Sohu continuously drives tech innovation and practice. We’ll share practical insights and tech news here.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.