Artificial Intelligence 15 min read

Language Model as a Service and Black‑Box Optimization: Insights from Prof. Qiu Xipeng’s Talk

Prof. Qiu Xipeng’s talk highlighted how large language models can be offered as a service and efficiently adapted via in‑context learning, lightweight label‑tuning, and gradient‑free black‑box optimization, showcasing a unified asymmetric Transformer (CPT) that handles understanding, generation, ABSA and NER tasks while reducing resource demands.

Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
Language Model as a Service and Black‑Box Optimization: Insights from Prof. Qiu Xipeng’s Talk

Background : Deep learning has become the standard technique in NLP. On October 15, the REDtech youth technology salon invited Prof. Qiu Xipeng (Fudan University) to present the talk “Language Model as a Service and Black‑Box Optimization”.

Pre‑training vs. Fine‑tuning : Large upstream models (e.g., OpenAI, Google) are trained on massive corpora and exhibit strong few‑shot abilities. However, as model size grows, the traditional pre‑training + fine‑tuning pipeline becomes impractical because the models are not open‑source and downstream users lack the resources to run them.

In‑context learning : A new paradigm where a prompt or a few examples are fed to the frozen model, allowing it to adapt to downstream tasks without parameter updates.

Language‑Model‑as‑a‑Service (LMaaS) : The idea of deploying a large language model as a remote service. Example applications include generating web‑button text, converting natural language to mathematical formulas, etc. Two main challenges are (1) what the original model is and (2) how to adapt it to specific downstream tasks.

Unified Pre‑training Model – United Foundation Model : A single model that can handle both understanding and generation tasks. The proposed CPT model is an asymmetric Transformer (Encoder‑Decoder) that merges BERT‑style understanding and BART‑style generation, achieving state‑of‑the‑art results on many Chinese benchmarks while being >2× faster in generation.

Seq2Seq Masked Language Modeling (T5) : Treats many NLP tasks as sequence‑to‑sequence problems. While powerful, some tasks (e.g., fine‑grained aspect‑based sentiment analysis) are difficult to cast directly.

ABSA as Sequence Generation : Reformulates the seven ABSA subtasks into a single sequence‑generation task—outputting aspect term positions, opinion words, and sentiment polarity. A BART‑based encoder‑decoder can handle all subtasks with a simple unified framework.

Unified NER Generation : Extends the seq2seq approach to continuous, nested, and discontinuous named‑entity recognition by generating entity spans and types, achieving strong results on standard NER benchmarks.

Efficient Tuning Algorithms :

𝒴‑Tuning (Label‑Tuning) : Keeps the pre‑trained feature extractor frozen and only tunes a lightweight label‑space module, greatly reducing memory and data requirements.

Black‑Box Tuning : Treats the frozen model as a black box and optimizes prompts without gradients, converting the problem into a gradient‑free optimization task.

Black‑Box Tuning v2 (BBTv2) : Removes the need for prompt pre‑training, introduces deep prompts at every layer, and uses improved random projection, achieving better performance with only ~10k tunable parameters.

Summary of LMaaS Application Modes :

Text prompt – handcrafted textual prompts (feature‑engineering style).

In‑context learning – few‑shot adaptation via examples.

Data generation – generate synthetic data with a large model to train smaller models.

Black‑box optimization – gradient‑free prompt tuning.

Feature‑based learning – use model outputs as features (e.g., 𝒴‑Tuning).

Q&A Highlights :

Large pre‑trained models are widely used in industry, but cost and efficient adaptation remain major challenges.

Generative extraction (e.g., aspect‑based sentiment) still suffers from data scarcity in real‑world scenarios, though future model advances are expected to alleviate this.

LLMNLPpretrainingprompt tuninglanguage modelblack-box optimizationY‑Tuning
Xiaohongshu Tech REDtech
Written by

Xiaohongshu Tech REDtech

Official account of the Xiaohongshu tech team, sharing tech innovations and problem insights, advancing together.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.