Artificial Intelligence 16 min read

How Financial Institutions Are Building Their Own Large Language Models

This article explores how the finance sector is creating specialized large language models—covering the shift from generic to domain‑specific models, training innovations, evaluation methods, and real‑world applications such as marketing, customer service, risk control, and operational analytics.

Data Thinking Notes

Apr 11, 2024

How Financial Institutions Are Building Their Own Large Language Models

From General Large Models to Financial Large Models

The finance industry’s three core traits—data‑driven, knowledge‑intensive, and complex workflows—align closely with the capabilities of large models, making finance a natural arena for domain‑specific AI.

Data‑driven: All decisions rely on data analysis.

Knowledge‑intensive: Specialized terminology (e.g., KS, MOB, COB) and dense sub‑domains require deep expertise.

Complex workflows: Extensive human collaboration and intricate processes.

These characteristics suggest that a tailored financial large model can unlock significant value.

Why Generic Models Fall Short and Challenges Faced

Generic models struggle with:

Financial knowledge gaps: They may not understand industry‑specific terms.

Capability issues: Hallucinations, accuracy problems, and forgetting hinder data‑critical finance tasks.

Cost concerns: High training and inference expenses motivate the development of smaller, domain‑focused models.

Addressing these challenges justifies building a finance‑specific model.

Cost‑Effective Domain Model Advantages

Specialized models can achieve performance comparable to much larger generic models, reducing training time from months to weeks and cutting inference hardware requirements.

Self‑Developed Financial Large Model Enhancements

Our in‑house model incorporates four key enhancements:

Chinese language augmentation

Financial domain augmentation

Dialogue augmentation

Application‑specific augmentation

Training Technique Innovations

Injecting Professional Financial Knowledge – Enrich pre‑training with extensive financial corpora, expand instruction data for diverse scenarios, and ensure alignment data reflects industry preferences.

Data Preparation – Build a pipeline that extracts, cleans, and validates data, resulting in ~10 TB of general corpus plus 1 TB of financial text.

Vocabulary Expansion – Adopt a character‑level extension adding 7 K Chinese characters, yielding a 39 K token vocabulary with 48 % compression.

Two‑Stage Pre‑training – Stage 1 updates embeddings and decoder layers for 40 B tokens; Stage 2 performs full‑model training on 300 B tokens, balancing Chinese and English data.

Instruction Fine‑tuning (SFT) – Mix generic and finance‑specific instructions in a 4:1 ratio, using automated generation and human rewriting to improve dialogue quality.

Value Alignment – Apply reinforcement learning with a pairwise reward model and PPO optimization to align the model’s values with human preferences.

Engineering Optimizations

Memory and compute bottlenecks are tackled by:

Increasing batch size and reducing memory footprint (87 % memory reduction, 3× batch size, 36 % throughput gain).

Adopting FlashAttention and custom kernels (26 % further throughput increase).

Improving distributed communication efficiency, achieving near‑linear scaling up to 640 GPUs with 94 % efficiency.

Evaluation Method Innovations

We design a multi‑dimensional evaluation framework:

Horizontal evaluation: Compare different models across tasks.

Vertical evaluation: Track a single model’s progress across versions and optimization stages.

Pre‑training metrics include loss, NLP benchmarks, and perplexity on unseen data.

Fine‑tuning assessment adds dialogue‑specific tests and extensive human evaluation.

Reinforcement stage checks that capabilities do not regress and that safety, usefulness, and stability improve.

Application Innovations

Key use‑cases demonstrated:

Marketing: Real‑time, personalized content generation boosts acquisition efficiency.

Customer Service: AI‑assisted agents achieve a 25 % efficiency gain; full automation is explored with quality safeguards.

Operations: Data‑driven analytics unify standards, accelerating decision‑making.

Office Assistance: Seamless AI assistants accelerate knowledge acquisition for new employees.

Risk Control: Combining model reasoning with traditional decision‑based risk systems enhances proactive, real‑time risk management.

Summary: Financial Large Model Iteration Path

The evolution of financial large models follows a dual‑track of continuous training and rigorous evaluation, driving iterative improvements much like an employee’s growth cycle—learning, feedback, and ongoing innovation.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Large Language Models model training Applications finance AI

Written by

Data Thinking Notes

Sharing insights on data architecture, governance, and middle platforms, exploring AI in data, and linking data with business scenarios.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.