BigBang Transformer (BBT): A 1‑Billion‑Parameter Financial Pre‑trained Language Model with Time‑Series‑Text Cross‑Modal Architecture
The BigBang Transformer (BBT) is a 1‑billion‑parameter financial pre‑trained language model that combines text and time‑series data in a cross‑modal Transformer architecture, achieving up to 10% higher downstream accuracy than T5‑scale models and demonstrating strong performance on financial NLP tasks, time‑series forecasting, and multi‑factor investment strategies.
SuperSymmetry Technology announced the BigBang Transformer (BBT), a 1‑billion‑parameter financial pre‑trained language model that integrates textual and time‑series modalities through a novel cross‑modal Transformer architecture. Compared with T5‑scale models, BBT improves downstream task accuracy by nearly 10% and significantly raises the R² score for time‑series prediction.
The article first outlines the limitations of existing large language models such as GPT‑3, LaMDA, and PaLM, which perform well on general tasks but struggle in specialized domains and with non‑textual modalities like time‑series and tabular data.
SuperSymmetry’s solution is a time‑series‑text cross‑modal pre‑training framework that jointly encodes textual and temporal data. The model follows an encoder‑decoder (T5‑like) design, where both modalities are fed into a shared bidirectional Transformer encoder and the decoder produces outputs for NLU, NLG, and time‑series tasks.
The architecture introduces a universal, model‑agnostic vector time‑representation component called DWT‑ST2Vec , which decomposes time series into low‑frequency/high‑frequency and global/local components, enabling the model to learn richer temporal patterns.
SuperSymmetry also built the BBTCorpus, a 300 GB Chinese financial corpus containing over 800 billion tokens from news, announcements, research reports, books, government documents, and social media, making it the largest and most comprehensive financial dataset to date.
Experimental results show that domain‑specific pre‑training improves average downstream accuracy by ~0.7% (Similarity‑Weighted Sampling) and an additional 3.21% when source prompts are added. Scaling the model to 1 billion parameters further boosts performance across eight downstream tasks.
The BBT model can generate analyst‑style commentary from stock price series, produce market‑trend reports for e‑commerce, and write understandable operation fault reports for manufacturing, demonstrating its potential beyond finance.
To support the community, SuperSymmetry released the BBT‑FinCUGE benchmark, the first large‑scale Chinese financial NLP dataset covering sentiment analysis, event extraction, causal event extraction, summarization, relation extraction, negative‑news detection, news classification, and event‑subject extraction.
Developers can access eleven APIs (knowledge graph, summarization, sentiment detection, classification, NER, relation extraction, event extraction, causal event extraction, announcement extraction, negative‑news detection, etc.) to build applications in finance and other industries.
Overall, BBT aims to become the foundational AI model for financial investment, offering a unified framework that bridges numerical time‑series data and human language, with broader applicability to manufacturing, IoT, smart cities, and big‑data analytics.
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.