Industry Trends and Challenges of Large Language Models in Enterprise Applications (2023 Review)
The article reviews the rapid development of large language models in enterprise settings, covering internal collaboration tools, AI assistants for development and marketing, multimodal generation, inference speed bottlenecks, resource constraints, and future directions such as open‑source models and academic‑industry cooperation.
Over the past year, after the "hundred‑model battle," the lack of a killer application has led enterprises to adopt a conservative stance, focusing on B‑to‑B efficiency improvements where performance and cost are paramount, while internal collaboration and external marketing services are growing faster.
Internal collaboration includes code generation and office productivity tools; companies are packaging various development functions to maximize platform usability and scalability, with front‑end and client‑side code adoption higher than server‑side, though the latter remains usable.
Many firms are building AI‑powered Q&A assistants to answer developers' questions about API usage or field meanings, replacing on‑call staff, and leveraging knowledge bases in e‑commerce to reduce operational costs; at the OS level, AI agents are explored for protocol‑ or dialogue‑based work.
Industry sentiment remains skeptical about a unified large‑model API, believing that apps will persist and each will gain an intelligent assistant.
In external marketing, large models enable seamless creative generation combined with recommendation recall, shifting from a two‑stage manual‑creative‑then‑recall workflow to a single‑stage generate‑and‑recommend process, while offering templates to boost user publishing activity.
Text‑to‑video startups attract attention, yet the technology is still early, facing challenges in storyline coherence, visual clarity, and frame stability; traditional image‑or‑video stitching remains mature, and multimodal research stays a focal point.
Large‑model applications in advertising split into integration with recommendation systems (early recall stage) and conversational recommendation, the latter aligning well with model behavior but constrained by inference latency, which must improve by roughly 100× to meet sub‑100 ms user expectations.
Technical challenges include inference speed, model controllability, uncertainty perception, and device resource limits; neither open‑source nor proprietary models have fully solved these issues.
Current mitigation strategies avoid direct C‑side deployment, use vector databases, RAG, fine‑tuning, and pre‑stage regex filtering to reduce hallucinations, though they cannot eliminate them entirely.
Simple methods prove useful for detecting low‑volume harmful content without complex intent‑classification algorithms, emphasizing a product‑centric rather than point‑specific technical solution.
Resource scarcity, especially GPU shortages, remains a major obstacle; domestic ecosystems lag behind leading foreign providers.
Enterprises must explore heterogeneous compute environments to maximize utilization while ensuring stability and security, addressing bugs at chip, operator, and graph layers to achieve parity with global leaders.
Long‑term progress requires deeper collaboration with academia to build robust large‑scale training and inference capabilities, particularly on the training side.
Open‑source large models, especially Llama, are expected to be a key variable in 2024, though open‑source pre‑trained models alone offer limited utility without accompanying datasets.
Some companies are using large models as API query interfaces for intent detection, with GPT‑series models performing best, but data leakage concerns arise, especially in client‑side applications.
Consequently, many firms prefer domestic large‑model services or explore local deployments, continuously expanding interface capabilities via GPT, domestic, or on‑premise models.
Agent technology hinges on planning, which remains unsolved; agents may be less suitable for consumer apps but valuable as internal NLP tools, accelerating development that previously required high cost and effort.
Large models will continue to dominate language and multimodal domains, while pure visual models remain computationally prohibitive.
From data integration and governance to model training and inference, big data and AI remain tightly intertwined; the upcoming DataFunCon2024 Shanghai event will gather experts to discuss current and future data‑intelligence applications.
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.