Artificial Intelligence 8 min read

Applying Large Language Models for Automated Test Case Generation at KooJiaLe

This article describes how KooJiaLe, a leading 3D design company, built an AI‑powered platform that uses large language models to automate test case generation, detailing its workflow, generation modes, editing features, export options, optimization efforts, results, and remaining challenges.

Continuous Delivery 2.0
Continuous Delivery 2.0
Continuous Delivery 2.0
Applying Large Language Models for Automated Test Case Generation at KooJiaLe

Many software companies are now exploring the use of large language models (LLMs) to assist engineers throughout the software development lifecycle. This article reports on KooJiaLe, a leading 3D‑design enterprise, and its internal AI testing group’s attempts to generate test cases using LLMs.

The company believes that a private Retrieval‑Augmented Generation (RAG) system is essential because generic LLMs cannot cover specific business domains.

Improving test‑case authoring efficiency : Traditional manual test‑case writing consumes significant time. By automatically generating initial test cases with AI and then having testers review and refine them, the platform shortens preparation time and boosts testing efficiency.

2.2 Test‑case generation methods

Direct generation : Paste requirement text into the input box on the “Direct Generation” tab and click “Generate Test Cases”.

Image upload : Click “Upload Image” on the “Direct Generation” tab, adjust the recognized result, then generate test cases.

Free‑prompt generation : Paste requirements into the “Free Generation” tab, optionally edit the platform‑provided prompt, and generate test cases.

2.3 Test‑case edit, add, delete

The platform allows online editing of generated test cases, supporting addition and deletion operations (see screenshots).

2.4 Test‑case export

Direct import to internal test‑case management platform for review and test‑plan creation.

Export to XMind.

3. Test‑case generation workflow

The offline generation process includes: requirement storage, scheduled retrieval, preprocessing, prompt assembly, GPT service invocation, test‑case parsing, failure retry, task status update, and finally storing the test cases.

4. Tool optimization process

Initially, over 50% of generation tasks failed. The team performed extensive analysis and optimizations.

4.1 Root‑cause analysis

Service stability : Early reliance on a single GPT service caused failures when the service was unstable.

Input length limits : GPT’s token limit prevented handling large requirement texts.

Technical implementation : Front‑end issues caused browser‑level request blocking.

4.2 Handling service instability

Retry mechanism : Two additional requests on failure, though limited impact due to short retry intervals.

Introducing alternative models : Added Wenxin Yiyan and Minimax as backup engines; when GPT fails, the system switches to these models, improving reliability.

4.3 Handling length restrictions

Combined Wenxin Yiyan (for Chinese requirement understanding) with GPT (for test‑case generation) to bypass token limits while leveraging each model’s strengths.

4.4 Other optimizations

Encrypted user input on the front‑end to prevent XSS attacks.

Opened a free‑prompt feature, allowing users to fine‑tune prompts for better results.

5. Summary & Outlook

5.1 Achievements : Over 300 generation tasks have been created, producing more than 2,000 test cases with an 80%+ success rate, noticeably improving tester efficiency.

5.2 Limitations and issues :

Insufficient domain knowledge leads to incomplete or inaccurate test cases.

Limited handling of non‑functional requirements such as performance and security.

Complex scenarios may require deeper understanding beyond the AI’s capability.

Lack of effective evaluation metrics makes it hard to assess the usefulness of generated cases.

Overall, while the AI‑driven platform shows promising results, further research and enhancements are needed to address domain expertise, non‑functional testing, and evaluation challenges.

AIAutomationLLMsoftware testingR&Dtest case generation
Continuous Delivery 2.0
Written by

Continuous Delivery 2.0

Tech and case studies on organizational management, team management, and engineering efficiency

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.