Integrating Data and AI for Platform Engineering: IDP Practices, Model Fine‑Tuning, and R&D Efficiency at Qunhe Technology
The article details how Qunhe Technology combines big data and AI within an Internal Developer Product (IDP) framework to boost software development efficiency, outlines architectural decisions, presents fine‑tuning pipelines for code‑review models, and shares interview insights from senior technical director Dr. Hu Guanghuan on practical implementations and ROI.
Qunhe Technology’s philosophy of data‑driven and AI‑enabled platform engineering delivers powerful implementation capabilities, with the Internal Developer Product (IDP) system offering a high‑ROI path for improving the efficiency of software development teams of any size.
The company has built a closed‑loop efficiency‑boosting cycle that spans IDP digitization, productized delivery processes, data‑warehouse/data‑lake pipelines, large‑language‑model (LLM) and multimodal model capabilities, and AI‑assisted code review, deployment, and release.
At the Shenzhen DA Digital Technology Conference (July 25‑26), senior technical director Dr. Hu Guanghuan presented real‑world scenarios on designing and implementing multiple IDP products and leveraging big data and AI for high‑ROI efficiency gains.
Technical Vision and Industry Pain Points
Traditional efficiency tools (low‑code platforms, automated testing) often fail to address complex development workflows because they target isolated stages and rely heavily on users’ technical skill. Core bottlenecks include the lack of systematic, data‑driven identification of the highest‑ROI improvement points and the heavy dependence on manual scripting for observability and analysis.
Qunhe’s solution integrates data and AI into platform engineering, treating the entire delivery flow as a special “business line” that reuses the company’s existing big‑data infrastructure and incorporates high‑cohesion, low‑coupling IDP services as OLTP components, enabling data‑driven decision making and large‑scale LLM deployment.
Architecture Design and Technical Practice
Cross‑department collaboration (data, AI, business) is achieved by reorganizing the infrastructure team to include middleware, observability, DevOps, security, and big‑data sub‑teams, each owning one or more IDPs that together form a matrix covering the full delivery pipeline.
Metrics balance technical dimensions (code quality) with business value (delivery speed) using a data‑lake for multi‑granularity indicator calibration.
AI integration in code review, requirement analysis, and design leverages LLMs for tasks such as document generation, code generation, and automated review, with concrete ROI figures (e.g., 40% demand‑analysis efficiency, 30% code‑review time reduction).
Fine‑Tuning Data Set Design
{类型}:填入“PullRequest”、"MergeRequest"或"Commit"等
{diff内容}:直接贴gitdiff(支持多文件)
{结构摘要}:针对代码增删改的自动摘要,可用ast/tree‑sitter等工具提取结构要点
{think}:思考过程(非必选)
{output}:整体结构变化、潜在风险、优化建议Example JSONL entry for a Python backend service:
{"input":"<SYSTEM>\n你是一名资深代码评审专家。\n<USER>\n请对以下PullRequest的代码变更,从[整体结构变化]、[潜在风险]、[优化建议]三个维度进行专业评审。\n\n【变更diff】\n<<<\n-defsave_data(data):\n- db.save(data)\n+defsave_data(data):\n+ try:\n+ db.save(data)\n+ except Exception as e:\n+ logger.error(f\"save_datafailed:{e}\")\n>>>\n\n【结构摘要】\n<<<\nsave_data函数负责将数据保存到数据库,是服务的数据存储关键环节。\n>>>\n","think":"首先分析结构变化...","output":"[整体结构变化]..."}Fine‑Tuning Technical Details and Model Selection
Supervised Fine‑Tuning (SFT) using LoRA/QLoRA or P‑Tuning v2 with HuggingFace Transformers + PEFT.
Prompt engineering to highlight structure summaries.
Model choice: open‑source Qwen‑3‑8B (requires 24 GB+ GPU, 1‑3 epochs, 100‑200 steps/hour).
Model Evaluation and Continuous Optimization
Accuracy: proportion of model‑generated review comments matching expert annotations.
Recall: proportion of actual code issues identified by the model.
F1‑score: harmonic mean of accuracy and recall.
Practical ROI Verification
Demand analysis & documentation: 40% efficiency gain, 30% reduction in omissions.
Code generation & assistance: 30% faster coding, 40% AI code adoption, 80% reduction in unit‑test writing time.
Code review: 30% shorter review time, 20% higher PR pass rate.
Security is ensured via traditional DevSecOps rule checks, static analysis, and AI‑code‑review double verification before merge.
Industry Outlook
For small teams, adopt cloud‑based IDP services and AI assistants (e.g., Cursor, Copilot) early; scale dedicated IDP teams once the repo count exceeds ~50. The ultimate “AI‑native” R&D model envisions engineers focusing on leveraging LLMs and agents rather than low‑level coding.
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.