One-Click Ad Video from Assets + Brief, plus Baidu’s 8B Text-to-Image – An AI Toolbox
The article introduces three open‑source AI tools—a video editor that turns raw footage and a brief into a finished ad, Baidu's 8‑billion‑parameter text‑to‑image model that runs on 24 GB GPUs, and a weekly AI‑developer digest that auto‑generates Chinese reports—detailing their workflows, benchmarks, usage commands, and target users.
01 | agentic-video-editor
Manual editing of raw footage into a 30‑second advertisement typically requires half a day of selecting shots, arranging rhythm, exporting, and reviewing.
The tool replaces this workflow with an AI Agent pipeline consisting of:
Original footage + creative brief
↓
[Pre‑processing] – scene detection, speech‑to‑text, shot indexing
↓
[Director Agent] – AI searches footage, selects shots, creates an edit plan
↓
[Refinement Agent] – fine‑tunes start/end of each shot
↓
[Edit Agent] – FFmpeg renders MP4
↓
[Review Agent] – scores relevance, rhythm, visual quality, viewing experience, overall (0‑1 each)
↓
If overall score < threshold → feedback to Director Agent (max 3 retries)Running the editor requires a single command:
ave edit \
--footage-dir /path/to/your/footage \
--brief '{"product": "My Product", "audience": "Women 25-45", "tone": "authentic", "duration_seconds": 30}' \
--pipeline pipelines/ugc-ad.yaml \
--style styles/dtc-testimonial.yamlThe built‑in DTC template follows the hook → problem → solution → social proof → CTA structure; custom YAML pipelines can be authored to combine agents differently.
02 | ERNIE‑Image
ERNIE‑Image is Baidu’s open‑source diffusion‑transformer (DiT) model with 8 B parameters, achieving state‑of‑the‑art results among open‑weight text‑to‑image models.
GenEval benchmark scores:
Overall 0.8856 (higher than Qwen‑Image 0.8683 and FLUX.2‑klein‑9B 0.8481)
LongTextBench (Chinese long‑text) 0.9733, comparable to Seedream 4.5 0.9882
Key strengths identified in the source:
Text rendering – long paragraphs, dense typography, layout‑rich images (posters, infographics, UI mockups)
Complex instruction compliance – accurate handling of multi‑object, relational, knowledge‑intensive prompts
Structured generation – posters, comics, storyboards, multi‑panel graphics
Consumer‑grade deployment – runs on a single GPU with 24 GB VRAM
Two released variants:
ERNIE‑Image (SFT version) – 50 inference steps, guidance scale 4.0
ERNIE‑Image‑Turbo (DMD+RL accelerated) – 8 inference steps, guidance scale 1.0
Example usage via HuggingFace:
import torch
from diffusers import ErnieImagePipeline
pipe = ErnieImagePipeline.from_pretrained(
"baidu/ERNIE-Image",
torch_dtype=torch.bfloat16,
).to("cuda")
image = pipe(
prompt="a black‑and‑white Chinese countryside dog",
height=1024, width=1024,
num_inference_steps=50,
guidance_scale=4.0,
use_pe=True,
).images[0]03 | ai-influence-digest
The tool monitors public activity of more than 65 AI developers, filters posts that are immediately useful for content creators, and generates a structured Chinese weekly briefing without relying on the X (Twitter) API.
Core features:
No X API dependency – fully compliant and avoids account bans
Coverage of tools, workflows, tutorials, prompts across 65+ developers
Automatic rendering of Xiaohongshu‑style long‑image screenshots for easy sharing
Markdown‑formatted Chinese summary output
Three‑step workflow:
# Step 1: Scan candidate posts
python3 scripts/scan_x_weekly.py \
--accounts references/accounts_65.txt \
--days 7 \
--outdir ./output/ai-influence-digest
# Step 2: Human review and assemble Markdown weekly report
# (filter criteria in references/filters.md)
# Step 3: Render Xiaohongshu‑style report screenshot
bash scripts/render_weekly_screenshots.sh \
./output/ai-influence-digest/weekly_report.md \
./output/ai-influence-digest/weekly_report.png \
"2026-04-18"Summary
agentic-video-editor – automates raw footage editing into ads via an AI Agent pipeline with automatic review and up to three retry cycles.
ERNIE‑Image – 8 B diffusion‑transformer delivering state‑of‑the‑art text‑to‑image generation on a single 24 GB GPU; excels at Chinese text rendering and structured graphics.
ai-influence-digest – continuously tracks 65+ AI developers, filters high‑value updates, and produces a ready‑to‑share Chinese weekly briefing.
All projects are open source. Repository URLs: https://github.com/poseljacob/agentic-video-editor, https://github.com/baidu/ERNIE-Image, https://github.com/koffuxu/ai-influence-digest.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Geek Labs
Daily shares of interesting GitHub open-source projects. AI tools, automation gems, technical tutorials, open-source inspiration.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
