Tagged articles
129 articles
Page 2 of 2
Alimama Tech
Alimama Tech
Apr 10, 2024 · Artificial Intelligence

SizeCube: AI‑Driven Arbitrary‑Size Image and Video Outpainting for Advertising

SizeCube leverages Stable Diffusion‑based diffusion models and a sophisticated pipeline—including quality filtering, feature mining, latent‑space UNet denoising, super‑resolution, and temporal 3D‑U‑Net video processing—to automatically outpaint images and videos to any size, boosting Alibaba advertisers’ creative flexibility, click‑through rates, and asset adaptability across diverse ad placements.

AIAdvertisingImage Outpainting
0 likes · 14 min read
SizeCube: AI‑Driven Arbitrary‑Size Image and Video Outpainting for Advertising
Architects' Tech Alliance
Architects' Tech Alliance
Apr 7, 2024 · Artificial Intelligence

How Sora Is Redefining Text‑to‑Video Generation: Inside the New AI Model

Sora, the newly announced text‑to‑video large model, can generate one‑minute high‑fidelity videos from textual prompts or static images, handling complex scenes, expressive characters, and sophisticated camera motions while also supporting video extension and frame‑filling, positioning it at the forefront of multimodal AI research.

AI modelMultimodalSora
0 likes · 6 min read
How Sora Is Redefining Text‑to‑Video Generation: Inside the New AI Model
Architect
Architect
Mar 28, 2024 · Artificial Intelligence

Understanding OpenAI's Sora Video Generation Model: Architecture, Workflow, and Core Technologies

This article explains OpenAI's Sora video generation model, detailing its latent diffusion foundation, video compression network, spacetime patch representation, Diffusion Transformer processing, and decoding pipeline, while also reviewing related Stable Diffusion and Transformer concepts that enable high‑quality text‑to‑video synthesis.

AILatent DiffusionSora
0 likes · 17 min read
Understanding OpenAI's Sora Video Generation Model: Architecture, Workflow, and Core Technologies
DevOps
DevOps
Mar 26, 2024 · Artificial Intelligence

OpenAI’s Sora: A One‑Minute Text‑to‑Video Diffusion Transformer Model

OpenAI’s newly released Sora model demonstrates one‑minute text‑to‑video generation using a diffusion‑based transformer architecture that operates on spatiotemporal patches, compresses visual data into latent codes, and builds on a wide range of prior video generation research, while the article also advertises a DevOps certification program.

AIOpenAISora
0 likes · 8 min read
OpenAI’s Sora: A One‑Minute Text‑to‑Video Diffusion Transformer Model
DaTaobao Tech
DaTaobao Tech
Mar 25, 2024 · Artificial Intelligence

Survey of AIGC Video Generation Algorithms

Since 2023, AI‑generated video research has expanded across six algorithmic categories—text‑to‑video, image‑to‑video, editing, style transfer, human motion, and long‑video generation—highlighting works such as CogVideo, Imagen Video, MagicVideo, ControlVideo, DCTNet, NUWA‑XL and OpenAI’s Sora, while analysis shows short‑clip diffusion models excel, editing remains costly, style transfer is efficient, and truly long, temporally consistent videos remain an open challenge.

AIAIGCVideo Editing
0 likes · 13 min read
Survey of AIGC Video Generation Algorithms
NewBeeNLP
NewBeeNLP
Mar 22, 2024 · Artificial Intelligence

Unraveling Sora: How OpenAI Might Build Its Text‑to‑Video Engine

This article provides a step‑by‑step technical analysis of OpenAI’s Sora model, examining its possible overall architecture, video encoder‑decoder design, Spacetime Latent Patch mechanism, transformer‑based diffusion process, training strategies, and long‑term consistency techniques, while grounding each speculation in publicly available reports and related research.

AI analysisSoraTransformer
0 likes · 50 min read
Unraveling Sora: How OpenAI Might Build Its Text‑to‑Video Engine
NewBeeNLP
NewBeeNLP
Mar 20, 2024 · Artificial Intelligence

How Open‑Sora 1.0 Replicates Sora: Architecture, Training Pipeline & Performance Insights

This article provides a comprehensive technical walkthrough of Open‑Sora 1.0, covering its Diffusion‑Transformer architecture, three‑stage training strategy, data‑preprocessing scripts, generation quality, and the Colossal‑AI acceleration that together make Sora‑level video synthesis openly reproducible.

AI VideoDiffusion TransformerOpen-Sora
0 likes · 12 min read
How Open‑Sora 1.0 Replicates Sora: Architecture, Training Pipeline & Performance Insights
21CTO
21CTO
Mar 17, 2024 · Artificial Intelligence

What Data Powers OpenAI’s Upcoming Video Model Sora?

OpenAI CTO Mira Murati provided vague answers about Sora’s training data, confirming the use of publicly available, licensed, and Shutterstock content while acknowledging uncertainty about social‑media sources, amid ongoing legal disputes over AI model data usage.

AI training dataOpenAISora
0 likes · 4 min read
What Data Powers OpenAI’s Upcoming Video Model Sora?
Alimama Tech
Alimama Tech
Mar 14, 2024 · Artificial Intelligence

High-Fidelity Image-to-Video Generation for E-commerce with AtomoVideo and Noise Rectification

Alibaba’s AI team introduced AtomoVideo, a diffusion‑based image‑to‑video generator enhanced by a training‑free Noise Rectification module that adds and corrects controlled noise to eliminate first‑frame errors, enabling merchants to automatically create high‑fidelity 4‑second 720p product videos with strong temporal consistency for e‑commerce advertising.

AIAIGCVideo Generation
0 likes · 10 min read
High-Fidelity Image-to-Video Generation for E-commerce with AtomoVideo and Noise Rectification
Sohu Tech Products
Sohu Tech Products
Mar 6, 2024 · Artificial Intelligence

Analysis of OpenAI Sora: Data Engineering, Network Architecture, and World Model Implications

OpenAI’s Sora video model unifies image and video data into latent spacetime patches via a VAE, trains on original resolutions with GPT‑4‑expanded captions, employs a Diffusion Transformer backbone for patch‑wise denoising, and demonstrates 3D‑consistent, long‑term world‑model capabilities that hint at a unified computer‑vision paradigm and steps toward AGI.

AI researchOpenAI SoraTransformer
0 likes · 9 min read
Analysis of OpenAI Sora: Data Engineering, Network Architecture, and World Model Implications
Architects' Tech Alliance
Architects' Tech Alliance
Feb 25, 2024 · Artificial Intelligence

How Sora Redefined Video Generation: Breakthroughs and Industry Impact

The article provides an in‑depth technical analysis of OpenAI's Sora, highlighting its 60‑second 1080p video generation capability, the novel patches‑vectorization and transformer training pipeline that leverages GPT‑generated prompts for multimodal alignment, and its potential to become a universal video‑generation base model that could reshape the AI industry.

AGILarge Language ModelSora
0 likes · 6 min read
How Sora Redefined Video Generation: Breakthroughs and Industry Impact
CSS Magic
CSS Magic
Feb 20, 2024 · Artificial Intelligence

OpenAI’s Sora Video Model Is Hyped—But Here Are the Flaws OpenAI Itself Acknowledges

The article walks through OpenAI’s own admission of Sora’s shortcomings—such as unrealistic physics, misplaced spatial details, and erratic object behavior—by showcasing concrete demo failures, additional observations, and technical notes about its diffusion‑based, transformer architecture and metadata embedding.

AI limitationsOpenAISora
0 likes · 7 min read
OpenAI’s Sora Video Model Is Hyped—But Here Are the Flaws OpenAI Itself Acknowledges
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Feb 19, 2024 · Artificial Intelligence

Technical Review of OpenAI's Sora Video Generation Model

This article reviews OpenAI's Sora video generation model, summarizing its technical report, key innovations such as patch-based visual tokens, compression networks, scaling transformers, language understanding, and discussing its capabilities, highlights, and current limitations in AI video synthesis.

AIOpenAISora
0 likes · 9 min read
Technical Review of OpenAI's Sora Video Generation Model
Architects' Tech Alliance
Architects' Tech Alliance
Feb 18, 2024 · Artificial Intelligence

How OpenAI’s Sora Redefines Video Generation with 3‑D Consistency and World Simulation

OpenAI’s Sora model introduces a diffusion‑transformer approach that generates high‑fidelity, 60‑second videos with consistent 3‑D camera motion, long‑term object persistence, and the ability to simulate interactive digital worlds, backed by a detailed technical report and research paper.

Artificial IntelligenceOpenAISora
0 likes · 9 min read
How OpenAI’s Sora Redefines Video Generation with 3‑D Consistency and World Simulation
21CTO
21CTO
Feb 17, 2024 · Artificial Intelligence

How OpenAI’s Sora Is Pushing Video Generation to New Frontiers

OpenAI’s Sora model demonstrates large‑scale text‑conditional video generation using a diffusion transformer that operates on spatiotemporal patches, supporting variable durations, resolutions, and aspect ratios while showcasing emergent simulation abilities, flexible sampling, and multimodal editing capabilities, though it still has notable limitations.

AI researchMultimodalSora
0 likes · 19 min read
How OpenAI’s Sora Is Pushing Video Generation to New Frontiers
NewBeeNLP
NewBeeNLP
Feb 17, 2024 · Artificial Intelligence

How Sora Highlights the Next Leap Toward AGI and Shifts AI Competition

The article analyzes OpenAI's Sora video model, arguing that its integration of large‑language‑model reasoning with diffusion techniques marks a major step toward true world understanding, reshapes creative workflows, widens the AI talent gap, and accelerates the path to artificial general intelligence.

AGIAI trendsSora
0 likes · 7 min read
How Sora Highlights the Next Leap Toward AGI and Shifts AI Competition
Architect
Architect
Feb 16, 2024 · Artificial Intelligence

Can OpenAI’s Sora Redefine Text‑to‑Video Generation? An In‑Depth Technical Review

OpenAI’s newly unveiled Sora model transforms short text prompts into up‑to‑one‑minute high‑definition videos, showcasing advanced diffusion‑Transformer architecture, improved occlusion handling, and detailed visual fidelity, while the article examines its technical breakthroughs, compares it to earlier models, and discusses emerging safety and misuse concerns.

AI safetyOpenAISora
0 likes · 12 min read
Can OpenAI’s Sora Redefine Text‑to‑Video Generation? An In‑Depth Technical Review
AntTech
AntTech
Dec 20, 2022 · Artificial Intelligence

Towards Smooth Video Composition: A New Benchmark for GAN‑Based Video Generation

Researchers from multiple institutions propose a GAN‑based video generation framework that explicitly models short‑, medium‑, and long‑range temporal relations, introduces B‑spline motion embeddings and temporal shift modules, and demonstrates substantial quality improvements across several video datasets.

B-splineGaNStyleGAN-V
0 likes · 7 min read
Towards Smooth Video Composition: A New Benchmark for GAN‑Based Video Generation
Alimama Tech
Alimama Tech
Oct 26, 2022 · Artificial Intelligence

GPU Utilization Analysis and Optimization for Alibaba's Intelligent Creative Video Service

The paper analyzes why Alibaba Mama’s intelligent creative video service suffers low GPU utilization—due to Python GIL blocking, lack of kernel fusion, and serialized CUDA streams—and details service‑level changes (separate CPU/GPU processes, shared‑memory queues, priority scheduling) and operator‑level kernel‑fusion techniques (channels‑last layouts, custom pooling, TensorRT conversion) that raise utilization from ~30 % to near 100 % and boost throughput by 75 %.

GPU optimizationPythonTensorRT
0 likes · 20 min read
GPU Utilization Analysis and Optimization for Alibaba's Intelligent Creative Video Service
MaGe Linux Operations
MaGe Linux Operations
Jul 3, 2022 · Backend Development

How to Automate 10,000 Video‑Channel Posts with Python and OCR for Massive Traffic

This guide shows how to use Python to scrape high‑quality chat screenshots, apply OCR, generate silent chat videos, batch‑download matching audio from short‑video platforms, and combine them into thousands of unique WeChat Video Channel clips, leveraging volume to outsmart recommendation algorithms and boost traffic.

OCRPythonVideo Generation
0 likes · 11 min read
How to Automate 10,000 Video‑Channel Posts with Python and OCR for Massive Traffic
Tencent Advertising Technology
Tencent Advertising Technology
Nov 2, 2021 · Artificial Intelligence

Tencent Advertising Multimedia AI Platform: Intelligent Creation, Fine‑grained Understanding, Similar‑Ad Retrieval, and Smart Review

This article presents Tencent's advertising multimedia AI platform, detailing its intelligent video creation engine, fine‑grained ad content understanding, large‑scale similar‑ad retrieval system, and automated ad review pipeline, while also introducing the team and current recruitment opportunities.

MultimediaVideo Generationad understanding
0 likes · 22 min read
Tencent Advertising Multimedia AI Platform: Intelligent Creation, Fine‑grained Understanding, Similar‑Ad Retrieval, and Smart Review
Yanxuan Tech Team
Yanxuan Tech Team
Apr 19, 2021 · Artificial Intelligence

How AI Powers Personalized Ad Creatives: From Templates to Automated Video

This article explains how algorithmic "smart creative" technology automates personalized advertising by using data‑driven templates, image and video synthesis, and aesthetic scoring to generate high‑click‑through ad content while reducing manual production costs.

AI-generated creativesVideo Generationimage composition
0 likes · 7 min read
How AI Powers Personalized Ad Creatives: From Templates to Automated Video
DataFunTalk
DataFunTalk
Nov 22, 2020 · Artificial Intelligence

Short Video Analysis in Local Life Scenarios: Techniques and Practices at Meituan

This article presents Meituan's AI-driven short video analysis workflow, covering industry trends, multi‑label video classification, intelligent cover selection, and video generation techniques, while discussing challenges, model building, label expansion, continuous data iteration, and future outlook for video AI in local services.

AIMeituanVideo Generation
0 likes · 16 min read
Short Video Analysis in Local Life Scenarios: Techniques and Practices at Meituan
DataFunSummit
DataFunSummit
Nov 5, 2020 · Artificial Intelligence

Short Video Analysis for Local Life Scenarios: Techniques and Practices at Meituan

This article presents Meituan's AI‑driven short‑video analysis pipeline for local‑life scenarios, covering industry trends, multi‑label classification, intelligent cover selection, and video generation, and discusses model construction, label‑system expansion, continuous data iteration, and practical applications in restaurant and hotel domains.

AIMeituanVideo Generation
0 likes · 16 min read
Short Video Analysis for Local Life Scenarios: Techniques and Practices at Meituan