Tagged articles

129 articles

Page 2 of 2

Apr 10, 2024 · Artificial Intelligence

SizeCube: AI‑Driven Arbitrary‑Size Image and Video Outpainting for Advertising

SizeCube leverages Stable Diffusion‑based diffusion models and a sophisticated pipeline—including quality filtering, feature mining, latent‑space UNet denoising, super‑resolution, and temporal 3D‑U‑Net video processing—to automatically outpaint images and videos to any size, boosting Alibaba advertisers’ creative flexibility, click‑through rates, and asset adaptability across diverse ad placements.

AIAdvertisingImage Outpainting

0 likes · 14 min read

SizeCube: AI‑Driven Arbitrary‑Size Image and Video Outpainting for Advertising

Architects' Tech Alliance

Apr 7, 2024 · Artificial Intelligence

How Sora Is Redefining Text‑to‑Video Generation: Inside the New AI Model

Sora, the newly announced text‑to‑video large model, can generate one‑minute high‑fidelity videos from textual prompts or static images, handling complex scenes, expressive characters, and sophisticated camera motions while also supporting video extension and frame‑filling, positioning it at the forefront of multimodal AI research.

AI modelMultimodalSora

0 likes · 6 min read

How Sora Is Redefining Text‑to‑Video Generation: Inside the New AI Model

Architect

Mar 28, 2024 · Artificial Intelligence

Understanding OpenAI's Sora Video Generation Model: Architecture, Workflow, and Core Technologies

This article explains OpenAI's Sora video generation model, detailing its latent diffusion foundation, video compression network, spacetime patch representation, Diffusion Transformer processing, and decoding pipeline, while also reviewing related Stable Diffusion and Transformer concepts that enable high‑quality text‑to‑video synthesis.

AILatent DiffusionSora

0 likes · 17 min read

Understanding OpenAI's Sora Video Generation Model: Architecture, Workflow, and Core Technologies

DevOps

Mar 26, 2024 · Artificial Intelligence

OpenAI’s Sora: A One‑Minute Text‑to‑Video Diffusion Transformer Model

OpenAI’s newly released Sora model demonstrates one‑minute text‑to‑video generation using a diffusion‑based transformer architecture that operates on spatiotemporal patches, compresses visual data into latent codes, and builds on a wide range of prior video generation research, while the article also advertises a DevOps certification program.

AIOpenAISora

0 likes · 8 min read

OpenAI’s Sora: A One‑Minute Text‑to‑Video Diffusion Transformer Model

DaTaobao Tech

Mar 25, 2024 · Artificial Intelligence

Survey of AIGC Video Generation Algorithms

Since 2023, AI‑generated video research has expanded across six algorithmic categories—text‑to‑video, image‑to‑video, editing, style transfer, human motion, and long‑video generation—highlighting works such as CogVideo, Imagen Video, MagicVideo, ControlVideo, DCTNet, NUWA‑XL and OpenAI’s Sora, while analysis shows short‑clip diffusion models excel, editing remains costly, style transfer is efficient, and truly long, temporally consistent videos remain an open challenge.

AIAIGCVideo Editing

0 likes · 13 min read

Survey of AIGC Video Generation Algorithms

NewBeeNLP

Mar 22, 2024 · Artificial Intelligence

Unraveling Sora: How OpenAI Might Build Its Text‑to‑Video Engine

This article provides a step‑by‑step technical analysis of OpenAI’s Sora model, examining its possible overall architecture, video encoder‑decoder design, Spacetime Latent Patch mechanism, transformer‑based diffusion process, training strategies, and long‑term consistency techniques, while grounding each speculation in publicly available reports and related research.

AI analysisSoraTransformer

0 likes · 50 min read

Unraveling Sora: How OpenAI Might Build Its Text‑to‑Video Engine

NewBeeNLP

Mar 20, 2024 · Artificial Intelligence

How Open‑Sora 1.0 Replicates Sora: Architecture, Training Pipeline & Performance Insights

This article provides a comprehensive technical walkthrough of Open‑Sora 1.0, covering its Diffusion‑Transformer architecture, three‑stage training strategy, data‑preprocessing scripts, generation quality, and the Colossal‑AI acceleration that together make Sora‑level video synthesis openly reproducible.

AI VideoDiffusion TransformerOpen-Sora

0 likes · 12 min read

How Open‑Sora 1.0 Replicates Sora: Architecture, Training Pipeline & Performance Insights

21CTO

Mar 17, 2024 · Artificial Intelligence

What Data Powers OpenAI’s Upcoming Video Model Sora?

OpenAI CTO Mira Murati provided vague answers about Sora’s training data, confirming the use of publicly available, licensed, and Shutterstock content while acknowledging uncertainty about social‑media sources, amid ongoing legal disputes over AI model data usage.

AI training dataOpenAISora

0 likes · 4 min read

What Data Powers OpenAI’s Upcoming Video Model Sora?

Alimama Tech

Mar 14, 2024 · Artificial Intelligence

High-Fidelity Image-to-Video Generation for E-commerce with AtomoVideo and Noise Rectification

Alibaba’s AI team introduced AtomoVideo, a diffusion‑based image‑to‑video generator enhanced by a training‑free Noise Rectification module that adds and corrects controlled noise to eliminate first‑frame errors, enabling merchants to automatically create high‑fidelity 4‑second 720p product videos with strong temporal consistency for e‑commerce advertising.

AIAIGCVideo Generation

0 likes · 10 min read

High-Fidelity Image-to-Video Generation for E-commerce with AtomoVideo and Noise Rectification

DeWu Technology

Mar 11, 2024 · Artificial Intelligence

Understanding OpenAI's Sora Video Generation Model: Diffusion, Transformers, and Latent Space

OpenAI's Sora video generation model uses latent diffusion, a video compression encoder-decoder, tokenizes spatio-temporal patches, processes them with a diffusion‑trained Transformer conditioned on DALL·E‑style text annotations, then decodes to high‑resolution videos up to a minute long.

AILatent DiffusionSora

0 likes · 18 min read

Understanding OpenAI's Sora Video Generation Model: Diffusion, Transformers, and Latent Space

Sohu Tech Products

Mar 6, 2024 · Artificial Intelligence

Analysis of OpenAI Sora: Data Engineering, Network Architecture, and World Model Implications

OpenAI’s Sora video model unifies image and video data into latent spacetime patches via a VAE, trains on original resolutions with GPT‑4‑expanded captions, employs a Diffusion Transformer backbone for patch‑wise denoising, and demonstrates 3D‑consistent, long‑term world‑model capabilities that hint at a unified computer‑vision paradigm and steps toward AGI.

AI researchOpenAI SoraTransformer

0 likes · 9 min read

Analysis of OpenAI Sora: Data Engineering, Network Architecture, and World Model Implications

Architects' Tech Alliance

Feb 25, 2024 · Artificial Intelligence

How Sora Redefined Video Generation: Breakthroughs and Industry Impact

The article provides an in‑depth technical analysis of OpenAI's Sora, highlighting its 60‑second 1080p video generation capability, the novel patches‑vectorization and transformer training pipeline that leverages GPT‑generated prompts for multimodal alignment, and its potential to become a universal video‑generation base model that could reshape the AI industry.

AGILarge Language ModelSora

0 likes · 6 min read

How Sora Redefined Video Generation: Breakthroughs and Industry Impact

CSS Magic

Feb 20, 2024 · Artificial Intelligence

OpenAI’s Sora Video Model Is Hyped—But Here Are the Flaws OpenAI Itself Acknowledges

The article walks through OpenAI’s own admission of Sora’s shortcomings—such as unrealistic physics, misplaced spatial details, and erratic object behavior—by showcasing concrete demo failures, additional observations, and technical notes about its diffusion‑based, transformer architecture and metadata embedding.

AI limitationsOpenAISora

0 likes · 7 min read

OpenAI’s Sora Video Model Is Hyped—But Here Are the Flaws OpenAI Itself Acknowledges

Rare Earth Juejin Tech Community

Feb 19, 2024 · Artificial Intelligence

Technical Review of OpenAI's Sora Video Generation Model

This article reviews OpenAI's Sora video generation model, summarizing its technical report, key innovations such as patch-based visual tokens, compression networks, scaling transformers, language understanding, and discussing its capabilities, highlights, and current limitations in AI video synthesis.

AIOpenAISora

0 likes · 9 min read

Technical Review of OpenAI's Sora Video Generation Model

Architects' Tech Alliance

Feb 18, 2024 · Artificial Intelligence

How OpenAI’s Sora Redefines Video Generation with 3‑D Consistency and World Simulation

OpenAI’s Sora model introduces a diffusion‑transformer approach that generates high‑fidelity, 60‑second videos with consistent 3‑D camera motion, long‑term object persistence, and the ability to simulate interactive digital worlds, backed by a detailed technical report and research paper.

Artificial IntelligenceOpenAISora

0 likes · 9 min read

How OpenAI’s Sora Redefines Video Generation with 3‑D Consistency and World Simulation

21CTO

Feb 17, 2024 · Artificial Intelligence

How OpenAI’s Sora Is Pushing Video Generation to New Frontiers

OpenAI’s Sora model demonstrates large‑scale text‑conditional video generation using a diffusion transformer that operates on spatiotemporal patches, supporting variable durations, resolutions, and aspect ratios while showcasing emergent simulation abilities, flexible sampling, and multimodal editing capabilities, though it still has notable limitations.

AI researchMultimodalSora

0 likes · 19 min read

How OpenAI’s Sora Is Pushing Video Generation to New Frontiers

NewBeeNLP

Feb 17, 2024 · Artificial Intelligence

How Sora Highlights the Next Leap Toward AGI and Shifts AI Competition

The article analyzes OpenAI's Sora video model, arguing that its integration of large‑language‑model reasoning with diffusion techniques marks a major step toward true world understanding, reshapes creative workflows, widens the AI talent gap, and accelerates the path to artificial general intelligence.

AGIAI trendsSora

0 likes · 7 min read

How Sora Highlights the Next Leap Toward AGI and Shifts AI Competition

Architect

Feb 16, 2024 · Artificial Intelligence

Can OpenAI’s Sora Redefine Text‑to‑Video Generation? An In‑Depth Technical Review

OpenAI’s newly unveiled Sora model transforms short text prompts into up‑to‑one‑minute high‑definition videos, showcasing advanced diffusion‑Transformer architecture, improved occlusion handling, and detailed visual fidelity, while the article examines its technical breakthroughs, compares it to earlier models, and discusses emerging safety and misuse concerns.

AI safetyOpenAISora

0 likes · 12 min read

Can OpenAI’s Sora Redefine Text‑to‑Video Generation? An In‑Depth Technical Review

AntTech

Dec 20, 2022 · Artificial Intelligence

Towards Smooth Video Composition: A New Benchmark for GAN‑Based Video Generation

Researchers from multiple institutions propose a GAN‑based video generation framework that explicitly models short‑, medium‑, and long‑range temporal relations, introduces B‑spline motion embeddings and temporal shift modules, and demonstrates substantial quality improvements across several video datasets.

B-splineGaNStyleGAN-V

0 likes · 7 min read

Towards Smooth Video Composition: A New Benchmark for GAN‑Based Video Generation

Alimama Tech

Oct 26, 2022 · Artificial Intelligence

GPU Utilization Analysis and Optimization for Alibaba's Intelligent Creative Video Service

The paper analyzes why Alibaba Mama’s intelligent creative video service suffers low GPU utilization—due to Python GIL blocking, lack of kernel fusion, and serialized CUDA streams—and details service‑level changes (separate CPU/GPU processes, shared‑memory queues, priority scheduling) and operator‑level kernel‑fusion techniques (channels‑last layouts, custom pooling, TensorRT conversion) that raise utilization from ~30 % to near 100 % and boost throughput by 75 %.

GPU optimizationPythonTensorRT

0 likes · 20 min read

GPU Utilization Analysis and Optimization for Alibaba's Intelligent Creative Video Service

MaGe Linux Operations

Jul 3, 2022 · Backend Development

How to Automate 10,000 Video‑Channel Posts with Python and OCR for Massive Traffic

This guide shows how to use Python to scrape high‑quality chat screenshots, apply OCR, generate silent chat videos, batch‑download matching audio from short‑video platforms, and combine them into thousands of unique WeChat Video Channel clips, leveraging volume to outsmart recommendation algorithms and boost traffic.

OCRPythonVideo Generation

0 likes · 11 min read

How to Automate 10,000 Video‑Channel Posts with Python and OCR for Massive Traffic

MaGe Linux Operations

Feb 16, 2022 · Artificial Intelligence

Recreate Wuhan University’s Cherry Blossom Bloom with Python and OpenCV

This tutorial shows how to use Python, OpenCV, and Pillow to capture, process, and animate Wuhan University’s cherry blossom scenes, turning pixel data into a time‑lapse video with custom text overlays and frame‑by‑frame control.

OpenCVPythonVideo Generation

0 likes · 5 min read

Recreate Wuhan University’s Cherry Blossom Bloom with Python and OpenCV

Python Programming Learning Circle

Jan 17, 2022 · Fundamentals

Creating a Cherry Blossom Animation with Python, OpenCV, and Pillow

This article demonstrates how to use Python, OpenCV, and Pillow to capture, annotate, and assemble cherry‑blossom images into a video, explaining pixel color representation, frame saving, canvas creation, text rendering, and video encoding steps with complete code examples.

TutorialVideo Generationimage-processing

0 likes · 5 min read

Creating a Cherry Blossom Animation with Python, OpenCV, and Pillow

Tencent Advertising Technology

Nov 2, 2021 · Artificial Intelligence

Tencent Advertising Multimedia AI Platform: Intelligent Creation, Fine‑grained Understanding, Similar‑Ad Retrieval, and Smart Review

This article presents Tencent's advertising multimedia AI platform, detailing its intelligent video creation engine, fine‑grained ad content understanding, large‑scale similar‑ad retrieval system, and automated ad review pipeline, while also introducing the team and current recruitment opportunities.

MultimediaVideo Generationad understanding

0 likes · 22 min read

Tencent Advertising Multimedia AI Platform: Intelligent Creation, Fine‑grained Understanding, Similar‑Ad Retrieval, and Smart Review

Python Crawling & Data Mining

Jul 9, 2021 · Fundamentals

Build a Free Taobao Main Image Video Generator with Python, Tkinter & FFmpeg

This guide walks you through building a free Python Tkinter desktop application that merges multiple PNG or JPG images with background audio into a video using FFmpeg, covering environment setup, GUI design, file handling, log capture, video generation, and preview steps.

GUIPythonTkinter

0 likes · 11 min read

Build a Free Taobao Main Image Video Generator with Python, Tkinter & FFmpeg

Alibaba Terminal Technology

Jun 2, 2021 · Operations

Turn Your Git History into a Stunning Video with Gource and Avconv

This guide shows how to install Gource and Avconv, configure Chinese font support, and use a series of command‑line options to transform any Git repository’s commit history into a high‑resolution video, optionally adding background music for a polished visual celebration of your project.

TutorialVideo Generationavconv

0 likes · 4 min read

Turn Your Git History into a Stunning Video with Gource and Avconv

Yanxuan Tech Team

Apr 19, 2021 · Artificial Intelligence

How AI Powers Personalized Ad Creatives: From Templates to Automated Video

This article explains how algorithmic "smart creative" technology automates personalized advertising by using data‑driven templates, image and video synthesis, and aesthetic scoring to generate high‑click‑through ad content while reducing manual production costs.

AI-generated creativesVideo Generationimage composition

0 likes · 7 min read

How AI Powers Personalized Ad Creatives: From Templates to Automated Video

DataFunTalk

Nov 22, 2020 · Artificial Intelligence

Short Video Analysis in Local Life Scenarios: Techniques and Practices at Meituan

This article presents Meituan's AI-driven short video analysis workflow, covering industry trends, multi‑label video classification, intelligent cover selection, and video generation techniques, while discussing challenges, model building, label expansion, continuous data iteration, and future outlook for video AI in local services.

AIMeituanVideo Generation

0 likes · 16 min read

Short Video Analysis in Local Life Scenarios: Techniques and Practices at Meituan

DataFunSummit

Nov 5, 2020 · Artificial Intelligence

Short Video Analysis for Local Life Scenarios: Techniques and Practices at Meituan

This article presents Meituan's AI‑driven short‑video analysis pipeline for local‑life scenarios, covering industry trends, multi‑label classification, intelligent cover selection, and video generation, and discusses model construction, label‑system expansion, continuous data iteration, and practical applications in restaurant and hotel domains.

AIMeituanVideo Generation

0 likes · 16 min read

Short Video Analysis for Local Life Scenarios: Techniques and Practices at Meituan