Artificial Intelligence 5 min read

Enlarging the Long-time Dependencies via RL-based Memory Network in Movie Affective Analysis

The paper introduces a reinforcement‑learning‑driven memory network that stores and updates historical video information via DDPG, overcoming LSTM/Transformer limitations on long‑duration movie sequences, and achieves state‑of‑the‑art affective prediction on LIRIS‑ACCEDE and related datasets, with real‑world deployments in AI content inspection and film‑element knowledge graphs.

Youku Technology
Youku Technology
Youku Technology
Enlarging the Long-time Dependencies via RL-based Memory Network in Movie Affective Analysis

ACMMM (ACM International Conference on Multimedia) is a top-tier ACM conference in the multimedia field, recommended by the China Computer Federation as an A‑class international academic conference. It is held annually and accepts papers covering multimedia, multimedia retrieval, machine learning, artificial intelligence, computer vision, data science, HCI, multimedia signal processing, as well as applications in healthcare, education, entertainment and many other research directions.

Paper Title: Enlarging the Long-time Dependencies via RL-based Memory Network in Movie Affective Analysis

Abstract: Understanding the emotional content of movies has become a hot research topic in affective computing, with important applications in movie quality assessment, climax detection, and multimedia retrieval. Existing mainstream methods (e.g., LSTM, Transformer) suffer from inherent drawbacks when modeling long‑duration video sequences, such as gradient vanishing/exploding, limited memory capacity, and high computational cost. To address these issues, this work proposes a reinforcement‑learning‑based memory network. A readable and writable memory bank is introduced to store historical information, enhancing the model’s memory ability. Reinforcement learning (DDPG) is employed, using a policy network and a value network to model long‑term dependencies and adaptively update the memory bank. The one‑step temporal‑difference optimization of RL avoids the gradient problems of BPTT. Experiments on the LIRIS‑ACCEDE dataset for affective prediction, as well as on related datasets for music emotion prediction and video summarization, demonstrate state‑of‑the‑art performance. The method has also been deployed in business scenarios such as AI content inspection, Beidou Smart Investment, and film‑element knowledge graphs.

Authors: Zhang Jie, Zhao Yin, Qian Kai (all from Alibaba Entertainment AI Brain – Beidou Star team).

Team Overview: Alibaba Entertainment Beidou Star AI Brain leverages big data and AI to mine user needs, establishing capabilities for structured content acquisition evaluation, casting suitability, AI‑driven content inspection, scheduling, and digital promotion, thereby supporting the full lifecycle of content decision‑making and achieving cost reduction and efficiency gains for the platform.

Open‑source release: Youku dynamic template development system improves distribution efficiency by 30%

New open‑source GaiaX dynamic template engine – try HelloWorld

Youku mobile bullet‑screen architecture design and engineering practice summary

reinforcement learningmultimedia AIlong-term dependenciesMemory Networkmovie affective analysis
Youku Technology
Written by

Youku Technology

Discover top-tier entertainment technology here.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.