Artificial Intelligence 18 min read

OneFlow Coop: Joint Optimization of Dynamic‑Graph Recomputation and Memory Allocation

This article introduces OneFlow Coop, a memory‑optimization technique that jointly optimizes dynamic‑graph recomputation strategies and GPU memory allocation by analyzing existing DTR limitations, proposing recomputable in‑place, op‑guided tensor allocation, and layout‑aware eviction modules, and demonstrating superior experimental results.

DataFunSummit

Apr 11, 2023

OneFlow Coop: Joint Optimization of Dynamic‑Graph Recomputation and Memory Allocation

The article presents OneFlow Coop, a novel approach that combines dynamic‑graph recomputation (DTR) with memory‑allocation strategies to reduce GPU memory consumption during neural‑network training.

It first explains the background of DTR, noting that most GPU memory is occupied by intermediate feature tensors rather than model parameters, and describes how recomputation can free memory by re‑evaluating released tensors during the backward pass.

The limitations of existing DTR methods are discussed: greedy selection can cause fragmented memory, ignore tensor ordering, and conflict with in‑place operations, leading to inefficient memory usage and higher recomputation costs.

Coop addresses these issues through three modules: (1) recomputable in‑place , which enables tensors to share memory while remaining recomputable; (2) op‑guided tensor allocation , which places low‑cost tensors and high‑cost tensors on opposite sides of the memory pool based on operation type; and (3) layout‑aware eviction , which uses a sliding‑window algorithm to find the optimal contiguous free‑memory block with minimal release cost, reducing the search complexity from O(n²) to O(n).

Experimental results show that Coop consistently achieves lower memory fragmentation and faster search times compared to traditional DTR and the DTE variant across multiple models, while maintaining comparable computational overhead.

The article concludes with a brief Q&A, confirming OneFlow’s support for both dynamic and static graphs, its compatibility with PyTorch APIs, and the applicability of Coop to large‑model training and scenarios with limited GPU memory.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

deep learning Memory Optimization OneFlow Dynamic Graph GPU Memory recomputation

Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.