Tag

memory isolation

1 views collected around this technical thread.

Bilibili Tech
Bilibili Tech
Jun 4, 2024 · Big Data

Improving Resource Utilization and Isolation in Bilibili Big Data Clusters with the Amiya Over‑commit Component

By deploying the self‑developed Amiya over‑commit component together with kernel‑level cgroup memory isolation, explicit task priorities, OOM‑priority killing, and asynchronous reclamation, Bilibili’s big‑data clusters boosted daily resource utilization by about 15 %, eliminated DataNode OOM kills, cut memory‑reclaim latency to zero, and achieved a further 9 % overall efficiency gain.

Big DataOOM Prioritycgroup
0 likes · 18 min read
Improving Resource Utilization and Isolation in Bilibili Big Data Clusters with the Amiya Over‑commit Component
Baidu Geek Talk
Baidu Geek Talk
Jul 18, 2022 · Artificial Intelligence

GPU Container Virtualization for AI Heterogeneous Computing: Architecture and Best Practices

The article surveys GPU container virtualization for AI heterogeneous computing, detailing utilization challenges, historical architectures, various virtualization methods, Baidu's dual-engine user- and kernel-space design with isolation and scheduling features, performance benefits, best‑practice scenarios, and deployment guidance, concluding with a technical Q&A.

AI computingContainerizationGPU virtualization
0 likes · 30 min read
GPU Container Virtualization for AI Heterogeneous Computing: Architecture and Best Practices