Tag

oom

1 views collected around this technical thread.

Deepin Linux
Deepin Linux
May 1, 2025 · Fundamentals

Understanding Memory and Process Interaction: Virtual Memory, Paging, and Allocation in Linux

This article explains how memory works as a temporary storage stage for processes, describes the fundamentals of physical and virtual memory, details paging, page tables, multi‑level paging, allocation mechanisms such as brk() and mmap(), and outlines Linux memory‑management techniques including caching, swapping, and OOM handling.

MMUMemory ManagementPaging
0 likes · 33 min read
Understanding Memory and Process Interaction: Virtual Memory, Paging, and Allocation in Linux
Ops Development Stories
Ops Development Stories
Feb 6, 2025 · Cloud Native

Automate Java OOM Heapdump Collection with a Kubernetes DaemonSet

This guide explains how to automatically capture Java OOM heapdump files using a DaemonSet that watches for heapdump.prof creation, compresses and uploads them to Alibaba Cloud OSS, and notifies developers via a WeChat bot, providing a scalable, non‑intrusive solution for memory‑leak diagnostics in Kubernetes environments.

DaemonSetGoKubernetes
0 likes · 19 min read
Automate Java OOM Heapdump Collection with a Kubernetes DaemonSet
Zhuanzhuan Tech
Zhuanzhuan Tech
Oct 31, 2024 · Backend Development

Root Cause Analysis of OOM in a Spring Boot Service: ScriptEngine Initialization and StringSequence Memory Consumption

This article details a step‑by‑step investigation of an OutOfMemoryError in a Spring Boot social app, revealing that frequent initialization of a script engine caused massive StringSequence instances via SPI loading, and shows how consolidating the engine eliminated the OOM issue.

ArthasMemoryAnalysisPerformance
0 likes · 15 min read
Root Cause Analysis of OOM in a Spring Boot Service: ScriptEngine Initialization and StringSequence Memory Consumption
IT Services Circle
IT Services Circle
Aug 31, 2024 · Databases

Production OOM Incident Caused by Incorrect Pagination and How to Fix It

The article analyzes a production out‑of‑memory crash triggered by a pagination bug that misused the OFFSET parameter, explains why the error escaped testing and code review, and presents corrected pagination techniques for Oracle, MySQL and MyBatis to prevent similar failures.

MyBatisMySQLOracle
0 likes · 6 min read
Production OOM Incident Caused by Incorrect Pagination and How to Fix It
Spring Full-Stack Practical Cases
Spring Full-Stack Practical Cases
Aug 15, 2024 · Backend Development

Master SpringBoot Interceptors to Avoid OOM: Proper postHandle and afterCompletion Strategies

This article explains how SpringBoot interceptors work, demonstrates custom interceptor implementation, shows how misuse of ThreadLocal can cause memory leaks and OOM errors, and provides a fix by moving cleanup logic to afterCompletion, complete with code samples and performance screenshots.

InterceptorSpring MVCSpringBoot
0 likes · 9 min read
Master SpringBoot Interceptors to Avoid OOM: Proper postHandle and afterCompletion Strategies
IT Services Circle
IT Services Circle
Jul 14, 2024 · Backend Development

Understanding MyBatis Dynamic SQL, OOM Incidents, and the Importance of Backend Parameter Validation

This article explains MyBatis dynamic SQL, recounts a first‑hand OOM incident caused by missing backend validation, and shares practical lessons on parameter checking, balancing reusable versus specialized interfaces, and adopting defensive programming to build more reliable backend systems.

Backend ValidationDynamic SQLMyBatis
0 likes · 11 min read
Understanding MyBatis Dynamic SQL, OOM Incidents, and the Importance of Backend Parameter Validation
Lobster Programming
Lobster Programming
May 22, 2024 · Backend Development

How to Diagnose and Prevent Java OOM Errors Before Your Service Crashes

This article explains common causes of Java OutOfMemoryError, demonstrates how to reproduce OOM with sample code, and provides step‑by‑step techniques using jmap, heap dumps, and VisualVM to locate and avoid memory leaks in backend applications.

DiagnosticsJVMMemory Management
0 likes · 6 min read
How to Diagnose and Prevent Java OOM Errors Before Your Service Crashes
Aikesheng Open Source Community
Aikesheng Open Source Community
Mar 12, 2024 · Databases

Resolving MySQL OOM Caused by Triggers and table_open_cache_instances

This article analyzes a MySQL replica that repeatedly ran out of memory due to large triggers and a high table_open_cache_instances setting, demonstrates how to reproduce the issue with test scripts, and provides a practical fix by reducing the parameter to mitigate OOM.

DatabaseMySQLPerformance Tuning
0 likes · 9 min read
Resolving MySQL OOM Caused by Triggers and table_open_cache_instances
Java Architect Essentials
Java Architect Essentials
Nov 27, 2023 · Backend Development

Analyzing and Resolving OutOfMemoryError in Java MyBatis Applications

This article examines the frequent OutOfMemoryError incidents in a Java backend service, explains the underlying heap and metaspace causes, analyzes MyBatis source code that leads to memory leaks, demonstrates a reproducible scenario, and offers practical optimization recommendations to prevent OOM in production.

Memory LeakMyBatisPerformance
0 likes · 6 min read
Analyzing and Resolving OutOfMemoryError in Java MyBatis Applications
Tencent Music Tech Team
Tencent Music Tech Team
Oct 20, 2023 · Mobile Development

Root Cause Analysis of OOM Crash in iOS Karaoke App Caused by Swizzled NSMutableArray Protection

An OOM crash in the K‑song iOS karaoke app was traced to a configuration that swizzled several NSMutableArray methods, causing each observer lookup to autorelease objects, rapidly filling autorelease‑pool pages and exhausting memory; converting the protection code to manual reference counting eliminated the leak and stopped the crashes.

AutoreleasePoolMemory LeakObjective-C
0 likes · 21 min read
Root Cause Analysis of OOM Crash in iOS Karaoke App Caused by Swizzled NSMutableArray Protection
WeiLi Technology Team
WeiLi Technology Team
Aug 31, 2023 · Operations

Why Did My Kubernetes Node Stay NotReady? OOM Killer, PLEG, and Fixes

A high‑load Kubernetes node entered NotReady due to repeated OOM‑killer activity, daemonset restarts, and PLEG health failures, and the article walks through diagnosis, log analysis, root‑cause explanation, and practical remediation steps to restore node readiness.

KubernetesOperationsPLEG
0 likes · 9 min read
Why Did My Kubernetes Node Stay NotReady? OOM Killer, PLEG, and Fixes
iQIYI Technical Product Team
iQIYI Technical Product Team
Aug 11, 2023 · Artificial Intelligence

Debugging Random OOM Issues in PyTorch Distributed Training on A100 Clusters

The iQIYI backend team traced random OOM crashes in PyTorch Distributed Data Parallel on an A100 cluster to a malformed DDP message injected by a security scan, which forced a near‑terabyte allocation; using jemalloc for diagnostics, they mitigated the issue by adjusting scan policies and collaborating with PyTorch to harden the protocol.

AI infrastructureMemory DebuggingPyTorch
0 likes · 9 min read
Debugging Random OOM Issues in PyTorch Distributed Training on A100 Clusters
Efficient Ops
Efficient Ops
Jul 11, 2023 · Operations

Why Did Our kube-apiserver OOM? A Deep Dive into Kubernetes Control-Plane Failures

This article details a real-world Kubernetes control‑plane outage where kube‑apiserver repeatedly OOM‑killed, explores cluster metrics, logs, heap and goroutine profiles, hypothesizes root causes such as etcd latency and DeleteCollection memory leaks, and offers step‑by‑step troubleshooting and prevention guidance.

KubernetesOperationsTroubleshooting
0 likes · 21 min read
Why Did Our kube-apiserver OOM? A Deep Dive into Kubernetes Control-Plane Failures
Tongcheng Travel Technology Center
Tongcheng Travel Technology Center
Jun 6, 2023 · Operations

Root Cause Analysis and GC Parameter Optimization for Elasticsearch OOM Issues in the Membership Service

This article details a comprehensive investigation of an out‑of‑memory crash in a critical Elasticsearch cluster, explains how GC logs and heap dumps revealed a to‑space‑exhausted condition, and describes the G1GC tuning parameters that eliminated the nightly spikes and stabilized performance.

ElasticsearchG1GCJVM Tuning
0 likes · 9 min read
Root Cause Analysis and GC Parameter Optimization for Elasticsearch OOM Issues in the Membership Service
Java Architect Essentials
Java Architect Essentials
Apr 20, 2023 · Operations

Diagnosing Metaspace OOM in Java Applications: A Step‑by‑Step Analysis

This article walks through a real‑world investigation of a Metaspace Out‑Of‑Memory error in a Java service, detailing how JVM monitoring tools, class‑loader behavior, and hot‑deployment agents contributed to the issue and presenting practical fixes and preventive measures.

ArthasClassLoaderJVM
0 likes · 12 min read
Diagnosing Metaspace OOM in Java Applications: A Step‑by‑Step Analysis
Efficient Ops
Efficient Ops
Feb 7, 2023 · Operations

Why Did kube-apiserver OOM? A Deep Dive into Kubernetes Control‑Plane Failures

This article details a real‑world Kubernetes control‑plane outage where kube‑apiserver repeatedly OOM‑killed, examines cluster metrics, logs, heap and goroutine profiles, explores root‑cause hypotheses such as etcd latency and DeleteCollection memory leaks, and offers practical prevention steps.

KubernetesOperationsTroubleshooting
0 likes · 19 min read
Why Did kube-apiserver OOM? A Deep Dive into Kubernetes Control‑Plane Failures
NetEase Cloud Music Tech Team
NetEase Cloud Music Tech Team
Feb 6, 2023 · Mobile Development

Memory Monitoring and Leak Detection Practices in NetEase Cloud Music Android App

The NetEase Cloud Music Android team built a comprehensive memory‑monitoring system—combining LeakCanary and KOOM for leak detection, instrumented image loading for large‑bitmap tracking, periodic heap and thread metrics collection, and automated ticket generation—to identify, rank, and resolve leaks, oversized resources, and thread‑related OOM risks across development and production.

AndroidKOOMLeak Detection
0 likes · 15 min read
Memory Monitoring and Leak Detection Practices in NetEase Cloud Music Android App
Tencent Database Technology
Tencent Database Technology
Jan 31, 2023 · Databases

Common MySQL OOM Scenarios and TDSQL‑C Memory Optimization Strategies

This article examines typical MySQL out‑of‑memory (OOM) situations in production, explains how to diagnose memory usage with Performance Schema and other tools, and presents a series of TDSQL‑C‑specific optimization techniques—including server‑parameter tuning, process‑list monitoring, cold‑page detection, buffer‑pool limits, and dynamic resizing—to mitigate OOM risks.

Database PerformanceMemory OptimizationMySQL
0 likes · 12 min read
Common MySQL OOM Scenarios and TDSQL‑C Memory Optimization Strategies
Qunar Tech Salon
Qunar Tech Salon
Jan 31, 2023 · Operations

Root Cause Analysis and Mitigation of JVM GC‑Induced OOM and Memory Fragmentation in a Containerized Hotel Pricing Service

This article details how long JVM garbage‑collection pauses and glibc ptmalloc memory‑fragmentation caused container OOM kills in a hotel‑pricing system, and explains the step‑by‑step diagnosis, JVM tuning, Kubernetes health‑check adjustments, and the replacement of ptmalloc with jemalloc to eliminate the issue.

GCJVMKubernetes
0 likes · 9 min read
Root Cause Analysis and Mitigation of JVM GC‑Induced OOM and Memory Fragmentation in a Containerized Hotel Pricing Service