Tag

Ops

1 views collected around this technical thread.

Efficient Ops
Efficient Ops
Jun 15, 2025 · Operations

Master Ansible: Automate 300+ Servers with Simple Playbooks

This guide introduces Ansible’s core concepts, installation steps, common commands, and a complete Nginx deployment playbook, showing how to efficiently automate configuration, scaling, and updates across hundreds of servers.

AnsibleAutomationConfiguration Management
0 likes · 7 min read
Master Ansible: Automate 300+ Servers with Simple Playbooks
Efficient Ops
Efficient Ops
May 6, 2025 · Databases

5 Must‑Have GUI Tools to Master Redis Management

Operations engineers struggling with countless Redis commands and opaque data structures can simplify their workflow with five recommended visual tools that turn complex Redis operations into intuitive interfaces, complete with monitoring, cluster support, and cross‑platform clients.

GUIOpsRedis
0 likes · 4 min read
5 Must‑Have GUI Tools to Master Redis Management
Efficient Ops
Efficient Ops
Apr 14, 2025 · Operations

How LoggiFly Simplifies Docker Log Monitoring and Automated Alerts

LoggiFly is a lightweight Docker log monitoring tool that detects predefined keywords or regex patterns, supports multi‑channel notifications, can automatically restart or stop containers, and offers flexible deployment via environment variables or YAML configuration, helping ops teams maintain stable containerized environments.

AutomationDockerLoggiFly
0 likes · 6 min read
How LoggiFly Simplifies Docker Log Monitoring and Automated Alerts
Efficient Ops
Efficient Ops
Mar 2, 2025 · Operations

How to Diagnose Linux Server Performance Issues in the First 60 Seconds

This article walks you through the ten essential Linux command‑line tools—such as uptime, vmstat, iostat, and top—that Netflix’s performance engineers use to quickly assess system load, resource saturation, and errors within the critical first minute of troubleshooting.

LinuxOpsPerformance Monitoring
0 likes · 18 min read
How to Diagnose Linux Server Performance Issues in the First 60 Seconds
Efficient Ops
Efficient Ops
Dec 29, 2024 · Operations

Turn Shell Commands into Real‑Time Visual Dashboards with Sampler

Sampler is a lightweight tool that lets you execute shell commands, visualize their output, and set up alerts using simple YAML configurations, offering a quick, server‑less alternative to full‑blown monitoring stacks for databases, message queues, and custom scripts.

MonitoringOpssampler
0 likes · 16 min read
Turn Shell Commands into Real‑Time Visual Dashboards with Sampler
Java Tech Enthusiast
Java Tech Enthusiast
Dec 2, 2024 · Operations

Sampler: A Visual Server Monitoring Tool for Linux

Sampler is a Linux visual monitoring tool that runs from a single binary, uses simple YAML files to define widgets such as sparklines and bar charts, and displays real‑time CPU, memory, network, Docker container statistics and other metrics, while being easily extensible to services like MySQL, MongoDB and Kafka.

LinuxOpssampler
0 likes · 7 min read
Sampler: A Visual Server Monitoring Tool for Linux
Linux Ops Smart Journey
Linux Ops Smart Journey
Nov 10, 2024 · Operations

Master Ansible: Using yum_repository, yum, and systemd Modules for Efficient Automation

This article explores three frequently used Ansible modules—yum_repository, yum, and systemd—detailing their parameters, usage examples, and practical commands to streamline package management and service control, helping DevOps engineers boost automation efficiency in cloud and container environments.

AnsibleOpsPackage Management
0 likes · 10 min read
Master Ansible: Using yum_repository, yum, and systemd Modules for Efficient Automation
Linux Ops Smart Journey
Linux Ops Smart Journey
Nov 5, 2024 · Operations

Master 8 Essential Ansible Modules for Efficient Automation

This article introduces eight essential Ansible modules—file, copy, template, fetch, and get_url—explaining their parameters, usage examples, and how they simplify automation tasks in operations, with code snippets and reference links for deeper learning.

AnsibleAutomationConfiguration Management
0 likes · 11 min read
Master 8 Essential Ansible Modules for Efficient Automation
Baidu Tech Salon
Baidu Tech Salon
Oct 16, 2024 · Big Data

Design and Implementation of an Online/Offline Integrated Task Scheduling System for Baidu's Mobile Operations Promotion Platform (OPS)

The paper presents Baidu’s Mobile Operations Promotion Platform redesign, introducing an online‑offline integrated task‑scheduling architecture that partitions settlement fields to the data‑warehouse, records all jobs in a unified MySQL operation table, orchestrates them via Turing Data Studio, and manages dependencies to achieve consistent, auditable, billion‑scale settlement processing.

BaiduOffline ProcessingOps
0 likes · 14 min read
Design and Implementation of an Online/Offline Integrated Task Scheduling System for Baidu's Mobile Operations Promotion Platform (OPS)
Beijing SF i-TECH City Technology Team
Beijing SF i-TECH City Technology Team
May 30, 2024 · Operations

Root Cause Analysis of CPU Sys Spikes and Memory Pressure in Linux Services

This article investigates two real‑world performance incidents—one caused by excessive disk I/O from a misconfigured Filebeat and another by kernel memory‑fragmentation bugs triggered by a trace feature—detailing observations, Linux diagnostic commands, analysis, and practical remediation steps.

CPULinuxMemory
0 likes · 15 min read
Root Cause Analysis of CPU Sys Spikes and Memory Pressure in Linux Services
Python Programming Learning Circle
Python Programming Learning Circle
May 23, 2024 · Operations

Supervisor Process Monitoring and Management Guide

This article introduces Supervisor, a client/server process monitoring tool for Unix-like systems, explains its installation, configuration, and usage—including custom service and application files, command-line control with supervisorctl, advanced features like process groups, automatic restart policies, and web UI—providing practical examples and code snippets for reliable daemon management.

AutomationLinuxOps
0 likes · 17 min read
Supervisor Process Monitoring and Management Guide
Java Tech Enthusiast
Java Tech Enthusiast
Jan 7, 2024 · Operations

Using the Linux top Command for Real-Time System Monitoring

The Linux top command offers a dynamic, real‑time view of system processes and resource usage—showing overall statistics, CPU and memory breakdowns, and detailed process columns—while supporting customizable refresh intervals, batch mode, and interactive shortcuts for sorting, column selection, and monitoring crucial metrics like %idle, %wa, and %steal.

CPULinuxOps
0 likes · 7 min read
Using the Linux top Command for Real-Time System Monitoring
Efficient Ops
Efficient Ops
Sep 26, 2023 · Operations

Mastering Zabbix: From Installation to Advanced Monitoring and Automation

This comprehensive guide walks you through Zabbix monitoring concepts, reliability calculations, installation methods, web UI configuration, host and template management, custom monitoring, alert integration with OneAlert, Grafana visualization, distributed monitoring, SNMP support, and practical scripts for large‑scale server environments.

AlertingAutomationGrafana
0 likes · 28 min read
Mastering Zabbix: From Installation to Advanced Monitoring and Automation
DeWu Technology
DeWu Technology
Aug 28, 2023 · Operations

Real-time Data Warehouse Business-Side Chaos Engineering Practice

The article describes how a real‑time data warehouse supporting ad‑delivery metrics adopts both technical and business‑side chaos‑engineering, using red‑blue team drills to inject faults, monitor indicator anomalies, and refine response procedures, thereby enhancing early risk detection, system resilience, and overall data stability for the advertising platform.

Chaos EngineeringData QualityData Warehousing
0 likes · 16 min read
Real-time Data Warehouse Business-Side Chaos Engineering Practice
DevOps
DevOps
Jul 20, 2023 · Operations

Why Continuous Testing Is Essential for Infrastructure and How to Implement It

The article explains why continuous testing of infrastructure is critical for stability and reliability, outlines a comprehensive testing scope ranging from unit to reliability tests, discusses tool selection and practical Terraform‑based examples, and shows how test‑driven development can improve IaC workflows.

Continuous IntegrationIaCInfrastructure Testing
0 likes · 9 min read
Why Continuous Testing Is Essential for Infrastructure and How to Implement It
Efficient Ops
Efficient Ops
Jan 30, 2023 · Operations

Master Redis Monitoring: Key Metrics, Commands, and Performance Testing

This guide explains essential Redis monitoring metrics, the tools and commands for collecting performance, memory, activity, persistence, and error data, and shows how to use INFO, slowlog, and redis-benchmark to assess and improve database operations.

MetricsMonitoringOps
0 likes · 6 min read
Master Redis Monitoring: Key Metrics, Commands, and Performance Testing
Efficient Ops
Efficient Ops
Nov 9, 2022 · Operations

9 Essential Linux Shell Scripts for System Administration and Automation

This article presents nine practical Linux shell script examples—ranging from DDoS IP blocking and alert notifications to MySQL backups, Nginx log management, network traffic monitoring, system initialization, and disk usage checks—that operations engineers can adapt and deploy in real-world environments.

AutomationLinuxOps
0 likes · 10 min read
9 Essential Linux Shell Scripts for System Administration and Automation
Practical DevOps Architecture
Practical DevOps Architecture
Sep 30, 2022 · Operations

Resolving Filebeat Startup Failure: EOF Error in Registrar State

This guide explains how to troubleshoot Filebeat failing to start due to an EOF error while loading registrar state, by inspecting logs, resetting the registry directory, and restarting the service on a Linux host.

FilebeatLinuxLogstash
0 likes · 4 min read
Resolving Filebeat Startup Failure: EOF Error in Registrar State
Efficient Ops
Efficient Ops
Aug 8, 2022 · Operations

Master Essential Linux Ops: xargs, Background Jobs, Process Monitoring & More

This guide walks you through practical Linux operations—from using xargs for efficient file handling and running commands in the background, to monitoring high‑memory and high‑CPU processes, viewing multiple logs with multitail, continuous ping logging, checking TCP states, identifying top IPs on port 80, and leveraging SSH for port forwarding.

LinuxMonitoringOps
0 likes · 10 min read
Master Essential Linux Ops: xargs, Background Jobs, Process Monitoring & More