Tag

Distributed Storage

1 views collected around this technical thread.

Linux Ops Smart Journey
Linux Ops Smart Journey
Jun 4, 2025 · Cloud Native

Deploy Longhorn on Kubernetes with Helm: Step‑by‑Step Guide

This article provides a comprehensive, hands‑on tutorial for deploying the open‑source Longhorn distributed block storage system on a Kubernetes cluster using Helm, covering prerequisites, Helm chart preparation, installation, validation, and PVC mounting to ensure reliable stateful workloads.

Distributed StorageHelmKubernetes
0 likes · 11 min read
Deploy Longhorn on Kubernetes with Helm: Step‑by‑Step Guide
AntData
AntData
Mar 14, 2025 · Fundamentals

Analysis of DeepSeek 3FS Storage Service Architecture and Design

This article provides an in‑depth technical analysis of DeepSeek's open‑source 3FS distributed file system, focusing on the StorageService architecture, space pooling, allocation mechanisms, reference counting, fragmentation handling, and the RDMA‑based read/write data path.

Distributed StorageFile SystemRDMA
0 likes · 15 min read
Analysis of DeepSeek 3FS Storage Service Architecture and Design
AntData
AntData
Mar 5, 2025 · Cloud Native

DeepSeek 3FS Network Communication Module: Design, Implementation, and Impact on AI Infrastructure

This article provides an in‑depth analysis of DeepSeek's open‑source 3FS distributed storage system, focusing on its network communication module, RDMA‑based design, core classes such as IBSocket, Listener, and IOWorker, and how these innovations advance high‑performance AI infrastructure.

AI infrastructureDistributed StorageFolly Coroutines
0 likes · 15 min read
DeepSeek 3FS Network Communication Module: Design, Implementation, and Impact on AI Infrastructure
Deepin Linux
Deepin Linux
Feb 23, 2025 · Cloud Computing

Understanding Ceph Distributed Storage Architecture and Its Core Components

Ceph is a unified, open‑source distributed storage system whose layered architecture—comprising RADOS, LIBRADOS, and upper‑level services like RADOSGW, RBD, and CephFS—provides high performance, reliability, scalability, and flexible data access for cloud, big‑data, and AI workloads.

Big DataCephCloud Computing
0 likes · 25 min read
Understanding Ceph Distributed Storage Architecture and Its Core Components
Bilibili Tech
Bilibili Tech
Jan 17, 2025 · Backend Development

NeighborHash: An Enhanced Batch Query Architecture for Real‑time Recommendation Systems

NeighborHash is a distributed batch‑query architecture for real‑time recommendation systems that combines a cache‑line‑optimized hash table—featuring Lodger Relocation, bidirectional cache‑aware probing, and inline‑chaining—with an NVMe‑backed key‑value service, versioned updates, and asynchronous memory‑access chaining to achieve sub‑microsecond, high‑throughput top‑N retrieval.

AMACBatch QueryDistributed Storage
0 likes · 20 min read
NeighborHash: An Enhanced Batch Query Architecture for Real‑time Recommendation Systems
IT Architects Alliance
IT Architects Alliance
Jan 8, 2025 · Big Data

Understanding Distributed Storage: A Comparative Overview of HDFS, Ceph, and MinIO

This article explains the fundamentals, use cases, advantages, and trade‑offs of three major distributed storage solutions—HDFS, Ceph, and MinIO—guiding readers on how to select the most suitable system for big‑data, cloud‑native, and containerized environments.

Big DataCephDistributed Storage
0 likes · 12 min read
Understanding Distributed Storage: A Comparative Overview of HDFS, Ceph, and MinIO
360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
Jan 7, 2025 · Big Data

High‑Performance Distributed Storage: Ceph vs Alibaba Pangu 2.0 vs XSKY INFINI

This article compares three high‑performance distributed storage systems—Ceph, Alibaba's Pangu 2.0, and XSKY INFINI—examining their architectures, key technologies such as RTC thread models, append‑only writes, kernel‑bypass, RDMA, data compression, and metadata management to reveal how they exploit modern flash hardware.

CephDistributed StorageNVMe
0 likes · 21 min read
High‑Performance Distributed Storage: Ceph vs Alibaba Pangu 2.0 vs XSKY INFINI
IT Architects Alliance
IT Architects Alliance
Jan 2, 2025 · Blockchain

Challenges and Solutions for IT Architecture in the Web3.0 Era

The article examines how the shift from centralized Web2.0 to decentralized Web3.0 creates major architectural challenges—data scattering, smart‑contract security, high concurrency, and cross‑chain interoperability—and proposes technical solutions such as distributed storage, consistency algorithms, code auditing, caching, load balancing, and cross‑chain standards.

Distributed StorageWeb3.0blockchain
0 likes · 14 min read
Challenges and Solutions for IT Architecture in the Web3.0 Era
JD Tech
JD Tech
Dec 26, 2024 · Databases

Optimizing Query Performance for JD's BIP Procurement System with JED, JimKV, and Elasticsearch

This article details how JD's BIP procurement system tackled massive query‑performance challenges by segmenting order data, leveraging the JED distributed MySQL solution, introducing JimKV for hot‑data caching, and offloading complex searches to Elasticsearch, resulting in dramatically reduced load and faster user experiences.

Database OptimizationDistributed StorageElasticsearch
0 likes · 11 min read
Optimizing Query Performance for JD's BIP Procurement System with JED, JimKV, and Elasticsearch
JD Retail Technology
JD Retail Technology
Dec 12, 2024 · Databases

Optimizing Query Performance for JD's BIP Procurement System with JED, JimKV, and Elasticsearch

This article presents a comprehensive case study of how JD's procurement system (BIP) tackled massive data volume and complex query challenges by redesigning data models, introducing heterogeneous storage for inbound orders, leveraging JED and JimKV, and offloading complex searches to Elasticsearch, resulting in dramatically reduced database load and improved user experience.

Database OptimizationDistributed StorageElasticsearch
0 likes · 11 min read
Optimizing Query Performance for JD's BIP Procurement System with JED, JimKV, and Elasticsearch
High Availability Architecture
High Availability Architecture
Aug 16, 2024 · Big Data

Introduction to Elasticsearch: Core Concepts, Query Types, Pagination, and Data Synchronization

This article provides a comprehensive overview of Elasticsearch, covering its distributed storage architecture, core data model concepts, analysis and query capabilities, practical next‑token pagination techniques, join strategies, and various data synchronization methods for integrating Elasticsearch with other systems.

Big DataData SynchronizationDistributed Storage
0 likes · 13 min read
Introduction to Elasticsearch: Core Concepts, Query Types, Pagination, and Data Synchronization
DevOps Operations Practice
DevOps Operations Practice
Aug 15, 2024 · Cloud Native

Five Best Open-Source Kubernetes Storage Solutions

This article reviews five leading open‑source storage solutions for Kubernetes—OpenEBS, Rook, GlusterFS, Ceph, and LongHorn—detailing their architectures, key features, and ideal use‑cases to help readers select the most appropriate storage option for various application requirements.

Distributed StorageKubernetescloud-native
0 likes · 6 min read
Five Best Open-Source Kubernetes Storage Solutions
DataFunTalk
DataFunTalk
May 27, 2024 · Big Data

JD Retail’s Unified HDFS Storage: Cross‑Region and Hierarchical Storage Practices

This article details JD Retail’s large‑scale HDFS deployment, describing how cross‑region storage challenges were solved with a full‑copy topology, asynchronous block replication, flow‑control mechanisms, and a tiered storage strategy that automatically moves hot, warm, and cold data among SSD, HDD, and high‑density HDD nodes to improve performance and cut costs.

Big DataCross-RegionDistributed Storage
0 likes · 20 min read
JD Retail’s Unified HDFS Storage: Cross‑Region and Hierarchical Storage Practices
DataFunTalk
DataFunTalk
May 21, 2024 · Big Data

Applying Alluxio to Autonomous Driving Model Training: Deployment, Performance, and Operational Insights

This article details how Alluxio was adopted to replace NAS in autonomous driving model training, describing the data closed‑loop workflow, the challenges of the previous system, Alluxio's architectural benefits, deployment strategies across single and multiple data centers, functional and performance testing, operational tuning, and the resulting cost and efficiency gains.

AlluxioData PipelineDistributed Storage
0 likes · 15 min read
Applying Alluxio to Autonomous Driving Model Training: Deployment, Performance, and Operational Insights
Sohu Tech Products
Sohu Tech Products
Mar 13, 2024 · Databases

DingoDB Multi-Modal Vector Database: Design Philosophy, Architecture and Applications

DingoDB is a multi‑modal vector database that unifies storage and analysis of structured, semi‑structured and unstructured data through a Raft‑based distributed architecture, offering MySQL‑compatible SQL, high‑performance APIs, automatic sharding, real‑time index optimization, and hybrid scalar‑vector queries for enterprise knowledge bases, LLM memory, and real‑time decision‑making.

Data ArchitectureDingoDBDistributed Storage
0 likes · 11 min read
DingoDB Multi-Modal Vector Database: Design Philosophy, Architecture and Applications
DataFunSummit
DataFunSummit
Feb 6, 2024 · Big Data

Exploring ByteDance's EB‑Scale HDFS: Architecture, Multi‑Datacenter Challenges, Tiered Storage, and Data Protection Practices

This article presents an in‑depth overview of ByteDance's EB‑scale HDFS, covering its new features, multi‑datacenter architecture, tiered storage implementation, data management services, capacity and fault‑tolerance strategies, as well as practical data‑protection mechanisms and related Q&A.

Big DataDistributed StorageHDFS
0 likes · 22 min read
Exploring ByteDance's EB‑Scale HDFS: Architecture, Multi‑Datacenter Challenges, Tiered Storage, and Data Protection Practices
DataFunTalk
DataFunTalk
Oct 3, 2023 · Big Data

Design and Practices of Alibaba Cloud's Billion‑Scale Real‑Time Log Analysis System

This article presents the architecture, core challenges, key design decisions, and future directions of Alibaba Cloud's SLS platform, which handles billions of daily log queries with sub‑300 ms latency by leveraging LSM‑based storage, indexing, columnar layout, distributed caching, and multi‑tenant isolation.

Distributed StorageIndexingLSM
0 likes · 17 min read
Design and Practices of Alibaba Cloud's Billion‑Scale Real‑Time Log Analysis System
php中文网 Courses
php中文网 Courses
Sep 28, 2023 · Backend Development

Implementing Distributed Data Storage and Retrieval with PHP Microservices

This article explains the challenges of traditional single-node data storage, introduces microservice architecture, and provides step-by-step PHP Swoole code examples for creating storage and retrieval microservices and a client script, demonstrating how to achieve scalable, fault‑tolerant distributed data storage and retrieval.

Backend DevelopmentDistributed StorageMicroservices
0 likes · 5 min read
Implementing Distributed Data Storage and Retrieval with PHP Microservices
DataFunTalk
DataFunTalk
Sep 22, 2023 · Big Data

Design and Practice of Baidu's Tape Library Storage Architecture Based on the Aries Cloud Storage System

This article presents a comprehensive overview of Baidu Intelligent Cloud's tape‑library solution, detailing tape and tape‑library fundamentals, the Aries cloud storage stack, data and access models, the end‑to‑end data flow, key architectural design choices, implementation details, and a real‑world case study demonstrating large‑scale cold‑data storage, backup, and retrieval performance.

AriesCold DataDistributed Storage
0 likes · 28 min read
Design and Practice of Baidu's Tape Library Storage Architecture Based on the Aries Cloud Storage System