Cloud Computing 16 min read

Design and Implementation of Autohome Enterprise Cloud Disk

The article describes the background, goals, architecture, security, and future plans of Autohome's internally developed enterprise cloud disk, detailing its multi‑layer design, RBAC permission model, file processing, task scheduling, and integration with various corporate systems to improve collaborative work efficiency.

HomeTech
HomeTech
HomeTech
Design and Implementation of Autohome Enterprise Cloud Disk

Background

Autohome has grown from a small startup team to over 5,000 employees nationwide. Distributed offices, independent business units, and high staff turnover create massive daily document and data flow, leading to several pain points identified through internal interviews.

Pain point 1: Employee documents, audio, and video data lack centralized management, resulting in poor collaboration and data leakage risk.

Many traditional desktops and laptops store work data locally, making centralized control impossible and increasing security risks for sharing and transferring files.

Pain point 2: Email attachments are limited to 32 MB, forcing large files to be transferred via alternative media, compromising security and privacy.

Pain point 3: Existing departmental FTP services are limited and cannot support advanced collaborative features such as encrypted sharing, permission control, versioning, and multi‑user editing.

Pain point 4: Documents cannot be accessed outside the office network, hindering business personnel who need remote access.

Goals and Solution

To address these issues and improve employee efficiency, a private enterprise cloud‑disk was proposed. After comparing commercial and open‑source solutions (e.g., Linkcloud, Aisu, Baidu, 163, 115), the decision was made to develop an in‑house product, named "Home Cloud Disk".

Qualitative goals:

Enable employees to access and collaborate on documents anytime, anywhere.

Provide a standard interface for business systems to achieve unified data storage.

Quantitative goals:

Offer 200 GB storage per employee.

Support cross‑platform access and data synchronization.

Ensure zero data loss.

Integrate a cloud‑disk entry into the Autohome App.

Cloud‑Disk System Design Overview

The Home Cloud Disk is built on a private‑cloud storage architecture, aiming to provide on‑demand file access and collaborative editing. The logical architecture consists of four layers:

· Front‑end (web, desktop client, mobile app, SDK)

· Business API (controller layer, functional and third‑party APIs)

· Backend services (data processing, scheduling, storage, databases, storage‑scheduling algorithms)

· Underlying compute layer

3.1 System Logical Architecture Diagram

3.2 Permission Management Design

The system adopts an extended RBAC model rather than a simple ACL, because data‑centric products require higher security and more complex resource relationships.

Roles are divided into personal, organizational, and custom types; a user may hold multiple roles simultaneously. Roles support multiple hierarchy levels for permission inheritance, and user/department information is synchronized daily to reflect real‑time changes.

Core permission features:

RBAC design with 11 granular permission types and custom roles.

SSO (CAS) integration for unified login.

3.3 File Processing Design

The core process handles file storage, deduplication, compression, and synchronization.

Deduplication uses SHA‑1 hashes (20 B) for each 8 KB data block. Collisions are extremely unlikely; most errors stem from hardware or transmission faults.

Three synchronization strategies were evaluated; the project chose the server‑centric approach (delete client files that do not match the server) to maximize data protection.

File module core functions:

CRUD operations for documents, folders, and libraries.

Multi‑device file sync and roaming.

File tagging and custom fields (e.g., confidentiality flag).

API encapsulation for OpenAPI exposure.

3.4 Task Scheduling Design

Task scheduling is realized via an RPC framework, allowing components written in different languages to communicate through a unified protocol.

Core scheduling functions:

RPC dispatch service for inter‑component communication.

Background sync process for AD user information.

PDF/Office conversion process for online preview and editing.

3.5 Basic Module Design

Message service (email, in‑app notifications).

Online PDF/Office preview.

Comprehensive logging (server, controller, notification, RPC, scheduled tasks).

Cache service to reduce database load and improve response time.

3.6 Data Security Design

End‑to‑end HTTPS with dynamic token and two‑factor authentication.

Multi‑dimensional black/white list to prevent brute‑force attacks.

Data sharding algorithm to disperse storage across servers.

Integrity verification after upload/download.

Application‑level encryption for libraries, share passwords, and expiration policies.

3.7 Third‑Party System Integration

The cloud disk integrates several corporate services:

SSO (CAS) for single sign‑on.

OWA for online document preview and collaborative editing.

OTP for two‑factor authentication.

OpenAPI for unified authentication, authorization, and dispatch.

Challenges Encountered

Data Security

Supporting web browsers, desktop clients, and mobile apps introduces multiple network protocols, creating various security threats such as CSRF, XSS, SQL injection, brute‑force attacks, and RPC vulnerabilities.

Multi‑Technology Stack Collaboration

The core is written in C, the web service uses Python/Django, and the UI is pure HTML. Coordinating across languages caused schedule delays, quality issues, and communication gaps among team members.

Corresponding Solutions

Data Security Measures

CSRF: enforce CSRF token on all requests.

XSS: escape HTML output.

SQL injection: use ORM with parameterized queries.

HTTPS everywhere; token verification for mobile login.

Brute‑force protection: captcha, account lockout, dynamic token.

Refactor vulnerable RPC interfaces.

Team Collaboration Improvements

Standardize development processes and documentation.

Define clear daily and weekly tasks.

Implement code reviews.

Key Features of the Cloud Disk

Unified Account Login

The system integrates SSO (CAS) covering all platforms, unifying Autohome and OA accounts for a consistent login experience. Internet access includes secondary dynamic token verification to prevent brute‑force attacks.

Online File Management

After login, users can upload, download, share, create new documents, manage version history, and use a recycle bin.

Document Time Machine

Automatic versioning allows users to roll back to any previous state.

Cloud Collaboration

Online preview and editing, shared storage with one‑click sharing, and activity notifications.

Access Permission Management

Sharing supports granular permissions: read‑only, read‑write, and full management.

Audit Logging

The system records who performed which operations on which files and when.

Future Plans

Future enhancements include support for more file types (e.g., Photoshop, Illustrator, Xmind), preview watermarks, monitoring dashboards, and advanced data leakage prevention, with continuous iteration to improve user experience and collaborative efficiency.

cloud storageData Securityprivate cloudRBACenterprisefile synchronization
HomeTech
Written by

HomeTech

HomeTech tech sharing

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.