Databases 13 min read

Elasticsearch vs ClickHouse: Performance Comparison, Cost Analysis, and Deployment Guide

This article compares Elasticsearch and ClickHouse in terms of write throughput, query speed, and server cost, provides a cost analysis, and offers step‑by‑step deployment instructions for Zookeeper, Kafka, FileBeat, and ClickHouse, including troubleshooting tips and configuration examples.

Code Ape Tech Column
Code Ape Tech Column
Code Ape Tech Column
Elasticsearch vs ClickHouse: Performance Comparison, Cost Analysis, and Deployment Guide

The author introduces the need for a private‑cloud data analysis solution to reduce server overhead for SaaS services, and chooses ClickHouse as a cost‑effective alternative to Elasticsearch.

Elasticsearch vs ClickHouse

ClickHouse offers significantly higher write throughput (50‑200 MB/s per server, over 600 k records/s, >5× ES), fewer write rejections, and faster queries (2‑30 GB/s in pagecache, 5‑30× faster than ES). Its higher compression (1/3‑1/30 of ES) reduces disk usage and I/O, leading to lower CPU, memory, and overall server costs.

Cost analysis based on Alibaba Cloud pricing shows ClickHouse can halve server expenses compared to Elasticsearch.

Environment Deployment

1. Zookeeper Cluster

Installation and configuration steps include installing Java, setting up directories, extracting Zookeeper binaries, defining zoo.cfg with tickTime, initLimit, syncLimit, dataDir, clientPort, and server definitions, creating myid on each node, and starting the service.

yum install java-1.8.0-openjdk-devel.x86_64
/etc/profile 配置环境变量
...
sh zkServer.sh start

2. Kafka Cluster

mkdir -p /usr/kafka
chmod 777 -R /usr/kafka
wget --no-check-certificate https://mirrors.tuna.tsinghua.edu.cn/apache/kafka/3.2.0/kafka_2.12-3.2.0.tgz
...
nohup /usr/kafka/kafka_2.12-3.2.0/bin/kafka-server-start.sh ... &

3. FileBeat

sudo rpm --import https://packages.elastic.co/GPG-KEY-elasticsearch
cat > /etc/yum.repos.d/elastic.repo <

FileBeat configuration highlights keys_under_root: true to flatten JSON fields and Kafka output settings.

4. ClickHouse

Before installation, verify CPU supports SSE 4.2, create data directories, adjust CPU governor, disable overcommit and transparent huge pages, add the official repository, and install clickhouse-server and clickhouse-client . Configuration changes include setting log level to information and reviewing log paths.

grep -q sse4_2 /proc/cpuinfo && echo "SSE 4.2 supported" || echo "SSE 4.2 not supported"
mkdir -p /data/clickhouse
...
yum -y install clickhouse-server clickhouse-client

Troubleshooting and Solutions

Kafka Engine Table – Direct select disabled; enable with --stream_like_engine_allow_direct_select 1 when launching the client.

clickhouse-client --stream_like_engine_allow_direct_select 1 --password xxxxx

Local Table Macro – Missing shard macro; define distinct shard values per node in <macros> section.

<macros>
  <shard>01</shard>
  <replica>example01-01-1</replica>
</macros>

Replica Already Exists – Remove stale Zookeeper nodes before recreating the replicated table.

Distributed Table Authentication – Ensure correct user/password in <remote_servers> configuration.

<remote_servers>
  <clickhouse_cluster>
    <shard>
      <internal_replication>true</internal_replication>
      <replica>
        <host>ip1</host>
        <port>9000</port>
        <user>default</user>
        <password>xxxx</password>
      </replica>
    </shard>
    ...
  </clickhouse_cluster>
</remote_servers>

After resolving these issues, the author creates a distributed table and a materialized view to sync Kafka data into ClickHouse.

CREATE TABLE default.bi_inner_log_all ON CLUSTER clickhouse_cluster AS default.bi_inner_log_local ENGINE = Distributed(...);
CREATE MATERIALIZED VIEW default.view_bi_inner_log ON CLUSTER clickhouse_cluster TO default.bi_inner_log_all AS SELECT ... FROM default.kafka_clickhouse_inner_log;

Conclusion: By following official documentation and troubleshooting steps, the full data pipeline—from log collection to ClickHouse storage—was successfully built, offering high performance and lower cost.

Big DataDatabasedeploymentElasticsearchZookeeperKafkaClickHouse
Code Ape Tech Column
Written by

Code Ape Tech Column

Former Ant Group P8 engineer, pure technologist, sharing full‑stack Java, job interview and career advice through a column. Site: java-family.cn

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.