TL;DR — Quick Summary
Complete ClickHouse guide: columnar storage, MergeTree engines, replication, materialized views, data ingestion, and a practical log analytics example.
ClickHouse is an open-source column-oriented DBMS built for real-time OLAP queries at scale. Developed by Yandex in 2016, it processes billions of rows per second on commodity hardware by combining columnar storage, vectorized query execution, and aggressive data compression. This guide covers everything from architecture internals and installation to replication, materialized views, and a practical log analytics walkthrough.
Prerequisites
- Linux host or Docker with at least 4 GB RAM (16 GB+ recommended for production).
- Basic SQL knowledge.
- For replication: ZooKeeper 3.6+ or ClickHouse Keeper (bundled since v22.4).
ClickHouse Architecture
Columnar Storage
Traditional row-oriented databases store all columns of a row together on disk. ClickHouse stores each column in a separate file. When a query touches only 3 of 100 columns, ClickHouse reads roughly 3% of the data. This column-affinity also enables far better compression ratios because repeated similar values pack tightly.
Vectorized Query Execution
ClickHouse processes data in batches of 8,192–65,536 rows at a time (the chunk or block). SIMD CPU instructions operate on entire vectors of values simultaneously. Combined with query compilation via LLVM, aggregations that would take minutes on a row store finish in seconds.
MergeTree Engine Family
The MergeTree family is ClickHouse’s workhorse. Data arrives as parts (sorted, immutable on-disk chunks) and is periodically merged in the background:
| Engine | Best For |
|---|---|
| MergeTree | Append-only analytics, base engine |
| ReplacingMergeTree | Last-write-wins deduplication |
| SummingMergeTree | Pre-aggregated counters/sums |
| AggregatingMergeTree | Custom aggregate states (used with materialized views) |
| CollapsingMergeTree | Mutable rows via sign column |
| ReplicatedMergeTree | HA replication via Keeper/ZooKeeper |
Data Compression
ClickHouse defaults to LZ4 (fast decompression) and supports ZSTD (higher ratio). Per-column codecs stack: CODEC(Delta, LZ4) reduces timestamp columns by 90%+. Typical compression ratios are 5–15× over raw data.
Approximate Query Processing
For interactive dashboards, ClickHouse supports SAMPLE BY on MergeTree tables. Adding SAMPLE 0.01 to a query processes 1% of data while returning statistically representative results in milliseconds.
Installation
apt (Ubuntu/Debian)
apt-get install -y apt-transport-https ca-certificates curl gnupg
curl -fsSL 'https://packages.clickhouse.com/rpm/lts/repodata/repomd.xml.key' \
| gpg --dearmor -o /usr/share/keyrings/clickhouse-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/clickhouse-keyring.gpg] \
https://packages.clickhouse.com/deb stable main" \
> /etc/apt/sources.list.d/clickhouse.list
apt-get update
apt-get install -y clickhouse-server clickhouse-client
systemctl enable --now clickhouse-server
yum/dnf (RHEL/Rocky/AlmaLinux)
yum install -y yum-utils
yum-config-manager --add-repo \
https://packages.clickhouse.com/rpm/clickhouse.repo
yum install -y clickhouse-server clickhouse-client
systemctl enable --now clickhouse-server
Docker
docker run -d \
--name clickhouse \
--ulimit nofile=262144:262144 \
-p 8123:8123 -p 9000:9000 \
-v clickhouse-data:/var/lib/clickhouse \
clickhouse/clickhouse-server:latest
Port 8123 = HTTP interface. Port 9000 = native TCP (used by clickhouse-client).
Initial Configuration
config.xml — Network and Storage
Edit /etc/clickhouse-server/config.xml:
<!-- Accept connections from all interfaces -->
<listen_host>0.0.0.0</listen_host>
<!-- Data and logs storage paths -->
<path>/var/lib/clickhouse/</path>
<tmp_path>/var/lib/clickhouse/tmp/</tmp_path>
<user_files_path>/var/lib/clickhouse/user_files/</user_files_path>
users.xml — Profiles and Quotas
Edit /etc/clickhouse-server/users.xml:
<profiles>
<default>
<max_memory_usage>10000000000</max_memory_usage> <!-- 10 GB -->
<max_threads>8</max_threads>
<use_uncompressed_cache>0</use_uncompressed_cache>
</default>
<readonly>
<readonly>1</readonly>
</readonly>
</profiles>
<quotas>
<default>
<interval>
<duration>3600</duration>
<queries>100</queries>
<errors>10</errors>
</interval>
</default>
</quotas>
Set a strong password for the default user or create a dedicated user per application.
Table Design and Schema
Data Types
| Concern | Recommended Type |
|---|---|
| String columns with < 10k distinct values | LowCardinality(String) |
| Nullable fields | Avoid; use sentinel values ('', 0) |
| High-precision timestamps | DateTime64(3) (milliseconds) |
| IPv4/IPv6 | IPv4, IPv6 native types |
| JSON blobs | String + JSON functions or Tuple |
LowCardinality wraps a dictionary encoding around any type, cutting memory and improving compression for repeated values like HTTP status codes, country codes, or log levels.
Partitioning and Ordering
CREATE TABLE events
(
event_date Date,
event_time DateTime64(3),
user_id UInt64,
action LowCardinality(String),
revenue Decimal(18, 4),
properties String
)
ENGINE = MergeTree()
PARTITION BY toYYYYMM(event_date)
ORDER BY (action, user_id, event_time)
SETTINGS index_granularity = 8192;
PARTITION BYcontrols physical data separation. Monthly partitions (toYYYYMM) allow instant partition drops for data retention.ORDER BYdetermines the sparse primary index and physical sort order. Put high-cardinality filter columns first.- Primary key vs sorting key: The primary key defaults to the ORDER BY key but can be a prefix of it. Use
PRIMARY KEY (action)to keep a shorter index while still sorting by(action, user_id, event_time).
Data Ingestion
clickhouse-client
# Interactive
clickhouse-client --host=localhost --user=default --password=secret
# Batch insert from CSV
clickhouse-client --query="INSERT INTO events FORMAT CSVWithNames" < events.csv
HTTP Interface
curl -X POST 'http://localhost:8123/?query=INSERT+INTO+events+FORMAT+JSONEachRow' \
--data-binary '{"event_date":"2026-03-23","user_id":1,"action":"click",...}'
Kafka Engine
The Kafka engine reads directly from Kafka topics:
CREATE TABLE events_kafka
(
event_time DateTime64(3),
user_id UInt64,
action LowCardinality(String)
)
ENGINE = Kafka
SETTINGS
kafka_broker_list = 'broker:9092',
kafka_topic_list = 'events',
kafka_group_name = 'clickhouse-consumer',
kafka_format = 'JSONEachRow';
-- Materialized view to push from Kafka engine to MergeTree
CREATE MATERIALIZED VIEW events_mv TO events AS
SELECT * FROM events_kafka;
S3 Engine
SELECT count(), sum(revenue)
FROM s3(
'https://s3.us-east-1.amazonaws.com/bucket/events/2026-03-*.parquet',
'ACCESS_KEY', 'SECRET_KEY',
'Parquet'
);
clickhouse-local
Process files without a running server:
clickhouse-local --query="
SELECT action, count() as cnt
FROM file('events.parquet', Parquet)
GROUP BY action ORDER BY cnt DESC LIMIT 10
"
SQL Dialect and Extensions
ClickHouse is ANSI SQL–compatible with powerful extensions:
-- arrayJoin: explode array column into rows
SELECT user_id, arrayJoin(tags) AS tag
FROM posts;
-- SAMPLE: approximate query on 1% of data
SELECT count() FROM events SAMPLE 0.01;
-- FINAL: force deduplication read for ReplacingMergeTree
SELECT * FROM users FINAL WHERE user_id = 42;
-- WITH ROLLUP: multi-level subtotals
SELECT action, toDate(event_time) AS day, sum(revenue)
FROM events
GROUP BY action, day WITH ROLLUP;
-- Window functions
SELECT user_id,
revenue,
sum(revenue) OVER (PARTITION BY user_id ORDER BY event_time) AS running_total
FROM events;
-- Lambda functions
SELECT arrayFilter(x -> x > 100, [50, 150, 200]) AS big_values;
Dictionaries for Lookups
CREATE DICTIONARY geo_country (
ip_prefix String,
country String
)
PRIMARY KEY ip_prefix
SOURCE(HTTP(URL 'https://...'))
LAYOUT(HASHED())
LIFETIME(3600);
SELECT dictGet('geo_country', 'country', client_ip) AS country, count()
FROM access_log GROUP BY country;
Replication and Sharding
ReplicatedMergeTree
Replace MergeTree() with ReplicatedMergeTree('/clickhouse/tables/{shard}/events', '{replica}'). ClickHouse Keeper (or ZooKeeper) coordinates replicas.
CREATE TABLE events ON CLUSTER my_cluster
(
event_date Date,
event_time DateTime64(3),
user_id UInt64,
action LowCardinality(String)
)
ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/events', '{replica}')
PARTITION BY toYYYYMM(event_date)
ORDER BY (action, user_id, event_time);
Distributed Table Engine
A Distributed table is a virtual layer that routes queries across shards:
CREATE TABLE events_dist ON CLUSTER my_cluster
AS events
ENGINE = Distributed(my_cluster, default, events, rand());
Write to events_dist and reads are fanned out to all shards automatically.
Materialized Views
Materialized views in ClickHouse fire on every INSERT into the source table — effectively creating real-time aggregation pipelines:
-- Target: AggregatingMergeTree stores intermediate states
CREATE TABLE events_hourly_agg
(
action LowCardinality(String),
hour DateTime,
cnt AggregateFunction(count),
total_rev AggregateFunction(sum, Decimal(18,4))
)
ENGINE = AggregatingMergeTree()
ORDER BY (action, hour);
-- View triggers on INSERT to events
CREATE MATERIALIZED VIEW events_hourly_mv
TO events_hourly_agg AS
SELECT
action,
toStartOfHour(event_time) AS hour,
countState() AS cnt,
sumState(revenue) AS total_rev
FROM events
GROUP BY action, hour;
-- Query merges states on read
SELECT action, hour,
countMerge(cnt) AS requests,
sumMerge(total_rev) AS revenue
FROM events_hourly_agg
GROUP BY action, hour
ORDER BY hour DESC;
Refreshable Materialized Views (v23.12+) allow periodic full recalculations — useful for complex transformations that don’t compose incrementally:
CREATE MATERIALIZED VIEW monthly_cohorts
REFRESH EVERY 1 HOUR AS
SELECT ...
FROM events
WHERE event_date >= today() - 90;
Monitoring
system Tables
-- Slow queries in the last hour
SELECT query, query_duration_ms, read_rows, memory_usage
FROM system.query_log
WHERE event_time > now() - INTERVAL 1 HOUR
AND query_duration_ms > 5000
ORDER BY query_duration_ms DESC LIMIT 20;
-- Current server metrics
SELECT metric, value FROM system.metrics
WHERE metric IN ('Query', 'Connection', 'MemoryTracking');
Prometheus Endpoint
Enable in config.xml:
<prometheus>
<endpoint>/metrics</endpoint>
<port>9363</port>
<metrics>true</metrics>
<events>true</events>
<asynchronous_metrics>true</asynchronous_metrics>
</prometheus>
Scrape http://clickhouse-host:9363/metrics with Prometheus. The official Grafana ClickHouse plugin (grafana-clickhouse-datasource) connects directly via the native protocol for live dashboard queries.
ClickHouse vs Alternatives
| ClickHouse | TimescaleDB | Apache Druid | DuckDB | BigQuery | Elasticsearch | |
|---|---|---|---|---|---|---|
| Storage | Columnar | Row + compression | Columnar segments | Columnar | Columnar | Inverted index |
| Query speed | Very fast | Fast | Fast | Very fast | Fast | Medium |
| Ingestion rate | Very high | High | High | Low | High | High |
| Streaming | Kafka engine | Continuous aggs | Native | No | Pub/Sub | Logstash |
| Self-hosted | Yes | Yes | Yes | Yes | No | Yes |
| Best for | OLAP/logs/events | Time-series + SQL | Event analytics | Local files/BI | Serverless DW | Full-text search |
| Scale model | Sharding + replication | Hypertable | Cluster | Single node | Serverless | Cluster |
Practical Example: Log Analytics
Real-world scenario: you receive 500 million nginx access log lines per day and need sub-second dashboards.
-- Schema
CREATE TABLE nginx_logs
(
log_date Date,
log_time DateTime64(3),
remote_addr IPv4,
method LowCardinality(String),
uri String,
status UInt16,
bytes_sent UInt64,
request_time Float32,
referer String,
user_agent String
)
ENGINE = MergeTree()
PARTITION BY log_date
ORDER BY (status, log_time)
TTL log_date + INTERVAL 90 DAY DELETE
SETTINGS index_granularity = 8192;
-- Real-time P95 latency per endpoint (materialized view)
CREATE TABLE nginx_latency_agg
(
uri String,
hour DateTime,
p95_state AggregateFunction(quantile(0.95), Float32)
)
ENGINE = AggregatingMergeTree()
ORDER BY (uri, hour);
CREATE MATERIALIZED VIEW nginx_latency_mv TO nginx_latency_agg AS
SELECT
uri,
toStartOfHour(log_time) AS hour,
quantileState(0.95)(request_time) AS p95_state
FROM nginx_logs
GROUP BY uri, hour;
-- Query: top 10 slowest endpoints today
SELECT uri,
quantileMerge(0.95)(p95_state) AS p95_ms
FROM nginx_latency_agg
WHERE hour >= toStartOfDay(now())
GROUP BY uri
ORDER BY p95_ms DESC LIMIT 10;
Insert via clickhouse-client piped from tail -F /var/log/nginx/access.log or the Kafka engine for distributed collection.
Gotchas and Edge Cases
- FINAL is expensive —
SELECT ... FINALon large ReplacingMergeTree tables forces a full merge pass at read time. Preferoptimize_on_insert = 1or scheduleOPTIMIZE TABLE ... FINALoff-peak. - Small inserts kill performance — Never insert row-by-row. Batch at least 10,000 rows per INSERT or use the async insert mode (
async_insert = 1). - JOIN order matters — Always put the smaller table on the right side of a JOIN. ClickHouse loads the right-side table into memory as a hash table.
- Nullable overhead — Every
Nullable(T)column adds a separate null-map bitfield. Avoid unless nulls are semantically required. - Partition granularity — Partitioning by day on a high-volume table creates too many small parts. Prefer monthly partitions and rely on ORDER BY for pruning.
Summary
- Columnar + vectorized execution delivers 100-1000× faster aggregations than OLTP databases for analytical workloads.
- MergeTree family covers dedup (Replacing), pre-aggregation (Summing, Aggregating), and mutability (Collapsing).
- Schema design is the most impactful tuning knob:
ORDER BYfor query patterns,PARTITION BYfor retention,LowCardinalityfor enums. - Materialized views +
AggregatingMergeTreecreate zero-latency aggregation pipelines that update on every insert. - Kafka and S3 engines eliminate external ETL for streaming and batch ingestion.
- Replication via ReplicatedMergeTree + ClickHouse Keeper provides HA without additional infrastructure.