Elasticsearch Setup for Log Analysis: Installation and Configuration
Elasticsearch is a distributed search and analytics engine built on Apache Lucene, designed for horizontal scalability and near-real-time search. When combined with Logstash (or Filebeat) for collection and Kibana for visualization, it forms the ELK Stack — one of the most widely deployed solutions for centralized log analysis, infrastructure monitoring, and security event management.
This guide walks through setting up a production-ready Elasticsearch environment for log analysis on Ubuntu Linux, covering everything from initial installation through cluster security and lifecycle management.
Prerequisites
Before you begin, ensure you have:
- Ubuntu Server 22.04 LTS or 24.04 LTS with at least 8 GB RAM (16 GB recommended).
- Root or sudo access.
- Sufficient disk space: Estimate your daily log volume and multiply by your retention period. SSD storage is strongly recommended for Elasticsearch data.
- Open ports: 9200 (Elasticsearch HTTP), 9300 (Elasticsearch transport), 5601 (Kibana), 5044 (Logstash Beats input).
Check available memory and disk:
free -h
df -h /
Installing Elasticsearch
Add the Elastic Repository
Import the Elastic GPG key and add the repository:
wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo gpg --dearmor -o /usr/share/keyrings/elastic-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/elastic-keyring.gpg] https://artifacts.elastic.co/packages/8.x/apt stable main" | sudo tee /etc/apt/sources.list.d/elastic-8.x.list
Install and Start Elasticsearch
sudo apt update
sudo apt install elasticsearch -y
During installation, Elasticsearch 8.x generates a superuser password and enrollment tokens. Save the output — you’ll need the password for initial setup:
The generated password for the elastic built-in superuser is: <password>
Configure Elasticsearch to start on boot:
sudo systemctl daemon-reload
sudo systemctl enable elasticsearch
Before starting, configure the cluster settings.
Cluster Configuration
Edit the main Elasticsearch configuration file:
sudo nano /etc/elasticsearch/elasticsearch.yml
Single-Node Development Setup
For a single-node deployment (development or small environments):
cluster.name: log-analysis
node.name: es-node-01
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
network.host: 0.0.0.0
http.port: 9200
discovery.type: single-node
xpack.security.enabled: true
xpack.security.enrollment.enabled: true
xpack.security.http.ssl.enabled: false
xpack.security.transport.ssl.enabled: false
Multi-Node Production Cluster
For a three-node cluster:
Node 1 (es-node-01):
cluster.name: log-analysis
node.name: es-node-01
node.roles: [master, data, ingest]
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
network.host: 192.168.1.50
http.port: 9200
discovery.seed_hosts: ["192.168.1.50", "192.168.1.51", "192.168.1.52"]
cluster.initial_master_nodes: ["es-node-01", "es-node-02", "es-node-03"]
Nodes 2 and 3 use the same configuration with their respective node.name and network.host values.
Start Elasticsearch
sudo systemctl start elasticsearch
Verify it’s running:
curl -u elastic:YourPassword -X GET "localhost:9200"
Expected response:
{
"name": "es-node-01",
"cluster_name": "log-analysis",
"cluster_uuid": "...",
"version": {
"number": "8.17.0",
"build_flavor": "default",
"build_type": "deb"
},
"tagline": "You Know, for Search"
}
Check cluster health:
curl -u elastic:YourPassword "localhost:9200/_cluster/health?pretty"
A green status means all primary and replica shards are allocated.
Installing and Configuring Kibana
Install Kibana from the same repository:
sudo apt install kibana -y
Configure Kibana:
sudo nano /etc/kibana/kibana.yml
server.port: 5601
server.host: "0.0.0.0"
server.name: "kibana-server"
elasticsearch.hosts: ["http://192.168.1.50:9200"]
elasticsearch.username: "kibana_system"
elasticsearch.password: "YourKibanaPassword"
Set the kibana_system user password in Elasticsearch:
sudo /usr/share/elasticsearch/bin/elasticsearch-reset-password -u kibana_system -i
Start Kibana:
sudo systemctl enable kibana
sudo systemctl start kibana
Access Kibana at http://your-server:5601 and log in with the elastic user.
Setting Up Logstash
Install Logstash:
sudo apt install logstash -y
Creating a Pipeline Configuration
Logstash pipelines define how data flows from input through filters to output. Create a pipeline for syslog data:
sudo nano /etc/logstash/conf.d/syslog.conf
input {
beats {
port => 5044
}
}
filter {
if [fields][log_type] == "syslog" {
grok {
match => {
"message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}"
}
}
date {
match => [ "syslog_timestamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]
}
mutate {
remove_field => [ "message" ]
}
}
}
output {
elasticsearch {
hosts => ["http://192.168.1.50:9200"]
user => "elastic"
password => "YourPassword"
index => "syslog-%{+YYYY.MM.dd}"
}
}
Nginx Access Log Pipeline
sudo nano /etc/logstash/conf.d/nginx.conf
input {
beats {
port => 5045
}
}
filter {
if [fields][log_type] == "nginx_access" {
grok {
match => {
"message" => '%{IPORHOST:client_ip} - %{DATA:user} \[%{HTTPDATE:timestamp}\] "%{WORD:method} %{URIPATHPARAM:request} HTTP/%{NUMBER:http_version}" %{NUMBER:status_code} %{NUMBER:bytes} "%{DATA:referrer}" "%{DATA:user_agent}"'
}
}
date {
match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
}
geoip {
source => "client_ip"
target => "geoip"
}
useragent {
source => "user_agent"
target => "ua"
}
mutate {
convert => {
"status_code" => "integer"
"bytes" => "integer"
}
}
}
}
output {
elasticsearch {
hosts => ["http://192.168.1.50:9200"]
user => "elastic"
password => "YourPassword"
index => "nginx-access-%{+YYYY.MM.dd}"
}
}
Start Logstash:
sudo systemctl enable logstash
sudo systemctl start logstash
Configuring Filebeat on Source Servers
Install Filebeat on each server whose logs you want to collect:
sudo apt install filebeat -y
Configure Filebeat:
sudo nano /etc/filebeat/filebeat.yml
filebeat.inputs:
- type: filestream
id: syslog
enabled: true
paths:
- /var/log/syslog
- /var/log/auth.log
fields:
log_type: syslog
- type: filestream
id: nginx-access
enabled: true
paths:
- /var/log/nginx/access.log
fields:
log_type: nginx_access
output.logstash:
hosts: ["192.168.1.50:5044"]
processors:
- add_host_metadata: ~
- add_cloud_metadata: ~
Start Filebeat:
sudo systemctl enable filebeat
sudo systemctl start filebeat
Verify logs are flowing:
curl -u elastic:YourPassword "localhost:9200/_cat/indices?v&s=index"
You should see syslog-* and nginx-access-* indices appearing.
Index Lifecycle Management (ILM)
ILM automates index management to prevent unbounded disk growth. Create a policy that moves indices through hot, warm, and delete phases:
curl -u elastic:YourPassword -X PUT "localhost:9200/_ilm/policy/logs-policy" \
-H 'Content-Type: application/json' -d'
{
"policy": {
"phases": {
"hot": {
"min_age": "0ms",
"actions": {
"rollover": {
"max_primary_shard_size": "30gb",
"max_age": "1d"
},
"set_priority": {
"priority": 100
}
}
},
"warm": {
"min_age": "7d",
"actions": {
"shrink": {
"number_of_shards": 1
},
"forcemerge": {
"max_num_segments": 1
},
"set_priority": {
"priority": 50
}
}
},
"delete": {
"min_age": "30d",
"actions": {
"delete": {}
}
}
}
}
}'
This policy rolls over indices daily or at 30 GB, moves them to warm storage after 7 days (merging segments for efficiency), and deletes them after 30 days.
Apply ILM to an Index Template
curl -u elastic:YourPassword -X PUT "localhost:9200/_index_template/logs-template" \
-H 'Content-Type: application/json' -d'
{
"index_patterns": ["syslog-*", "nginx-access-*"],
"template": {
"settings": {
"number_of_shards": 1,
"number_of_replicas": 1,
"index.lifecycle.name": "logs-policy",
"index.lifecycle.rollover_alias": "logs"
}
}
}'
JVM and System Tuning
JVM Heap Size
Set the heap to 50% of available RAM, up to 31 GB:
sudo nano /etc/elasticsearch/jvm.options.d/heap.options
-Xms8g
-Xmx8g
Both values must be equal to prevent heap resizing during operation.
OS-Level Tuning
Elasticsearch requires several kernel parameter adjustments:
# Increase virtual memory map count
echo "vm.max_map_count=262144" | sudo tee -a /etc/sysctl.d/99-elasticsearch.conf
sudo sysctl -p /etc/sysctl.d/99-elasticsearch.conf
# Disable swap (Elasticsearch should never swap)
sudo swapoff -a
# Remove swap entries from /etc/fstab to persist across reboots
Set file descriptor limits:
sudo tee -a /etc/security/limits.conf > /dev/null << 'EOF'
elasticsearch - nofile 65535
elasticsearch - nproc 4096
EOF
Restart Elasticsearch after making these changes:
sudo systemctl restart elasticsearch
Security Hardening
Firewall Rules
Restrict Elasticsearch and Kibana ports to trusted networks only:
sudo ufw allow from 192.168.1.0/24 to any port 9200 comment "Elasticsearch HTTP"
sudo ufw allow from 192.168.1.0/24 to any port 9300 comment "Elasticsearch transport"
sudo ufw allow from 192.168.1.0/24 to any port 5601 comment "Kibana"
sudo ufw allow from 192.168.1.0/24 to any port 5044 comment "Logstash Beats"
Create Dedicated Users
Avoid using the elastic superuser for daily operations. Create role-specific users:
# Create a read-only user for Kibana dashboards
curl -u elastic:YourPassword -X POST "localhost:9200/_security/user/dashboard_viewer" \
-H 'Content-Type: application/json' -d'
{
"password": "ViewerPass123",
"roles": ["viewer"],
"full_name": "Dashboard Viewer"
}'
Enable Audit Logging
# Add to elasticsearch.yml
xpack.security.audit.enabled: true
xpack.security.audit.logfile.events.include: ["authentication_failed", "access_denied", "connection_denied"]
Monitoring and Troubleshooting
Cluster Health
# Overall health
curl -u elastic:YourPassword "localhost:9200/_cluster/health?pretty"
# Node stats
curl -u elastic:YourPassword "localhost:9200/_nodes/stats?pretty" | head -50
# Index sizes
curl -u elastic:YourPassword "localhost:9200/_cat/indices?v&s=store.size:desc"
# Shard allocation
curl -u elastic:YourPassword "localhost:9200/_cat/shards?v&s=index"
Common Issues
Cluster status is yellow: This usually means replica shards are unassigned, often because you have a single-node cluster. For single-node setups, set replicas to zero:
curl -u elastic:YourPassword -X PUT "localhost:9200/_settings" \
-H 'Content-Type: application/json' -d'{"index.number_of_replicas": 0}'
Elasticsearch won’t start — bootstrap checks failed: Check the logs:
sudo journalctl -u elasticsearch --no-pager | tail -30
Common causes: insufficient vm.max_map_count, heap size not set, or file descriptor limits too low.
Logstash pipeline not processing: Test the pipeline configuration:
sudo /usr/share/logstash/bin/logstash --config.test_and_exit -f /etc/logstash/conf.d/syslog.conf
High disk usage — indices not being deleted: Verify ILM is active:
curl -u elastic:YourPassword "localhost:9200/_ilm/policy/logs-policy?pretty"
curl -u elastic:YourPassword "localhost:9200/syslog-*/_ilm/explain?pretty"
Useful Kibana Queries (KQL)
Once your data is flowing, use Kibana’s Discover tab with these queries:
# Find all 5xx errors in Nginx logs
status_code >= 500
# Search for authentication failures
syslog_program: "sshd" AND syslog_message: "Failed password"
# Find logs from a specific host
host.name: "web-server-01" AND syslog_program: "kernel"
# Time-scoped queries
@timestamp >= "2026-01-05T00:00:00" AND @timestamp < "2026-01-06T00:00:00"
Summary
A well-configured ELK Stack transforms scattered log files across dozens of servers into a searchable, visualizable, and alertable centralized system. The key points from this guide:
- Install Elasticsearch, Kibana, and Logstash from the official Elastic repository to keep versions synchronized.
- Use
single-nodediscovery for development; deploy at least three nodes for production resilience. - Deploy Filebeat on every source server as a lightweight shipper; use Logstash for parsing and enrichment.
- Configure ILM policies from day one to automate index rollover and deletion — retroactive cleanup is painful.
- Set JVM heap to 50% of RAM (max 31 GB) and disable swap entirely.
- Restrict network access with firewall rules and create role-based users instead of sharing the superuser credentials.
- Monitor cluster health daily and set up Kibana alerts for disk usage, cluster status changes, and log volume anomalies.
With these foundations in place, you can expand your deployment to handle application logs, security events, metrics, and traces — building a comprehensive observability platform from a solid Elasticsearch core.