What port does Node Exporter run on?

Node Exporter listens on port 9100 by default. Prometheus itself listens on port 9090.

How often does Prometheus scrape metrics by default?

The default global scrape_interval is 1 minute (1m). You can override it globally or per job in prometheus.yml.

Can Prometheus monitor Windows servers?

Yes. Use the Windows Exporter (formerly WMI Exporter) on Windows hosts. It exposes metrics on port 9182 and is configured the same way as Node Exporter in prometheus.yml.

How long does Prometheus retain data by default?

Prometheus retains data for 15 days by default. Change it with --storage.tsdb.retention.time=30d or cap by size with --storage.tsdb.retention.size=10GB.

Prometheus + Node Exporter: Linux Server Monitoring Guide

TL;DR — Quick Summary

Set up Prometheus with Node Exporter for Linux server monitoring. Learn PromQL, alerting rules, Alertmanager, storage tuning, and security hardening.

Prometheus is the de-facto standard for open-source infrastructure monitoring. Combined with Node Exporter, it gives you deep visibility into every Linux server in your fleet — CPU usage, memory pressure, disk saturation, network throughput, and load averages — all queryable with a powerful expression language and alertable through Alertmanager. This guide walks from bare metal to a production-ready monitoring stack: architecture, installation, PromQL, alerting, Alertmanager, recording rules, storage tuning, and security hardening.

Prerequisites

Ubuntu 22.04 or Debian 12 (steps are identical on RHEL/Rocky with dnf substituted for apt)
Root or sudo access
Ports 9090 (Prometheus), 9100 (Node Exporter), and 9093 (Alertmanager) open in your firewall
Basic familiarity with systemd and YAML

Architecture Overview

Prometheus uses a pull model: it scrapes HTTP endpoints called targets at a configurable interval and stores the time-series data locally in a custom columnar database (TSDB). This is the opposite of push-based systems like StatsD.

┌─────────────────────────────────────────────────────────┐
│                     Linux Server                         │
│                                                          │
│   Node Exporter :9100  ←  Prometheus :9090  →  Grafana  │
│         ↑                      ↓                         │
│   /proc /sys              TSDB on disk                   │
│                                ↓                         │
│                          Alertmanager :9093              │
│                          (email/Slack/PD)                │
└─────────────────────────────────────────────────────────┘

Key concepts:

Scrape — Prometheus fetches /metrics from each target at scrape_interval
TSDB — Local time-series database; each metric is a float64 value with a nanosecond timestamp and a set of labels
PromQL — Query language for slicing, aggregating, and computing rates over time-series
Alertmanager — Separate process that receives firing alerts from Prometheus and routes notifications

Step 1: Install Prometheus

Create the system user and directories

sudo useradd --system --no-create-home --shell /bin/false prometheus

sudo mkdir -p /etc/prometheus /var/lib/prometheus
sudo chown prometheus:prometheus /etc/prometheus /var/lib/prometheus

Download and install the binary

# Check https://github.com/prometheus/prometheus/releases for the latest version
PROM_VERSION="2.51.2"
cd /tmp
wget "https://github.com/prometheus/prometheus/releases/download/v${PROM_VERSION}/prometheus-${PROM_VERSION}.linux-amd64.tar.gz"
tar xzf prometheus-${PROM_VERSION}.linux-amd64.tar.gz
cd prometheus-${PROM_VERSION}.linux-amd64

sudo cp prometheus promtool /usr/local/bin/
sudo chown prometheus:prometheus /usr/local/bin/prometheus /usr/local/bin/promtool

sudo cp -r consoles console_libraries /etc/prometheus/
sudo chown -R prometheus:prometheus /etc/prometheus/consoles /etc/prometheus/console_libraries

Write prometheus.yml

sudo nano /etc/prometheus/prometheus.yml

global:
  scrape_interval: 15s
  evaluation_interval: 15s
  external_labels:
    environment: "production"
    datacenter: "dc1"

rule_files:
  - /etc/prometheus/rules/*.yml

alerting:
  alertmanagers:
    - static_configs:
        - targets: ["localhost:9093"]

scrape_configs:
  - job_name: "prometheus"
    static_configs:
      - targets: ["localhost:9090"]

  - job_name: "node"
    static_configs:
      - targets: ["localhost:9100"]
    labels:
      instance: "web-01"
      environment: "production"

sudo chown prometheus:prometheus /etc/prometheus/prometheus.yml
# Validate syntax before starting
promtool check config /etc/prometheus/prometheus.yml

Create the systemd unit

sudo nano /etc/systemd/system/prometheus.service

[Unit]
Description=Prometheus Monitoring System
Documentation=https://prometheus.io/docs/introduction/overview/
After=network.target

[Service]
Type=simple
User=prometheus
Group=prometheus
ExecStart=/usr/local/bin/prometheus \
  --config.file=/etc/prometheus/prometheus.yml \
  --storage.tsdb.path=/var/lib/prometheus \
  --storage.tsdb.retention.time=30d \
  --web.console.templates=/etc/prometheus/consoles \
  --web.console.libraries=/etc/prometheus/console_libraries \
  --web.listen-address=0.0.0.0:9090 \
  --web.enable-lifecycle

Restart=on-failure
RestartSec=5s
NoNewPrivileges=true
ProtectSystem=strict
ReadWritePaths=/var/lib/prometheus

[Install]
WantedBy=multi-user.target

sudo systemctl daemon-reload
sudo systemctl enable --now prometheus
sudo systemctl status prometheus

Step 2: Install Node Exporter

Node Exporter exposes hardware and OS metrics from /proc and /sys on port 9100.

NODE_VERSION="1.7.0"
cd /tmp
wget "https://github.com/prometheus/node_exporter/releases/download/v${NODE_VERSION}/node_exporter-${NODE_VERSION}.linux-amd64.tar.gz"
tar xzf node_exporter-${NODE_VERSION}.linux-amd64.tar.gz

sudo useradd --system --no-create-home --shell /bin/false node_exporter
sudo cp node_exporter-${NODE_VERSION}.linux-amd64/node_exporter /usr/local/bin/
sudo chown node_exporter:node_exporter /usr/local/bin/node_exporter

sudo nano /etc/systemd/system/node_exporter.service

[Unit]
Description=Prometheus Node Exporter
Documentation=https://github.com/prometheus/node_exporter
After=network.target

[Service]
Type=simple
User=node_exporter
Group=node_exporter
ExecStart=/usr/local/bin/node_exporter \
  --collector.systemd \
  --collector.processes

Restart=on-failure
RestartSec=5s
NoNewPrivileges=true
ProtectSystem=strict

[Install]
WantedBy=multi-user.target

sudo systemctl daemon-reload
sudo systemctl enable --now node_exporter

# Verify metrics are exposed
curl -s http://localhost:9100/metrics | head -30

Key Node Exporter Metrics

Metric	Description
`node_cpu_seconds_total`	CPU time by mode (user, system, idle, iowait) — use `rate()`
`node_memory_MemAvailable_bytes`	Available memory (includes reclaimable cache)
`node_memory_MemTotal_bytes`	Total physical memory
`node_filesystem_avail_bytes`	Free disk space on a filesystem
`node_filesystem_size_bytes`	Total filesystem size
`node_network_receive_bytes_total`	Bytes received per network interface — use `rate()`
`node_network_transmit_bytes_total`	Bytes transmitted per network interface — use `rate()`
`node_load1` / `node_load5` / `node_load15`	System load averages
`node_disk_io_time_seconds_total`	Time spent doing I/O — use `rate()` for saturation
`node_time_seconds`	Current system time (for clock drift detection)
`node_systemd_unit_state`	State of systemd units when `--collector.systemd` enabled

PromQL Basics

PromQL is Prometheus’s query language. Open the Prometheus UI at http://your-server:9090 to experiment.

CPU usage percentage

# CPU usage across all cores, averaged
100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)

Memory usage percentage

(1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100

Disk usage percentage

(1 - (node_filesystem_avail_bytes{fstype!~"tmpfs|devtmpfs"} / node_filesystem_size_bytes)) * 100

Network throughput (bytes/sec)

rate(node_network_receive_bytes_total{device!="lo"}[5m])

Key PromQL functions

Function	Example	Purpose
`rate()`	`rate(http_requests_total[5m])`	Per-second average rate over a range vector
`irate()`	`irate(cpu_seconds_total[5m])`	Instantaneous rate (last two samples) — spiky
`increase()`	`increase(errors_total[1h])`	Total increase over a range (= rate × duration)
`histogram_quantile()`	`histogram_quantile(0.99, rate(http_duration_bucket[5m]))`	P99 latency from a histogram metric
`avg by()`	`avg by (job) (metric)`	Average across a dimension
`sum without()`	`sum without (cpu) (metric)`	Aggregate dropping specified labels
`topk()`	`topk(5, metric)`	Top N time-series by value
`predict_linear()`	`predict_linear(disk_avail[6h], 3600*24)`	Predict value in N seconds

Alerting Rules

Create the rules directory and an alerts file:

sudo mkdir -p /etc/prometheus/rules
sudo nano /etc/prometheus/rules/alerts.yml

groups:
  - name: node_alerts
    interval: 30s
    rules:

      - alert: HighCPUUsage
        expr: 100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 85
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "High CPU usage on {{ $labels.instance }}"
          description: "CPU usage is {{ $value | printf \"%.1f\" }}% for more than 5 minutes."

      - alert: CriticalCPUUsage
        expr: 100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 95
        for: 2m
        labels:
          severity: critical
        annotations:
          summary: "Critical CPU usage on {{ $labels.instance }}"
          description: "CPU usage is {{ $value | printf \"%.1f\" }}% — investigate immediately."

      - alert: LowDiskSpace
        expr: (node_filesystem_avail_bytes{fstype!~"tmpfs|devtmpfs"} / node_filesystem_size_bytes) * 100 < 15
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Low disk space on {{ $labels.instance }}"
          description: "Filesystem {{ $labels.mountpoint }} has {{ $value | printf \"%.1f\" }}% free."

      - alert: CriticalDiskSpace
        expr: (node_filesystem_avail_bytes{fstype!~"tmpfs|devtmpfs"} / node_filesystem_size_bytes) * 100 < 5
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "Critical disk space on {{ $labels.instance }}"
          description: "Filesystem {{ $labels.mountpoint }} is nearly full."

      - alert: HighMemoryUsage
        expr: (1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100 > 90
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "High memory usage on {{ $labels.instance }}"
          description: "Memory usage is {{ $value | printf \"%.1f\" }}%."

      - alert: InstanceDown
        expr: up == 0
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "Instance {{ $labels.instance }} is down"
          description: "{{ $labels.job }}/{{ $labels.instance }} has been unreachable for 1 minute."

      - alert: HighLoad
        expr: node_load1 / count without(cpu, mode)(node_cpu_seconds_total{mode="idle"}) > 2
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "High system load on {{ $labels.instance }}"
          description: "Load average per CPU is {{ $value | printf \"%.2f\" }}."

sudo chown -R prometheus:prometheus /etc/prometheus/rules
promtool check rules /etc/prometheus/rules/alerts.yml
sudo systemctl reload prometheus

Alertmanager Setup

Install Alertmanager

AM_VERSION="0.27.0"
cd /tmp
wget "https://github.com/prometheus/alertmanager/releases/download/v${AM_VERSION}/alertmanager-${AM_VERSION}.linux-amd64.tar.gz"
tar xzf alertmanager-${AM_VERSION}.linux-amd64.tar.gz

sudo useradd --system --no-create-home --shell /bin/false alertmanager
sudo mkdir -p /etc/alertmanager /var/lib/alertmanager
sudo cp alertmanager-${AM_VERSION}.linux-amd64/alertmanager /usr/local/bin/
sudo chown alertmanager:alertmanager /usr/local/bin/alertmanager /etc/alertmanager /var/lib/alertmanager

Configure routes

sudo nano /etc/alertmanager/alertmanager.yml

global:
  smtp_smarthost: "smtp.example.com:587"
  smtp_from: "alerts@example.com"
  smtp_auth_username: "alerts@example.com"
  smtp_auth_password: "your-smtp-password"
  slack_api_url: "https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK"

route:
  group_by: ["alertname", "instance"]
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 4h
  receiver: "ops-team"
  routes:
    - match:
        severity: critical
      receiver: "pagerduty"
      continue: true
    - match:
        severity: warning
      receiver: "slack-warnings"

receivers:
  - name: "ops-team"
    email_configs:
      - to: "ops@example.com"
        send_resolved: true

  - name: "slack-warnings"
    slack_configs:
      - channel: "#alerts"
        title: "{{ .GroupLabels.alertname }}"
        text: "{{ range .Alerts }}{{ .Annotations.description }}{{ end }}"
        send_resolved: true

  - name: "pagerduty"
    pagerduty_configs:
      - routing_key: "YOUR_PAGERDUTY_INTEGRATION_KEY"
        send_resolved: true

inhibit_rules:
  - source_match:
      severity: critical
    target_match:
      severity: warning
    equal: ["alertname", "instance"]

sudo nano /etc/systemd/system/alertmanager.service

[Unit]
Description=Prometheus Alertmanager
After=network.target

[Service]
Type=simple
User=alertmanager
Group=alertmanager
ExecStart=/usr/local/bin/alertmanager \
  --config.file=/etc/alertmanager/alertmanager.yml \
  --storage.path=/var/lib/alertmanager

Restart=on-failure
NoNewPrivileges=true
ProtectSystem=strict
ReadWritePaths=/var/lib/alertmanager

[Install]
WantedBy=multi-user.target

sudo systemctl daemon-reload
sudo systemctl enable --now alertmanager

Recording Rules

Recording rules pre-compute expensive PromQL expressions and store the result as a new metric. This dramatically speeds up dashboards that aggregate across hundreds of nodes.

# /etc/prometheus/rules/recording.yml
groups:
  - name: node_recording
    interval: 1m
    rules:
      - record: instance:node_cpu_usage:percent
        expr: 100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)

      - record: instance:node_memory_usage:percent
        expr: (1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100

      - record: instance:node_filesystem_usage:percent
        expr: (1 - (node_filesystem_avail_bytes{fstype!~"tmpfs|devtmpfs"} / node_filesystem_size_bytes)) * 100

      - record: instance:node_network_receive_bytes:rate5m
        expr: sum by (instance) (rate(node_network_receive_bytes_total{device!="lo"}[5m]))

After writing this file, run promtool check rules /etc/prometheus/rules/recording.yml and reload Prometheus.

Storage Tuning

Prometheus TSDB compresses time-series data efficiently — expect about 1-2 bytes per sample. A server scraping 1,000 metrics every 15 seconds accumulates roughly 2 GB per month.

# In prometheus.service ExecStart flags:

# Keep data for 90 days
--storage.tsdb.retention.time=90d

# OR cap storage size (Prometheus will delete oldest blocks first)
--storage.tsdb.retention.size=20GB

# Tune the write-ahead log compression
--storage.tsdb.wal-compression

# Head block chunk range (default 2h, lower = less memory on restart)
--storage.tsdb.min-block-duration=2h

Check current storage usage:

du -sh /var/lib/prometheus/
# View TSDB stats in the UI
curl http://localhost:9090/api/v1/status/tsdb | python3 -m json.tool

Security

Basic authentication (Prometheus 2.24+)

# Generate a bcrypt hash
htpasswd -nBC 10 admin
# Copy the hash output

sudo nano /etc/prometheus/web.yml

basic_auth_users:
  admin: "$2y$10$your-bcrypt-hash-here"

# Add to ExecStart in prometheus.service
--web.config.file=/etc/prometheus/web.yml

TLS termination with nginx reverse proxy

server {
    listen 443 ssl;
    server_name prometheus.example.com;

    ssl_certificate     /etc/letsencrypt/live/prometheus.example.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/prometheus.example.com/privkey.pem;

    auth_basic "Prometheus";
    auth_basic_user_file /etc/nginx/.prometheus-htpasswd;

    location / {
        proxy_pass http://127.0.0.1:9090;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

Bind Prometheus to localhost only when using a reverse proxy:

--web.listen-address=127.0.0.1:9090

Firewall rules

# Block direct access — only nginx exposed externally
sudo ufw deny 9090
sudo ufw deny 9093
# Node Exporter: allow only from Prometheus server IP
sudo ufw allow from 10.0.0.5 to any port 9100

Comparison Table

Feature	Prometheus	Zabbix	Nagios	Datadog	Netdata
Model	Pull (scrape)	Push/Pull	Pull (active checks)	Push (agent)	Push (parent-child)
Storage	Local TSDB	PostgreSQL/MySQL	Flat files	SaaS cloud	DB / Netdata Cloud
Query language	PromQL (powerful)	Custom + SQL	None	DogStatsD metrics	Custom
Alerting	Alertmanager	Built-in	Built-in	Built-in	Built-in
Dashboards	Grafana (external)	Built-in	None native	Built-in	Built-in
Auto-discovery	Yes (file/SD)	Yes	Limited	Yes (Agent)	Yes
Cost	Free / open-source	Free (enterprise paid)	Free (enterprise paid)	SaaS pricing	Free (cloud paid)
Scalability	Thanos/Cortex/Mimir	Good with proxy	Poor (monolith)	Managed	Good
Learning curve	Moderate (PromQL)	Steep	Steep	Low	Very low
Best for	Cloud-native, Kubernetes	Enterprise IT	Legacy / SNMP	Managed SaaS	Real-time per-host

Production Stack Recipe

A minimal production setup for a fleet of Linux servers:

Prometheus server (t3.small or 2 vCPUs / 4 GB RAM):
  - /var/lib/prometheus  →  separate data volume (100 GB+)
  - Retention: 30d time  OR  20 GB size
  - Scrapers: node_exporter on every host, blackbox_exporter for HTTP checks

Node Exporter (on every server):
  - systemd service, port 9100
  - Collectors: --collector.systemd --collector.processes
  - Firewall: allow 9100 only from Prometheus server IP

Alertmanager:
  - Deduplicated alert routing: Slack for warnings, PagerDuty for critical
  - Inhibit rules: suppress warnings when critical fires for same instance

Grafana (optional but recommended):
  - Docker: docker run -d -p 3000:3000 grafana/grafana
  - Import dashboard ID 1860 (Node Exporter Full) from grafana.com/dashboards

Reverse proxy (nginx + TLS):
  - Prometheus: https://prometheus.internal.example.com  (VPN-only)
  - Grafana:    https://grafana.example.com  (public, auth required)

Gotchas and Edge Cases

rate() requires at least two samples. If a counter resets (process restart), rate() handles it gracefully. irate() is more sensitive to resets — use rate() for alerting rules.

Cardinality explosion kills performance. Never use high-cardinality labels like user_id, session_id, or request_url in metrics. Each unique label combination creates a separate time-series. A few thousand extra series slow queries; millions crash Prometheus.

for clause in alerts prevents flapping. An alert with for: 5m must be continuously firing for 5 minutes before it sends a notification. Omit for and a 1-second spike triggers a page.

Recording rules need time to populate. After adding a recording rule, the new metric only exists from that point forward — you cannot query it for historical data that predates the rule.

Scrape timeout must be less than scrape interval. If scrape_timeout (default 10s) exceeds scrape_interval, Prometheus logs errors. If your target is slow, increase scrape_interval for that job.

Node Exporter on Ubuntu 22.04 may need --collector.systemd flag. Without it, node_systemd_unit_state metrics are missing. Add the flag explicitly to the systemd ExecStart.

Summary

Prometheus uses a pull model — it scrapes /metrics endpoints at a configurable interval
Node Exporter exposes hardware and OS metrics from /proc//sys on port 9100
PromQL functions rate(), increase(), and histogram_quantile() are the core of useful queries
Write alerting rules with a for clause to prevent flapping; route through Alertmanager for deduplication and silencing
Recording rules pre-compute expensive aggregations — critical for multi-node dashboards
Tune storage retention with --storage.tsdb.retention.time or --storage.tsdb.retention.size
Always put Prometheus behind a reverse proxy with TLS and restrict direct port access via firewall
Avoid high-cardinality labels — they are the most common cause of performance problems

Prometheus + Node Exporter: Linux Server Monitoring Guide

Prerequisites

Architecture Overview

Step 1: Install Prometheus

Create the system user and directories

Download and install the binary

Write prometheus.yml

Create the systemd unit

Step 2: Install Node Exporter

Key Node Exporter Metrics

PromQL Basics

CPU usage percentage

Memory usage percentage

Disk usage percentage

Network throughput (bytes/sec)

Key PromQL functions

Alerting Rules

Alertmanager Setup

Install Alertmanager

Configure routes

Recording Rules

Storage Tuning

Security

Basic authentication (Prometheus 2.24+)

TLS termination with nginx reverse proxy

Firewall rules

Comparison Table

Production Stack Recipe

Gotchas and Edge Cases

Summary

Guide & Instructions

Create prometheus user and directories

Install Prometheus binary

Write prometheus.yml configuration

Install Node Exporter as systemd service

Enable and start both services

Configure alerting rules and Alertmanager

Frequently Asked Questions

Prerequisites

Architecture Overview

Step 1: Install Prometheus

Create the system user and directories

Download and install the binary

Write prometheus.yml

Create the systemd unit

Step 2: Install Node Exporter

Key Node Exporter Metrics

PromQL Basics

CPU usage percentage

Memory usage percentage

Disk usage percentage

Network throughput (bytes/sec)

Key PromQL functions

Alerting Rules

Alertmanager Setup

Install Alertmanager

Configure routes

Recording Rules

Storage Tuning

Security

Basic authentication (Prometheus 2.24+)

TLS termination with nginx reverse proxy

Firewall rules

Comparison Table

Production Stack Recipe

Gotchas and Edge Cases

Summary

Related Articles

Guide & Instructions

Create prometheus user and directories

Install Prometheus binary

Write prometheus.yml configuration

Install Node Exporter as systemd service

Enable and start both services

Configure alerting rules and Alertmanager

Frequently Asked Questions

Related Articles

Grafana Loki: Centralized Log Aggregation with Promtail

Prometheus and Grafana Monitoring: Setup, Alerting, and Troubleshooting

Logrotate: Complete Log Management Guide for Linux