jq is the command-line JSON processor that every Linux admin and DevOps engineer should have in their toolkit. When you’re working with REST APIs, parsing Kubernetes output, processing Terraform state, or extracting data from log files, jq lets you slice, filter, and transform JSON directly from the terminal — no Python script needed.

This guide covers everything from basic field extraction to advanced transformations, with real-world examples you’ll use daily.

Prerequisites

  • A Linux system (any distribution)
  • Basic familiarity with shell pipes and command-line tools
  • JSON data to work with (we’ll use examples throughout)

Installing jq

On Debian/Ubuntu:

sudo apt update && sudo apt install -y jq

On RHEL/Fedora/AlmaLinux:

sudo dnf install -y jq

On Alpine:

sudo apk add jq

On macOS:

brew install jq

Verify:

jq --version

jq vs Other JSON Processing Tools

ToolBest ForDrawbacks
jqQuick CLI transforms, shell pipesComplex logic gets verbose
Python jsonMulti-step processing, complex logicRequires Python runtime, more code
yqYAML processing (jq-like syntax)Separate install, YAML-specific
fxInteractive JSON explorationNot scriptable
gronMaking JSON greppableOne-directional (flatten only)

jq wins when you need fast, scriptable transforms in a pipeline with zero dependencies beyond the binary.

Basic jq: Extracting Data

The Identity Filter

The simplest jq operation pretty-prints JSON:

echo '{"name":"nginx","version":"1.25.3","status":"running"}' | jq '.'
{
  "name": "nginx",
  "version": "1.25.3",
  "status": "running"
}

Extracting Fields

Use dot notation to pull specific values:

echo '{"name":"nginx","version":"1.25.3","status":"running"}' | jq '.name'
"nginx"

To get the raw string without quotes, add -r:

echo '{"name":"nginx","version":"1.25.3"}' | jq -r '.name'
nginx

The -r flag is essential for scripting. Without it, jq outputs JSON-encoded strings (with quotes), which breaks variable assignments and comparisons in bash.

Nested Objects

echo '{"server":{"host":"10.0.0.1","port":8080}}' | jq '.server.port'
8080

Array Access

echo '["nginx","apache","caddy"]' | jq '.[0]'
"nginx"

Get all elements:

echo '["nginx","apache","caddy"]' | jq '.[]'
"nginx"
"apache"
"caddy"

Working with Real API Responses

Here’s a realistic scenario: querying a REST API and extracting what you need.

# Get GitHub user info
curl -s https://api.github.com/users/torvalds | jq '{login, name, public_repos, followers}'
{
  "login": "torvalds",
  "name": "Linus Torvalds",
  "public_repos": 7,
  "followers": 223000
}

Extracting from Arrays of Objects

Most APIs return arrays. Here’s how to work with them:

# Simulate an API response
cat << 'EOF' | jq '.servers[] | {name, status}'
{
  "servers": [
    {"name": "web-01", "status": "running", "cpu": 45},
    {"name": "web-02", "status": "stopped", "cpu": 0},
    {"name": "db-01", "status": "running", "cpu": 78}
  ]
}
EOF
{
  "name": "web-01",
  "status": "running"
}
{
  "name": "web-02",
  "status": "stopped"
}
{
  "name": "db-01",
  "status": "running"
}

Filtering JSON Data with jq

The select() Function

Filter arrays to only matching elements:

cat << 'EOF' | jq '.servers[] | select(.status == "running")'
{
  "servers": [
    {"name": "web-01", "status": "running", "cpu": 45},
    {"name": "web-02", "status": "stopped", "cpu": 0},
    {"name": "db-01", "status": "running", "cpu": 78}
  ]
}
EOF
{
  "name": "web-01",
  "status": "running",
  "cpu": 45
}
{
  "name": "db-01",
  "status": "running",
  "cpu": 78
}

Numeric Comparisons

# Servers with CPU usage above 50%
cat servers.json | jq '.servers[] | select(.cpu > 50) | .name'

Combining Conditions

# Running servers with high CPU
cat servers.json | jq '.servers[] | select(.status == "running" and .cpu > 50)'

String Matching

# Servers whose name contains "web"
cat servers.json | jq '.servers[] | select(.name | contains("web"))'

For regex matching:

# Names matching a pattern
cat servers.json | jq '.servers[] | select(.name | test("web-[0-9]+"))'

Transforming JSON with jq

Building New Objects

Construct new JSON structures from existing data:

cat << 'EOF' | jq '.servers[] | {hostname: .name, healthy: (.status == "running"), load: (.cpu / 100)}'
{
  "servers": [
    {"name": "web-01", "status": "running", "cpu": 45},
    {"name": "db-01", "status": "running", "cpu": 78}
  ]
}
EOF
{
  "hostname": "web-01",
  "healthy": true,
  "load": 0.45
}
{
  "hostname": "db-01",
  "healthy": true,
  "load": 0.78
}

map() for Array Transforms

Transform every element in an array:

echo '[1,2,3,4,5]' | jq 'map(. * 2)'
[2, 4, 6, 8, 10]

Practical example — extract just names into an array:

cat servers.json | jq '[.servers[].name]'
["web-01", "web-02", "db-01"]

Sorting and Grouping

# Sort by CPU descending
cat servers.json | jq '.servers | sort_by(.cpu) | reverse'

# Group by status
cat servers.json | jq '.servers | group_by(.status)'

Aggregations

# Count servers
cat servers.json | jq '.servers | length'

# Average CPU
cat servers.json | jq '.servers | map(.cpu) | add / length'

# Max CPU
cat servers.json | jq '.servers | max_by(.cpu) | .name'

jq in Shell Scripts

Storing Results in Variables

#!/bin/bash
response=$(curl -s https://api.github.com/users/torvalds)

name=$(echo "$response" | jq -r '.name')
repos=$(echo "$response" | jq -r '.public_repos')

echo "User: $name has $repos public repos"

Gotcha: Always quote the variable in echo "$response". Unquoted variables lose whitespace and break JSON parsing.

Iterating Over JSON Arrays

#!/bin/bash
# Process each server from API
curl -s https://api.example.com/servers | jq -r '.[] | .name' | while read -r server; do
    echo "Checking $server..."
    ping -c 1 -W 2 "$server" > /dev/null 2>&1 && echo "  UP" || echo "  DOWN"
done

Generating Config Files from JSON

# Convert JSON inventory to /etc/hosts entries
cat inventory.json | jq -r '.hosts[] | "\(.ip)\t\(.hostname)"' >> /etc/hosts

Conditional Processing

#!/bin/bash
status=$(curl -s https://api.example.com/health | jq -r '.status')

if [ "$status" != "healthy" ]; then
    echo "ALERT: Service is $status" | mail -s "Health Check Failed" admin@example.com
fi

Modifying JSON with jq

Updating Values

echo '{"version": "1.0", "debug": false}' | jq '.version = "2.0" | .debug = true'
{
  "version": "2.0",
  "debug": true
}

Adding Fields

echo '{"name": "app"}' | jq '. + {"env": "production", "replicas": 3}'

Deleting Fields

echo '{"name":"app","secret":"abc123","version":"1.0"}' | jq 'del(.secret)'

Updating Nested Values

echo '{"database":{"host":"old-host","port":5432}}' | jq '.database.host = "new-host"'

Real-World jq Recipes

Parse Kubernetes Pod Status

kubectl get pods -o json | jq -r '.items[] | select(.status.phase != "Running") | "\(.metadata.name)\t\(.status.phase)"'

Extract Terraform Output Values

terraform output -json | jq -r 'to_entries[] | "\(.key) = \(.value.value)"'

Parse Docker Container Stats

docker inspect $(docker ps -q) | jq '.[] | {name: .Name, ip: .NetworkSettings.IPAddress, status: .State.Status}'

Process CloudWatch Log Entries

aws logs get-log-events --log-group-name /app/prod --log-stream-name main \
  | jq -r '.events[] | "\(.timestamp | . / 1000 | strftime("%Y-%m-%d %H:%M:%S")) \(.message)"'

CSV Output from JSON

cat servers.json | jq -r '.servers[] | [.name, .status, (.cpu | tostring)] | @csv'
"web-01","running","45"
"web-02","stopped","0"
"db-01","running","78"

Troubleshooting

“parse error: Invalid numeric literal”: Your input isn’t valid JSON. Validate it first with jq '.' file.json. Common cause: the API returned HTML (error page) instead of JSON.

“null” output when you expect data: The field name is wrong or doesn’t exist at that path. Use jq 'keys' to see available fields, or jq '.' | head -20 to inspect the structure.

Quotes in variable assignments: Always use jq -r (raw output) when assigning to shell variables. Without it, you get "value" instead of value, which breaks string comparisons.

“Cannot iterate over null”: You’re trying to iterate (.[]) over a field that’s null or missing. Add the optional operator: .field[]? or check with select(. != null).

Whitespace issues in pipes: Quote your variables. echo $json | jq '.' breaks on whitespace. Use echo "$json" | jq '.' instead.

Summary

  • jq is the standard tool for JSON processing on the command line — one binary, no dependencies, works everywhere
  • Use .field, .[index], and .[] for basic extraction; add -r for raw string output in scripts
  • select() filters arrays by condition — supports string matching, numeric comparisons, and regex with test()
  • Build new JSON structures with object construction {key: .field} and map() for array transforms
  • Combine with curl, kubectl, terraform, and docker for real-world automation
  • sort_by(), group_by(), length, add handle sorting and aggregation
  • Modify JSON with assignment (.key = value), addition (. + {}), and deletion (del(.key))
  • Always use -r flag and quote variables when integrating jq into bash scripts

Bash Scripting for Sysadmins: The Essential Guide | Install Security Updates from the Command Line in Ubuntu