jq is the command-line JSON processor that every Linux admin and DevOps engineer should have in their toolkit. When you’re working with REST APIs, parsing Kubernetes output, processing Terraform state, or extracting data from log files, jq lets you slice, filter, and transform JSON directly from the terminal — no Python script needed.
This guide covers everything from basic field extraction to advanced transformations, with real-world examples you’ll use daily.
Prerequisites
- A Linux system (any distribution)
- Basic familiarity with shell pipes and command-line tools
- JSON data to work with (we’ll use examples throughout)
Installing jq
On Debian/Ubuntu:
sudo apt update && sudo apt install -y jq
On RHEL/Fedora/AlmaLinux:
sudo dnf install -y jq
On Alpine:
sudo apk add jq
On macOS:
brew install jq
Verify:
jq --version
jq vs Other JSON Processing Tools
| Tool | Best For | Drawbacks |
|---|---|---|
| jq | Quick CLI transforms, shell pipes | Complex logic gets verbose |
| Python json | Multi-step processing, complex logic | Requires Python runtime, more code |
| yq | YAML processing (jq-like syntax) | Separate install, YAML-specific |
| fx | Interactive JSON exploration | Not scriptable |
| gron | Making JSON greppable | One-directional (flatten only) |
jq wins when you need fast, scriptable transforms in a pipeline with zero dependencies beyond the binary.
Basic jq: Extracting Data
The Identity Filter
The simplest jq operation pretty-prints JSON:
echo '{"name":"nginx","version":"1.25.3","status":"running"}' | jq '.'
{
"name": "nginx",
"version": "1.25.3",
"status": "running"
}
Extracting Fields
Use dot notation to pull specific values:
echo '{"name":"nginx","version":"1.25.3","status":"running"}' | jq '.name'
"nginx"
To get the raw string without quotes, add -r:
echo '{"name":"nginx","version":"1.25.3"}' | jq -r '.name'
nginx
The
-rflag is essential for scripting. Without it, jq outputs JSON-encoded strings (with quotes), which breaks variable assignments and comparisons in bash.
Nested Objects
echo '{"server":{"host":"10.0.0.1","port":8080}}' | jq '.server.port'
8080
Array Access
echo '["nginx","apache","caddy"]' | jq '.[0]'
"nginx"
Get all elements:
echo '["nginx","apache","caddy"]' | jq '.[]'
"nginx"
"apache"
"caddy"
Working with Real API Responses
Here’s a realistic scenario: querying a REST API and extracting what you need.
# Get GitHub user info
curl -s https://api.github.com/users/torvalds | jq '{login, name, public_repos, followers}'
{
"login": "torvalds",
"name": "Linus Torvalds",
"public_repos": 7,
"followers": 223000
}
Extracting from Arrays of Objects
Most APIs return arrays. Here’s how to work with them:
# Simulate an API response
cat << 'EOF' | jq '.servers[] | {name, status}'
{
"servers": [
{"name": "web-01", "status": "running", "cpu": 45},
{"name": "web-02", "status": "stopped", "cpu": 0},
{"name": "db-01", "status": "running", "cpu": 78}
]
}
EOF
{
"name": "web-01",
"status": "running"
}
{
"name": "web-02",
"status": "stopped"
}
{
"name": "db-01",
"status": "running"
}
Filtering JSON Data with jq
The select() Function
Filter arrays to only matching elements:
cat << 'EOF' | jq '.servers[] | select(.status == "running")'
{
"servers": [
{"name": "web-01", "status": "running", "cpu": 45},
{"name": "web-02", "status": "stopped", "cpu": 0},
{"name": "db-01", "status": "running", "cpu": 78}
]
}
EOF
{
"name": "web-01",
"status": "running",
"cpu": 45
}
{
"name": "db-01",
"status": "running",
"cpu": 78
}
Numeric Comparisons
# Servers with CPU usage above 50%
cat servers.json | jq '.servers[] | select(.cpu > 50) | .name'
Combining Conditions
# Running servers with high CPU
cat servers.json | jq '.servers[] | select(.status == "running" and .cpu > 50)'
String Matching
# Servers whose name contains "web"
cat servers.json | jq '.servers[] | select(.name | contains("web"))'
For regex matching:
# Names matching a pattern
cat servers.json | jq '.servers[] | select(.name | test("web-[0-9]+"))'
Transforming JSON with jq
Building New Objects
Construct new JSON structures from existing data:
cat << 'EOF' | jq '.servers[] | {hostname: .name, healthy: (.status == "running"), load: (.cpu / 100)}'
{
"servers": [
{"name": "web-01", "status": "running", "cpu": 45},
{"name": "db-01", "status": "running", "cpu": 78}
]
}
EOF
{
"hostname": "web-01",
"healthy": true,
"load": 0.45
}
{
"hostname": "db-01",
"healthy": true,
"load": 0.78
}
map() for Array Transforms
Transform every element in an array:
echo '[1,2,3,4,5]' | jq 'map(. * 2)'
[2, 4, 6, 8, 10]
Practical example — extract just names into an array:
cat servers.json | jq '[.servers[].name]'
["web-01", "web-02", "db-01"]
Sorting and Grouping
# Sort by CPU descending
cat servers.json | jq '.servers | sort_by(.cpu) | reverse'
# Group by status
cat servers.json | jq '.servers | group_by(.status)'
Aggregations
# Count servers
cat servers.json | jq '.servers | length'
# Average CPU
cat servers.json | jq '.servers | map(.cpu) | add / length'
# Max CPU
cat servers.json | jq '.servers | max_by(.cpu) | .name'
jq in Shell Scripts
Storing Results in Variables
#!/bin/bash
response=$(curl -s https://api.github.com/users/torvalds)
name=$(echo "$response" | jq -r '.name')
repos=$(echo "$response" | jq -r '.public_repos')
echo "User: $name has $repos public repos"
Gotcha: Always quote the variable in
echo "$response". Unquoted variables lose whitespace and break JSON parsing.
Iterating Over JSON Arrays
#!/bin/bash
# Process each server from API
curl -s https://api.example.com/servers | jq -r '.[] | .name' | while read -r server; do
echo "Checking $server..."
ping -c 1 -W 2 "$server" > /dev/null 2>&1 && echo " UP" || echo " DOWN"
done
Generating Config Files from JSON
# Convert JSON inventory to /etc/hosts entries
cat inventory.json | jq -r '.hosts[] | "\(.ip)\t\(.hostname)"' >> /etc/hosts
Conditional Processing
#!/bin/bash
status=$(curl -s https://api.example.com/health | jq -r '.status')
if [ "$status" != "healthy" ]; then
echo "ALERT: Service is $status" | mail -s "Health Check Failed" admin@example.com
fi
Modifying JSON with jq
Updating Values
echo '{"version": "1.0", "debug": false}' | jq '.version = "2.0" | .debug = true'
{
"version": "2.0",
"debug": true
}
Adding Fields
echo '{"name": "app"}' | jq '. + {"env": "production", "replicas": 3}'
Deleting Fields
echo '{"name":"app","secret":"abc123","version":"1.0"}' | jq 'del(.secret)'
Updating Nested Values
echo '{"database":{"host":"old-host","port":5432}}' | jq '.database.host = "new-host"'
Real-World jq Recipes
Parse Kubernetes Pod Status
kubectl get pods -o json | jq -r '.items[] | select(.status.phase != "Running") | "\(.metadata.name)\t\(.status.phase)"'
Extract Terraform Output Values
terraform output -json | jq -r 'to_entries[] | "\(.key) = \(.value.value)"'
Parse Docker Container Stats
docker inspect $(docker ps -q) | jq '.[] | {name: .Name, ip: .NetworkSettings.IPAddress, status: .State.Status}'
Process CloudWatch Log Entries
aws logs get-log-events --log-group-name /app/prod --log-stream-name main \
| jq -r '.events[] | "\(.timestamp | . / 1000 | strftime("%Y-%m-%d %H:%M:%S")) \(.message)"'
CSV Output from JSON
cat servers.json | jq -r '.servers[] | [.name, .status, (.cpu | tostring)] | @csv'
"web-01","running","45"
"web-02","stopped","0"
"db-01","running","78"
Troubleshooting
“parse error: Invalid numeric literal”: Your input isn’t valid JSON. Validate it first with jq '.' file.json. Common cause: the API returned HTML (error page) instead of JSON.
“null” output when you expect data: The field name is wrong or doesn’t exist at that path. Use jq 'keys' to see available fields, or jq '.' | head -20 to inspect the structure.
Quotes in variable assignments: Always use jq -r (raw output) when assigning to shell variables. Without it, you get "value" instead of value, which breaks string comparisons.
“Cannot iterate over null”: You’re trying to iterate (.[]) over a field that’s null or missing. Add the optional operator: .field[]? or check with select(. != null).
Whitespace issues in pipes: Quote your variables. echo $json | jq '.' breaks on whitespace. Use echo "$json" | jq '.' instead.
Summary
- jq is the standard tool for JSON processing on the command line — one binary, no dependencies, works everywhere
- Use
.field,.[index], and.[]for basic extraction; add-rfor raw string output in scripts select()filters arrays by condition — supports string matching, numeric comparisons, and regex withtest()- Build new JSON structures with object construction
{key: .field}andmap()for array transforms - Combine with
curl,kubectl,terraform, anddockerfor real-world automation sort_by(),group_by(),length,addhandle sorting and aggregation- Modify JSON with assignment (
.key = value), addition (. + {}), and deletion (del(.key)) - Always use
-rflag and quote variables when integrating jq into bash scripts
Bash Scripting for Sysadmins: The Essential Guide | Install Security Updates from the Command Line in Ubuntu