Every system administrator reaches a point where Bash scripts become unwieldy. A disk-usage alert that started as a five-line script now spans three hundred lines, handles edge cases poorly, and breaks silently on a different distribution. Python bridges the gap between quick shell one-liners and full configuration-management tools like Ansible. It gives you proper error handling, rich data structures, thousands of libraries, and code that is readable months after you wrote it.

This article walks through the core Python skills a Linux sysadmin needs: file operations, subprocess management, SSH automation, REST API integration, log parsing, system monitoring, and notifications. Every section includes production-ready code you can adapt to your environment.


Prerequisites

Before starting, make sure you have the following:

  • Python 3.10+ — verify with python3 --version
  • pip — the Python package installer (usually bundled with Python)
  • venv — the built-in virtual environment module
  • A Linux server (Ubuntu 22.04/24.04, Debian 12, or RHEL 9 recommended)
  • Basic familiarity with the Linux command line and Bash
  • An SSH key pair configured for remote server access (see SSH Hardening)
# Verify Python installation
python3 --version
# Python 3.12.3

# Verify pip
pip3 --version
# pip 24.0 from /usr/lib/python3/dist-packages/pip (python 3.12)

Setting Up a Python Environment

Never install Python packages globally on a production server. Use virtual environments to isolate dependencies for each project.

Creating a virtual environment

# Create a project directory
mkdir -p ~/sysadmin-scripts
cd ~/sysadmin-scripts

# Create a virtual environment
python3 -m venv venv

# Activate it
source venv/bin/activate

# Your prompt changes to show (venv)
(venv) user@server:~/sysadmin-scripts$

Installing essential libraries

# Install all libraries we will use in this article
pip install paramiko requests psutil python-dotenv jinja2

# Freeze dependencies for reproducibility
pip freeze > requirements.txt

Project structure

A well-organized sysadmin scripts project looks like this:

sysadmin-scripts/
├── venv/
├── requirements.txt
├── .env                  # Secrets (API keys, passwords)
├── config.yaml           # Server lists, thresholds
├── scripts/
│   ├── disk_monitor.py
│   ├── log_parser.py
│   ├── remote_exec.py
│   └── backup_rotation.py
└── logs/
    └── automation.log

File and Directory Operations

Python’s pathlib module (standard library since Python 3.4) provides an object-oriented interface for filesystem paths. Combined with shutil, it handles virtually every file operation a sysadmin needs.

Working with pathlib

from pathlib import Path
import shutil
from datetime import datetime, timedelta

# Create paths (works on any OS)
log_dir = Path("/var/log/myapp")
backup_dir = Path("/backup/logs")

# Create directories (parents=True acts like mkdir -p)
backup_dir.mkdir(parents=True, exist_ok=True)

# List all .log files
for log_file in log_dir.glob("*.log"):
    print(f"{log_file.name}  {log_file.stat().st_size / 1024:.1f} KB")

# Find files recursively
for conf in Path("/etc").rglob("*.conf"):
    print(conf)

Automated log rotation script

#!/usr/bin/env python3
"""Rotate and compress log files older than N days."""

import gzip
import shutil
from pathlib import Path
from datetime import datetime, timedelta

LOG_DIR = Path("/var/log/myapp")
ARCHIVE_DIR = Path("/var/log/myapp/archive")
MAX_AGE_DAYS = 7

ARCHIVE_DIR.mkdir(parents=True, exist_ok=True)

cutoff = datetime.now() - timedelta(days=MAX_AGE_DAYS)

for log_file in LOG_DIR.glob("*.log"):
    mtime = datetime.fromtimestamp(log_file.stat().st_mtime)
    if mtime < cutoff:
        # Compress the file
        gz_path = ARCHIVE_DIR / f"{log_file.name}.gz"
        with open(log_file, "rb") as f_in:
            with gzip.open(gz_path, "wb") as f_out:
                shutil.copyfileobj(f_in, f_out)
        log_file.unlink()
        print(f"Archived: {log_file.name} -> {gz_path.name}")

Backup script with timestamp directories

#!/usr/bin/env python3
"""Create timestamped backups and remove backups older than 30 days."""

import shutil
from pathlib import Path
from datetime import datetime, timedelta

SOURCE = Path("/etc/nginx")
BACKUP_ROOT = Path("/backup/nginx")
RETENTION_DAYS = 30

# Create timestamped backup
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
backup_path = BACKUP_ROOT / timestamp

shutil.copytree(SOURCE, backup_path)
print(f"Backup created: {backup_path}")

# Clean old backups
cutoff = datetime.now() - timedelta(days=RETENTION_DAYS)
for old_backup in BACKUP_ROOT.iterdir():
    if old_backup.is_dir():
        mtime = datetime.fromtimestamp(old_backup.stat().st_mtime)
        if mtime < cutoff:
            shutil.rmtree(old_backup)
            print(f"Removed old backup: {old_backup.name}")

Running System Commands with subprocess

The subprocess module lets you run shell commands from Python with full control over input, output, and error handling. Always prefer subprocess.run() over the older os.system().

Basic command execution

import subprocess

# Run a command and capture output
result = subprocess.run(
    ["df", "-h", "/"],
    capture_output=True,
    text=True,
    timeout=30
)

print(result.stdout)
print(f"Return code: {result.returncode}")

# Check for errors
if result.returncode != 0:
    print(f"Error: {result.stderr}")

Running multiple commands safely

import subprocess
import sys

def run_cmd(cmd, description=""):
    """Run a command and handle errors."""
    try:
        result = subprocess.run(
            cmd,
            capture_output=True,
            text=True,
            timeout=60,
            check=True  # Raises CalledProcessError on non-zero exit
        )
        print(f"[OK] {description}")
        return result.stdout
    except subprocess.CalledProcessError as e:
        print(f"[FAIL] {description}: {e.stderr.strip()}")
        sys.exit(1)
    except subprocess.TimeoutExpired:
        print(f"[TIMEOUT] {description}: command exceeded 60s")
        sys.exit(1)

# Example: system update sequence
run_cmd(["sudo", "apt", "update"], "Update package lists")
run_cmd(["sudo", "apt", "upgrade", "-y"], "Upgrade packages")
run_cmd(["sudo", "apt", "autoremove", "-y"], "Remove unused packages")

Parsing command output

import subprocess

def get_listening_ports():
    """Return a list of ports with listening services."""
    result = subprocess.run(
        ["ss", "-tlnp"],
        capture_output=True,
        text=True,
        check=True
    )
    ports = []
    for line in result.stdout.strip().split("\n")[1:]:  # Skip header
        parts = line.split()
        if len(parts) >= 4:
            addr = parts[3]
            port = addr.rsplit(":", 1)[-1]
            ports.append(port)
    return sorted(set(ports))

print("Listening ports:", get_listening_ports())

SSH Automation with Paramiko

Paramiko is the standard Python library for SSH2 connections. It lets you execute commands on remote servers, transfer files over SFTP, and manage SSH keys — all from Python.

Installing Paramiko

pip install paramiko

Executing remote commands

import paramiko

def ssh_exec(hostname, username, command, key_path="~/.ssh/id_ed25519"):
    """Execute a command on a remote server via SSH."""
    client = paramiko.SSHClient()
    client.set_missing_host_key_policy(paramiko.AutoAddPolicy())

    try:
        client.connect(
            hostname=hostname,
            username=username,
            key_filename=str(Path(key_path).expanduser()),
            timeout=10
        )
        stdin, stdout, stderr = client.exec_command(command, timeout=30)
        exit_code = stdout.channel.recv_exit_status()

        return {
            "stdout": stdout.read().decode().strip(),
            "stderr": stderr.read().decode().strip(),
            "exit_code": exit_code
        }
    finally:
        client.close()

# Example usage
result = ssh_exec("web01.example.com", "deploy", "uptime")
print(result["stdout"])

Running commands on multiple servers

import paramiko
from pathlib import Path
from concurrent.futures import ThreadPoolExecutor, as_completed

SERVERS = [
    {"host": "web01.example.com", "user": "deploy"},
    {"host": "web02.example.com", "user": "deploy"},
    {"host": "db01.example.com",  "user": "deploy"},
]

def check_uptime(server):
    """Check uptime on a single server."""
    client = paramiko.SSHClient()
    client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
    try:
        client.connect(
            hostname=server["host"],
            username=server["user"],
            key_filename=str(Path("~/.ssh/id_ed25519").expanduser()),
            timeout=10
        )
        _, stdout, _ = client.exec_command("uptime -p")
        return server["host"], stdout.read().decode().strip()
    except Exception as e:
        return server["host"], f"ERROR: {e}"
    finally:
        client.close()

# Execute in parallel with ThreadPoolExecutor
with ThreadPoolExecutor(max_workers=5) as executor:
    futures = {executor.submit(check_uptime, s): s for s in SERVERS}
    for future in as_completed(futures):
        host, uptime = future.result()
        print(f"{host}: {uptime}")

SFTP file transfers

import paramiko
from pathlib import Path

def upload_file(hostname, username, local_path, remote_path):
    """Upload a file to a remote server via SFTP."""
    transport = paramiko.Transport((hostname, 22))
    key = paramiko.Ed25519Key.from_private_key_file(
        str(Path("~/.ssh/id_ed25519").expanduser())
    )
    transport.connect(username=username, pkey=key)
    sftp = paramiko.SFTPClient.from_transport(transport)

    try:
        sftp.put(str(local_path), str(remote_path))
        print(f"Uploaded {local_path} -> {hostname}:{remote_path}")
    finally:
        sftp.close()
        transport.close()

# Upload a configuration file
upload_file(
    "web01.example.com",
    "deploy",
    Path("./configs/nginx.conf"),
    "/tmp/nginx.conf"
)

Working with REST APIs

The requests library makes HTTP calls straightforward. Sysadmins use APIs to interact with monitoring platforms, cloud providers, DNS services, and internal tooling.

GET requests — fetching data

import requests

# Check server status from a monitoring API
response = requests.get(
    "https://api.example.com/v1/servers",
    headers={"Authorization": "Bearer YOUR_API_KEY"},
    timeout=10
)

if response.status_code == 200:
    servers = response.json()
    for server in servers:
        print(f"{server['name']}: {server['status']}")
else:
    print(f"API error: {response.status_code} {response.text}")

POST requests — sending data

import requests
import json

# Create a DNS record via API
payload = {
    "type": "A",
    "name": "web03.example.com",
    "content": "203.0.113.50",
    "ttl": 300
}

response = requests.post(
    "https://api.cloudflare.com/client/v4/zones/ZONE_ID/dns_records",
    headers={
        "Authorization": "Bearer YOUR_CF_TOKEN",
        "Content-Type": "application/json"
    },
    json=payload,
    timeout=15
)

result = response.json()
if result.get("success"):
    print(f"DNS record created: {result['result']['id']}")
else:
    print(f"Error: {result.get('errors')}")

Robust API client with retries

import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry

def create_api_session(base_url, api_key, retries=3):
    """Create a requests session with retry logic."""
    session = requests.Session()
    session.headers.update({
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json"
    })

    retry_strategy = Retry(
        total=retries,
        backoff_factor=1,
        status_forcelist=[429, 500, 502, 503, 504]
    )
    adapter = HTTPAdapter(max_retries=retry_strategy)
    session.mount("https://", adapter)
    session.mount("http://", adapter)

    return session

# Usage
api = create_api_session("https://api.example.com", "YOUR_KEY")
response = api.get("https://api.example.com/v1/health", timeout=10)
print(response.json())

Log Parsing and Analysis

Sysadmins spend a significant amount of time analyzing log files. Python makes it easy to parse, filter, and aggregate log data.

Parsing syslog with regular expressions

#!/usr/bin/env python3
"""Parse syslog and report failed SSH login attempts."""

import re
from collections import Counter
from pathlib import Path

LOG_FILE = Path("/var/log/auth.log")
PATTERN = re.compile(
    r"(\w+\s+\d+\s+[\d:]+)\s+\S+\s+sshd\[\d+\]:\s+"
    r"Failed password for (?:invalid user )?(\S+) from (\S+)"
)

ip_counter = Counter()
user_counter = Counter()

with open(LOG_FILE) as f:
    for line in f:
        match = PATTERN.search(line)
        if match:
            timestamp, user, ip = match.groups()
            ip_counter[ip] += 1
            user_counter[user] += 1

print("Top 10 attacking IPs:")
for ip, count in ip_counter.most_common(10):
    print(f"  {ip:20s} {count:5d} attempts")

print("\nTop 10 targeted usernames:")
for user, count in user_counter.most_common(10):
    print(f"  {user:20s} {count:5d} attempts")

Parsing structured logs (JSON)

#!/usr/bin/env python3
"""Analyze JSON-formatted application logs."""

import json
from collections import Counter
from pathlib import Path
from datetime import datetime

LOG_FILE = Path("/var/log/myapp/access.json")

status_codes = Counter()
slow_requests = []

with open(LOG_FILE) as f:
    for line in f:
        try:
            entry = json.loads(line)
            status_codes[entry["status"]] += 1
            if entry.get("response_time", 0) > 2.0:
                slow_requests.append(entry)
        except json.JSONDecodeError:
            continue

print("Status code distribution:")
for code, count in sorted(status_codes.items()):
    print(f"  {code}: {count}")

print(f"\nSlow requests (>2s): {len(slow_requests)}")
for req in slow_requests[:5]:
    print(f"  {req['method']} {req['path']} - {req['response_time']:.2f}s")

Real-time log tailing

#!/usr/bin/env python3
"""Tail a log file and alert on error patterns."""

import time
from pathlib import Path

LOG_FILE = Path("/var/log/myapp/error.log")
ALERT_PATTERNS = ["CRITICAL", "OOM", "segfault", "disk full"]

def tail_file(filepath, interval=1.0):
    """Yield new lines as they are appended to a file."""
    with open(filepath) as f:
        f.seek(0, 2)  # Go to end of file
        while True:
            line = f.readline()
            if line:
                yield line.strip()
            else:
                time.sleep(interval)

print(f"Tailing {LOG_FILE}... (Ctrl+C to stop)")
for line in tail_file(LOG_FILE):
    for pattern in ALERT_PATTERNS:
        if pattern in line:
            print(f"[ALERT] {line}")
            # Here you could send a notification
            break

System Monitoring Scripts

The psutil library provides cross-platform access to system metrics — CPU, memory, disk, network, and process information. It is the foundation for building custom monitoring tools.

Installing psutil

pip install psutil

Comprehensive system health check

#!/usr/bin/env python3
"""Comprehensive system health check script."""

import psutil
import socket
from datetime import datetime

def bytes_to_human(n):
    """Convert bytes to human-readable format."""
    for unit in ["B", "KB", "MB", "GB", "TB"]:
        if n < 1024:
            return f"{n:.1f} {unit}"
        n /= 1024
    return f"{n:.1f} PB"

def check_system():
    """Run all system checks and return results."""
    results = {}
    hostname = socket.gethostname()

    # CPU
    cpu_percent = psutil.cpu_percent(interval=1)
    cpu_count = psutil.cpu_count()
    load_avg = psutil.getloadavg()
    results["cpu"] = {
        "usage_percent": cpu_percent,
        "cores": cpu_count,
        "load_avg_1m": load_avg[0],
        "load_avg_5m": load_avg[1],
        "load_avg_15m": load_avg[2],
    }

    # Memory
    mem = psutil.virtual_memory()
    swap = psutil.swap_memory()
    results["memory"] = {
        "total": bytes_to_human(mem.total),
        "used": bytes_to_human(mem.used),
        "available": bytes_to_human(mem.available),
        "percent": mem.percent,
        "swap_percent": swap.percent,
    }

    # Disk
    results["disks"] = []
    for partition in psutil.disk_partitions():
        try:
            usage = psutil.disk_usage(partition.mountpoint)
            results["disks"].append({
                "mount": partition.mountpoint,
                "device": partition.device,
                "total": bytes_to_human(usage.total),
                "used_percent": usage.percent,
            })
        except PermissionError:
            continue

    # Network
    net = psutil.net_io_counters()
    results["network"] = {
        "bytes_sent": bytes_to_human(net.bytes_sent),
        "bytes_recv": bytes_to_human(net.bytes_recv),
    }

    return hostname, results

hostname, health = check_system()
print(f"=== System Health Report: {hostname} ===")
print(f"Timestamp: {datetime.now().isoformat()}\n")

print(f"CPU: {health['cpu']['usage_percent']}% "
      f"({health['cpu']['cores']} cores, "
      f"load: {health['cpu']['load_avg_1m']:.2f})")

print(f"Memory: {health['memory']['percent']}% "
      f"({health['memory']['used']} / {health['memory']['total']}) "
      f"Swap: {health['memory']['swap_percent']}%")

for disk in health["disks"]:
    status = "WARNING" if disk["used_percent"] > 85 else "OK"
    print(f"Disk {disk['mount']}: {disk['used_percent']}% "
          f"of {disk['total']} [{status}]")

print(f"Network: sent={health['network']['bytes_sent']}, "
      f"recv={health['network']['bytes_recv']}")

Disk usage alerts with thresholds

#!/usr/bin/env python3
"""Monitor disk usage and alert when thresholds are exceeded."""

import psutil

THRESHOLDS = {
    "/": 85,
    "/var": 80,
    "/home": 90,
    "/tmp": 75,
}

alerts = []

for partition in psutil.disk_partitions():
    mount = partition.mountpoint
    if mount in THRESHOLDS:
        usage = psutil.disk_usage(mount)
        if usage.percent > THRESHOLDS[mount]:
            alerts.append({
                "mount": mount,
                "usage": usage.percent,
                "threshold": THRESHOLDS[mount],
                "free_gb": usage.free / (1024 ** 3),
            })

if alerts:
    print("DISK USAGE ALERTS:")
    for alert in alerts:
        print(f"  {alert['mount']}: {alert['usage']}% used "
              f"(threshold: {alert['threshold']}%, "
              f"free: {alert['free_gb']:.1f} GB)")
else:
    print("All disk partitions within normal thresholds.")

Process monitoring

#!/usr/bin/env python3
"""Monitor specific processes and alert if they stop running."""

import psutil

REQUIRED_PROCESSES = ["nginx", "postgresql", "redis-server", "sshd"]

running = {p.name() for p in psutil.process_iter(["name"])}

for proc_name in REQUIRED_PROCESSES:
    if proc_name in running:
        print(f"  [RUNNING] {proc_name}")
    else:
        print(f"  [STOPPED] {proc_name} -- ACTION REQUIRED")

Sending Notifications

Automation is only useful if you know when something goes wrong. Python can send alerts via email, Slack, or any webhook-based service.

Sending email notifications

import smtplib
from email.mime.text import MIMEText
from email.mime.multipart import MIMEMultipart

def send_email(subject, body, to_addr, from_addr, smtp_server, smtp_port=587,
               username=None, password=None):
    """Send an email notification."""
    msg = MIMEMultipart()
    msg["From"] = from_addr
    msg["To"] = to_addr
    msg["Subject"] = subject
    msg.attach(MIMEText(body, "plain"))

    with smtplib.SMTP(smtp_server, smtp_port) as server:
        server.starttls()
        if username and password:
            server.login(username, password)
        server.send_message(msg)

# Example usage
send_email(
    subject="[ALERT] Disk usage above 90% on web01",
    body="Partition /var is at 92% usage. Free space: 3.2 GB.",
    to_addr="admin@example.com",
    from_addr="alerts@example.com",
    smtp_server="smtp.example.com",
    username="alerts@example.com",
    password="app-specific-password"
)

Slack webhook notifications

import requests
import json

def send_slack_alert(webhook_url, message, severity="warning"):
    """Send an alert to a Slack channel via webhook."""
    colors = {
        "info": "#36a64f",
        "warning": "#ff9900",
        "critical": "#ff0000",
    }

    payload = {
        "attachments": [{
            "color": colors.get(severity, "#36a64f"),
            "title": f"Server Alert ({severity.upper()})",
            "text": message,
            "footer": "SysAdmin Automation",
        }]
    }

    response = requests.post(
        webhook_url,
        json=payload,
        timeout=10
    )
    return response.status_code == 200

# Example usage
send_slack_alert(
    webhook_url="https://hooks.slack.com/services/T00/B00/XXXX",
    message="Disk usage on /var exceeded 90% on web01.example.com",
    severity="critical"
)

Combining monitoring with notifications

#!/usr/bin/env python3
"""Monitor system resources and send alerts when thresholds are exceeded."""

import psutil
import requests
import socket
from datetime import datetime

SLACK_WEBHOOK = "https://hooks.slack.com/services/T00/B00/XXXX"
HOSTNAME = socket.gethostname()
CPU_THRESHOLD = 90
MEMORY_THRESHOLD = 85
DISK_THRESHOLD = 85

def alert(message, severity="warning"):
    """Send alert to Slack."""
    payload = {
        "text": f"*[{severity.upper()}]* `{HOSTNAME}` - {message}\n"
                f"_Timestamp: {datetime.now().isoformat()}_"
    }
    requests.post(SLACK_WEBHOOK, json=payload, timeout=10)

# Check CPU
cpu = psutil.cpu_percent(interval=2)
if cpu > CPU_THRESHOLD:
    alert(f"CPU usage at {cpu}% (threshold: {CPU_THRESHOLD}%)", "critical")

# Check memory
mem = psutil.virtual_memory()
if mem.percent > MEMORY_THRESHOLD:
    alert(f"Memory usage at {mem.percent}% (threshold: {MEMORY_THRESHOLD}%)", "critical")

# Check disks
for part in psutil.disk_partitions():
    try:
        usage = psutil.disk_usage(part.mountpoint)
        if usage.percent > DISK_THRESHOLD:
            alert(f"Disk {part.mountpoint} at {usage.percent}% "
                  f"(threshold: {DISK_THRESHOLD}%)", "warning")
    except PermissionError:
        continue

Scheduling Python Scripts

Using cron

The most straightforward way to schedule Python scripts on Linux is with cron.

# Edit your crontab
crontab -e

# Run disk monitor every 15 minutes
*/15 * * * * /home/deploy/sysadmin-scripts/venv/bin/python /home/deploy/sysadmin-scripts/scripts/disk_monitor.py >> /var/log/disk_monitor.log 2>&1

# Run full health check every hour
0 * * * * /home/deploy/sysadmin-scripts/venv/bin/python /home/deploy/sysadmin-scripts/scripts/health_check.py >> /var/log/health_check.log 2>&1

# Run log analysis daily at 6 AM
0 6 * * * /home/deploy/sysadmin-scripts/venv/bin/python /home/deploy/sysadmin-scripts/scripts/log_parser.py >> /var/log/log_analysis.log 2>&1

Always use the full path to the Python binary inside your virtual environment. This ensures the script uses the correct Python version and has access to installed packages.

Using systemd timers

For more advanced scheduling with logging integration and dependency management, use systemd timers.

# /etc/systemd/system/health-check.service
[Unit]
Description=System Health Check
After=network.target

[Service]
Type=oneshot
User=deploy
ExecStart=/home/deploy/sysadmin-scripts/venv/bin/python /home/deploy/sysadmin-scripts/scripts/health_check.py
StandardOutput=journal
StandardError=journal
# /etc/systemd/system/health-check.timer
[Unit]
Description=Run health check every 15 minutes

[Timer]
OnBootSec=5min
OnUnitActiveSec=15min
Persistent=true

[Install]
WantedBy=timers.target
# Enable and start the timer
sudo systemctl daemon-reload
sudo systemctl enable health-check.timer
sudo systemctl start health-check.timer

# Check timer status
systemctl list-timers --all | grep health

Useful Libraries Reference Table

LibraryPurposeInstallDocumentation
pathlibFile/directory path operationsBuilt-indocs.python.org/3/library/pathlib
subprocessRun system commandsBuilt-indocs.python.org/3/library/subprocess
shutilHigh-level file operations (copy, move, archive)Built-indocs.python.org/3/library/shutil
osLow-level OS interface, environment variablesBuilt-indocs.python.org/3/library/os
paramikoSSH2 protocol, remote command execution, SFTPpip install paramikoparamiko.org
requestsHTTP client for REST APIspip install requestsrequests.readthedocs.io
psutilSystem monitoring (CPU, memory, disk, network)pip install psutilpsutil.readthedocs.io
python-dotenvLoad environment variables from .env filespip install python-dotenvpypi.org/project/python-dotenv
jinja2Template engine for config file generationpip install jinja2jinja.palletsprojects.com
pyyamlParse and write YAML configuration filespip install pyyamlpyyaml.org
fabricHigh-level SSH automation (built on Paramiko)pip install fabricfabfile.org
scheduleHuman-friendly job schedulingpip install scheduleschedule.readthedocs.io
loggingStructured application loggingBuilt-indocs.python.org/3/library/logging

Troubleshooting

ModuleNotFoundError when running from cron

Cron uses a minimal environment. If your script fails with ModuleNotFoundError, you are probably not using the full path to the virtual environment Python binary.

# Wrong -- uses system Python, which lacks your packages
* * * * * python3 /home/deploy/scripts/monitor.py

# Correct -- uses the venv Python
* * * * * /home/deploy/sysadmin-scripts/venv/bin/python /home/deploy/scripts/monitor.py

Paramiko connection refused

If Paramiko raises ConnectionRefusedError, verify that SSH is running on the target and that the port is correct.

# Specify a custom port if your server uses a non-standard SSH port
client.connect(hostname="web01.example.com", port=2222, username="deploy",
               key_filename="/home/deploy/.ssh/id_ed25519")

Permission denied on file operations

When scripts run as a non-root user, they may not have access to files under /var/log or /etc. Either run the script with sudo or adjust file permissions.

# Grant read access to auth.log for the deploy user
sudo usermod -aG adm deploy

# Or use ACLs for fine-grained control
sudo setfacl -m u:deploy:r /var/log/auth.log

psutil returns 0.0% CPU on first call

psutil.cpu_percent() returns 0.0 on the first call because it needs two measurements to calculate usage. Always pass an interval or call it twice.

# Correct: pass interval for a blocking measurement
cpu = psutil.cpu_percent(interval=1)

# Or call it twice with a delay
psutil.cpu_percent()
import time; time.sleep(1)
cpu = psutil.cpu_percent()

requests timeout or SSL errors

When connecting to APIs behind corporate proxies or with self-signed certificates, you may encounter SSL errors.

# Disable SSL verification (development only -- never in production)
response = requests.get("https://internal-api.local", verify=False)

# Or specify a custom CA bundle
response = requests.get("https://api.example.com", verify="/etc/ssl/custom-ca.pem")

Summary

Python is the Swiss Army knife of system administration. In this guide, you learned how to:

  • Set up isolated Python environments with venv for sysadmin projects
  • Manage files and directories programmatically with pathlib and shutil
  • Execute and parse system commands with subprocess
  • Automate SSH connections and file transfers with paramiko
  • Interact with REST APIs using requests with proper retry logic
  • Parse and analyze log files with regex and JSON processing
  • Build monitoring scripts using psutil for CPU, memory, disk, and network metrics
  • Send notifications via email and Slack webhooks
  • Schedule scripts reliably with cron and systemd timers

Start small: pick one repetitive task you perform weekly (log rotation, health checks, backup verification) and automate it with Python. As your confidence grows, chain these scripts together into a comprehensive automation toolkit.

For related guides on securing the servers where these scripts will run, see the Linux Server Security Checklist and SSH Hardening: 12 Steps to Secure Your Linux Server.