Ubuntu System Monitoring and Log Management: Complete 2025 Guide

By Hafiz Ali | Linux System Administrator with 8+ years experience managing Ubuntu servers and VPN infrastructure. Certified RHCE and Ubuntu Server Specialist.

๐Ÿ•’ Last updated: December 2024 | Tested on Ubuntu 22.04 LTS and 24.04 LTS

`

Effective system monitoring and log management are crucial for maintaining Ubuntu server health, performance, and security. This comprehensive guide covers everything from basic monitoring commands to advanced log analysis techniques used by professional system administrators.

๐Ÿš€ Monitoring Tools Overview

Tool CategoryToolsBest ForComplexity
๐Ÿ“Š Built-in Commandstop, htop, vmstat, iostatQuick checks๐ŸŸข Easy
๐Ÿ“ˆ System Monitoringsystemd-journald, syslogLog analysis๐ŸŸข Easy
๐Ÿ” Advanced ToolsNetdata, Prometheus, GrafanaEnterprise monitoring๐ŸŸก Medium
๐Ÿ›ก๏ธ Security Monitoringfail2ban, auditdSecurity analysis๐ŸŸก Medium

๐Ÿ“Š Real-time System Monitoring

๐Ÿ–ฅ๏ธ Process Monitoring with htop

# Install htop (if not already installed)
sudo apt update && sudo apt install htop

# Launch htop for real-time monitoring
htop

# Key htop features:
# - ๐Ÿ“Š Color-coded CPU/Memory usage
# - ๐Ÿ” Process tree view (F5)
# - ๐Ÿ”„ Process sorting (F6)
# - ๐ŸŽฏ Process search (F3)
# - ๐Ÿšฆ Customize display (F2)

๐Ÿ’ป Basic System Monitoring Commands

# CPU and Memory usage
top
htop

# Memory usage details
free -h
cat /proc/meminfo

# Disk usage and I/O
df -h
iostat -x 1

# Network statistics
iftop
nethogs

# System load averages
uptime
cat /proc/loadavg

๐Ÿ“ Log Management Fundamentals

๐Ÿ“ Ubuntu Log File Locations

# System logs
/var/log/syslog
/var/log/auth.log
/var/log/kern.log

# Application logs
/var/log/nginx/          # Web server
/var/log/mysql/          # Database
/var/log/apache2/        # Apache
/var/log/dpkg.log        # Package management

# Systemd journal
/var/log/journal/

๐Ÿ”„ Systemd Journal Management

# View all journal logs
sudo journalctl

# Follow new log entries in real-time
sudo journalctl -f

# View logs from current boot only
sudo journalctl -b

# View logs for specific service
sudo journalctl -u nginx
sudo journalctl -u mysql

# View logs with time range
sudo journalctl --since "2024-12-01 00:00:00" --until "2024-12-01 23:59:59"

# View kernel messages
sudo journalctl -k

# Show disk usage by journal
sudo journalctl --disk-usage

# Limit journal size
sudo journalctl --vacuum-size=1G
sudo journalctl --vacuum-time=30days

๐Ÿ” Advanced Log Analysis

๐Ÿ“‹ Essential Log Analysis Commands

# Search for errors in logs
sudo grep -i error /var/log/syslog
sudo journalctl -p err

# Search for specific patterns
sudo grep "connection refused" /var/log/syslog
sudo grep "authentication failure" /var/log/auth.log

# Count occurrences of errors
sudo grep -c "error" /var/log/syslog

# Monitor logs in real-time with filtering
sudo tail -f /var/log/syslog | grep -i error

# Analyze log file sizes
sudo du -sh /var/log/* | sort -hr

# Check for log rotation issues
ls -la /var/log/*.1

๐ŸŽฏ Log Analysis Script Examples

#!/bin/bash
# Simple log analysis script

echo "=== System Log Analysis Report ==="
echo "Generated: $(date)"
echo

# Top errors in syslog
echo "--- Top Errors in Syslog ---"
sudo grep -i error /var/log/syslog | tail -10

# Failed login attempts
echo
echo "--- Failed Login Attempts ---"
sudo grep "authentication failure" /var/log/auth.log | tail -5

# Disk usage
echo
echo "--- Log Disk Usage ---"
sudo du -sh /var/log/

๐Ÿš€ Performance Monitoring Tools

๐Ÿ“ˆ System Performance Commands

# Real-time system monitoring
vmstat 1 10
mpstat 1 10
iostat -x 1 10

# Process monitoring with detailed I/O
iotop
pidstat 1

# Network monitoring
ss -tuln
netstat -tuln
ip -s link

# Memory analysis
cat /proc/meminfo
slabtop

# Disk performance
iostat -x 1
iotop

๐Ÿ› ๏ธ Custom Monitoring Script

#!/bin/bash
# System health monitoring script

echo "=== System Health Check ==="
echo "Timestamp: $(date)"

# CPU and Memory
echo "--- CPU & Memory ---"
echo "Load: $(uptime | awk -F'load average:' '{print $2}')"
echo "Memory: $(free -h | grep Mem | awk '{print $3\"/\"$2}')"

# Disk space
echo "--- Disk Usage ---"
df -h | grep -v tmpfs

# Services status
echo "--- Critical Services ---"
services=("nginx" "mysql" "ssh")
for service in "${services[@]}"; do
    status=$(systemctl is-active $service)
    echo "$service: $status"
done

# Recent errors
echo "--- Recent Errors ---"
sudo journalctl -p err --since "1 hour ago" | tail -5

๐Ÿ”ง Advanced Monitoring Setup

๐Ÿ“Š Netdata Installation & Configuration

# Install Netdata for real-time monitoring
bash <(curl -Ss https://my-netdata.io/kickstart.sh)
# Alternative: Install from repository
sudo apt install netdata
# Configure Netdata (edit /etc/netdata/netdata.conf)
sudo nano /etc/netdata/netdata.conf
# Key configuration settings:
[web]
    bind to = 0.0.0.0:19999
# Restart Netdata
sudo systemctl restart netdata
# Access Netdata web interface
# http://your-server-ip:19999

๐Ÿ“ก Prometheus & Grafana Setup

# Install Prometheus node exporter
wget https://github.com/prometheus/node_exporter/releases/download/v1.6.1/node_exporter-1.6.1.linux-amd64.tar.gz
tar xvfz node_exporter-*.tar.gz
cd node_exporter-*
./node_exporter &

# Install Grafana
sudo apt install -y apt-transport-https software-properties-common
wget -q -O - https://packages.grafana.com/gpg.key | sudo apt-key add -
echo "deb https://packages.grafana.com/oss/deb stable main" | sudo tee /etc/apt/sources.list.d/grafana.list
sudo apt update && sudo apt install grafana

# Start Grafana
sudo systemctl daemon-reload
sudo systemctl start grafana-server
sudo systemctl enable grafana-server

๐Ÿ›ก๏ธ Security Monitoring

๐Ÿšจ Fail2ban Installation

# Install fail2ban for SSH protection
sudo apt update && sudo apt install fail2ban

# Configure fail2ban
sudo cp /etc/fail2ban/jail.conf /etc/fail2ban/jail.local
sudo nano /etc/fail2ban/jail.local

# Key configuration for SSH protection:
[sshd]
enabled = true
port = ssh
logpath = /var/log/auth.log
maxretry = 3
bantime = 3600

# Start and enable fail2ban
sudo systemctl enable fail2ban
sudo systemctl start fail2ban

# Check fail2ban status
sudo fail2ban-client status
sudo fail2ban-client status sshd

๐Ÿ“ Security Log Monitoring

# Monitor SSH login attempts
sudo grep "Failed password" /var/log/auth.log
sudo grep "Accepted password" /var/log/auth.log

# Check for brute force attacks
sudo fail2ban-client status sshd

# Monitor sudo usage
sudo grep "sudo:" /var/log/auth.log

# Check for unusual processes
ps aux --sort=-%cpu | head -10
ps aux --sort=-%mem | head -10

๐Ÿ“ˆ Performance Benchmarking

โšก System Benchmark Commands

# CPU benchmark (install sysbench first)
sudo apt install sysbench
sysbench cpu --cpu-max-prime=20000 run

# Memory benchmark
sysbench memory --memory-total-size=10G run

# Disk I/O benchmark
sysbench fileio --file-total-size=1G prepare
sysbench fileio --file-total-size=1G --file-test-mode=rndrw run
sysbench fileio --file-total-size=1G cleanup

# Network performance
iperf3 -s  # On server
iperf3 -c server-ip  # On client

๐Ÿ“Š Performance Thresholds & Alerts

#!/bin/bash
# Performance threshold monitoring

# CPU threshold (90%)
CPU_THRESHOLD=90
cpu_usage=$(top -bn1 | grep "Cpu(s)" | awk '{print $2}' | cut -d'%' -f1)

if (( $(echo "$cpu_usage > $CPU_THRESHOLD" | bc -l) )); then
    echo "ALERT: High CPU usage: $cpu_usage%"
fi

# Memory threshold (85%)
MEM_THRESHOLD=85
mem_usage=$(free | grep Mem | awk '{printf("%.0f", $3/$2 * 100.0)}')

if [ $mem_usage -gt $MEM_THRESHOLD ]; then
    echo "ALERT: High Memory usage: $mem_usage%"
fi

# Disk threshold (90%)
DISK_THRESHOLD=90
disk_usage=$(df / | awk 'NR==2 {print $5}' | cut -d'%' -f1)

if [ $disk_usage -gt $DISK_THRESHOLD ]; then
    echo "ALERT: High Disk usage: $disk_usage%"
fi

๐Ÿ”ง Log Rotation & Management

๐Ÿ”„ Configure Logrotate

# Check current logrotate configuration
cat /etc/logrotate.conf
ls /etc/logrotate.d/

# Create custom logrotate configuration
sudo nano /etc/logrotate.d/custom-app

Example logrotate configuration:

/var/log/custom-app/*.log {
    daily
    missingok
    rotate 7
    compress
    delaycompress
    notifempty
    create 644 root root
    postrotate
        /usr/bin/systemctl reload custom-app
    endscript
}

๐Ÿงน Log Cleanup Script

#!/bin/bash
# Automated log cleanup script

echo "Starting log cleanup..."

# Clean old log files
find /var/log -name "*.log.*" -mtime +30 -delete
find /var/log -name "*.gz" -mtime +30 -delete

# Clean temporary files
find /tmp -type f -mtime +7 -delete
find /var/tmp -type f -mtime +30 -delete

# Clean package cache
sudo apt autoremove -y
sudo apt autoclean

# Clean systemd journal
sudo journalctl --vacuum-time=30d

echo "Log cleanup completed: $(date)"

๐Ÿ“‹ Monitoring Best Practices

  • โœ… Set up alerts for critical thresholds (CPU > 90%, Memory > 85%, Disk > 90%)
  • โœ… Monitor key services (SSH, web server, database, VPN services)
  • โœ… Regular log reviews - schedule daily/weekly log analysis
  • โœ… Implement log rotation to prevent disk space issues
  • โœ… Use centralized logging for multiple servers
  • โœ… Monitor security logs for unauthorized access attempts
  • โœ… Set up performance baselines to detect anomalies
  • โœ… Automate cleanup tasks with cron jobs

โ“ Frequently Asked Questions

๐Ÿ”ง How often should I check system logs?

Daily: Quick scan of critical logs (auth.log, syslog)
Weekly: Comprehensive log review and analysis
Real-time: Set up alerts for critical errors and security events

๐Ÿ“Š What's the difference between htop and top?

top: Basic process monitor, available on all systems
htop: Enhanced version with colors, mouse support, and better UI
Use htop for interactive monitoring and top for scripts.

๐Ÿšจ How do I set up email alerts for high resource usage?

#!/bin/bash
# Simple email alert script

THRESHOLD=90
CURRENT_USAGE=$(df / | awk 'NR==2 {print $5}' | cut -d'%' -f1)

if [ $CURRENT_USAGE -gt $THRESHOLD ]; then
    echo "Warning: Disk usage is $CURRENT_USAGE%" | mail -s "Disk Alert" admin@example.com
fi

# Add to crontab to run every hour:
# 0 * * * * /path/to/disk-alert.sh

๐Ÿ”— Related Ubuntu Guides

๐Ÿ“ˆ Master Ubuntu System Administration

Our complete Ubuntu Troubleshooting category has everything you need for professional server management and monitoring.

Explore All Monitoring Guides โ†’

Similar Posts