Real-time view of running processes, CPU, memory, and load average. Pre-installed on virtually every Linux system.
Common Use Cases
Common Options
top # interactive mode top -b -n 1 # batch mode, single snapshot (good for scripting) top -p 1234 # watch a specific PID top -u myuser # filter by user top -d 0.5 # refresh every 0.5s
Interactive Keys
| Key | Action |
|---|---|
1 | Toggle per-CPU breakdown |
M | Sort by memory |
P | Sort by CPU (default) |
T | Sort by time |
k | Kill a process by PID |
r | Renice a process |
f | Field manager (add/remove columns) |
W | Save config to ~/.toprc |
q | Quit |
Reading the Header
top - 14:23:01 up 10 days, 3:12, 2 users, load average: 0.45, 0.61, 0.72 Tasks: 312 total, 3 running, 308 sleeping, 1 zombie %Cpu(s): 8.3 us, 2.1 sy, 0.0 ni, 88.4 id, 1.2 wa, 0.0 hi, 0.0 si MiB Mem : 15987.2 total, 1023.4 free, 12845.1 used, 2118.7 buff/cache MiB Swap: 2048.0 total, 512.0 free, 1536.0 used. 1456.3 avail Mem
CPU Column Meanings
| Field | Meaning |
|---|---|
us | User space CPU time |
sy | Kernel (system) CPU time |
ni | Niced (low priority) processes |
id | Idle — lower = busier CPU |
wa | I/O wait — CPU waiting for disk |
hi | Hardware interrupts |
si | Software interrupts |
st | Stolen time (VM host taking CPU) |
Red Flags
Gotchas
-d 1 or higher.Enhanced interactive process viewer. Color-coded CPU/memory bars, mouse support, tree view, and easier process management than top.
Common Options
htop # interactive htop -p 1234,5678 # watch specific PIDs htop -u myuser # filter by user htop -d 5 # 0.5s refresh htop -t # tree view by default
Key Shortcuts
| Key | Action |
|---|---|
F2 | Setup / configuration |
F3 / / | Search processes |
F4 | Filter processes |
F5 | Tree view |
F6 | Sort by column |
F9 | Kill (signal menu) |
Space | Tag process |
u | Filter by user |
H | Toggle user threads |
K | Toggle kernel threads |
Gotchas
apt install htop or yum install htop.H to collapse threads.Reports per-CPU statistics. Essential for spotting uneven load distribution across cores.
Common Options
mpstat # all CPUs summary since boot mpstat -P ALL 1 5 # per-CPU stats, 1s interval, 5 times mpstat -P 0,1,2 1 # specific CPUs only mpstat -I ALL 1 # include interrupt stats
Sample Output
CPU %usr %sys %iowait %irq %soft %idle all 23.4 4.1 18.2 0.1 0.3 53.9 0 45.2 8.3 32.1 0.2 0.5 13.7 # CPU 0 hot 1 1.2 0.2 0.1 0.0 0.1 98.4 # CPU 1 idle
Red Flags
apt install sysstat.Reports virtual memory, CPU activity, I/O, and process states in a compact format. Great for a quick overall system snapshot.
Common Options
vmstat 1 # update every 1 second vmstat 1 10 # 1s interval, 10 samples vmstat -s # memory stats summary vmstat -d # disk stats vmstat -t 1 # include timestamp
Output Columns
| Group | Column | Meaning |
|---|---|---|
| procs | r | Processes waiting to run (run queue) |
| procs | b | Processes in uninterruptible sleep (I/O wait) |
| memory | swpd | Virtual memory used (swap) |
| memory | free | Idle memory |
| memory | buff | Buffer memory (I/O buffers) |
| memory | cache | Page cache memory |
| swap | si | Swap in (from disk to mem) KB/s |
| swap | so | Swap out (mem to disk) KB/s |
| io | bi | Blocks read from disk |
| io | bo | Blocks written to disk |
| cpu | us | User CPU % |
| cpu | sy | System CPU % |
| cpu | id | Idle CPU % |
| cpu | wa | I/O wait % |
| cpu | st | Stolen (VM) |
Red Flags
Gotchas
Per-process statistics for CPU, memory, I/O, and context switches. Like top but in time-series form, great for logging.
Common Options
pidstat 1 # CPU stats for all active processes pidstat -u 1 # CPU usage per process pidstat -d 1 # disk I/O per process pidstat -r 1 # memory stats per process pidstat -w 1 # context switches per process pidstat -p 1234 1 # watch specific PID pidstat -t 1 # include threads
Red Flags
Linux profiling with hardware performance counters. Use for CPU flame graphs, cache misses, branch mispredictions, and kernel tracing.
Common Use Cases
# CPU profiling - sample call stack at 99Hz for 10s perf record -F 99 -ag -- sleep 10 perf report # Profile a specific command perf stat -d ./my-program # Count events system-wide perf stat -a sleep 5 # Flame graph (with Brendan Gregg's scripts) perf record -F 99 -ag -p 1234 -- sleep 30 perf script | stackcollapse-perf.pl | flamegraph.pl > flame.svg # Top functions by CPU perf top # Trace syscalls perf trace -p 1234 # Count cache misses perf stat -e cache-misses,cache-references ./my-program
Common perf stat Metrics
| Metric | What it means |
|---|---|
instructions | Total instructions executed |
cycles | CPU cycles consumed |
IPC | Instructions per cycle — higher is better |
cache-misses | L1/L2/L3 cache misses — high = memory bound |
branch-misses | Branch mispredictions — high = CPU pipeline stalls |
page-faults | Memory page faults |
Gotchas
linux-tools-$(uname -r) and linux-perf.-fno-omit-frame-pointer or use DWARF unwinding (--call-graph dwarf).Quick snapshot of total, used, free, and available memory including swap.
Common Options
free -h # human-readable (KB/MB/GB) free -m # megabytes free -s 1 # update every 1 second free -t # include total row
Sample Output
total used free shared buff/cache available Mem: 15987 12845 1023 312 2118 2847 Swap: 2048 1536 512
Key: available vs free
Red Flags
Detailed memory statistics including page faults, swap activity, and kernel memory.
vmstat -s # memory event counters since boot cat /proc/meminfo # raw kernel memory breakdown cat /proc/slabinfo # kernel slab allocator stats
Reports physical memory usage accounting for shared memory correctly. Shows PSS (proportional set size) — more accurate than RSS.
Common Options
smem -r # sort by RSS descending smem -s pss -r # sort by PSS descending smem -u # per-user summary smem -t # show totals smem -P nginx # filter by process name
Memory Metrics
| Metric | Meaning |
|---|---|
VSZ | Virtual memory — includes everything mapped (not all physical) |
RSS | Resident Set Size — physical memory used (double-counts shared) |
PSS | Proportional Set Size — shared memory split proportionally. Most accurate. |
USS | Unique Set Size — memory used exclusively by this process |
Shows the memory map of a process — all mapped regions, sizes, and permissions.
pmap 1234 # basic memory map pmap -x 1234 # extended (RSS, dirty pages) pmap -d 1234 # show device format pmap -x 1234 | tail -1 # just the totals line
Reports CPU and I/O statistics for devices. The go-to tool for diagnosing disk bottlenecks.
Common Options
iostat -x 1 # extended stats, 1s interval iostat -xz 1 # extended, skip idle devices iostat -x 1 10 # 10 samples iostat -p sda 1 # specific device iostat -t -x 1 # with timestamp
Key Columns (iostat -x)
| Column | Meaning |
|---|---|
r/s | Reads per second |
w/s | Writes per second |
rkB/s | KB read per second |
wkB/s | KB written per second |
await | Average I/O wait time (ms) — includes queue time |
r_await | Read wait time (ms) |
w_await | Write wait time (ms) |
svctm | Service time (deprecated, ignore) |
%util | Device busy % — how saturated the disk is |
aqu-sz | Average queue depth |
Red Flags
Gotchas
Shows real-time disk I/O per process. Like top but for disk usage.
iotop # interactive, requires root iotop -o # only show processes doing I/O iotop -b -n 5 # batch mode, 5 iterations iotop -p 1234 # watch specific PID iotop -a # accumulated I/O totals
CONFIG_TASK_IO_ACCOUNTING.Reports disk space usage per filesystem. First tool to check when a disk-full error occurs.
df -h # human readable df -hT # include filesystem type df -i # inode usage (not space!) df -h /var # specific mount point
Red Flags
Estimates file and directory space usage. Use to find what's consuming disk space.
du -sh * # size of each item in current dir du -sh /var/log/* # size breakdown in /var/log du -h --max-depth=1 / # top-level directory sizes # Find top 10 largest directories du -xh / | sort -rh | head -10 # Largest files anywhere on system find / -xdev -type f -printf "%s %p\n" | sort -rn | head -20
-x, du crosses mount boundaries and can scan NFS or tmpfs unintentionally.Lists open files — and in Linux, everything is a file: sockets, pipes, devices. Invaluable for debugging file descriptor leaks and network connections.
Common Use Cases
# What files does a process have open? lsof -p 1234 # Who has a file open? lsof /var/log/app.log # What process is using a port? lsof -i :8080 lsof -i TCP:443 # All network connections lsof -i # Files open by a user lsof -u myuser # Deleted files still held open (space not reclaimed!) lsof +L1 # Count open file descriptors per process lsof -n | awk '{print $2}' | sort | uniq -c | sort -rn | head
Red Flags
-n (no DNS) and -P (no port names) to speed it up.Socket statistics — faster replacement for netstat. Shows TCP/UDP connections, listen ports, and socket buffers.
Common Options
ss -tulnp # listening TCP+UDP sockets with PID ss -tan # all TCP connections (numeric) ss -s # socket summary stats ss -tp # TCP with process info ss -o state established # only ESTABLISHED connections ss -o state time-wait # only TIME_WAIT ss dst 10.0.0.1 # connections to a specific IP ss sport = :8080 # connections from source port 8080 # Count connections by state ss -tan | awk '{print $1}' | sort | uniq -c
Flags
| Flag | Meaning |
|---|---|
-t | TCP sockets |
-u | UDP sockets |
-l | Listening sockets only |
-n | Numeric (no DNS/service name resolution) |
-p | Show process (PID/name) |
-e | Extended info (timers, uid) |
-m | Socket memory info |
-i | Internal TCP info (RTT, cwnd, retransmits) |
Red Flags
tcp_tw_reuse or use connection pooling.Classic network statistics tool. Largely superseded by ss but still widely available.
netstat -tulnp # listening sockets with PID netstat -an # all connections numeric netstat -s # protocol statistics (retransmits etc) netstat -rn # routing table netstat -i # interface stats
Real-time bandwidth usage per connection. Like top but for network traffic.
iftop # interactive, auto-selects interface iftop -i eth0 # specific interface iftop -n # no DNS resolution iftop -P # show ports iftop -B # show bytes (not bits) iftop -f "host 10.0.0.1" # filter by host
Shows network bandwidth usage per process — the missing tool between iftop (per connection) and top (per process).
nethogs # all interfaces nethogs eth0 # specific interface nethogs -d 2 # 2s refresh interval nethogs -b # tracemode (non-interactive)
Captures and analyzes raw network packets. The definitive tool for deep network debugging.
Common Use Cases
# Capture all traffic on eth0 tcpdump -i eth0 # Capture specific port tcpdump -i eth0 port 8080 # Capture to file for Wireshark analysis tcpdump -i eth0 -w capture.pcap # Read pcap file tcpdump -r capture.pcap # Filter by host and port tcpdump -i eth0 host 10.0.0.1 and port 443 # Show HTTP GET requests tcpdump -i eth0 -A 'tcp port 80 and (((ip[2:2] - ((ip[0]&0xf)<<2)) - ((tcp[12]&0xf0)>>2)) != 0)' # DNS queries tcpdump -i any port 53 # Count packets by source IP tcpdump -i eth0 -nn -q | awk '{print $3}' | cut -d. -f1-4 | sort | uniq -c | sort -rn
Common Flags
| Flag | Meaning |
|---|---|
-i | Interface (any for all) |
-n | No DNS resolution |
-nn | No DNS and no port name resolution |
-v / -vv | More verbose output |
-A | Print packet payload as ASCII |
-X | Print payload as hex and ASCII |
-s 0 | Capture full packet (default is 262144) |
-c N | Capture N packets then stop |
-w file | Write to pcap file |
Gotchas
-s 0 carefully and write to file instead of displaying live.wireshark group.ping tests basic reachability and latency. mtr (Matt's Traceroute) combines traceroute and ping for live hop-by-hop latency and packet loss.
ping -c 10 8.8.8.8 # 10 pings to Google DNS ping -i 0.2 8.8.8.8 # fast ping (0.2s interval) mtr 8.8.8.8 # interactive traceroute mtr --report -c 50 8.8.8.8 # 50 packets, report mode (good for sharing) mtr -n 8.8.8.8 # no DNS resolution
Red Flags
Snapshot of current processes. Essential for finding PIDs, checking process state, and understanding process relationships.
Common Options
ps aux # all processes, all users (BSD style) ps -ef # full format (UNIX style) ps -ef --forest # tree view showing parent/child ps aux --sort=-%cpu # sort by CPU descending ps aux --sort=-%mem # sort by memory descending ps -p 1234 -o pid,cmd,rss # custom output columns ps -u myuser # by user # Find a process by name ps aux | grep nginx pgrep -la nginx # cleaner alternative
Process States
| State | Meaning |
|---|---|
R | Running or runnable (on CPU or in run queue) |
S | Sleeping — waiting for event (interruptible) |
D | Uninterruptible sleep — usually waiting on I/O. Cannot be killed. |
T | Stopped (SIGSTOP or traced by debugger) |
Z | Zombie — exited but parent hasn't called wait() |
I | Idle kernel thread |
Red Flags
Traces system calls made by a process. Invaluable for debugging "what is this process actually doing?" without source code.
Common Options
# Attach to running process strace -p 1234 # Trace a new command strace ls /tmp # Follow child processes too strace -f -p 1234 # Filter to specific syscalls strace -e trace=open,read,write -p 1234 strace -e trace=network -p 1234 strace -e trace=file -p 1234 # Show timing info strace -T -p 1234 # time spent in each syscall strace -t -p 1234 # wall clock timestamps # Summary: count syscalls and time strace -c ./my-program # Write output to file strace -o /tmp/strace.log -p 1234
Common Use Cases
| Problem | strace filter |
|---|---|
| What files is it opening? | -e trace=openat,open |
| What network calls? | -e trace=network |
| Why is it slow? | -T -c (summary with time) |
| What is it writing? | -e trace=write -s 1024 |
| Signal handling? | -e signal=all |
Gotchas
-f, you only see the parent. Most multi-process apps need -f.Like strace but traces library calls instead of syscalls. Useful for seeing malloc, fopen, and other libc calls.
ltrace ./my-program # trace library calls ltrace -p 1234 # attach to running process ltrace -c ./my-program # summary count ltrace -e malloc+free ./app # only malloc/free calls
# How many FDs does a process have? ls /proc/1234/fd | wc -l # What is the FD limit? cat /proc/1234/limits | grep "open files" # Check process memory maps cat /proc/1234/maps # Process environment variables cat /proc/1234/environ | tr '\0' '\n' # Actual binary path (useful for containers) ls -la /proc/1234/exe
High-level eBPF tracing language. Trace kernel and userspace events with minimal overhead. The modern replacement for many strace/ltrace use cases.
One-liners
# Files opened by process name bpftrace -e 'tracepoint:syscalls:sys_enter_openat /comm == "nginx"/ { printf("%s\n", str(args->filename)); }' # Syscall count per process bpftrace -e 'tracepoint:raw_syscalls:sys_enter { @[comm] = count(); }' # Disk I/O latency histogram bpftrace -e 'kprobe:blk_account_io_start { @start[arg0] = nsecs; } kprobe:blk_account_io_done /@start[arg0]/ { @usecs = hist((nsecs - @start[arg0]) / 1000); delete(@start[arg0]); }' # TCP connect latency bpftrace -e 'kprobe:tcp_v4_connect { @start[tid] = nsecs; } kretprobe:tcp_v4_connect /@start[tid]/ { @ms = hist((nsecs - @start[tid]) / 1000000); delete(@start[tid]); }' # List available tracepoints bpftrace -l 'tracepoint:syscalls:*'
Gotchas
Collection of ready-made eBPF tools from the BCC toolkit. Each tool is production-safe and solves a specific performance question.
Essential BCC Tools
| Tool | What it answers |
|---|---|
execsnoop | What processes are being exec'd right now? |
opensnoop | What files are being opened? |
biolatency | Block I/O latency histogram |
biosnoop | Per-I/O latency with process name |
tcpconnect | What TCP connections are being made? |
tcpaccept | What TCP connections are being accepted? |
tcpretrans | TCP retransmits with details |
runqlat | CPU run queue latency histogram |
offcputime | Time spent off CPU (blocked, sleeping) |
profile | CPU flame graph profiler |
memleak | Outstanding allocations (memory leaks) |
funccount | Count function calls |
trace | Trace arbitrary kernel/user functions |
cachestat | Page cache hit/miss ratio |
cachetop | Page cache top by process |
fileslower | Slow file reads/writes |
ext4slower | Slow ext4 operations |
runqlat # run queue latency — are tasks waiting for CPU? biolatency -D # disk I/O latency per device tcpretrans # watch TCP retransmits live cachestat 1 # page cache stats every second memleak -p 1234 # watch for memory leaks in PID profile -F 99 -af 30 # CPU flame graph for 30s
Built-in kernel tracing framework. Accessed via /sys/kernel/debug/tracing. No external tools needed — available on any Linux system.
cd /sys/kernel/debug/tracing # List available tracers cat available_tracers # Function call tracing echo function > current_tracer echo 1 > tracing_on cat trace echo 0 > tracing_on # Trace a specific function echo my_function > set_ftrace_filter echo function > current_tracer # Easier with trace-cmd (wrapper) trace-cmd record -e sched_switch -p function sleep 5 trace-cmd report
System Activity Reporter. Collects, records, and reports historical system performance data. Part of sysstat. The only standard tool for looking back in time.
Common Options
# CPU usage every 1s, 5 samples sar -u 1 5 # Memory stats sar -r 1 5 # Disk I/O sar -d 1 5 # Network stats sar -n DEV 1 5 # TCP stats sar -n TCP 1 5 # Load average and run queue sar -q 1 5 # Historical data from today's log sar -u -f /var/log/sysstat/sa$(date +%d) # Historical data from specific time range sar -u -s 09:00:00 -e 10:00:00
Gotchas
/etc/default/sysstat → set ENABLED="true", then systemctl enable --now sysstat.Combines vmstat, iostat, ifstat, and netstat into one colorized output. Great for a live overview of all subsystems at once.
dstat # default: cpu, disk, net, paging, system dstat -cdngy # cpu, disk, net, paging, system dstat --top-cpu # show top CPU process dstat --top-io # show top I/O process dstat --top-mem # show top memory process dstat -t 1 60 # with timestamp, 1s interval, 60s dstat --output /tmp/dstat.csv 1 # export to CSV for analysis
dool (fork) on newer systems. Some distros ship it as dstat still.Cross-platform monitoring tool with a rich curses UI. Shows CPU, memory, disk, network, processes, and alerts in one screen.
glances # interactive TUI glances -w # web server mode (port 61208) glances -s # server mode (for remote monitoring) glances -c remote-host # connect to remote glances server glances --export csv # export metrics to CSV
Advanced system and process monitor. Records all activity to disk and allows replaying historical sessions. Captures processes that have already exited.
atop # interactive atop -r /var/log/atop/atop_20240426 # replay saved log atop -A # show all resources atop -w /tmp/atop.log 1 60 # write log, 1s interval, 60s
Brendan Gregg's methodology: for every resource, check Utilization, Saturation, and Errors.
| Resource | Utilization | Saturation | Errors |
|---|---|---|---|
| CPU | mpstat %usr+%sys | vmstat r > CPUs | dmesg | grep error |
| Memory | free avail | vmstat si/so > 0 | dmesg OOM killer |
| Disk | iostat %util | iostat aqu-sz > 1 | smartctl -a /dev/sda |
| Network | sar -n DEV txkB/s | netstat -s retransmits | ip -s link |
| File descriptors | lsof -p PID | wc -l | /proc/PID/limits | EMFILE errors in logs |
Shown in top, uptime, and /proc/loadavg. Represents the average number of processes in R (running) or D (uninterruptible sleep) state over 1, 5, and 15 minutes.
| Scenario | Meaning |
|---|---|
| Load 1m > 5m > 15m | Load is increasing — problem is getting worse |
| Load 1m < 5m < 15m | Load is decreasing — problem is recovering |
| Load high, CPU idle | D state processes — blocked on I/O, not CPU |
| Load = CPUs | Fully utilized but not saturated |
| Load >> CPUs | Saturated — tasks are queuing |
| Signal | Number | Default Action | Use case |
|---|---|---|---|
SIGHUP | 1 | Terminate | Reload config (many daemons) |
SIGINT | 2 | Terminate | Ctrl+C |
SIGQUIT | 3 | Core dump | Ctrl+\ — quit with core |
SIGKILL | 9 | Terminate | Force kill — cannot be caught or ignored |
SIGTERM | 15 | Terminate | Graceful shutdown (default kill signal) |
SIGSTOP | 19 | Stop | Pause process — cannot be caught |
SIGCONT | 18 | Continue | Resume stopped process |
SIGUSR1 | 10 | Terminate | App-defined (e.g. log rotation) |
SIGUSR2 | 12 | Terminate | App-defined |
SIGPIPE | 13 | Terminate | Write to broken pipe |
SIGCHLD | 17 | Ignore | Child process stopped or exited |
kill -15 1234 # graceful terminate (default) kill -9 1234 # force kill kill -HUP 1234 # reload config killall nginx # kill all processes named nginx pkill -f "my-app" # kill by full command line match kill -0 1234 # check if process exists (no signal sent)