Linux sysctl Guide
Read, write, and persist Linux kernel parameters — networking, memory, VM, kernel, and security
sysctl namespace │ ├── net/ │ ├── core/ → socket buffers, netdev backlog │ ├── ipv4/ → TCP tuning, forwarding, ICMP, conntrack │ ├── ipv6/ → IPv6 forwarding, router advertisements │ └── netfilter/ → conntrack table size and timeouts │ ├── vm/ │ ├── swappiness → how aggressively kernel uses swap │ ├── dirty_* → writeback thresholds for page cache │ ├── overcommit_* → memory allocation policy │ └── nr_hugepages → transparent / explicit hugepage config │ ├── kernel/ │ ├── pid_max → max process IDs │ ├── panic → behaviour on kernel oops/panic │ ├── core_pattern → core dump filename template │ └── sched_* → CFS scheduler tunables │ └── fs/ ├── file-max → system-wide open file limit ├── inotify/ → inotify watch limits └── pipe-max-size → max pipe buffer size
Basics Reading & Writing

The sysctl command reads and writes kernel parameters at runtime. Every parameter maps to a file under /proc/sys/ — the dot-separated sysctl name corresponds to a slash-separated path.

Basic Usage

Shell
# Read a single parameter
sysctl net.ipv4.tcp_max_syn_backlog

# Read all parameters matching a pattern
sysctl -a | grep tcp_max

# Write a parameter (takes effect immediately, lost on reboot)
sysctl -w net.ipv4.tcp_max_syn_backlog=65535

# Read all parameters
sysctl -a

# Show only changed (non-default) values
sysctl -a --diff /etc/sysctl.conf

Name ↔ Path Mapping

sysctl name /proc/sys path ───────────────────────────────────────────────────── net.ipv4.ip_forward ↔ /proc/sys/net/ipv4/ip_forward vm.swappiness ↔ /proc/sys/vm/swappiness kernel.pid_max ↔ /proc/sys/kernel/pid_max fs.file-max ↔ /proc/sys/fs/file-max Dots in name = directory separators in /proc/sys
Runtime vs persistent sysctl -w changes take effect immediately but are lost on reboot. To persist, write to a file in /etc/sysctl.d/ (see the Persisting Changes section).
Basics /proc/sys

Every sysctl parameter is a virtual file under /proc/sys. You can read and write parameters directly using standard file I/O — useful in scripts where you want to avoid the sysctl binary.

Shell
# Read directly from /proc/sys
cat /proc/sys/net/ipv4/tcp_max_syn_backlog

# Write directly (same as sysctl -w)
echo 65535 > /proc/sys/net/ipv4/tcp_max_syn_backlog

# Explore the sysctl tree
ls /proc/sys/net/ipv4/ | head -20
ls /proc/sys/vm/

# Check if a parameter is read-only
ls -l /proc/sys/kernel/osrelease
# -r--r--r-- means read-only
Permissions Writing to /proc/sys requires root. Most parameters are world-readable but root-writable. A few are read-only even as root (e.g. kernel.osrelease).
Basics Persisting Changes

To survive reboots, write parameters to a .conf file under /etc/sysctl.d/. Files are loaded in lexicographic order, with later files overriding earlier ones.

File Format

/etc/sysctl.d/99-tuning.conf
# Lines starting with # are comments
# Format: parameter = value

net.core.somaxconn = 65535
net.ipv4.tcp_max_syn_backlog = 65535
net.ipv4.ip_local_port_range = 1024 65535

vm.swappiness = 10

fs.file-max = 2097152

Apply Without Rebooting

Shell
# Apply all files in /etc/sysctl.d/ and /etc/sysctl.conf
sysctl --system

# Apply a specific file
sysctl -p /etc/sysctl.d/99-tuning.conf

# Apply the legacy /etc/sysctl.conf
sysctl -p

Load Order

Files are loaded in this order (later = higher priority): /usr/lib/sysctl.d/*.conf ← distro defaults (don't edit) /usr/local/lib/sysctl.d/*.conf /run/sysctl.d/*.conf ← runtime / container overrides /etc/sysctl.d/*.conf ← your changes go here /etc/sysctl.conf ← legacy; still respected Naming tip: prefix with 99- to ensure your file loads last and overrides any distro defaults.
Best practice Create /etc/sysctl.d/99-custom.conf rather than editing /etc/sysctl.conf directly. This survives package upgrades and is easier to track in version control or config management (Ansible, Chef, Puppet).
Networking TCP

TCP parameters control connection queues, congestion control, retransmit behaviour, TIME_WAIT handling, and keepalives. These are among the most commonly tuned parameters on servers.

Connection Queues

Client SYN arrives │ ▼ SYN queue (half-open) ← tcp_max_syn_backlog — SYN received, SYN-ACK sent — awaiting client ACK │ ACK arrives ▼ Accept queue (fully established) ← somaxconn (and listen backlog) — waiting for app to call accept() │ accept() ▼ Application
ParameterDefaultDescription
net.core.somaxconn4096Max accept queue depth per socket. Also capped by the backlog arg passed to listen().
net.ipv4.tcp_max_syn_backlog1024Max SYN queue depth (half-open connections) per socket.
net.ipv4.tcp_syncookies1Send SYN cookies when SYN queue is full — mitigates SYN flood without dropping connections.
net.ipv4.tcp_fin_timeout60Seconds a socket stays in TIME_WAIT after FIN. Lower to free ports faster.
net.ipv4.tcp_tw_reuse0Allow reuse of TIME_WAIT sockets for new outbound connections (safe to enable).
net.ipv4.ip_local_port_range32768 60999Ephemeral port range for outbound connections. Widen for high-connection-rate clients.
net.ipv4.tcp_max_tw_buckets262144Max simultaneous TIME_WAIT sockets. Excess are immediately destroyed.

Retransmit & Keepalive

ParameterDefaultDescription
net.ipv4.tcp_retries215Max retransmit attempts before giving up on an established connection (~13–30 min).
net.ipv4.tcp_syn_retries6Max SYN retransmit attempts for outbound connections (~127s total).
net.ipv4.tcp_keepalive_time7200Seconds of idle before sending first keepalive probe (2 hours).
net.ipv4.tcp_keepalive_intvl75Seconds between keepalive probes.
net.ipv4.tcp_keepalive_probes9Number of unanswered probes before declaring the connection dead.

Congestion Control

Shell
# Show available algorithms
sysctl net.ipv4.tcp_available_congestion_control

# Show active algorithm
sysctl net.ipv4.tcp_congestion_control

# Enable BBR (better for high-BDP paths and streaming)
sysctl -w net.ipv4.tcp_congestion_control=bbr
sysctl -w net.core.default_qdisc=fq   # BBR works best with fq qdisc

Common Server Tuning

Shell
# High-connection-rate server (web, API)
sysctl -w net.core.somaxconn=65535
sysctl -w net.ipv4.tcp_max_syn_backlog=65535
sysctl -w net.ipv4.ip_local_port_range="1024 65535"
sysctl -w net.ipv4.tcp_tw_reuse=1
sysctl -w net.ipv4.tcp_fin_timeout=15

# Tighten keepalives for detecting dead connections faster
sysctl -w net.ipv4.tcp_keepalive_time=60
sysctl -w net.ipv4.tcp_keepalive_intvl=10
sysctl -w net.ipv4.tcp_keepalive_probes=6
somaxconn and listen() backlog are both caps The effective accept queue size is min(somaxconn, listen_backlog). Raising somaxconn alone isn't enough if your application passes a small value to listen(). Check your framework's default (Node.js: 511, many Java stacks: 50–100).
Networking Socket Buffers

TCP and UDP socket buffers control how much data the kernel holds in flight per connection. Too small = throughput limited by buffer, especially on high-latency links. Too large = wasted memory on idle connections.

Buffer Parameters

ParameterDefaultDescription
net.core.rmem_max212992Hard ceiling for socket receive buffer (bytes). Set by the application via SO_RCVBUF.
net.core.wmem_max212992Hard ceiling for socket send buffer (bytes).
net.core.rmem_default212992Default receive buffer before the application adjusts it.
net.core.wmem_default212992Default send buffer.
net.ipv4.tcp_rmem4096 131072 6291456TCP receive buffer: min / default / max (bytes). Kernel auto-tunes within this range.
net.ipv4.tcp_wmem4096 16384 4194304TCP send buffer: min / default / max (bytes).
net.ipv4.udp_rmem_min4096Minimum UDP receive buffer per socket.
net.ipv4.tcp_memautoSystem-wide TCP memory (pages): low / pressure / max. Kernel sets this from RAM at boot.

High-Bandwidth Tuning (10 GbE / long-haul)

Shell
# BDP rule: buffer ≥ bandwidth × RTT
# Example: 10 Gbps × 10ms RTT = 12.5 MB needed per connection

sysctl -w net.core.rmem_max=134217728       # 128 MB ceiling
sysctl -w net.core.wmem_max=134217728
sysctl -w net.ipv4.tcp_rmem="4096 87380 134217728"
sysctl -w net.ipv4.tcp_wmem="4096 65536 134217728"

# Enable auto-tuning (on by default, verify it's on)
sysctl net.ipv4.tcp_moderate_rcvbuf           # should be 1
Auto-tuning Modern kernels auto-tune TCP buffers within the tcp_rmem / tcp_wmem range. You mostly need to raise the maximum — the kernel grows the buffer as needed. The rmem_max / wmem_max values cap what the application can request via SO_RCVBUF / SO_SNDBUF.
Networking conntrack

Netfilter connection tracking parameters. Critical on firewalls, NAT gateways, and load balancers where the tracked connection count can be very high.

ParameterDefaultDescription
net.netfilter.nf_conntrack_max131072Max number of tracked connections. When full, new connections are dropped with "table full" in dmesg.
net.netfilter.nf_conntrack_tcp_timeout_established432000Seconds to keep an established TCP connection in the table (5 days default — very long).
net.netfilter.nf_conntrack_tcp_timeout_time_wait120Seconds to keep TIME_WAIT entries.
net.netfilter.nf_conntrack_udp_timeout30Seconds to keep an unidirectional UDP flow.
net.netfilter.nf_conntrack_udp_timeout_stream120Seconds to keep a bidirectional UDP flow.
net.netfilter.nf_conntrack_bucketsautoHash table size. Set to nf_conntrack_max / 4 for good performance.
Shell
# Check current table usage vs max
cat /proc/sys/net/netfilter/nf_conntrack_count
cat /proc/sys/net/netfilter/nf_conntrack_max

# Increase table size (e.g. for a busy NAT gateway)
sysctl -w net.netfilter.nf_conntrack_max=2097152

# Shorten TCP established timeout from 5 days to 1 hour
sysctl -w net.netfilter.nf_conntrack_tcp_timeout_established=3600

# Watch for table-full drops
dmesg | grep "nf_conntrack: table full"
"nf_conntrack: table full, dropping packet" This appears in dmesg when the conntrack table is exhausted. New connections are silently dropped. Either raise nf_conntrack_max, shorten timeouts to expire stale entries faster, or add NOTRACK rules in the raw table for traffic that doesn't need stateful tracking.
Networking Routing & Forwarding
ParameterDefaultDescription
net.ipv4.ip_forward0Enable IPv4 packet forwarding. Must be 1 for routers, VPN gateways, and containers.
net.ipv6.conf.all.forwarding0Enable IPv6 packet forwarding.
net.ipv4.conf.all.rp_filter1Reverse path filtering — drops packets whose source address has no return route. Set to 0 for asymmetric routing.
net.ipv4.conf.all.accept_redirects1Accept ICMP redirects. Disable on servers — routers should not be redirecting your traffic.
net.ipv4.conf.all.send_redirects1Send ICMP redirects. Disable if not acting as a router.
net.ipv4.route.max_sizeRouting cache max entries (removed in kernel 3.6; now caches are per-CPU and unbounded).
Shell
# Enable forwarding (required for Docker, Kubernetes, VPNs)
sysctl -w net.ipv4.ip_forward=1
sysctl -w net.ipv6.conf.all.forwarding=1

# Disable ICMP redirects on a server (not a router)
sysctl -w net.ipv4.conf.all.accept_redirects=0
sysctl -w net.ipv4.conf.all.send_redirects=0

# Disable reverse path filtering for asymmetric routing
sysctl -w net.ipv4.conf.all.rp_filter=0
Networking UDP
ParameterDefaultDescription
net.core.netdev_max_backlog1000Per-CPU queue of packets waiting to be processed by the kernel. Increase if NIC drops frames under load (ethtool -S shows rx_missed_errors).
net.ipv4.udp_rmem_min4096Minimum per-socket UDP receive buffer.
net.ipv4.udp_wmem_min4096Minimum per-socket UDP send buffer.
net.ipv4.udp_memautoSystem-wide UDP memory (pages): low / pressure / max.
UDP receive drops Check cat /proc/net/udp or ss -unp for per-socket drop counters. System-wide UDP drops appear in netstat -su under "receive buffer errors."
Memory Swappiness

vm.swappiness controls how aggressively the kernel reclaims anonymous memory (heap, stack) relative to file-backed page cache. It's a tendency knob, not a hard threshold.

vm.swappiness = 0 → prefer reclaiming page cache, avoid swapping anonymous memory until absolutely necessary vm.swappiness = 10 → light preference for reclaiming page cache over swap (good for databases, latency-sensitive workloads) vm.swappiness = 60 → kernel default, balanced vm.swappiness = 100 → treat anonymous and file-backed memory equally, swap aggressively
WorkloadRecommended ValueReason
General server10–30Reduces latency spikes from unexpected swapping
Database (MySQL, PostgreSQL)1–10Databases manage their own caches — kernel swap causes stalls
Redis / in-memory store0–1Any swap evicts data that must stay in RAM
Desktop / shared workstation60–100Users benefit from aggressive page cache retention
Container host (Kubernetes)0Pods expect predictable memory; swap causes OOM confusion
Shell
# Check current value
sysctl vm.swappiness

# Reduce for a database server
sysctl -w vm.swappiness=10

# Check current swap usage
free -h
vmstat -s | grep swap
swappiness=0 does not disable swap Setting it to 0 means "avoid swapping unless OOM is imminent" — the kernel will still swap if necessary. To fully disable swap: swapoff -a (and remove swap entries from /etc/fstab).
Memory Dirty Pages & Writeback

Dirty pages are modified memory pages that haven't been written to disk yet. The kernel periodically flushes them in the background (writeback). These parameters control when and how aggressively that happens.

ParameterDefaultDescription
vm.dirty_ratio20% of total RAM that can be dirty before a writing process is blocked and forced to flush. High values = more throughput but bigger write latency spikes on flush.
vm.dirty_background_ratio10% of RAM at which background flusher (pdflush/kworker) kicks in — less disruptive than dirty_ratio flush.
vm.dirty_writeback_centisecs500How often (centiseconds) the background flusher wakes up. Default = every 5 seconds.
vm.dirty_expire_centisecs3000How old (centiseconds) a dirty page must be before it is eligible for writeback. Default = 30 seconds.

Tuning for Different Workloads

Latency-sensitive (DB writes)

Lower dirty ratios mean smaller flush bursts and more predictable latency at the cost of slightly lower write throughput.

Throughput-oriented (bulk ingest)

Higher dirty ratios allow the kernel to batch more writes together, increasing disk throughput at the risk of a large stall when the threshold is hit.

Shell
# Check current dirty page count
cat /proc/meminfo | grep Dirty
grep -E "Dirty|Writeback" /proc/meminfo

# Reduce flush thresholds for lower write latency
sysctl -w vm.dirty_ratio=5
sysctl -w vm.dirty_background_ratio=2

# Force immediate writeback of all dirty pages
sync
echo 3 > /proc/sys/vm/drop_caches   # flush caches (use with care!)
Memory OOM Killer

When the system runs out of memory and cannot reclaim any, the Out-Of-Memory killer selects a process to kill. The victim is chosen by an OOM score — a combination of memory usage, runtime, and adjustments.

Parameter / FileDescription
vm.panic_on_oom0 = invoke OOM killer (default). 1 = kernel panic instead. Useful on systems where partial failure is worse than reboot.
vm.oom_kill_allocating_task0 = OOM killer picks the "best" victim by score. 1 = kill the task that triggered OOM immediately — faster, less heuristic.
/proc/<pid>/oom_scoreCurrent OOM score for a process (read-only). Higher = more likely to be killed.
/proc/<pid>/oom_score_adjAdjustment from -1000 to +1000. -1000 = never kill. +1000 = always kill first.
Shell
# See which process was OOM-killed
dmesg | grep -i "oom\|killed process"
journalctl -k | grep -i oom

# See OOM score for all processes (sorted)
for p in /proc/[0-9]*; do
  printf "%5d %3d %s\n" \
    "$(cat $p/oom_score 2>/dev/null)" \
    "$(cat $p/oom_score_adj 2>/dev/null)" \
    "$(cat $p/comm 2>/dev/null)"
done | sort -rn | head -20

# Protect a critical process from OOM kill
echo -1000 > /proc/$(pgrep myapp)/oom_score_adj

# Make a process a preferred OOM victim
echo +500 > /proc/$(pgrep lowpriority)/oom_score_adj
oom_score_adj persists only while the process runs Set it from your application's init script or systemd unit (OOMScoreAdjust=-500 in the [Service] section) to make it permanent.
Memory Memory Overcommit

Linux overcommits virtual memory by default — malloc() succeeds even when there isn't enough physical RAM to back the allocation, betting that not all of it will be used at once.

vm.overcommit_memoryBehaviour
0 (default)Heuristic — allow reasonable overcommit; kernel uses a formula to reject obviously excessive allocations.
1Always allow — never fail malloc(). Used by some HPC and scientific workloads.
2Strict — total committed memory cannot exceed swap + (overcommit_ratio % of RAM). malloc() can fail.
Shell
# Check current setting
sysctl vm.overcommit_memory
sysctl vm.overcommit_ratio     # used when overcommit_memory=2 (default 50%)

# Check committed memory vs limit
grep -E "CommitLimit|Committed_AS" /proc/meminfo
Memory Hugepages

Hugepages (2 MB or 1 GB) reduce TLB pressure for large working sets by using fewer, larger page table entries. Critical for databases (PostgreSQL, Oracle, MySQL), Java heaps, and DPDK workloads.

ParameterDescription
vm.nr_hugepagesNumber of pre-allocated 2 MB hugepages. Memory is reserved immediately and cannot be used for anything else.
vm.nr_overcommit_hugepagesAdditional hugepages that can be allocated on demand (not pre-reserved).
vm.transparent_hugepagesNot a sysctl — lives at /sys/kernel/mm/transparent_hugepage/enabled. Values: always, madvise, never.
Shell
# Show hugepage stats
grep Huge /proc/meminfo

# Pre-allocate 512 × 2MB hugepages (= 1 GB reserved)
sysctl -w vm.nr_hugepages=512

# Check / change Transparent Hugepage setting
cat /sys/kernel/mm/transparent_hugepage/enabled
echo madvise > /sys/kernel/mm/transparent_hugepage/enabled

# Disable THP entirely (often better for databases)
echo never > /sys/kernel/mm/transparent_hugepage/enabled
Transparent Hugepages and databases THP can cause latency spikes in databases (PostgreSQL, MongoDB, Redis) due to khugepaged compaction and copy-on-write overhead. Most database vendors recommend setting THP to madvise or never.
Kernel Limits
ParameterDefaultDescription
kernel.pid_max32768Maximum PID value. When exhausted, fork() fails. Raise to 4194304 on busy systems running many containers.
kernel.threads-max~30000Max number of threads system-wide. Each thread consumes a PID.
fs.file-max~9MSystem-wide open file descriptor limit. Per-process limit is set via ulimit -n or systemd's LimitNOFILE.
fs.inotify.max_user_watches8192Max inotify watches per user. IDEs, editors, and build tools exhaust this on large codebases.
fs.inotify.max_user_instances128Max inotify instances per user.
kernel.max_map_count65530Max virtual memory map areas per process. Elasticsearch and JVM workloads commonly exhaust this.
Shell
# Raise PID limit for container-heavy hosts
sysctl -w kernel.pid_max=4194304

# Fix "too many open files" system-wide
sysctl -w fs.file-max=2097152

# Fix inotify limit (IDEs, webpack, etc.)
sysctl -w fs.inotify.max_user_watches=524288

# Fix Elasticsearch "max virtual memory areas" error
sysctl -w vm.max_map_count=262144

# Check current open file descriptor count
cat /proc/sys/fs/file-nr    # open / unused / max
Kernel Panic & Crash Behaviour
ParameterDefaultDescription
kernel.panic0Seconds to wait before automatic reboot after a kernel panic. 0 = hang forever. Set to e.g. 10 for auto-reboot on production servers.
kernel.panic_on_oops0Panic (and thus reboot if kernel.panic > 0) on kernel oops. Useful when you prefer a clean reboot over running with a possibly corrupted kernel state.
vm.panic_on_oom0Panic instead of invoking the OOM killer.
kernel.unknown_nmi_halt0Halt on unknown NMI (Non-Maskable Interrupt). Useful for hardware fault isolation.
Shell
# Auto-reboot 10 seconds after a kernel panic
sysctl -w kernel.panic=10
sysctl -w kernel.panic_on_oops=1

# Check current panic setting
sysctl kernel.panic
Kernel Core Dumps
ParameterDefaultDescription
kernel.core_patterncoreTemplate for core dump filename. Supports %p (PID), %e (executable), %t (timestamp). Can pipe to a handler with |/usr/lib/....
kernel.core_uses_pid0Append PID to the core filename. Equivalent to adding %p to core_pattern.
fs.suid_dumpable00 = no core for setuid processes. 1 = dump (insecure). 2 = dump readable only by root.
Shell
# Write core dumps to /tmp with PID and name in filename
sysctl -w kernel.core_pattern=/tmp/core.%e.%p.%t

# Check what systemd-coredump uses (modern distros)
cat /proc/sys/kernel/core_pattern
# usually: |/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h

# List core dumps captured by systemd-coredump
coredumpctl list
Kernel Scheduler
ParameterDefaultDescription
kernel.sched_min_granularity_ns750000Minimum time a task runs before being preempted (nanoseconds). Lower = more responsive, higher context-switch overhead.
kernel.sched_latency_ns6000000Target scheduling latency — time within which every runnable task should run at least once (nanoseconds).
kernel.sched_migration_cost_ns500000Cost of migrating a task between CPUs. Higher value = less migration = better cache locality at the cost of load imbalance.
kernel.sched_autogroup_enabled1Group tasks by session for desktop fairness. Disable on servers to let all processes compete equally.
Shell
# Disable autogroup for server workloads
sysctl -w kernel.sched_autogroup_enabled=0

# Increase migration cost for cache-sensitive workloads
sysctl -w kernel.sched_migration_cost_ns=5000000
Security ASLR

Address Space Layout Randomisation randomises the memory addresses of the stack, heap, and shared libraries, making it harder to exploit memory corruption vulnerabilities.

kernel.randomize_va_spaceBehaviour
0Disabled — all addresses deterministic. Never use in production.
1Randomise stack, VDSO, shared libraries.
2Full ASLR — also randomise heap (default on most distros).
Shell
# Check ASLR setting
sysctl kernel.randomize_va_space

# Ensure full ASLR is enabled
sysctl -w kernel.randomize_va_space=2
Security dmesg & Kernel Pointers
ParameterDefaultDescription
kernel.dmesg_restrict01 = only root can read dmesg. Prevents unprivileged users from reading kernel addresses and hardware info.
kernel.kptr_restrict01 = hide kernel symbol addresses from /proc/kallsyms for non-root. 2 = hide even from root.
kernel.perf_event_paranoid2Controls who can use perf. -1 = all. 0 = all (plus CPU counters). 1 = user-level only. 2 = root only. 3 = nobody (some distros).
Shell
# Harden kernel info exposure
sysctl -w kernel.dmesg_restrict=1
sysctl -w kernel.kptr_restrict=1

# Allow non-root perf for development machines
sysctl -w kernel.perf_event_paranoid=1
Security Unprivileged BPF
ParameterDefaultDescription
kernel.unprivileged_bpf_disabled0 or 10 = unprivileged users can load BPF programs. 1 = root only (write-once: cannot be set back to 0 without reboot). 2 = root only but reversible.
net.core.bpf_jit_enable1Enable BPF JIT compiler. 0 = interpret. 1 = JIT. 2 = JIT with debugging output.
net.core.bpf_jit_harden01 = harden JIT for unprivileged users (constant blinding). 2 = harden for all users.
Shell
# Restrict BPF to root (production servers)
sysctl -w kernel.unprivileged_bpf_disabled=1

# Harden JIT against Spectre-style side channels
sysctl -w net.core.bpf_jit_harden=2
Security SYN Cookies & ICMP
ParameterDefaultDescription
net.ipv4.tcp_syncookies1Send SYN cookies when the SYN queue overflows — allows legitimate connections to proceed without a full queue entry. Essential for SYN flood mitigation.
net.ipv4.icmp_echo_ignore_broadcasts1Ignore ICMP echo requests to broadcast addresses (prevents Smurf attack amplification).
net.ipv4.icmp_ignore_bogus_error_responses1Suppress logging of bogus ICMP error responses.
net.ipv4.conf.all.log_martians0Log packets with impossible source addresses (martians) — useful for detecting spoofed traffic.
Shell
# Recommended security baseline for a public-facing server
sysctl -w net.ipv4.tcp_syncookies=1
sysctl -w net.ipv4.icmp_echo_ignore_broadcasts=1
sysctl -w net.ipv4.icmp_ignore_bogus_error_responses=1
sysctl -w net.ipv4.conf.all.accept_redirects=0
sysctl -w net.ipv4.conf.all.send_redirects=0
sysctl -w net.ipv4.conf.all.rp_filter=1
sysctl -w kernel.dmesg_restrict=1
sysctl -w kernel.randomize_va_space=2
Reference Quick-Reference Table

Commonly changed parameters, their defaults, and the typical direction of change.

ParameterDefaultTypical ChangeReason
net.core.somaxconn4096↑ 65535High-traffic servers / load balancers
net.ipv4.tcp_max_syn_backlog1024↑ 65535Burst connection handling
net.ipv4.ip_local_port_range32768 609991024 65535More ephemeral ports for clients
net.ipv4.tcp_tw_reuse0↑ 1Reuse TIME_WAIT ports for outbound
net.ipv4.tcp_fin_timeout60↓ 15Release ports faster
net.core.rmem_max212992↑ 134217728High-bandwidth / high-latency paths
net.core.wmem_max212992↑ 134217728High-bandwidth / high-latency paths
net.ipv4.tcp_congestion_controlcubicbbrBetter throughput on lossy or high-BDP links
net.netfilter.nf_conntrack_max131072↑ 2097152Busy NAT gateways / firewalls
vm.swappiness60↓ 10Reduce latency spikes on servers
vm.dirty_ratio20↓ 5Reduce write latency spikes
vm.max_map_count65530↑ 262144Elasticsearch, JVM applications
fs.file-max~9M↑ 2097152Many open files (databases, proxies)
fs.inotify.max_user_watches8192↑ 524288IDEs, build tools on large codebases
kernel.pid_max32768↑ 4194304Container hosts with many processes
kernel.panic010Auto-reboot on production systems after panic
net.ipv4.ip_forward0↑ 1Docker, Kubernetes, VPNs, routers
Reference Tuning Profiles

Web / API Server

/etc/sysctl.d/99-web-server.conf
# Connection capacity
net.core.somaxconn = 65535
net.ipv4.tcp_max_syn_backlog = 65535
net.ipv4.ip_local_port_range = 1024 65535
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_fin_timeout = 15

# Socket buffers
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216

# Congestion control
net.ipv4.tcp_congestion_control = bbr
net.core.default_qdisc = fq

# OS limits
fs.file-max = 2097152
vm.swappiness = 10

Database Server (PostgreSQL / MySQL)

/etc/sysctl.d/99-database.conf
# Avoid swapping — databases manage their own cache
vm.swappiness = 1

# Reduce write latency spikes from page cache flush
vm.dirty_ratio = 5
vm.dirty_background_ratio = 2
vm.dirty_writeback_centisecs = 100

# Disable transparent hugepages (causes latency spikes)
# Set in /etc/rc.local or a systemd unit:
# echo never > /sys/kernel/mm/transparent_hugepage/enabled

# More shared memory for PostgreSQL
kernel.shmmax = 68719476736   # 64 GB
kernel.shmall = 4294967296

# Faster keepalive to detect dead connections
net.ipv4.tcp_keepalive_time = 60
net.ipv4.tcp_keepalive_intvl = 10
net.ipv4.tcp_keepalive_probes = 6

Kubernetes / Container Host

/etc/sysctl.d/99-k8s.conf
# Required for pod networking
net.ipv4.ip_forward = 1
net.ipv6.conf.all.forwarding = 1
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1

# More PIDs for containers
kernel.pid_max = 4194304

# Disable swap (Kubernetes recommends this)
vm.swappiness = 0

# inotify for many file watchers across pods
fs.inotify.max_user_watches = 524288
fs.inotify.max_user_instances = 512

# Connection tracking for services with many pods
net.netfilter.nf_conntrack_max = 2097152