Tech

Mastering Linux Command Line: The DevOps Admin Toolkit You Can't Fake

Master the Linux command line for DevOps: piping, file systems, networking, process troubleshooting, systemd, disk management, security, and automation mindset.

June 2026 · 11 min read · 1 views · 0 hearts

Try in editor Tutorial catalog

Mastering the Linux command line isn’t just a checkbox on a DevOps job description—it’s the difference between being the person who reacts to outages and the person who prevents them. Every layer of modern infrastructure, from containers to cloud VMs, runs on Linux. Here’s the admin toolkit you can’t fake.

Master the Shell (and Don’t Stop at Bash)

You’ll live in the terminal. But knowing Bash alone isn’t enough. You need to be fluent in piping, redirection, and process substitution. That’s where the magic happens—chaining grep, awk, sed, and xargs into one-liners that debug production in seconds.

Process management: ps aux, top, htop, kill, and systemctl are your first responders. Understand zombie processes and how to find them.
Job control: Background tasks (&), nohup, screen, and tmux let you keep long-running jobs alive when your SSH session drops.
History tricks: Ctrl+R reverse search, !$ for last argument, and !! for last command save minutes every day.

Real-world test: Can you find the top 5 memory-consuming processes on a box without htop? If you can’t do it with ps aux --sort=-%mem | head -5, practice until you can.

Filesystem Deep Dive: Permissions, Inodes, and Links

DevOps means managing users, config files, and secrets. A misaligned permission is a security incident waiting to happen.

Permissions beyond chmod 777: Understand setuid, setgid, and sticky bits. Why does /tmp have the sticky bit? Because you don’t want others deleting your temp files.
Inodes and hard links: ls -li shows inode numbers. Hard links share the same inode—deleting one doesn’t free the data until all links are gone. Symlinks break if the target moves. Know the difference when symlinking configs.
ACLs (Access Control Lists): setfacl and getfacl let you assign permissions to multiple users without changing group ownership. Essential for shared directories in CI/CD pipelines.

Networking: The Lifeline of Distributed Systems

Containers communicate over networks. If you can’t diagnose a dropped packet, you’re blind.

Socket stats, not netstat: ss is faster and more detailed than netstat. Use ss -tuln to see listening ports.
Connectivity checks: curl -v, telnet, nc (netcat), and nslookup/dig are your first tools. tcpdump for packet-level inspection—learn to filter by port and host.
Routing and interfaces: ip route show, ip addr, and ip link replace the old ifconfig and route. These are standard in modern distros and Docker.

Pro tip: If you’re debugging a connection timeout between services, tcpdump -i any port 80 will show you if the SYN packet ever arrives. That separates a network issue from an application issue.

Process and Resource Troubleshooting

A CPU spike or OOM killer won’t announce itself politely. You need to hunt it.

strace and ltrace: Trace system calls and library calls. When a service hangs, strace -p <PID> shows what it’s waiting on (file I/O, network socket, lock).
lsof: List open files. Perfect for finding which process holds a lock on a config file or which PID is using a deleted file (eating disk space).
/proc filesystem: Direct access to kernel data structures. cat /proc/meminfo, /proc/cpuinfo, and /proc/<PID>/fd give raw data without tools.

Systemd and Service Management

Modern Linux runs on systemd. Hating it doesn’t make it go away.

Unit files: Know the structure of .service, .timer, and .socket files. They replace SysV init scripts and cron for many modern setups.
Journalctl: journalctl -u nginx.service -f tails logs for a specific service. journalctl --since "1 hour ago" saves you from scrolling.
Targets vs runlevels: systemctl get-default and systemctl isolate multi-user.target switch between graphical and headless modes.

Disk Management and Filesystem Health

Docker images, logs, and databases eat disk space. Full disks are the #1 cause of “mysterious” application failures.

df -h and du -sh * are your quick health checks.
lsblk and blkid: List block devices and their UUIDs—critical for mounting volumes reliably.
fdisk/parted and mkfs: Partitioning and formatting. With cloud instances, you often attach extra volumes and need to mount them manually.
mount and /etc/fstab: Automount on reboot. Use UUIDs instead of device names (/dev/sda) because names can change.

Real-world example: “No space left on device” even though df shows free space? Check lsof | grep deleted—a process is still writing to a deleted file. Restart the process to free the space.

Security Basics: Users, Groups, and SSH

You’re not managing a single server anymore; you’re managing access to a fleet.

useradd, usermod, groupadd, passwd: Create and manage users with minimal privileges. Never log in as root; use sudo.
SSH key management: ssh-keygen, ssh-copy-id, and ~/.ssh/authorized_keys. Understand ssh-agent for forwarding keys safely.
sudoers file: Control exactly which commands each user can run. visudo prevents syntax errors that lock you out.
Fail2ban and ufw/iptables: Basic firewall. Rate-limit SSH attempts. Every public-facing machine needs this.

The Automation Mindset

A good DevOps engineer doesn’t repeat manual steps. If you SSH into a box to fix something twice, you should write a script or Ansible playbook.

But that starts here: knowing Linux inside out means your automation scripts are tight, use exit codes correctly, handle signals, and don’t leave zombie processes.

Final test: Without looking it up, write a one-liner that finds all .log files older than 7 days in /var/log, compresses them with gzip, and deletes the originals. If you can do that, you’re already ahead of most.

Comments

Questions, corrections, and tips stay visible for everyone reading this page.

0 in thread

Join the discussion

No comments yet

Be the first to leave a note — it helps the next reader.