Tech
Mastering Linux Command Line: The DevOps Admin Toolkit You Can't Fake
Master the Linux command line for DevOps: piping, file systems, networking, process troubleshooting, systemd, disk management, security, and automation mindset.
June 2026 · 11 min read · 1 views · 0 hearts
Advertisement
Mastering the Linux command line isn’t just a checkbox on a DevOps job description—it’s the difference between being the person who reacts to outages and the person who prevents them. Every layer of modern infrastructure, from containers to cloud VMs, runs on Linux. Here’s the admin toolkit you can’t fake.
Master the Shell (and Don’t Stop at Bash)
You’ll live in the terminal. But knowing Bash alone isn’t enough. You need to be fluent in piping, redirection, and process substitution. That’s where the magic happens—chaining grep, awk, sed, and xargs into one-liners that debug production in seconds.
- Process management:
ps aux,top,htop,kill, andsystemctlare your first responders. Understand zombie processes and how to find them. - Job control: Background tasks (
&),nohup,screen, andtmuxlet you keep long-running jobs alive when your SSH session drops. - History tricks:
Ctrl+Rreverse search,!$for last argument, and!!for last command save minutes every day.
Real-world test: Can you find the top 5 memory-consuming processes on a box without htop? If you can’t do it with ps aux --sort=-%mem | head -5, practice until you can.
Filesystem Deep Dive: Permissions, Inodes, and Links
DevOps means managing users, config files, and secrets. A misaligned permission is a security incident waiting to happen.
- Permissions beyond
chmod 777: Understand setuid, setgid, and sticky bits. Why does/tmphave the sticky bit? Because you don’t want others deleting your temp files. - Inodes and hard links:
ls -lishows inode numbers. Hard links share the same inode—deleting one doesn’t free the data until all links are gone. Symlinks break if the target moves. Know the difference when symlinking configs. - ACLs (Access Control Lists):
setfaclandgetfacllet you assign permissions to multiple users without changing group ownership. Essential for shared directories in CI/CD pipelines.
Networking: The Lifeline of Distributed Systems
Containers communicate over networks. If you can’t diagnose a dropped packet, you’re blind.
- Socket stats, not netstat:
ssis faster and more detailed thannetstat. Usess -tulnto see listening ports. - Connectivity checks:
curl -v,telnet,nc(netcat), andnslookup/digare your first tools.tcpdumpfor packet-level inspection—learn to filter by port and host. - Routing and interfaces:
ip route show,ip addr, andip linkreplace the oldifconfigandroute. These are standard in modern distros and Docker.
Pro tip: If you’re debugging a connection timeout between services, tcpdump -i any port 80 will show you if the SYN packet ever arrives. That separates a network issue from an application issue.
Process and Resource Troubleshooting
A CPU spike or OOM killer won’t announce itself politely. You need to hunt it.
straceandltrace: Trace system calls and library calls. When a service hangs,strace -p <PID>shows what it’s waiting on (file I/O, network socket, lock).lsof: List open files. Perfect for finding which process holds a lock on a config file or which PID is using a deleted file (eating disk space)./procfilesystem: Direct access to kernel data structures.cat /proc/meminfo,/proc/cpuinfo, and/proc/<PID>/fdgive raw data without tools.
Systemd and Service Management
Modern Linux runs on systemd. Hating it doesn’t make it go away.
- Unit files: Know the structure of
.service,.timer, and.socketfiles. They replace SysV init scripts and cron for many modern setups. - Journalctl:
journalctl -u nginx.service -ftails logs for a specific service.journalctl --since "1 hour ago"saves you from scrolling. - Targets vs runlevels:
systemctl get-defaultandsystemctl isolate multi-user.targetswitch between graphical and headless modes.
Disk Management and Filesystem Health
Docker images, logs, and databases eat disk space. Full disks are the #1 cause of “mysterious” application failures.
df -handdu -sh *are your quick health checks.lsblkandblkid: List block devices and their UUIDs—critical for mounting volumes reliably.fdisk/partedandmkfs: Partitioning and formatting. With cloud instances, you often attach extra volumes and need to mount them manually.mountand/etc/fstab: Automount on reboot. Use UUIDs instead of device names (/dev/sda) because names can change.
Real-world example: “No space left on device” even though df shows free space? Check lsof | grep deleted—a process is still writing to a deleted file. Restart the process to free the space.
Security Basics: Users, Groups, and SSH
You’re not managing a single server anymore; you’re managing access to a fleet.
useradd,usermod,groupadd,passwd: Create and manage users with minimal privileges. Never log in as root; usesudo.- SSH key management:
ssh-keygen,ssh-copy-id, and~/.ssh/authorized_keys. Understandssh-agentfor forwarding keys safely. sudoersfile: Control exactly which commands each user can run.visudoprevents syntax errors that lock you out.- Fail2ban and
ufw/iptables: Basic firewall. Rate-limit SSH attempts. Every public-facing machine needs this.
The Automation Mindset
A good DevOps engineer doesn’t repeat manual steps. If you SSH into a box to fix something twice, you should write a script or Ansible playbook.
But that starts here: knowing Linux inside out means your automation scripts are tight, use exit codes correctly, handle signals, and don’t leave zombie processes.
Final test: Without looking it up, write a one-liner that finds all .log files older than 7 days in /var/log, compresses them with gzip, and deletes the originals. If you can do that, you’re already ahead of most.
Advertisement
Comments
Questions, corrections, and tips stay visible for everyone reading this page.
Join the discussion
No comments yet
Be the first to leave a note — it helps the next reader.