Master Linux Text Processing: grep, cut, sort, uniq, diff, and More
This guide provides a comprehensive overview of essential Linux text‑processing commands—including grep, cut, sort, uniq, tee, diff, paste, and tr—detailing their key options, usage examples, and practical tips for efficiently searching, extracting, sorting, comparing, merging, and transforming file contents.
Command-line Text Processing Tools
grep: Search for patterns
The grep command searches files for lines matching a given pattern or regular expression. If no file is specified or the filename is "-", it reads from standard input. -i: ignore case -v: invert match, print non‑matching lines -n: show line numbers of matches -r: recursively search sub‑directories -l: print only matching file names -c: print only the count of matching lines -A: print N lines after each match -B: print N lines before each match -C: print N lines before and after each match
grep --color=auto passwd # highlight matches
alias grep='grep --color=auto' # create alias
alias grep='grep --color=auto' >> /etc/bashrc # persist alias
source /etc/bashrc # reload configuration
grep 'a*' passwd
vim passwd
cat passwd | grep 'bash$' # lines ending with "bash"
cat passwd | grep -ni 'bash$' # case‑insensitive with line numbers
cat passwd | grep -niv 'bash$' # invert match, case‑insensitive
cat passwd | grep -nB 2 'ftp' # match line and 2 lines before
cat passwd | grep -nA 2 '^ftp' # match line and 2 lines after
cat passwd | grep -nC 2 '^ftp' # match line and 2 lines surrounding
cat passwd | grep -o '^ftp' # print only the matching partcut: Extract columns or character ranges
The cut command extracts specific fields or character positions from each line of input. -c: split by character positions -d: specify a custom delimiter (default is TAB) -f: select fields when used with
-d # Show the content of passwd file
cat passwd
# Extract the first column (usernames)
cat passwd | cut -d":" -f1
# Extract first and seventh columns
cat passwd | cut -d":" -f1,7
# Extract characters 1‑5 of each line
cat passwd | cut -c 1-5
# Extract characters from position 10 to end
cat passwd | cut -c 10-sort: Sort file contents
The sort command orders lines of text files. -b: ignore leading blanks -n: sort numerically -u: output unique lines only -o<file>: write output to specified file -r: reverse order
# Sort by the third field numerically (field delimiter ':')
cat passwd | sort -n -t":" -k3
# Sort by the third field in reverse numeric order
cat passwd | sort -nr -t":" -k3
# Remove duplicate lines
cat 3.txt | sort -uuniq: Remove or report duplicate lines
The uniq command filters adjacent duplicate lines, usually after sort. -c or --count: prefix lines with occurrence count -d or --repeated: show only duplicated lines -u or --unique: show only unique lines
# Show unique lines (adjacent duplicates removed)
cat 3.txt | uniq
# Count occurrences of each line
cat 3.txt | uniq -c
# Show only duplicated lines
cat 3.txt | uniq -dtee: Duplicate input to a file
The tee command reads from standard input and writes to both standard output and one or more files. -a or --append: append to the file instead of overwriting
echo "helloworld" | tee abc.txt # write to file and screen
echo "helloworld" | tee -a abc.txt # append to file
cat abc.txt
cat vsftpd.conf | grep -v "^#" | grep -v "^$" | tee 4.txtdiff: Compare files line by line
The diff command shows differences between two files or directories. -c: context format -u: unified format
# Compare two files
diff 1.txt 2.txt
# Unified diff
diff -u 1.txt 2.txt
# Context diff
diff -c 1.txt 2.txt
# Compare two directories
diff test1 test2
# Create a patch file
diff 1.txt 2.txt > file.patch
# Apply the patch
patch 1.txt file.patchpaste: Merge lines of files side by side
The paste command joins corresponding lines of the given files. -d<delimiter> or --delimiters=<delimiter>: use specified delimiter instead of TAB -s or --serial: paste files sequentially rather than in parallel
# Parallel paste
paste 3.txt 4.txt
# Serial paste (each file as a single line)
paste -s 3.txt 4.txt
# Use "|" as delimiter
paste -d"|" 3.txt 4.txt
# Use ":" as delimiter
paste -d":" 3.txt 4.txttr: Translate or delete characters
The tr command reads from standard input, translates or deletes characters, and writes the result to standard output. -c, --complement: complement the set of characters -d, --delete: delete characters in the set -s, --squeeze-repeats: replace repeated characters with a single instance -t, --truncate-set1: truncate SET1 to the length of SET2
# Replace YES with NO (note: this replaces each character)
cat 1.txt | tr "YES" "NO"
# Replace YES with "NO " (adds a space)
cat 1.txt | tr "YES" "NO "
# Convert lowercase to uppercase
cat 1.txt | tr "[a-z]" "[A-Z]"
# Delete all lowercase letters
cat passwd | tr -d "a-z"
# Delete all uppercase letters
cat passwd | tr -d "A-Z"
# Squeeze repeated characters
tr -s "a-z" < passwdSigned-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
