1. Pattern Matching Utilities
A subset of text processing utilities/tools specifically designed to search for patterns in text using regular expressions or other pattern-based logic.
Examples: sed (uses patterns for substitution or deletion), awk (uses patterns for selecting and processing data), grep (explicit pattern searching).
sed (stream editor)
Name origin: Stream Editor
man sed: sed - stream editor for filtering and transforming text
Primary Purpose: Processes text line-by-line and applies transformations such as substitution, deletion, or insertion. Most oftern used for replacing text in files.
Syntax
sed [options] 'command' filename
Use options like -n to suppress default output or -i for in-place editing.
Common commands include: s (substitution), p (print), d (delete), a (append); i (insert).
Examples
sed 's/old/new/g' file.txt # Substitute all occurrences of "old" with "new" (does not modify the original file)
sed -i '/s/old//i' file.txt # Substitute "old" with "" (delete "old") in place (modifies the file).
sed '3,5 /s/old/new/' file.txt # Substitute "old" with "new" in lines 3 to 5
echo "Hello World" | sed 's/World/Universe/' # print: Hello Universe
sed -n '5,30p' file.txt # print lines 5 to 30
sed -n '/foo/p' file.txt # print lines containing foo
sed -n '/foo/Ip' file.txt # print lines containing foo, case insensitive
sed -n '/foo/!p' file.txt # print lines not containing foo
sed -i '2d' file.txt # Delete line 2 and overwrite the file.txt file
sed '/foo/d' file.txt # delete lines matching foo
sed G file.txt # append a newline after each line
sed '/foo/a extra-line' file.txt # append a line with "extra-line" after lines containing foo
sed '5 i apple' file.txt # insert apple before line 5
sed -e 's/foo/bar/' -e '/delete/d' file.txt # Apply multiple commands using -e
sed -f script.sed file.txt # run sed commands from a script file. e.g., script.sed: /s/old/new/
awk (pattern scanning and processing language)
Name origin: Aho, Weinberger, and Kernighan
man awk: gawk - pattern scanning and processing language
Primary Purpose: awk is a language for (structured) text processing. A more powerful sed; follows sed style, but uses C syntax to specify commands. Like a mini relational database management system.
Syntax
awk 'pattern {action}' filename
pattern: A condition or expression that determines which lines of the input file are processed. If omitted, the action is applied to all lines.
{action}: A block of code to execute for each line that matches the pattern. Actions can include printing, calculations, string manipulations, etc. If omitted, the default action is to print the line.
Examples
$ awk {print} datafile
Aaron 45 55 60 90
Bob 70 75 88 100
Chuck 75 80 85 100
Donald 80 70 70 95
$
$ awk '/Bob/ {print}' datafile # /Bob/: A pattern that matches lines containing "Bob".
Bob 70 75 88 100
$
$ # ~: Checks if a string or field matches a pattern (regular expression)
$ # The !~ operator is used to check for non-matching patterns
$ awk '$1 ~ /Bo/ {print}' datafile # Print lines where the first field contains "Bo"
Bob 70 75 88 100
$ cat prog.awk
BEGIN { print "Starting to read" }
{ print }
END {print "Finished reading" }
$
$ awk -f prog.awk datafile
Starting to read
Aaron 45 55 60 90
Bob 70 75 88 100
Chuck 75 80 85 100
Donald 80 70 70 95
Finished reading
$ awk '{ print $1, $2 }' datafile
Aaron 45
Bob 70
Chuck 75
Donald 80
$
$ awk '$5 > 99 { print $0 }' datafile # Print lines where the 5th field > 99. $0 is the default argument, optional, and means printing the entire line
Bob 70 75 88 100
Chuck 75 80 85 100
$
$ awk 'BEGIN { print "Names" } { print $1 } END { } ' datafile
Names
Aaron
Bob
Chuck
Donald
$ cat prog2.awk
# AWK can serve C-like expressions
BEGIN { print "Average"; total = 0; count = 0; }
{ total = total + $2; ++count; }
END { avg = total / count
print total, avg }
$
$ awk -f prog2.awk datafile
Average
270 67.5
$ cat prom3.awk
BEGIN {
print "Total for Bob"
}
# /Bob/ is a pattern that matches any line containing the string "Bob"
/Bob/ {
sum = $2 + $3 + $4
print sum
}
$
$ awk -f prom3.awk datafile
Total for Bob
233
$ # -F: Specifies the field separator (whitespace by default)
$ # -F',' sets the delimiter to a comma (useful for CSV files).
$ cat data.csv
ID,Name,Score
1,Bob,85
2,Alice,90
3,John,78
4,Mary,92
$ awk -F',' '{print $2}' data.csv
Name
Bob
Alice
John
Mary
grep (print lines that match patterns)
Name origin: Global Regular Expression Print
man grep: grep, egrep, fgrep, rgrep - print lines that match patterns
Primary Purpose: Scans files or streams for lines matching specific patterns or regular expressions.
Syntax
grep [options] pattern [file...]
[options]: Flags to modify the behavior of grep.
pattern: A string or regular expression to search for.
[file...]: The file(s) to search in. If no file is provided, grep reads from standard input.
Examples
$ cat filename
hello world
Hello again
Say hi to the world
helloworld
$
$ grep 'hello' filename # Print line
862D
s containing "hello"
hello world
helloworld
$
$ grep -i "hello" filename # Print lines containing "hello", Case-insensitive
hello world
Hello again
helloworld
$
$ grep -n 'hello' filename # Print the line numbers where "hello" occurs
1:hello world
4:helloworld
$
$ grep -v 'hello' filename # Print lines that do not contain "hello"
Hello again
Say hi to the world
$
$ grep -w 'hello' filename # Match only the whole word "hello" (not substrings like "helloworld")
hello world
$
$ # grep -r 'hello' /path/to/directory # Search for "hello" in all files under a directory
$ grep -r 'hello' ./
.//script.sed:s/hi/hello/
.//filename:hello world
.//filename:helloworld
$
$ grep -c 'hello' filename # Count how many lines contain "hello"
2
$
$ grep -E 'hello|world' filename # Match lines with a pattern using extended regular expressions
hello world
Say hi to the world
helloworld
$
$ grep 'hello' *.txt # Search for a Pattern in Multiple Files. Search for "hello" in all .txt files
2. Shell
What It Is:
- A command-line interpreter that provides an interface to interact with the operating system.
- Processes user commands and executes them (e.g., running programs, managing files).
Key Features:
- Provides built-in commands for file management (
ls, cd, rm), process management (ps, kill), and more.
- Can run external programs like
sed, awk, and grep.
Common Shells:
- Bash (Bourne Again Shell): The most common shell for Linux systems.
- Zsh: An extended shell with additional features.
- Fish: A user-friendly command line shell.
Bash (Bourne Again Shell)
Bash is a specific implementation of a shell, widely used on Linux and macOS. Bash is an extension of the original Bourne Shell (sh) with additional features.
Key Features:
- Supports advanced scripting features like arrays, associative arrays, and functions.
- Includes built-in commands and utilities.
- Offers command-line editing, job control, and history management.
Example Use Case: Using Bash as an interactive shell:
$ echo "Hello, World!"
Hello, World!
3. Shell Scripting
What It Is:
- Writing scripts (automated sequences of commands) to be executed by a shell.
- Combines shell commands, utilities (like
sed, awk, and grep), and control flow structures (like loops and conditionals).
Key Features:
- Enables automation of repetitive tasks.
- Supports variables, loops, conditionals, and functions.
Example: Find all .log files, replace the word "ERROR" with "WARNING", and save the output to new files (filename_processed.log).
Step1: Create a script.sh file and give it execute permission
$ touch script.sh
$ chmod +x ./script.sh
Step2: Edit the script.sh file
#!/bin/bash
for file in *.log; do
sed 's/ERROR/WARNING/g' "$file" > "${file%.log}_processed.log"
done
Step3: Execute the script.sh file
sed processes the text.
- Shell provides the environment to run commands.
- Shell Scripting with Bash combines commands into an automated workflow.
POSIX
POSIX (Portable Operating System Interface) it's a set of standard operating system interfaces based on the Unix operating system.
As a standard, POSIX helps maintain compatibility between operating systems. POSIX defines both the system and user-level application programming interfaces (APIs), along with command line shells, and utility interfaces, for software compatibility (portability) with variants of Unix and other operating systems.
A POSIX operating system is any operating system that adheres to the POSIX standards, which is a set of guidelines defining application programming interfaces (APIs) and system calls, essentially ensuring compatibility and portability between different Unix-like operating systems, allowing applications to run on various platforms with minimal modifications.
POSIX-compliant operating systems: Linux, macOS, FreeBSD, OpenBSD, and Oracle Solaris.
Windows is not a POSIX operating system, but there are several ways to use POSIX on Windows, for example:
- Windows Subsystem for Linux (WSL): A compatibility layer that allows developers to run Linux binary executables on Windows 10 and 11. WSL allows developers to access a Linux environment that can run POSIX software on Windows files.
- PowerShell: A synthesis of Windows and Unix culture that is based on the IEEE POSIX 1003.2 standard for Unix shells.
1. Pattern Matching Utilities
A subset of text processing utilities/tools specifically designed to search for patterns in text using regular expressions or other pattern-based logic.
Examples:
sed(uses patterns for substitution or deletion),awk(uses patterns for selecting and processing data),grep(explicit pattern searching).sed(stream editor)Name origin: Stream Editor
man sed: sed - stream editor for filtering and transforming textPrimary Purpose: Processes text line-by-line and applies transformations such as substitution, deletion, or insertion. Most oftern used for replacing text in files.
Syntax
sed [options] 'command' filenameUse options like
-nto suppress default output or-ifor in-place editing.Common commands include:
s(substitution),p(print),d(delete),a(append);i(insert).Examples
awk(pattern scanning and processing language)Name origin: Aho, Weinberger, and Kernighan
man awk: gawk - pattern scanning and processing languagePrimary Purpose:
awkis a language for (structured) text processing. A more powerfulsed; followssedstyle, but usesCsyntax to specify commands. Like a mini relational database management system.Syntax
awk 'pattern {action}' filenamepattern: A condition or expression that determines which lines of the input file are processed. If omitted, the action is applied to all lines.{action}: A block of code to execute for each line that matches the pattern. Actions can include printing, calculations, string manipulations, etc. If omitted, the default action is to print the line.Examples
$ awk {print} datafile Aaron 45 55 60 90 Bob 70 75 88 100 Chuck 75 80 85 100 Donald 80 70 70 95 $ $ awk '/Bob/ {print}' datafile # /Bob/: A pattern that matches lines containing "Bob". Bob 70 75 88 100 $ $ # ~: Checks if a string or field matches a pattern (regular expression) $ # The !~ operator is used to check for non-matching patterns $ awk '$1 ~ /Bo/ {print}' datafile # Print lines where the first field contains "Bo" Bob 70 75 88 100$ cat prog.awk BEGIN { print "Starting to read" } { print } END {print "Finished reading" } $ $ awk -f prog.awk datafile Starting to read Aaron 45 55 60 90 Bob 70 75 88 100 Chuck 75 80 85 100 Donald 80 70 70 95 Finished reading$ cat prom3.awk BEGIN { print "Total for Bob" } # /Bob/ is a pattern that matches any line containing the string "Bob" /Bob/ { sum = $2 + $3 + $4 print sum } $ $ awk -f prom3.awk datafile Total for Bob 233grep(print lines that match patterns)Name origin: Global Regular Expression Print
man grep: grep, egrep, fgrep, rgrep - print lines that match patternsPrimary Purpose: Scans files or streams for lines matching specific patterns or regular expressions.
Syntax
[options]: Flags to modify the behavior ofgrep.pattern: A string or regular expression to search for.[file...]: The file(s) to search in. If no file is provided,grepreads from standard input.Examples
2. Shell
What It Is:
Key Features:
ls,cd,rm), process management (ps,kill), and more.sed,awk, andgrep.Common Shells:
Bash (Bourne Again Shell)
Bash is a specific implementation of a shell, widely used on Linux and macOS. Bash is an extension of the original Bourne Shell (
sh) with additional features.Key Features:
Example Use Case: Using Bash as an interactive shell:
3. Shell Scripting
What It Is:
sed,awk, andgrep), and control flow structures (like loops and conditionals).Key Features:
Example: Find all
.logfiles, replace the word "ERROR" with "WARNING", and save the output to new files (filename_processed.log).Step1: Create a
script.shfile and give it execute permissionStep2: Edit the
script.shfileStep3: Execute the
script.shfilesedprocesses the text.POSIX
POSIX (Portable Operating System Interface) it's a set of standard operating system interfaces based on the Unix operating system.
As a standard, POSIX helps maintain compatibility between operating systems. POSIX defines both the system and user-level application programming interfaces (APIs), along with command line shells, and utility interfaces, for software compatibility (portability) with variants of Unix and other operating systems.
A POSIX operating system is any operating system that adheres to the POSIX standards, which is a set of guidelines defining application programming interfaces (APIs) and system calls, essentially ensuring compatibility and portability between different Unix-like operating systems, allowing applications to run on various platforms with minimal modifications.
POSIX-compliant operating systems: Linux, macOS, FreeBSD, OpenBSD, and Oracle Solaris.
Windows is not a POSIX operating system, but there are several ways to use POSIX on Windows, for example: