Unit 4. Advanced Shell Programming

INDEX

4.1. Splitting, Comparing, Sorting, Merging & Ordering Files.
4.2. Filtering utilities: grep, sed etc.
4.3. awk utility

🔧 Advanced Shell Programming: File Operations in Linux

  • When working with large sets of data, files, or logs, shell programming provides powerful tools to split, compare, sort, merge, and order files.
  • These operations help automate data processing and improve productivity.

📂 1. Splitting Files

The split command divides a large file into smaller chunks.

✅ Syntax:

  • split [options] file prefix
✅ Example:
  • split -l 1000 bigfile.txt part_
🔹 This splits bigfile.txt into smaller files with 1000 lines each, named part_aa, part_ab, etc.

🆚 2. Comparing Files

Linux offers multiple tools to compare files:

a. diff

Compares two files line-by-line.
  • diff file1.txt file2.txt
🔹 Outputs the differences between the files.

b. cmp

Compares files byte-by-byte.
  • cmp file1.txt file2.txt
🔹 Shows the first mismatch location.

c. comm

Compares two sorted files line-by-line.
  • comm file1.txt file2.txt
🔹 Shows common lines and unique lines from each file.

🔃 3. Sorting Files

The sort command arranges the contents of a file based on alphabetical or numerical order.

✅ Syntax:

  • sort [options] filename

✅ Examples:

  • Sort alphabetically:
    • sort names.txt
  • Sort numerically:
    • sort -n numbers.txt’
  • Reverse sort:
    • sort -r data.txt
  • Sort by a specific column:
    • sort -k 2 employees.txt

🔀 4. Merging Files

You can merge sorted files using sort or cat, or use paste for side-by-side merging.

✅ a. cat – Line-by-line merge:

  • cat file1.txt file2.txt > merged.txt

✅ b. paste – Horizontal merge:

  • paste file1.txt file2.txt
🔹 Joins corresponding lines of each file with a tab.

📊 5. Ordering Files

Ordering refers to organizing data logically, especially for reports or processing.
  • Alphabetical order:

    • sort names.txt > ordered_names.txt
  • Numeric order by field:

    • sort -k 3 -n employee_data.txt
  • Unique lines only:

    • sort -u file.txt
  • Sort and remove duplicates (case insensitive):

    • sort -fu file.txt

Using awk and cut for Custom File Handling.

  • Extract specific columns:

    • cut -d’,’ -f1,3 data.csv
  • Conditional sorting/filtering:

    • awk ‘$3 > 50’ marks.txt | sort -k 3 -n

🔍 Advanced Shell Programming: Filtering Utilities.

  • Filtering utilities in shell programming are powerful tools used to extract, manipulate, and format data from text files or streams.
  • They are especially useful for log processing, pattern matching, and text transformation in automated scripts.

1. grep – Global Regular Expression Print

grep is used to search for patterns in files. It returns lines that match a given pattern.

✅ Basic Syntax:

  • grep [options] pattern filename

✅ Examples:

  • Search for the word “error” in a log file:
    • grep “error” server.log
  • Case-insensitive search:
    • grep -i “login” auth.log
  • Show line numbers with matches:
    • grep -n “failed” access.log
  • Use regular expressions:
    • grep “^user” users.txt # lines starting with ‘user’

2. sed – Stream Editor

sed is a powerful text-processing utility that can perform find-and-replace operations, insert or delete lines, and more.

✅ Basic Syntax:

  • sed [options] ‘command’ filename

✅ Examples:

  • Replace the first occurrence of “foo” with “bar”:
    • sed ‘s/foo/bar/’ file.txt
  • Replace all occurrences on a line:
    • sed ‘s/foo/bar/g’ file.txt
  • Delete a line containing a pattern:
    • sed ‘/delete/d’ file.txt
  • Print only modified lines:
    • sed -n ‘s/old/new/p’ file.txt

3. awk – Pattern Scanning and Processing

awk is a full scripting language designed for data extraction and reporting.

✅ Basic Syntax:

  • awk ‘pattern {action}’ filename

✅ Examples:

  • Print the second column of a file:
    • awk ‘{print $2}’ data.txt
  • Print lines where the third column is greater than 50:
    • awk ‘$3 > 50’ marks.txt
  • Use field separator (e.g., comma):
    • awk -F, ‘{print $1, $3}’ data.csv
 

4. cut – Extract Columns

cut is used to extract specific fields from a file, especially structured with delimiters.

✅ Examples:

  • Get the first column:
    • cut -d’,’ -f1 file.csv
  • Extract multiple fields:
    • cut -d’,’ -f1,3 file.csv
 

5. sort, uniq, and tr – Supporting Filters.

  • sort: Sorts input data.
    • sort names.txt
  • uniq: Removes duplicates from a sorted list.
    • sort names.txt | uniq
  • tr: Translates or deletes characters.
    • tr ‘a-z’ ‘A-Z’ < input.txt

AWK Utility in Advanced Shell Programming.

  • awk is a powerful text-processing tool in Unix/Linux systems, commonly used for pattern scanning, field extraction, and reporting.
  • It works by reading input line by line, splitting each line into fields, and performing actions based on patterns or conditions.

Key Features of awk:

  • Works line-by-line and splits lines into fields (default delimiter is whitespace).
  • Supports patterns, conditionals, loops, functions, and string manipulation.
  • Useful for filtering, transforming, summarizing, and formatting data.

📌 Basic Syntax:

    • awk ‘pattern { action }’ filename
  • pattern – a condition or regex.
  • action – code to execute when the pattern is matched.
  • filename – the file to process.

🔹 Common Built-in Variables:

💡 Examples:

  1. Print each line:
    • awk ‘{ print }’ file.txt
  2. Print the first column:
    • awk ‘{ print $1 }’ file.txt
  3. Print lines where second field equals “pass”:
    • awk ‘$2 == “pass” { print $0 }’ results.txt
  4. Add numbers in column 3:
    • awk ‘{ sum += $3 } END { print “Total:”, sum }’ data.txt
  5. Change the field separator (e.g., CSV file):
    • awk -F’,’ ‘{ print $2 }’ data.csv

🧠 When to Use AWK:

  • Extract specific columns from files.
  • Generate reports from structured data.
  • Perform inline calculations on data.
  • Filter and reformat text files.

✅ Summary.

Operation Command Purpose
Split split Divide files into parts
Compare diff, cmp, comm Find differences
Sort sort Alphabetical/numerical sorting
Merge cat, paste Combine files
Order sort, uniq, awk Organize data
Utility Purpose Example
grep Search text using patterns grep "login" log.txt
sed Modify text streams sed 's/error/fixed/' file.txt
awk Extract and process fields awk '{print $1}' data.txt
cut Cut specific columns cut -d',' -f1 file.csv
sort Sort lines sort names.txt
uniq Remove duplicates sort file.txt | uniq
tr Translate characters tr 'a-z' 'A-Z'

Leave a Reply

Your email address will not be published. Required fields are marked *

sign up!

We’ll send you the hottest deals straight to your inbox so you’re always in on the best-kept software secrets.