codemaniacstudio

Unit 4. Advanced Shell Programming

Unit 4. Advanced Shell Programming

INDEX

4.1. Splitting, Comparing, Sorting, Merging & Ordering Files.
4.2. Filtering utilities: grep, sed etc.
4.3. awk utility

πŸ”§ Advanced Shell Programming: File Operations in Linux

  • When working with large sets of data, files, or logs, shell programming provides powerful tools to split, compare, sort, merge, and order files.
  • These operations help automate data processing and improve productivity.

πŸ“‚ 1. Splitting Files

The split command divides a large file into smaller chunks.

βœ… Syntax:

  • split [options] file prefix
βœ… Example:
  • split -l 1000 bigfile.txt part_
πŸ”Ή This splits bigfile.txt into smaller files with 1000 lines each, named part_aa, part_ab, etc.

πŸ†š 2. Comparing Files

Linux offers multiple tools to compare files:

a. diff

Compares two files line-by-line.
  • diff file1.txt file2.txt
πŸ”Ή Outputs the differences between the files.

b. cmp

Compares files byte-by-byte.
  • cmp file1.txt file2.txt
πŸ”Ή Shows the first mismatch location.

c. comm

Compares two sorted files line-by-line.
  • comm file1.txt file2.txt
πŸ”Ή Shows common lines and unique lines from each file.

πŸ”ƒ 3. Sorting Files

The sort command arranges the contents of a file based on alphabetical or numerical order.

βœ… Syntax:

  • sort [options] filename

βœ… Examples:

  • Sort alphabetically:
    • sort names.txt
  • Sort numerically:
    • sort -n numbers.txt’
  • Reverse sort:
    • sort -r data.txt
  • Sort by a specific column:
    • sort -k 2 employees.txt

πŸ”€ 4. Merging Files

You can merge sorted files using sort or cat, or use paste for side-by-side merging.

βœ… a. cat – Line-by-line merge:

  • cat file1.txt file2.txt > merged.txt

βœ… b. paste – Horizontal merge:

  • paste file1.txt file2.txt
πŸ”Ή Joins corresponding lines of each file with a tab.

πŸ“Š 5. Ordering Files

Ordering refers to organizing data logically, especially for reports or processing.
  • Alphabetical order:

    • sort names.txt > ordered_names.txt
  • Numeric order by field:

    • sort -k 3 -n employee_data.txt
  • Unique lines only:

    • sort -u file.txt
  • Sort and remove duplicates (case insensitive):

    • sort -fu file.txt

Using awk and cut for Custom File Handling.

  • Extract specific columns:

    • cut -d’,’ -f1,3 data.csv
  • Conditional sorting/filtering:

    • awk β€˜$3 > 50’ marks.txt | sort -k 3 -n

πŸ” Advanced Shell Programming: Filtering Utilities.

  • Filtering utilities in shell programming are powerful tools used to extract, manipulate, and format data from text files or streams.
  • They are especially useful for log processing, pattern matching, and text transformation in automated scripts.

1. grep – Global Regular Expression Print

grep is used to search for patterns in files. It returns lines that match a given pattern.

βœ… Basic Syntax:

  • grep [options] pattern filename

βœ… Examples:

  • Search for the word β€œerror” in a log file:
    • grep β€œerror” server.log
  • Case-insensitive search:
    • grep -i β€œlogin” auth.log
  • Show line numbers with matches:
    • grep -n β€œfailed” access.log
  • Use regular expressions:
    • grep β€œ^user” users.txt # lines starting with β€˜user’

2. sed – Stream Editor

sed is a powerful text-processing utility that can perform find-and-replace operations, insert or delete lines, and more.

βœ… Basic Syntax:

  • sed [options] β€˜command’ filename

βœ… Examples:

  • Replace the first occurrence of β€œfoo” with β€œbar”:
    • sed β€˜s/foo/bar/’ file.txt
  • Replace all occurrences on a line:
    • sed β€˜s/foo/bar/g’ file.txt
  • Delete a line containing a pattern:
    • sed β€˜/delete/d’ file.txt
  • Print only modified lines:
    • sed -n β€˜s/old/new/p’ file.txt

3. awk – Pattern Scanning and Processing

awk is a full scripting language designed for data extraction and reporting.

βœ… Basic Syntax:

  • awk β€˜pattern {action}’ filename

βœ… Examples:

  • Print the second column of a file:
    • awk β€˜{print $2}’ data.txt
  • Print lines where the third column is greater than 50:
    • awk β€˜$3 > 50’ marks.txt
  • Use field separator (e.g., comma):
    • awk -F, β€˜{print $1, $3}’ data.csv
Β 

4. cut – Extract Columns

cut is used to extract specific fields from a file, especially structured with delimiters.

βœ… Examples:

  • Get the first column:
    • cut -d’,’ -f1 file.csv
  • Extract multiple fields:
    • cut -d’,’ -f1,3 file.csv
Β 

5. sort, uniq, and tr – Supporting Filters.

  • sort: Sorts input data.
    • sort names.txt
  • uniq: Removes duplicates from a sorted list.
    • sort names.txt | uniq
  • tr: Translates or deletes characters.
    • tr β€˜a-z’ β€˜A-Z’ < input.txt

AWK Utility in Advanced Shell Programming.

  • awk is a powerful text-processing tool in Unix/Linux systems, commonly used for pattern scanning, field extraction, and reporting.
  • It works by reading input line by line, splitting each line into fields, and performing actions based on patterns or conditions.

βœ… Key Features of awk:

  • Works line-by-line and splits lines into fields (default delimiter is whitespace).
  • Supports patterns, conditionals, loops, functions, and string manipulation.
  • Useful for filtering, transforming, summarizing, and formatting data.

πŸ“Œ Basic Syntax:

    • awk β€˜pattern { action }’ filename
  • pattern – a condition or regex.
  • action – code to execute when the pattern is matched.
  • filename – the file to process.

πŸ”Ή Common Built-in Variables:

πŸ’‘ Examples:

  1. Print each line:
    • awk β€˜{ print }’ file.txt
  2. Print the first column:
    • awk β€˜{ print $1 }’ file.txt
  3. Print lines where second field equals β€œpass”:
    • awk β€˜$2 == β€œpass” { print $0 }’ results.txt
  4. Add numbers in column 3:
    • awk β€˜{ sum += $3 } END { print β€œTotal:”, sum }’ data.txt
  5. Change the field separator (e.g., CSV file):
    • awk -F’,’ β€˜{ print $2 }’ data.csv

🧠 When to Use AWK:

  • Extract specific columns from files.
  • Generate reports from structured data.
  • Perform inline calculations on data.
  • Filter and reformat text files.

βœ… Summary.

Operation Command Purpose
Split split Divide files into parts
Compare diff, cmp, comm Find differences
Sort sort Alphabetical/numerical sorting
Merge cat, paste Combine files
Order sort, uniq, awk Organize data
Utility Purpose Example
grep Search text using patterns grep "login" log.txt
sed Modify text streams sed 's/error/fixed/' file.txt
awk Extract and process fields awk '{print $1}' data.txt
cut Cut specific columns cut -d',' -f1 file.csv
sort Sort lines sort names.txt
uniq Remove duplicates sort file.txt | uniq
tr Translate characters tr 'a-z' 'A-Z'
Exit mobile version