Skip to main content

Text Processing

Linux Processing Text Tools

cut - Extracting Sections of Text

cut is the simplest of the three. It's used to extract specific portions of each line of a file.

  • Basic Example:

    # Extract the first field (username) from /etc/passwd (delimiter is ':')
    cut -d: -f1 /etc/passwd

    # Extract fields 1 and 5
    cut -d: -f1,5 /etc/passwd

    # Extract fields 2 through 4
    cut -d: -f2-4 /etc/passwd

    # Extract the first field using a space as the delimiter
    echo "Hello World Example" | cut -d" " -f1 # Output: Hello

sed - Stream Editor for Text Transformations:

sed is a powerful stream editor that performs various text transformations on a line-by-line basis.

  • Basic Example:

    # Replace "apple" with "orange" in the file fruits.txt
    sed 's/apple/orange/' fruits.txt

    # Replace all occurrences of "apple" with "orange"
    sed 's/apple/orange/g' fruits.txt

    # Delete lines containing "banana"
    sed '/banana/d' fruits.txt

    # Print only lines containing "grape"
    sed '/grape/!d' fruits.txt # The '!' inverts the match

    # In-place replacement (modifies the file directly)
    sed -i 's/apple/orange/g' fruits.txt

awk - Pattern-Directed Text Processing Language:

awk is the most complex but also the most versatile of the three. It's a pattern-scanning and text-processing language.

  • Basic Example:

    # Print the first field of each line in data.txt
    awk '{print $1}' data.txt

    # Print lines where the third field is greater than 10
    awk '$3 > 10 {print $0}' data.txt

    # Calculate the sum of the second field
    awk '{sum += $2} END {print sum}' data.txt

    # Print the line number and the entire line
    awk '{print NR, $0}' data.txt

    # Print the number of lines in the file
    awk 'END {print NR}' data.txt

Key Differences and When to Use Each:

  1. cut: Simple field extraction. Use when you need to quickly extract columns from delimited data.2.
  2. sed: Text transformations and substitutions. Use for find-and-replace operations, deleting lines, or other simple edits.
  3. awk: Complex text processing and analysis. Use when you need more control over the processing logic, calculations, or working with specific patterns.