awk

text processinglinux
The awk command is one of the most frequently used commands in Linux/Unix-like operating systems. awk The awk command is a powerful text processing tool that treats each line of a file as a record and each word as a field. It's designed for pattern scanning and processing, making it excellent for data extraction and reporting from text files.

Quick Reference

Command Name:

awk

Category:

text processing

Platform:

linux

Basic Usage:

awk {print $1} filename.txt

Common Use Cases

  • 1

    Text processing

    Manipulate text data and extract information

  • 2

    Data analysis

    Perform complex data analysis and transformations

  • 3

    Scripting

    Use in shell scripts to process text data programmatically

  • 4

    Log analysis

    Analyze log files and extract relevant information

Syntax

awk [options] 'program' file...
awk [options] -f program-file file...

Options

Option Description
-F fs Set field separator to fs (a single character or regular expression)
-f program-file Read awk program from a file instead of from the command line
-v var=value Assign value to variable var before program execution begins
-W option Implementation-specific options (such as compatibility modes)
-m[fr] val Set various memory limits to val
-b Treat the input data as binary (requires gawk)
-d[file] Dump variables to file for debugging
-D[file] Debug mode - useful for development of complex awk programs
-e 'program-text' Program text is source code - allows multiple -e options
-E file Equivalent to -f file (POSIX only)
-g Enable GNU extensions (only affects --traditional)
-o[file] Write profiling information to file (gawk)
-p[file] Enable profiling and write profile to file (gawk)
-P Enable POSIX compatibility
-r Consider source program to be raw text, not awk program
-V Print version information

Examples

How to Use These Examples

The examples below show common ways to use the awk command. Try them in your terminal to see the results. You can copy any example by clicking on the code block.

#

Basic Examples:

awk '{print $1}' file.txt

Prints the first field (column) of each line in file.txt. By default, awk splits input lines on whitespace, treating each resulting part as a field accessed via $1, $2, etc.

awk -F, '{print $1, $3}' data.csv

Processes a CSV file, printing the first and third columns. The -F option sets the field separator to a comma, making it suitable for CSV processing.

awk '/error/ {print $0}' logfile.txt

Prints lines containing the pattern "error" in logfile.txt. The pattern between slashes is a regular expression that finds matching lines, and $0 represents the entire line.

awk '{sum += $1} END {print "Sum:", sum}' numbers.txt

Calculates the sum of values in the first column of numbers.txt. The program accumulates values in the "sum" variable and prints the result after processing all lines.

Advanced Examples:

awk 'BEGIN {FS=","} {if ($3 > 1000) print $1, $3}' sales.csv

Prints the name and amount from a CSV file for transactions over $1000. The BEGIN block sets the field separator, and the conditional statement filters records based on the value in the third column.

awk '{count[$1]++} END {for (word in count) print word, count[word]}' textfile.txt

Counts word frequency in a text file. This uses an associative array to track occurrences of each word, then outputs each unique word with its count.

ps aux | awk '{sum += $6} END {print "Total Memory Usage: " sum/1024 " MB"}'

Calculates total memory usage from ps output. This pipes the result of the ps command to awk, which sums the values in the 6th column (memory usage) and converts to MB.

awk 'NR % 2 == 0 {print $0}' file.txt

Prints every even-numbered line from a file. The built-in variable NR (Number of Records) tracks the current line number, and the modulo operation selects even lines.

awk 'BEGIN {OFS=","} {$2=$2; print}' file.txt > output.csv

Converts space-delimited file to CSV format. The assignment $2=$2 forces awk to rebuild the entire record using the output field separator (OFS) which is set to comma.

awk -F: '{users[$3]=$1} END {for (uid in users) print uid, users[uid]}' /etc/passwd

Extracts and maps UIDs to usernames from /etc/passwd. This creates an associative array mapping numeric user IDs to username strings from the system passwd file.

Try It Yourself

Practice makes perfect! The best way to learn is by trying these examples on your own system with real files.

Understanding Syntax

Pay attention to the syntax coloring: commands, options, and file paths are highlighted differently.

Notes

Built-in Variables:

  • $0 - The entire current line/record
  • $1, $2... - The 1st, 2nd, etc. fields of the current record
  • NF - Number of fields in the current record
  • NR - Record number of the current record (line number)
  • FNR - Record number in the current file
  • FS - Field separator (default is whitespace)
  • RS - Record separator (default is newline)
  • OFS - Output field separator (default is space)
  • ORS - Output record separator (default is newline)
  • FILENAME - Name of the current input file
  • ARGC - Number of command-line arguments
  • ARGV - Array of command-line arguments

Program Structure:

  • An awk program consists of patterns and actions: pattern { action }
  • Patterns are optional; without a pattern, the action applies to every line
  • Actions are optional; without an action, matching lines are printed
  • Special patterns: BEGIN (executed before processing), END (executed after processing)
  • Regular expressions are enclosed in slashes: /regex/
  • Relational expressions use standard operators: ==, !=, <, >, etc.

Common Use Cases:

  • Text processing and data extraction
  • Field-based calculations and reporting
  • CSV and tabular data manipulation
  • Log file analysis
  • Simple data transformations
  • Quick one-line scripts for text manipulation
  • Simple statistical calculations on data sets

Data Types and Operators:

  • Numbers: integers and floating-point (automatic conversion)
  • Strings: sequences of characters
  • Arrays: associative arrays (hash tables) using string indices
  • Arithmetic operators: +, -, *, /, % (modulo), ^ (exponentiation)
  • String operators: space (concatenation)
  • Comparison operators: ==, !=, <, <=, >, >=, ~ (matches regex), !~ (doesn't match)
  • Logical operators: && (AND), || (OR), ! (NOT)

Control Flow:

  • Conditional: if (condition) statement [else statement]
  • Loops: while (condition) statement, do statement while (condition), for (expr1; expr2; expr3) statement, for (var in array) statement
  • Other: break, continue, next (skip to next record), exit (jump to END)

Implementations:

  • Various implementations exist: POSIX awk, nawk (new awk), gawk (GNU awk), mawk
  • GNU awk (gawk) has the most features and is common on Linux systems
  • Different implementations may have slight variations in features and behavior

Related Commands:

  • sed - Stream editor for filtering and transforming text
  • grep - Search for patterns in text
  • cut - Remove sections from each line of files
  • perl - A more powerful programming language with similar text processing capabilities
  • tr - Translate or delete characters

Tips & Tricks

1

Use the -F separator option to specify the field separator

2

Use the -v var=value option to set an awk variable

3

Use the -f scriptfile option to specify an awk script file

4

Use the -e program-text option to specify an awk program text

5

Use the -W interactive-debugger option to enable the interactive debugger

Common Use Cases

Text processing

Manipulate text data and extract information

Data analysis

Perform complex data analysis and transformations

Scripting

Use in shell scripts to process text data programmatically

Log analysis

Analyze log files and extract relevant information

Report generation

Generate reports based on processed data

Related Commands

These commands are frequently used alongside awk or serve similar purposes:

Use Cases

1

Text processing

Manipulate text data and extract information

2

Data analysis

Perform complex data analysis and transformations

3

Scripting

Use in shell scripts to process text data programmatically

4

Log analysis

Analyze log files and extract relevant information

5

Report generation

Generate reports based on processed data

Learn By Doing

The best way to learn Linux commands is by practicing. Try out these examples in your terminal to build muscle memory and understand how the awk command works in different scenarios.

$ awk
View All Commands