awk

text processinglinux
The awk command is one of the most frequently used commands in Linux/Unix-like operating systems. awk The awk command is a powerful text processing tool that treats each line of a file as a record and each word as a field. It's designed for pattern scanning and processing, making it excellent for data extraction and reporting from text files.

Quick Reference

Command Name:

awk

Category:

text processing

Platform:

linux

Basic Usage:

awk {print $1} filename.txt

Common Use Cases

  • 1

    Text processing

    Manipulate text data and extract information

  • 2

    Data analysis

    Perform complex data analysis and transformations

  • 3

    Scripting

    Use in shell scripts to process text data programmatically

  • 4

    Log analysis

    Analyze log files and extract relevant information

Syntax

awk [options] 'program' file...
awk [options] -f program-file file...

Options

Option Description
-F fs Set field separator to fs (a single character or regular expression)
-f program-file Read awk program from a file instead of from the command line
-v var=value Assign value to variable var before program execution begins
-W option Implementation-specific options (such as compatibility modes)
-m[fr] val Set various memory limits to val
-b Treat the input data as binary (requires gawk)
-d[file] Dump variables to file for debugging
-D[file] Debug mode - useful for development of complex awk programs
-e 'program-text' Program text is source code - allows multiple -e options
-E file Equivalent to -f file (POSIX only)
-g Enable GNU extensions (only affects --traditional)
-o[file] Write profiling information to file (gawk)
-p[file] Enable profiling and write profile to file (gawk)
-P Enable POSIX compatibility
-r Consider source program to be raw text, not awk program
-V Print version information

Examples

How to Use These Examples

The examples below show common ways to use the awk command. Try them in your terminal to see the results. You can copy any example by clicking on the code block.

Basic Examples:

Print the first field (column) of each line
awk '{print $1}' file.txt
Process CSV file with comma as field separator
awk -F, '{print $1, $3}' data.csv
Print lines containing a specific pattern
awk '/error/ {print $0}' logfile.txt
Calculate sum of values in first column
awk '{sum += $1} END {print "Sum:", sum}' numbers.txt
Print specific fields with custom separator
awk -F: '{print $1, $3}' /etc/passwd

Advanced Examples:

Filter records based on condition
awk 'BEGIN {FS=","} {if ($3 > 1000) print $1, $3}' sales.csv
Count word frequency in a text file
awk '{count[$1]++} END {for (word in count) print word, count[word]}' textfile.txt
Calculate total memory usage from ps output
ps aux | awk '{sum += $6} END {print "Total Memory Usage: " sum/1024 " MB"}'
Print every even-numbered line
awk 'NR % 2 == 0 {print $0}' file.txt
Convert space-delimited to CSV format
awk 'BEGIN {OFS=","} {$2=$2; print}' file.txt > output.csv
Extract UIDs and usernames from /etc/passwd
awk -F: '{users[$3]=$1} END {for (uid in users) print uid, users[uid]}' /etc/passwd
Process multiple files and add filename
awk '{print FILENAME ":" $0}' file1.txt file2.txt
Use variables in awk script
awk -v threshold=1000 '$3 > threshold {print $1, $3}' data.txt
Format output with printf
awk '{printf "%-20s %10.2f\n", $1, $2}' formatted.txt

Try It Yourself

Practice makes perfect! The best way to learn is by trying these examples on your own system with real files.

Understanding Syntax

Pay attention to the syntax coloring: commands, options, and file paths are highlighted differently.

Notes

Built-in Variables:

  • $0 - The entire current line/record
  • $1, $2... - The 1st, 2nd, etc. fields of the current record
  • NF - Number of fields in the current record
  • NR - Record number of the current record (line number)
  • FNR - Record number in the current file
  • FS - Field separator (default is whitespace)
  • RS - Record separator (default is newline)
  • OFS - Output field separator (default is space)
  • ORS - Output record separator (default is newline)
  • FILENAME - Name of the current input file
  • ARGC - Number of command-line arguments
  • ARGV - Array of command-line arguments

Program Structure:

  • An awk program consists of patterns and actions: pattern { action }
  • Patterns are optional; without a pattern, the action applies to every line
  • Actions are optional; without an action, matching lines are printed
  • Special patterns: BEGIN (executed before processing), END (executed after processing)
  • Regular expressions are enclosed in slashes: /regex/
  • Relational expressions use standard operators: ==, !=, <, >, etc.

Common Use Cases:

  • Text processing and data extraction
  • Field-based calculations and reporting
  • CSV and tabular data manipulation
  • Log file analysis
  • Simple data transformations
  • Quick one-line scripts for text manipulation
  • Simple statistical calculations on data sets

Data Types and Operators:

  • Numbers: integers and floating-point (automatic conversion)
  • Strings: sequences of characters
  • Arrays: associative arrays (hash tables) using string indices
  • Arithmetic operators: +, -, *, /, % (modulo), ^ (exponentiation)
  • String operators: space (concatenation)
  • Comparison operators: ==, !=, <, <=, >, >=, ~ (matches regex), !~ (doesn't match)
  • Logical operators: && (AND), || (OR), ! (NOT)

Control Flow:

  • Conditional: if (condition) statement [else statement]
  • Loops: while (condition) statement, do statement while (condition), for (expr1; expr2; expr3) statement, for (var in array) statement
  • Other: break, continue, next (skip to next record), exit (jump to END)

Implementations:

  • Various implementations exist: POSIX awk, nawk (new awk), gawk (GNU awk), mawk
  • GNU awk (gawk) has the most features and is common on Linux systems
  • Different implementations may have slight variations in features and behavior

Related Commands:

  • sed - Stream editor for filtering and transforming text
  • grep - Search for patterns in text
  • cut - Remove sections from each line of files
  • perl - A more powerful programming language with similar text processing capabilities
  • tr - Translate or delete characters

Tips & Tricks

1

Use the -F separator option to specify the field separator

2

Use the -v var=value option to set an awk variable

3

Use the -f scriptfile option to specify an awk script file

4

Use the -e program-text option to specify an awk program text

5

Use the -W interactive-debugger option to enable the interactive debugger

Common Use Cases

Text processing

Manipulate text data and extract information

Data analysis

Perform complex data analysis and transformations

Scripting

Use in shell scripts to process text data programmatically

Log analysis

Analyze log files and extract relevant information

Report generation

Generate reports based on processed data

Related Commands

These commands are frequently used alongside awk or serve similar purposes:

Use Cases

1

Text processing

Manipulate text data and extract information

2

Data analysis

Perform complex data analysis and transformations

3

Scripting

Use in shell scripts to process text data programmatically

4

Log analysis

Analyze log files and extract relevant information

5

Report generation

Generate reports based on processed data

Learn By Doing

The best way to learn Linux commands is by practicing. Try out these examples in your terminal to build muscle memory and understand how the awk command works in different scenarios.

$ awk
View All Commands