Difference between revisions of "Linux command: awk"
Jump to navigation
Jump to search
Rafahsolis (talk | contribs) (Created page with "Comand line text processing Examples: <source lang="bash"> awk -F, '{print NR, length($0)}' filename.txt #print line number and line length </source>") |
Rafahsolis (talk | contribs) |
||
| (11 intermediate revisions by the same user not shown) | |||
| Line 1: | Line 1: | ||
| − | Comand line text processing | + | = Comand line text processing = |
Examples: | Examples: | ||
<source lang="bash"> | <source lang="bash"> | ||
awk -F, '{print NR, length($0)}' filename.txt #print line number and line length | awk -F, '{print NR, length($0)}' filename.txt #print line number and line length | ||
| + | awk '{print FILENAME " " length($0)}' */PRF* | uniq | ||
| + | |||
| + | awk 'BEGIN { FS = "," } ; { print $2 }' #Specify separator ',' can be done with -F too. | ||
| + | awk -F"," '$2~/^ABC$/' file #Find in a csv second field = ABC | ||
</source> | </source> | ||
| + | |||
| + | == Print from 3rd field till end == | ||
| + | <source lang="bash"> | ||
| + | awk '{ \ | ||
| + | for (i = 3; i <= NF; i++) { \ | ||
| + | printf("%s ", $i); \ | ||
| + | } \ | ||
| + | printf("\n") }' | ||
| + | </source> | ||
| + | == Print with condition == | ||
| + | awk '{if ($3 =="" || $4 == "" || $5 == "") print "Some score for the student",$1,"is missing";'}' student-marks | ||
| + | |||
| + | == Print if value == | ||
| + | <nowiki>$ echo "a b c d" | awk '($4){print "yes"}' | ||
| + | yes | ||
| + | $ echo "a b c d" | awk '($14){print "yes"}' ## prints nothing, no $14 | ||
| + | $ echo "a b c 0" | awk '($4){print "yes"}' ## prints nothing, $4 is 0</nowiki> | ||
| + | |||
| + | == Print columns as lines == | ||
| + | ls -lR | awk '{for(x=1;$x;++x) print $x}' | ||
| + | <nowiki>awk '{for(x=1;$x;x++)print $x}' | ||
| + | ___ __ ___ | ||
| + | | | | | ||
| + | | | |-----> increment x by 1 at the end of each loop. | ||
| + | | |--------> run the loop as long as there is a field number x | ||
| + | |------------> initialize x to 1</nowiki> | ||
| + | == Print from line x to line y == | ||
| + | <nowiki>I suggest the sed solution, but for the sake of completeness, | ||
| + | |||
| + | awk 'NR >= 57890000 && NR <= 57890010' /path/to/file | ||
| + | To cut out after the last line: | ||
| + | |||
| + | awk 'NR < 57890000 { next } { print } NR == 57890010 { exit }' /path/to/file</nowiki> | ||
| + | |||
| + | == Cheatsheet == | ||
| + | <nowiki> | ||
| + | Basics I | ||
| + | $1 | ||
| + | Reference first column | ||
| + | awk '/pattern/ {action}' file↵ | ||
| + | Execute action for matched pattern 'pattern' on file 'file' | ||
| + | ; | ||
| + | Char to separate two actions | ||
| + | print | ||
| + | Print current record line | ||
| + | $0 | ||
| + | Reference current record line | ||
| + | Variables I | ||
| + | $2 | ||
| + | Reference second column | ||
| + | FS | ||
| + | Field separator of input file (default whitespace) | ||
| + | NF | ||
| + | Number of fields in current record | ||
| + | NR | ||
| + | Line number of the current record | ||
| + | Basics II | ||
| + | ^ | ||
| + | Match beginning of field | ||
| + | ~ | ||
| + | Match opterator | ||
| + | !~ | ||
| + | Do not match operator | ||
| + | -F | ||
| + | Command line option to specify input field delimiter | ||
| + | BEGIN | ||
| + | Denotes block executed once at start | ||
| + | END | ||
| + | Denotes block executed once at end | ||
| + | str1 str2 | ||
| + | Concat str1 and str2 | ||
| + | One-Line Exercises I | ||
| + | awk '{print $1}' file↵ | ||
| + | Print first field for each record in file | ||
| + | awk '/regex/' file↵ | ||
| + | Print only lines that match regex in file | ||
| + | awk '!/regex/' file↵ | ||
| + | Print only lines that do not match regex in file | ||
| + | awk '$2 == "foo"' file↵ | ||
| + | Print any line where field 2 is equal to "foo" in file | ||
| + | awk '$2 != "foo"' file↵ | ||
| + | Print lines where field 2 is NOT equal to "foo" in file | ||
| + | awk '$1 ~ /regex/' file↵ | ||
| + | Print line if field 1 matches regex in file | ||
| + | awk '$1 !~ /regex/' file↵ | ||
| + | Print line if field 1 does NOT match regex in file | ||
| + | Variables II | ||
| + | FILENAME | ||
| + | Reference current input file | ||
| + | FNR | ||
| + | Reference number of the current record relative to current input file | ||
| + | OFS | ||
| + | Field separator of the outputted data (default whitespace) | ||
| + | ORS | ||
| + | Record separator of the outputted data (default newline) | ||
| + | RS | ||
| + | Record separator of input file (default newline) | ||
| + | Variables III | ||
| + | CONVFMT | ||
| + | Conversion format used when converting numbers (default %.6g) | ||
| + | SUBSEP | ||
| + | Separates multiple subscripts (default 034) | ||
| + | OFMT | ||
| + | Output format for numbers (default %.6g) | ||
| + | ARGC | ||
| + | Argument count, assignable | ||
| + | ARGV | ||
| + | Argument array, assignable | ||
| + | ENVIRON | ||
| + | Array of environment variables | ||
| + | Functions I | ||
| + | index(s,t) | ||
| + | Position in string s where string t occurs, 0 if not found | ||
| + | length(s) | ||
| + | Length of string s (or $0 if no arg) | ||
| + | rand | ||
| + | Random number between 0 and 1 | ||
| + | substr(s,index,len) | ||
| + | Return len-char substring of s that begins at index (counted from 1) | ||
| + | srand | ||
| + | Set seed for rand and return previous seed | ||
| + | int(x) | ||
| + | Truncate x to integer value | ||
| + | Functions II | ||
| + | split(s,a,fs) | ||
| + | Split string s into array a split by fs, returning length of a | ||
| + | match(s,r) | ||
| + | Position in string s where regex r occurs, or 0 if not found | ||
| + | sub(r,t,s) | ||
| + | Substitute t for first occurrence of regex r in string s (or $0 if s not given) | ||
| + | gsub(r,t,s) | ||
| + | Substitute t for all occurrences of regex r in string s | ||
| + | Functions III | ||
| + | system(cmd) | ||
| + | Execute cmd and return exit status | ||
| + | tolower(s) | ||
| + | String s to lowercase | ||
| + | toupper(s) | ||
| + | String s to uppercase | ||
| + | getline | ||
| + | Set $0 to next input record from current input file. | ||
| + | One-Line Exercises II | ||
| + | awk 'NR!=1{print $1}' file↵ | ||
| + | Print first field for each record in file excluding the first record | ||
| + | awk 'END{print NR}' file↵ | ||
| + | Count lines in file | ||
| + | awk '/foo/{n++}; END {print n+0}' file↵ | ||
| + | Print total number of lines that contain foo | ||
| + | awk '{total=total+NF};END{print total}' file↵ | ||
| + | Print total number of fields in all lines | ||
| + | awk '/regex/{getline;print}' file↵ | ||
| + | Print line immediately after regex, but not line containing regex in file | ||
| + | awk 'length > 32' file↵ | ||
| + | Print lines with more than 32 characters in file | ||
| + | awk 'NR==12' file↵ | ||
| + | Print line number 12 of file</nowiki> | ||
Latest revision as of 09:05, 16 November 2018
Comand line text processing
Examples:
awk -F, '{print NR, length($0)}' filename.txt #print line number and line length
awk '{print FILENAME " " length($0)}' */PRF* | uniq
awk 'BEGIN { FS = "," } ; { print $2 }' #Specify separator ',' can be done with -F too.
awk -F"," '$2~/^ABC$/' file #Find in a csv second field = ABC
Print from 3rd field till end
awk '{ \
for (i = 3; i <= NF; i++) { \
printf("%s ", $i); \
} \
printf("\n") }'
Print with condition
awk '{if ($3 =="" || $4 == "" || $5 == "") print "Some score for the student",$1,"is missing";'}' student-marks
Print if value
$ echo "a b c d" | awk '($4){print "yes"}'
yes
$ echo "a b c d" | awk '($14){print "yes"}' ## prints nothing, no $14
$ echo "a b c 0" | awk '($4){print "yes"}' ## prints nothing, $4 is 0
Print columns as lines
ls -lR | awk '{for(x=1;$x;++x) print $x}'
awk '{for(x=1;$x;x++)print $x}'
___ __ ___
| | |
| | |-----> increment x by 1 at the end of each loop.
| |--------> run the loop as long as there is a field number x
|------------> initialize x to 1
Print from line x to line y
I suggest the sed solution, but for the sake of completeness,
awk 'NR >= 57890000 && NR <= 57890010' /path/to/file
To cut out after the last line:
awk 'NR < 57890000 { next } { print } NR == 57890010 { exit }' /path/to/file
Cheatsheet
Basics I
$1
Reference first column
awk '/pattern/ {action}' file↵
Execute action for matched pattern 'pattern' on file 'file'
;
Char to separate two actions
print
Print current record line
$0
Reference current record line
Variables I
$2
Reference second column
FS
Field separator of input file (default whitespace)
NF
Number of fields in current record
NR
Line number of the current record
Basics II
^
Match beginning of field
~
Match opterator
!~
Do not match operator
-F
Command line option to specify input field delimiter
BEGIN
Denotes block executed once at start
END
Denotes block executed once at end
str1 str2
Concat str1 and str2
One-Line Exercises I
awk '{print $1}' file↵
Print first field for each record in file
awk '/regex/' file↵
Print only lines that match regex in file
awk '!/regex/' file↵
Print only lines that do not match regex in file
awk '$2 == "foo"' file↵
Print any line where field 2 is equal to "foo" in file
awk '$2 != "foo"' file↵
Print lines where field 2 is NOT equal to "foo" in file
awk '$1 ~ /regex/' file↵
Print line if field 1 matches regex in file
awk '$1 !~ /regex/' file↵
Print line if field 1 does NOT match regex in file
Variables II
FILENAME
Reference current input file
FNR
Reference number of the current record relative to current input file
OFS
Field separator of the outputted data (default whitespace)
ORS
Record separator of the outputted data (default newline)
RS
Record separator of input file (default newline)
Variables III
CONVFMT
Conversion format used when converting numbers (default %.6g)
SUBSEP
Separates multiple subscripts (default 034)
OFMT
Output format for numbers (default %.6g)
ARGC
Argument count, assignable
ARGV
Argument array, assignable
ENVIRON
Array of environment variables
Functions I
index(s,t)
Position in string s where string t occurs, 0 if not found
length(s)
Length of string s (or $0 if no arg)
rand
Random number between 0 and 1
substr(s,index,len)
Return len-char substring of s that begins at index (counted from 1)
srand
Set seed for rand and return previous seed
int(x)
Truncate x to integer value
Functions II
split(s,a,fs)
Split string s into array a split by fs, returning length of a
match(s,r)
Position in string s where regex r occurs, or 0 if not found
sub(r,t,s)
Substitute t for first occurrence of regex r in string s (or $0 if s not given)
gsub(r,t,s)
Substitute t for all occurrences of regex r in string s
Functions III
system(cmd)
Execute cmd and return exit status
tolower(s)
String s to lowercase
toupper(s)
String s to uppercase
getline
Set $0 to next input record from current input file.
One-Line Exercises II
awk 'NR!=1{print $1}' file↵
Print first field for each record in file excluding the first record
awk 'END{print NR}' file↵
Count lines in file
awk '/foo/{n++}; END {print n+0}' file↵
Print total number of lines that contain foo
awk '{total=total+NF};END{print total}' file↵
Print total number of fields in all lines
awk '/regex/{getline;print}' file↵
Print line immediately after regex, but not line containing regex in file
awk 'length > 32' file↵
Print lines with more than 32 characters in file
awk 'NR==12' file↵
Print line number 12 of file