Home>

If planning time in Linux is a hassle,

Use awk script as follows

begin {fs=":";ofs=":"}
 {total_seconds=total_seconds + $3}
 total_seconds>= 60 {total_seconds=total_sconds-60
  $2=$2 + 1
 }
{total_minutes=total_minutes + $2
 $2=$2 + 1
 }
{total_minutes=total_minutes + $2}
 total_minutes>= 60 {total_minutes=total_minutes-60
 $1=$1 + 1}
{total_hours=total_hours + $1}
end {print $1, $2, $3}

Below we give you a detailed explanation of the Linux awk command

Introduction

awk is a powerful text analysis tool,Compared to the search of grep and the editing of sed, when awk analyzes the data and generates a report,Looks particularly powerful.In simple terms, awk reads the file line by line,Slice each line with spaces as the default separator,The cut section is then subjected to various analysis processes.

awk comes in 3 different versions:awk, nawk, and gawk, without special instructions,Generally refers to gawk, which is the gnu version of awk.

awk derives its name from the first letters of the surnames of its founders, alfred aho, peter weinberger, and brian kernighan.In fact, awk does have its own language:the awk programming language, which the three creators have officially defined as a "style scanning and processing language." It allows you to create short programs,These programs read the input file, sort the data, process the data, perform calculations on the input, and generate reports,There are countless other features.

Instructions

awk "{pattern + action}" {filenames}

Although the operation can be complicated,But the syntax is always like this,Where pattern represents what awk looks for in the data,An action is a series of commands executed when a match is found.The curly braces ({}) do not need to appear in the program all the time,But they are used to group a series of instructions based on a specific pattern. pattern is the regular expression to be expressed,Enclosed with slashes.

The most basic function of the awk language is to browse and extract information based on specified rules in files or strings.After awk extracts the information,To perform other text operations.A complete awk script is usually used to format information in a text file.

Generally, awk is a unit of file processing.awk receives one line of a file,Then execute the corresponding command,To process text.

Call awk

There are three ways to call awk

Command line mode

awk [-f field-separator] "commands" input-file (s)

Among them, commands are real awk commands, and [-f domain separator] is optional. input-file (s) is the file to be processed.

In awk, in each line of the file,Each item separated by a field separator is called a field.In general, without specifying the -f domain separator,The default domain separator is a space.

2.shell script method

Insert all awk commands into a file,And make the awk program executable,Then awk command interpreter as the first line of the script,Called once by typing the script name.

Equivalent to the first line of a shell script:#!/Bin/sh

Can be replaced with:#!/Bin/awk

3. Insert all awk commands into a single file,Then call:

awk -f awk-script-file input-file (s)
</pre>
</div>
<p>
Among them, the -f option loads awk scripts in awk-script-file, and input-file (s) is the same as above.
</p>
<p>
This chapter focuses on the command line method.
</p>
<p>
Getting started example
</p>
<p>
Suppose the output of last -n 5 is as follows
</p>
<p>
[[email protected] ~] #last -n 5 < == Remove only the first five lines
</p>
<p>
root pts/1 192.168.1.100 tue feb 10 11:21 still logged in
</p>
<p>
root pts/1 192.168.1.100 tue feb 10 00:46-02:28 (01:41)
</p>
<p>
root pts/1 192.168.1.100 mon feb 9 11:41-18:30 (06:48)
</p>
<p>
dmtsai pts/1 192.168.1.100 mon feb 9 11:41-11:41 (00:00)
</p>
<p>
root tty1 fri sep 5 14:09-14:10 (00:01)
</p>
<p>
If it only shows the 5 most recently logged in accounts
</p>
<p>
#last -n 5 | awk "{print $1}"
</p>
<p>
root
</p>
<p>
root
</p>
<p>
root
</p>
<p>
dmtsai
</p>
<p>
roo
</p>
<p>
The awk workflow is this:read in a record separated by "\ n" newline characters,The records are then divided into domains by the specified domain separator,Padding fields, $0 means all fields,$1 means the first domain,$n represents the nth domain. The default domain separator is "blank key" or "[tab] key", so $1 means logged in user
$3 means login user ip, and so on.
</p>
<p>
If it only shows the account of/etc/passwd
</p>
<p>
#cat/etc/passwd | awk -f ":" "{print $1}"
</p>
<p>
root
</p>
<p>
daemon
</p>
<p>
bin
</p>
<p>
sys
</p>
<p>
This is an example of awk + action, where action {print $1} is executed for each line.
</p>
<p>
-f specifies the domain separator as ":".
</p>
<p>
If you only display the account of/etc/passwd and the shell corresponding to the account, and the account and shell are separated by the tab key
</p>
<p>
#cat/etc/passwd | awk -f ":" "{print $1" \ t "$7}"
</p>
<p>
root/bin/bash
</p>
<p>
daemon/bin/sh
</p>
<p>
bin/bin/sh
</p>
<p>
sys/bin/sh
</p>
<p>
If you only display the account of/etc/passwd and the shell corresponding to the account, and the account and shell are separated by a comma,And add the column name name, shell on all lines, and add "blue,/bin/nosh" on the last line.
</p>
<p>
cat/etc/passwd | awk -f ":" "begin {print" name, shell "} {print $1", "$7} end {print" blue,/bin/nosh "}"
</p>
<p>
name, shell
</p>
<p>
root,/bin/bash
</p>
<p>
daemon,/bin/sh
</p>
<p>
bin,/bin/sh
</p>
<p>
sys,/bin/sh
</p>
<p>
....
</p>
<p>
blue,/bin/nosh
</p>
<p>
The awk workflow is this:first perform beging, then read the file,Read a record separated by/n newline,The records are then divided into domains by the specified domain separator,Padding fields, $0 means all fields,$1 means the first domain,$n represents the nth domain, and then starts to execute the action corresponding to the pattern. Then start reading the second record ... until all records have been read,Finally, the end operation is performed.
</p>
<p>
Search all lines with root keyword in/etc/passwd
</p>
<p>
#awk -f:"/root /"/etc/passwd
</p>
<p>
root:x:0:0:root:/root:/bin/bash
</p>
<p>
This is an example of the use of pattern,Only the line that matches the pattern (here is root) will execute the action (no action is specified, the content of each line is output by default).
</p>
<p>
Search supports regular,For example, find the beginning of root:awk -f:"/^ root /"/etc/passwd
</p>
<p>
Search all lines with root keyword in/etc/passwd,And display the corresponding shell
</p>
<p>
#awk -f:"/root/{print $7}"/etc/passwd
</p>
<p>
/bin/bash
</p>
<p>
Action {print $7} specified here
</p>
<p>
awk built-in variables
</p>
<p>
awk has many built-in variables for setting environment information,These variables can be changed,Some of the most commonly used variables are given below.
</p>
<p>
argc command line arguments
</p>
<p>
Argv command line parameter arrangement
</p>
<p>
environ supports the use of system environment variables in the queue
</p>
<p>
filename awk browse filename
</p>
<p>
fnr Number of records browsed by file
</p>
<p>
fs sets the input field separator,Equivalent to the command line -f option
</p>
<p>
nf number of domains for browsing records
</p>
<p>
nr number of records read
</p>
<p>
ofs output field separator
</p>
<p>
ors output record separator
</p>
<p>
rs control record separator
</p>
<p>
In addition, the $0 variable refers to the entire record.
$1 represents the first field of the current line,$2 represents the second field of the current line,... and so on.
</p>
<p>
/Etc/passwd:file name, line number of each line,The number of columns per row,Corresponding complete line content:
</p>
<p>
#awk -f ":" "{print" filename:"filename", linenumber:"nr", columns:"nf", linecontent:"$0}"/etc/passwd
</p>
<p>
filename:/etc/passwd, linenumber:1, columns:7, linecontent:root:x:0:0:root:/root:/bin/bash
</p>
<p>
filename:/etc/passwd, linenumber:2, columns:7, linecontent:daemon:x:1:1:daemon:/usr/sbin:/bin/sh
</p>
<p>
filename:/etc/passwd, linenumber:3, columns:7, linecontent:bin:x:2:2:bin:/bin:/bin/sh
</p>
<p>
filename:/etc/passwd, linenumber:4, columns:7, linecontent:sys:x:3:3:sys:/dev:/bin/sh
</p>
<p>
Using printf instead of print can make the code more concise,readability
</p>
<p>
awk -f ":" "{printf (" filename:%10s, linenumber:%s, columns:%s, linecontent:%s \ n ", filename, nr, nf, $0)}"/etc/passwd
</p>
<p>
print and printf
</p>
<p>
Awk provides both print and printf functions.
</p>
<p>
The parameters of the print function can be variables, numbers, or strings.
The string must be quoted in double quotes,Parameters are separated by commas.
Without a comma,The parameters are connected in series and cannot be distinguished.
Here, the comma works the same as the separator of the output file.
The latter is just spaces.
</p>
<p>
printf function, its usage is basically similar to printf in C language,Can format strings,When the output is complex,printf is more useful,The code is easier to understand.
</p>
<p>
awk programming
</p>
<p>
Variables and assignments
</p>
<p>
In addition to awk's built-in variables,awk can also customize variables.
</p>
<p>
The following counts the number of accounts in/etc/passwd
</p>
<p>
awk "{count ++;print $0;} end {print" user count is ", count}"/etc/passwd
</p>
<p>
root:x:0:0:root:/root:/bin/bash
</p>
<p>
...
</p>
<p>
user count is 40
</p>
<p>
count is a custom variable.
In the previous action {}, there was only one print. In fact, print is just a statement.
While action {} can have multiple statements,Separated by;.
</p>
<p>
Count is not initialized here, although the default is 0, it is appropriate to initialize it to 0:
</p>
<p>
awk "begin {count=0;print" [start] user count is ", count} {count=count + 1;print $0;} end {print" [end] user count is ", count}"/etc/passwd
</p>
<p>
[start] user count is 0
</p>
<p>
root:x:0:0:root:/root:/bin/bash
</p>
<p>
...
</p>
<p>
[end] user count is 40
</p>
<p>
Count the number of bytes occupied by files in a folder
</p>
<p>
ls -l | awk "begin {size=0;} {size=size + $5;} end {print" [end] size is ", size}"
</p>
<p>
[end] size is 8657198
</p>
<p>
If displayed in m:
</p>
<p>
ls -l | awk "begin {size=0;} {size=size + $5;} end {print" [end] size is ", size/1024/1024," m "}"
</p>
<p>
[end] size is 8.25889 m
</p>
<p>
Note that statistics do not include subdirectories of folders.
</p>
<p>
Conditional statements
</p>
<p>
Conditional statements in awk are borrowed from the C language,See the following statement:
</p>
<div>
<pre>
if (expression) {
  statement;
  statement;
  ...
}
if (expression) {
  statement;
} else {
  statement2;
}
if (expression) {
  statement1;
} else if (expression1) {
  statement2;
} else {
  statement3;
}

Count the number of bytes occupied by files in a folder,Filter files of 4096 size (usually folders):

ls -l | awk "begin {size=0;print" [start] size is ", size} {if ($5!=4096) {size=size + $5;}} end {print" [end] size is ", size/1024/1024, "m"} "
[end] size is 8.22339 m

loop statement

The awk loop statement also borrows from the C language and supports while, do/while, for, break, continue. The semantics of these keywords are exactly the same as those in the C language.

Array

Because the subscripts of arrays in awk can be numbers and letters,The index of an array is often called a key. The values ​​and keywords are stored in an internal hash table for key/value.Since the hash is not stored sequentially,So when displaying the contents of the array,They are not displayed in the order you expect.Arrays are just like variables,Are created automatically when used,awk also automatically determines whether it stores numbers or strings.Generally speaking,The array in awk is used to collect information from records,Can be used to calculate sums, count words, and how many times a tracking template has been matched.

Account showing/etc/passwd

awk -f ":" "begin {count=0;} {name [count]=$1;count ++;};end {for (i=0;i<nr;i ++) print i, name [i]}"/etc/passwd

0 root

1 daemon

2 bin

3 sys

4 sync

5 games

...

Here iterates through the array using a for loop

  • Previous Function expressions and function declarations in JavaScript and function declarations differ from function expressions
  • Next Python implements mysql single quoted string filtering method