linux基礎學習之AWK

麥德良 ? 2016-09-22 10:04 ? Linux干貨

內容：

1、awk輸出(print、printf)

2、awk變量(內建變量和定義變量)

3、awk數組

4、awk重定向輸出

5、awk操作符

6、awk常見模式類型

7、awk控制及循環語句

8、awk內置函數

awk：（其名稱得自于它的創始人 Alfred Aho 、Peter Weinberger 和 Brian Kernighan 姓氏的首個字母）

awk是一款強大的報告生成器，不同于sed和grep,它的側重點是如何把文本信息更好的展示出來，常用與統計和格式化輸出。

awk相當于微型的shell,有著自己一套語法結構，例如：循環結構，數組，條件判斷，函數，內置變量等功能。處理對象一般純文本文件或純文本信息。

在linux中用的GUN是gawk，和awk其實是同一個命令：

[19:48 root@Centos6.8~]# ll /bin/awk 
lrwxrwxrwx. 1 root root 4 Jul 20 02:11 /bin/awk -> gawk

語法:

awk [options] 'script' file1 file2, …

awk [options] 'PATTERN { action }' file1 file2, …：兩條語法等價，上面的腳本可分為兩部分，模式和動作，注意action要用{}引起來

-F：指定分割符

處理過程：
逐行讀取，然后按照一定的分割符（默認空格）把該行內容進行內容分片，稱為字段，每一個字段內容用$1,$2..表示，$0表示該整行,此外在awk中打印變量不需要$引用
Awk的工作方式：
1）Awk 一次讀取文件中的一行
2）對于一行，按照給定的pattern的順序進行匹配，如果匹配則執行對應的 Action
3）如果沒有匹配上則不執行任何動作
4）在上訴的語法中，Pattern 和 Action 是可選的，但是必須提供其中一個
5）如果Pattern未提供，則對所有的輸入行執行 Action 操作
6）如果 Action 未提供，則默認打印出該行的數據
7）{} 這種 Action 不做任何事情，和未提供的 Action 的工作方式不一樣
8）Action 中的語句應該使用分號分隔

工作步驟：

第一步：執行BEGIN{action;… }語句塊中的語句【在讀取文件前就開始打印，所以后面可以不加文件作為參數】

第二步：從文件或標準輸入(stdin)讀取一行，然后執行pattern{ action;… }語句塊，它逐行掃描文件，從第一行到最后一行重復這個過程，直到文件全部被讀取完畢。

第三步：當讀至輸入流末尾時，執行END{action;…}語句塊【后面也要加上文件作為參數】

BEGIN語句塊在awk開始從輸入流中讀取行之前被執行，這是一個可選的語句塊，比如變量初始化、打印輸出表格的表頭等語句通常可以寫在BEGIN語句塊中

END語句塊在awk從輸入流中讀取完所有的行之后即被執行，比如打印所有行的分析結果這類信息匯總都是在END語句塊中完成，它也是一個可選語句塊

pattern語句塊中的通用命令是最重要的部分，也是可選的。如果沒有提供pattern語句塊，則默認執行{ print }，即打印每一個讀取到的行，awk讀取的每一行都會執行該語句塊

一、常用輸出說明：

1、print：

print item1, item2,…

.各項目之間使用逗號分隔，而輸出時則使用輸出分隔符分隔

.輸出的各item可以字符串或數值、當前記錄的字段、變量或awk的表達式，數值會被隱式轉換為字符串后輸出

.與bash的位置變量相似 print后面item如果省略，相當于print $0，若輸出空白，使用pirnt ""

演示說明：

[19:17 root@Centos6.8~]# awk -F: '{print $1,$3}' /etc/passwd
root 0
bin 1
daemon 2
adm 3
lp 4
sync 5
shutdown 6
halt 7
mail 8
uucp 10
operator 11
games 12
gopher 13
[19:34 root@Centos6.8~]# awk -F: '{print $1,$3,$7}' /etc/passwd
root 0 /bin/bash
bin 1 /sbin/nologin
daemon 2 /sbin/nologin
adm 3 /sbin/nologin
lp 4 /sbin/nologin
sync 5 /bin/sync
shutdown 6 /sbin/shutdown
halt 7 /sbin/halt
mail 8 /sbin/nologin
uucp 10 /sbin/nologin
operator 11 /sbin/nologin
games 12 /sbin/nologin
gopher 13 /sbin/nologin

2、printf：帶格式的輸出

語法：printf format, item1, item2,…

說明：

.要指定format

.不會自動換行；如需換行則需要給出\n

.format用于為后面的每個item指定其輸出格式

format格式的指示符都以%開頭，后跟一個字符；如下：

%c: 顯示字符的ASCII碼；

%d, %i：十進制整數；

%e, %E：科學計數法顯示數值；

%f: 顯示浮點數；

%g, %G: 以科學計數法的格式或浮點數的格式顯示數值；

%s: 顯示字符串；

%u: 無符號整數；

%%: 顯示%自身；

修飾符：

N: 顯示寬度；

-: 左對齊；

+：顯示數值符號；

演示說明：

[19:35 root@Centos6.8~]# awk -F: '{printf "%s%d",$1,$3}' /etc/passwd
root0bin1daemon2adm3lp4sync5shutdown6halt7mail8uucp10operator11games12gopher13ftp14nobody99dbus81usbmuxd113rpc32rtkit499avahi-autoipd170vcsa69abrt173rpcuser29nfsnobody65534haldaemon68ntp38apache48saslauth498postfix89mysql27gdm42pulse497sshd74tcpdump72hill500nihao501[19:36 root@Centos6.8~]# awk -F: '{printf "%s%d\n",$1,$3}' /etc/passwd
[19:36 root@Centos6.8~]# awk -F: '{printf "%s%d\n",$1,$3}' /etc/passwd
root0
bin1
daemon2
adm3
lp4
sync5
shutdown6
halt7
mail8
uucp10
operator11
games12
gopher13
[19:36 root@Centos6.8~]# awk -F: '{printf "%-20s%-10s\n",$1,$3}' /etc/passwd
root                0         
bin                 1         
daemon              2         
adm                 3         
lp                  4         
sync                5         
shutdown            6         
halt                7         
mail                8         
uucp                10        
operator            11        
games               12        
gopher              13

二、awk變量

1、awk內置變量之記錄變量：

FS: field separator，讀取文件本時，所使用字段分隔符；

RS: Record separator，輸入文本信息所使用的換行符；

OFS: Output Filed Separator:

ORS：Output Row Separator：

演示說明：

[19:42 root@Centos6.8~]# awk 'BEGIN{FS=":"}{print $1,$3}' /etc/passwd
root 0
bin 1
daemon 2
adm 3
lp 4
sync 5
shutdown 6
halt 7
mail 8
uucp 10
operator 11
games 12
gopher 13

[19:43 root@Centos6.8~]# awk 'BEGIN{FS=":";OFS="#"}{print $1,$3}' /etc/passwd
root#0
bin#1
daemon#2
adm#3
lp#4
sync#5
shutdown#6
halt#7
mail#8
uucp#10
operator#11
games#12
gopher#13

2、awk內置變量之數據變量：

NR: The number of input records，awk命令所處理的記錄數；如果有多個文件，這個數目會把處理的多個文件中行統一計數；

NF：Number of Field，當前記錄的field個數；

FNR: 與NR不同的是，FNR用于記錄正處理的行是當前這一文件中被總共處理的行數；

ARGV: 數組，保存命令行本身這個字符串，如awk '{print $0}' a.txt b.txt這個命令中，ARGV[0]保存awk，ARGV[1]保存a.txt；

ARGC: awk命令的參數的個數；

FILENAME: awk命令所處理的文件的名稱；

ENVIRON：當前shell環境變量及其值的關聯數組；

演示說明：

[19:45 root@Centos6.8~]# awk -F: '{print NR,$1}' /etc/issue /etc/passwd #NR統計所有文件的總行數
1 CentOS release 6.8 (Final)
2 Kernel \r on an \m
3 $(hostname)
4 `date`
5 root
6 bin
7 daemon
8 adm
9 lp
10 sync
11 shutdown
12 halt
13 mail
14 uucp
15 operator
16 games
17 gopher

[19:48 root@Centos6.8~]# awk  '{print NF,$1}' /etc/issue #顯示每一行的字段個數
4 CentOS
5 Kernel
1 $(hostname)
1 `date`

3、用戶自定義變量

可以命令行中通過-v選項自定義變量

gawk允許用戶自定義自己的變量以便在程序代碼中使用，變量名命名規則與大多數編程語言相同，只能使用字母、數字和下劃線，且不能以數字開頭。gawk變量名稱區分字符大小寫。

演示說明：

[19:52 root@Centos6.8~]# awk -F: -v A="nihao" '{print A,$1}' /etc/passwd  #同時也證明了變量的引用不需要$引用
nihao root
nihao bin
nihao daemon
nihao adm
nihao lp
nihao sync
nihao shutdown
nihao halt
nihao mail
nihao uucp
nihao operator
nihao games
nihao gopher

三、awk數組

定義方法

1：可以用數值作數組索引(下標)

array[1]=“hello awk”

Tarray[2]=“9527”

2：可以用字符串作數組索引(下標)

array[“first”]=“hello ”

array[“last”]=”awk”

array[“birth”]=”9527”

使用中 print array[1] 將得到”hello awk” 而 print array[2] 和 print[“birth”] 都將得到 ”9527” 。

例子：

split(string, array [, fieldsep [, seps ] ])

功能：將string表示的字符串以fieldsep為分隔符進行分隔，并將分隔后的結果保存至array為名的數組中

[16:44 root@Centos6.8~]#awk 'BEGIN{info="it is a test";split(info,tA," ");for(k in tA){print k,tA[k];}}'
4 test
1 it
2 is
3 a

for…in 輸出，因為數組是關聯數組，默認是無序的。所以通過for…in 得到是無序的數組。如果需要得到有序數組，需要通過下標獲得。

[16:49 root@Centos6.8~]# awk 'BEGIN{info="it is a test";tlen=split(info,tA," ");for(k=1;k<=tlen;k++){print k,tA[k];}}' 
1 it
2 is
3 a
4 test

過濾重復選項用法(利用真假判斷)：

[11:14 root@centos6.8~]# cat test 
1.1.1.1,1aaa
1.1.1.1,1bbb
1.1.1.1,1ccc
2.2.2.2,2aaa
2.2.2.2,2bbb
3.3.3.3,3aaa
[11:14 root@centos6.8~]# awk -F"," '!arry[$1]++' test 
1.1.1.1,1aaa
2.2.2.2,2aaa
3.3.3.3,3aaa

四、輸出重定向

print items > output-file：保存到文件

print items >> output-file：追加到文件

print items | command：使用管道交給某些命令處理

演示說明：

[20:02 root@Centos6.8~]# awk -F: -v A="nihao" '{print A,$1 > "/root/awktest"}' /etc/passwd
[20:03 root@Centos6.8~]# cat awktest 
nihao root
nihao bin
nihao daemon
nihao adm
nihao lp
nihao sync
nihao shutdown
nihao halt
nihao mail
nihao uucp
nihao operator
nihao games
nihao gopher

五、awk操作符

1、算術操作符：

-x: 負值

+x: 轉換為數值；

x^y:

x**y: 次方

x*y: 乘法

x/y：除法

x+y:

x-y:

x%y:

2、賦值操作符：

=

+=

-=

*=

/=

%=

^=

**=

++

—

演示說明：

[21:41 root@Centos6.8~]# awk '{ x+=$2+$3 }{print $0,x}' test 
3 5 6 7 11
2 3 1 0 15
4 5 6 9 26
2 3 4 4 33
2 2 1 0 36
4 5 0 9 41
[21:41 root@Centos6.8~]# cat test 
3 5 6 7
2 3 1 0
4 5 6 9
2 3 4 4
2 2 1 0
4 5 0 9

需要注意的是，如果某模式為=號，此時使用/=/可能會有語法錯誤，應以/[=]/替代；

3、比較操作符

x < y

x <= y

x > y

x >= y

x == y

x != y

x ~ y

x !~ y

4、邏輯關系符

&&

||

5、三目表示式

selector?if-true-exp:if-false-exp

6、函數調用：

function_name (para1,para2)

演示說明：

判斷UID是否大于等于500，如果為真就顯示“common user”，如果為假就顯示“system user”

[20:10 root@Centos6.8~]# awk -F: '{$3<500?A="system user":A="common user"}{print $1,A}' /etc/passwd #三目操作表達式
root system user
bin system user
daemon system user
adm system user
lp system user
sync system user
shutdown system user
halt system user
mail system user
uucp system user
operator system user
games system user
gopher system user

六、常見的模式類型：

1、Regexp: 正則表達式，格式為/regular expression/

2、expresssion：表達式，其值非0或為非空字符時滿足條件，如：$1~/foo/【非精準匹配】或 $1 == "VALUE"【精準匹配】，用運算符~(匹配)和!~(不匹配)。

3、Ranges：指定的匹配范圍，格式為pat1,pat2

4、BEGIN/END：特殊模式，僅在awk命令執行前運行一次或結束前運行一次（常用作標題和結尾說明）

5、Empty(空模式)：匹配任意輸入行；

演示說明：

[20:11 root@Centos6.8~]# awk -F: '/^r/{print $1,$3}' /etc/passwd #正則表達式使用
root 0
rpc 32
rtkit 499
rpcuser 29

[20:16 root@Centos6.8~]# awk -F: 'BEGIN{print "TEST BEGIN"}/^r/{print $1,$3}END{print "TEST OVER"}' /etc/passwd #BEGIN和END的使用演示
TEST BEGIN
root 0
rpc 32
rtkit 499
rpcuser 29
TEST OVER

七、控制語句

if-else

語法：if (condition) {then-body} else {[ else-body ]}

演示說明：

[10:10 root@centos6.8~]# df -P
Filesystem                   1024-blocks    Used Available Capacity Mounted on
/dev/mapper/VolGroup-lv_root    51475068 2740208  46113420       6% /
tmpfs                            1954768       0   1954768       0% /dev/shm
/dev/sda1                         487652   40654    421398       9% /boot
/dev/mapper/VolGroup-lv_home    10190136   36884   9628968       1% /home
[10:11 root@centos6.8~]# df -P|awk '{i=$5+0;if (i > 5){print $1,$5}}'
/dev/mapper/VolGroup-lv_root 6%
/dev/sda1 9%

注意：$N+0可以把$N強制去整數形式

[20:24 root@Centos6.8~]# awk -F: '{if ($3<500) {print $1,"systemm user"} else {print $1,"common user"}}' /etc/passwd
root systemm user
bin systemm user
daemon systemm user
adm systemm user
lp systemm user
sync systemm user
shutdown systemm user
halt systemm user
mail systemm user
uucp systemm user
operator systemm user
games systemm user
gopher systemm user

while

語法： while (condition){statement1; statment2; …}

[20:24 root@Centos6.8~]# awk -F: '{i=1;while (i<=3) {print $i;i++}}' /etc/passwd #每個的前三個字段逐行打印
root
x
0
bin
x
1
daemon
x
2
adm
x
3
lp
x
4
sync
x
5
shutdown
x
6
halt
x
7
mail
x
8
uucp
x
10
operator
x
11
games
x
12
gopher
x
13

for

語法： for ( variable assignment; condition; iteration process) { statement1, statement2, …}

演示說明：

[20:25 root@Centos6.8~]# awk -F: '{for(i=1;i<=3;i++) print $i}' /etc/passwd
root
x
0
bin
x
1
daemon
x
2
adm
x
3
lp
x
4
sync
x
5
shutdown
x
6
halt
x
7
mail
x
8
uucp
x
10
operator
x
11
games
x
12
gopher
x
13

for循環還可以用來遍歷數組元素：

語法： for (i in array) {statement1, statement2, …}

case

語法：switch (expression) { case VALUE or /REGEXP/: statement1, statement2,… default: statement1, …}

break 和 continue

常用于循環或case語句中

next

提前結束對本行文本的處理，并接著處理下一行；例如，下面的命令將顯示其ID號為奇數的用戶：

演示說明：

[20:29 root@Centos6.8~]# awk -F: '{if($3%2==0) next;print $1,$3}' /etc/passwd
bin 1
adm 3
sync 5
halt 7
operator 11
gopher 13

八、awk的內置函數

split(string, array [, fieldsep [, seps ] ])

功能：將string表示的字符串以fieldsep為分隔符進行分隔，并將分隔后的結果保存至array為名的數組中

length([string])

功能：返回string字符串中字符的個數；

[20:35 root@Centos6.8~]# awk -F: '{print $1,length($1)}' /etc/passwd
root 4
bin 3
daemon 6
adm 3
lp 2
sync 4
shutdown 8
halt 4
mail 4
uucp 4
operator 8
games 5
gopher 6

substr(string, start [, length])

功能：取string字符串中的子串，從start開始，取length個；start從1開始計數；

system(command)

功能：執行系統command并將結果返回至awk命令

systime()

功能：取系統當前時間，默認顯示秒

[20:30 root@Centos6.8~]#awk -F: '{print $1,systime()}' /etc/passwd
root 1472992355
bin 1472992355
daemon 1472992355
adm 1472992355
lp 1472992355
sync 1472992355
shutdown 1472992355
halt 1472992355
mail 1472992355
uucp 1472992355
operator 1472992355
games 1472992355
gopher 1472992355

tolower(s)

功能：將s中的所有字母轉為小寫

toupper(s)

功能：將s中的所有字母轉為大寫

[20:33 root@Centos6.8~]# awk -F: '{print $1,toupper($7)}' /etc/passwd
root /BIN/BASH
bin /SBIN/NOLOGIN
daemon /SBIN/NOLOGIN
adm /SBIN/NOLOGIN
lp /SBIN/NOLOGIN
sync /BIN/SYNC
shutdown /SBIN/SHUTDOWN
halt /SBIN/HALT
mail /SBIN/NOLOGIN
uucp /SBIN/NOLOGIN
operator /SBIN/NOLOGIN
games /SBIN/NOLOGIN
gopher /SBIN/NOLOGIN

原創文章，作者：麥德良，如若轉載，請注明出處：http://www.www58058.com/48407

贊 (0)

1

網絡N23期第一周（計算機的組成及功能及Linux的發行版等）

上一篇 2016-09-22 09:17

awk應用和systemd

下一篇 2016-09-22 10:04

N22-第二周作業

1、linux上的文件管理類命令都有哪些，其常用的使用方法及其相關示例演示。文件管理命令 mkdir 創建目錄 &…

Linux干貨 2016-08-22
Linux干貨

自動化運維工具Puppet

開發puppet模塊，nginx負載均衡并反代動態請求至httpd，httpd用ajp連接器將反代請求至tomcat，并部署tomcat-session-memcached 架構圖為在master主機上開發的模塊為： 1、chrony模塊; ├── chrony│ ├── files│…

2017-07-28
馬哥教育網絡班N22期+第7周作業

1、創建一個10G分區，并格式為ext4文件系統；(1) 要求其block大小為2048, 預留空間百分比為2, 卷標為MYDATA, 默認掛載屬性包含acl；mke2fs -t ext4 -b 2048 -L MYDATA -m 2 /dev/sdb1tune2fs -o acl /dev/sdb1(2) 掛載至/data/mydata目錄，要求掛載時禁止…

Linux干貨 2016-11-14
Linux磁盤管理

面對一塊硬盤，我們該如何使用它呢？本文從機械硬盤結構，分區，格式化，和掛載四個層次進行介紹。一、機械硬盤結構現在服務器使用機械式硬盤是主流，因為其造價低，容量大，和固態硬盤相比讀寫性能要差很多。機械硬盤主要由以下幾個部件構成：轉軸Spindle，盤片Platter，機械臂Boom，磁頭Head。工作機制是馬達帶動盤片高速旋轉，磁頭對盤片進行擦寫數據或讀取…

Linux干貨 2016-09-01
linux常用基礎命令簡介

linux常用基礎命令簡介 linux入門基礎筆記 linux新手入門常用命令語法參數簡介 linux常用基礎命令簡介 cd 語法選項 ls 語法選項 echo 語法選項 history 語法選項 man 語法選項 cd cd命令用來切換工作目錄至指定目錄。其中指定目錄表示法可為絕對路徑或相對路徑。若目錄名稱省略，則變換至…

Linux干貨 2017-03-26
linux基礎學習-第九天（shell基礎）

2016-08-10 授課內容： shell腳本基礎：變量運算 bash測試（數字測試、字符測試、文件測試、組合測試） read命令變量作用： 1、數據存儲格式 2、參與的運算 3、表示的數據范圍變量類型： 1、本地變量 2、環境（全局）變量 3、特殊變量駝峰命名變量：每個單詞一個字母大寫本地變量：…

Linux干貨 2016-08-11

評論列表（1條）

馬哥教育 2016-09-23 12:43

awk是一個很方便的文本格式化工具，這也是以后面試題必會遇到的面試題，希望下來多加練習，熟練掌握，

欧美性久久久久