grep、egrep、fgrep 正則表達式詳解

老鼠上了貓 ? 2015-07-01 10:19 ? Linux干貨, 系統運維

大綱
一、grep分類
   –1.1基本定義
   –1.2常用選項
   –1.3不常用選項
二、正則表達式
   –2.1基本定義
   –2.2正則表達式
        –2.2.1基本正則表達式
      –2.2.2擴展正則表達式
          –2.2.3快速正則表達式
三、案例分析
   –3.1grep選項案例
   –3.2正則表達式安全
    –3.2.1基本正則表達式案例
      –3.2.2擴展正則表達式案例

–3.2.3快速正則表達式案例

一、grep分類

1.1、基本定義：
      grep（Global search regular expression and print out theline)，全面搜索正則表達式并打印出來。
      是一種很強大的文本搜索工具，并把相匹配的行打印出來。grep在查找一個字符串時，是以整行為單位
      進行數據篩選的。

      egrep：相當于grep -E，利用此命令可使用擴展的正則表達式來搜索篩選文本。

      fgrep：相當于grep -F，不支持正則表達式

1.2、常用選項：
-E:擴展正則表達式，相當于egrep
-F:固定字符串列表，相當于fgrep
-G:基本正則表達式，默認

-n:標識匹配“搜索字符串”行號
-i:忽略大小寫
-y：同-i，忽略大小寫
-v:反相匹配
-w:完整匹配文字和數字字符
-c:計算匹配“搜索的字符串”的行數
-o:僅打印匹配到的字符串
-A NUM：除了顯示匹配行外，并顯示匹配行后的指定數量 NUM 行
-B NUM：除了顯示匹配行外，并顯示匹配行前的指定數量 NUM 行
-C NUM: 除了顯示匹配行外，并顯示匹配行前后的指定數量 NUM 行
--color=auto:與“搜索字符串”匹配的字符串著色顯色
--help：幫助信息

1.3、不常用選項：
-x:完整行匹配
-l:--files-with-matches  只打印包含匹配字符串的文件名 
-L:--files-without-match 只打印不包含匹配字符串的文件名
-f:從文件中提取模板,空文件中包含0個模板，所以什么都不匹配
-e:指定范本文件，其內容含有一個或多個范本樣式，讓grep查找符合范本條件的文件內容
-q:安靜模式，不打印任何標準輸出,如果有匹配的內容則立即返回狀態值0
-s:不顯示不存在或無匹配文本的錯誤信息。
-H:在每個匹配的行前顯示絕對路徑文件名，如果存在多個搜索文件，則默認存在-H功能
-h:匹配的行前不顯示絕對路徑文件名，默認存在于單個搜索文件前提下
-b:顯示在每一行輸出前的輸入字節的偏移量
-m NUM:在找到指定數量 NUM 的匹配行后停止讀文件
-a, --text：將二進制文件當作文本處理
-R, -r, --recursive：遞歸

二、正則表達式

2.1、基本定義：
正則表達式，又稱正規表示法、常規表示法(Regular Expression),常簡稱為RE；RE就是處理字串的方法，通過
一些特殊符號的輔助來實現對文本搜索、刪除、替換的目的。grep、vim、awk、sed等都支持RE。

2.2、正則表達式
2.2.1、基本正則表達式
a）錨定符
^ :行首錨定符   
$ :行尾錨定符   
\<:詞首錨定符   
\>:詞尾錨定符   
\b:位于詞首前相當于\<；位于詞尾后，相當于\> 
^$:匹配空白行

b）字符、次數匹配
.：匹配單個字符
*：匹配0個或多個重復位于星號前的字符
[]:匹配一組字符中的任意一個
[^]:取反
\{m\}：出現m次
\{m,n\}：最少出現m次，最多出現n次
\(\):分組引用，引用：\1, \2, \3

c)特殊符號
[:alnum:]:表示數字與大小寫字母[0-9a-zA-Z]
[:alpha:]:表示大小寫字母[a-zA-Z]
[:cntr:]:表示控制按鍵，Ctrl、Tab...
[:digit:]:表示數字
[:graph:]:表示除了空白鍵與Tab鍵外的所有按鍵
[:lower:]:代表小寫字母
[:print:]:代表任何可以被打印出來的字節
[:punct:]:代表標點符號
[:space:]:代給空白鍵
[:upper:]:代表大寫字母
[:xdigit:]:代表十六進制的數字類型

2.2.2、擴展正則表達式
使用方法及參數與基本正則表達式一致，與之不一樣的是特殊字符無需轉義（詞首和詞尾錨定除外），另新增了
幾個參數，詳情如下：
a)、特殊字符無轉使用轉義符
()：分用引用，相當于grep \(\)
{m}:相當于grep \{m\},精確匹配m次
{m,n}:相當于grep \{m,n\}最少出現m次，最多出現n次
\<:詞首錨定
\>:詞尾錨定符 
\b:位于詞首前相當于\<；位于詞尾后，相當于\> 
+：匹配其前導字符最少一次
？：匹配其前導字符0次或1次（案例測試2次以上的也會匹配，相當）
| ：或的意思，a|b；匹配a或b

2.2.3、快速正則表達式
同grep的常用選項及不常用選項

三、案例（為了方便，利用別名把grep默認加入–color=auto選項）

3.1 grep選項
a）常用選項測試案例

#-n:標識匹配“搜索字符串”行號：/etc/passwd只要包含root字符串的行都顯示出來，并標識行號
[root@localhost tmp]# grep -n "root" /etc/passwd 
1:root:x:0:0:root:/root:/bin/bash
11:operator:x:11:0:operator:/root:/sbin/nologin
89:roota:x:33130:33130::/home/roota:/bin/bash
90:aroot:x:33131:33131::/home/aroot:/bin/bash

#-i:忽略大小寫：/etc/rc0.d/K80kdump包含"PRO"字符串都顯示出來，無視大小寫。
[root@localhost ~]# grep -i "PRO" /etc/rc0.d/K80kdump 
# Provides: kdump 
# Description:  The kdump init script provides the support necessary for
KDUMP_IDE_NOPROBE_COMMANDLINE=""

#-y：同-i，忽略大小寫

#-v:反相匹配：匹配/etc/passwd文件中不包含“root"字符串的行
[root@localhost ~]# grep -v "root" /etc/passwd
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin

#-w:完整匹配文字和數字字符：匹配/etc/passwd中單詞為”root"的行，看下面結果，會發現與-n結果不一致，
#   roota、aroot用戶都不符合匹配要求
[root@localhost tmp]# grep -w "root" /etc/passwd
root:x:0:0:root:/root:/bin/bash
operator:x:11:0:operator:/root:/sbin/nologin

#-c:計算匹配“搜索的字符串”的行數：統計/etc/passwd文件中包含“root"字符串的行，從-n結果中即能確定
#   為4行
[root@localhost tmp]# grep -c "root" /etc/passwd
4

#-o:僅打印匹配到的字符串
[root@localhost tmp]# grep -o "aroot" /etc/passwd
aroot
aroot

#-A NUM：除了顯示匹配行外，并顯示匹配行后的指定數量 NUM 行
[root@localhost tmp]# grep -A1 "apache" /etc/passwd
apache:x:48:48:Apache:/var/www:/sbin/nologin
saslauth:x:498:76:"Saslauthd user":/var/empty/saslauth:/sbin/nologin

#-B NUM：除了顯示匹配行外，并顯示匹配行前的指定數量 NUM 行
[root@localhost tmp]# grep -B1 "apache" /etc/passwd
ntp:x:38:38::/etc/ntp:/sbin/nologin
apache:x:48:48:Apache:/var/www:/sbin/nologin

#-C NUM: 除了顯示匹配行外，并顯示匹配行前后的指定數量 NUM 行
[root@localhost tmp]# grep -C1 "apache" /etc/passwd
ntp:x:38:38::/etc/ntp:/sbin/nologin
apache:x:48:48:Apache:/var/www:/sbin/nologin
saslauth:x:498:76:"Saslauthd user":/var/empty/saslauth:/sbin/nologin

b)不常用選項：

#-x:完整行匹配：在搜索條件中，需輸入整行字符，下例從shell.sh中匹配包含”#！/bin/bash“行的行
[root@localhost scripts]# grep -x "#\!/bin/bash" shell.sh 
#!/bin/bash

#-l:--files-with-matches  只打印包含匹配字符串的文件名 ：如果/etc/passwd文件中存在root字符串，則打印
#   文件名，不存在，則不顯示
[root@localhost scripts]# grep -l "root" /etc/passwd
/etc/passwd

#-L:--files-without-match 只打印不包含匹配字符串的文件名，與-l選項正好相反：如果/etc/passwd文件中
#   不存在ro0ot字符串，則打印文件名，存在，則不顯示
[root@localhost scripts]# grep -L "ro0ot" /etc/passwd
/etc/passwd

#-f:從文件中提取模板,如果為空文件則什么都不匹配：新建一個test.txt，包含aroot\roota，在從test.txt中
#   提取為模板，匹配/etc/passwd中包含模板的行
[root@localhost tmp]# cat test.txt 
aroot
roota
[root@localhost tmp]# grep -f test.txt /etc/passwd
roota:x:33130:33130::/home/roota:/bin/bash
aroot:x:33131:33131::/home/aroot:/bin/bash
#還可以結合重定向使用：
[root@localhost tmp]# cat > test.in
aroot
broot       
#==>輸入ctrl+d中止輸入信號
[root@localhost tmp]# grep -f test.in /etc/passwd
aroot:x:33131:33131::/home/aroot:/bin/bash
broot:x:33132:33132::/home/broot:/bin/bash

#-e:指定范本文件，其內容含有一個或多個范本樣式，讓grep查找符合范本條件的文件內容：在/etc/passwd中匹
#   配包含aroot或roota字符串的行
[root@localhost tmp]# grep -e aroot -e roota /etc/passwd
roota:x:33130:33130::/home/roota:/bin/bash
aroot:x:33131:33131::/home/aroot:/bin/bash

#-q:安靜模式，不打印任何標準輸出,如果有匹配的內容則立即返回狀態值0
[root@localhost tmp]# grep -q "root" /etc/passwd
[root@localhost tmp]# echo $?
0

#-s:不顯示不存在或無匹配文本的錯誤信息：存在則匹配輸出，不存在則不輸出
[root@localhost tmp]# grep -s "aroot" /etc/passwd
aroot:x:33131:33131::/home/aroot:/bin/bash
[root@localhost tmp]# grep -s "ro0ot" /etc/passwd

#-H:在每個匹配的行前顯示絕對路徑文件名，如果存在多個搜索文件，則默認存在-H功能：結合-e選項使用
[root@localhost tmp]# grep -H -e aroot -e roota /etc/passwd
/etc/passwd:roota:x:33130:33130::/home/roota:/bin/bash
/etc/passwd:aroot:x:33131:33131::/home/aroot:/bin/bash

#-h:匹配的行前不顯示絕對路徑文件名，默認存在于單個搜索文件前提下：多個搜索文件，默認存在-H功能，加
#   上-h選項，則不顯示絕對路徑文件名了，看下兩例對比
[root@localhost ~]# grep "root" /tmp/aroot.txt /tmp/roota.txt 
/tmp/aroot.txt:aroot
/tmp/roota.txt:roota
[root@localhost ~]# grep -h "root" /tmp/aroot.txt /tmp/roota.txt 
aroot
roota

#-b:顯示在每一行輸出前的輸入字節的偏移量：通過wc統計，你會發現，第一行加第二行正好為65，前三行相加
#   為105
[root@localhost ~]# grep -b bin  /etc/passwd  --color=auto
0:root:x:0:0:root:/root:/bin/bash
32:bin:x:1:1:bin:/bin:/sbin/nologin
65:daemon:x:2:2:daemon:/sbin:/sbin/nologin
105:adm:x:3:4:adm:/var/adm:/sbin/nologin

[root@localhost ~]# head -n 1 /etc/passwd | wc -m
32
[root@localhost ~]# head -n 2 /etc/passwd | tail -n 1 | wc -m
33
[root@localhost ~]# head -n 3 /etc/passwd | tail -n 1 | wc -m
40

#-m NUM:在找到指定數量 NUM 的匹配行后停止讀文件
[root@localhost ~]# grep -m 2 "root" /etc/passwd
root:x:0:0:root:/root:/bin/bash
operator:x:11:0:operator:/root:/sbin/nologin

#-R, -r, --recursive：遞歸
[root@localhost ~]# grep -r "passwd" /etc
Binary file /etc/prelink.cache matches
/etc/rpc:yppasswdd      100009  yppasswd
/etc/rpc:nispasswd      100303  rpc.nispasswdd
Binary file /etc/vmware-tools/plugins/vmsvc/libgrabbitmqProxy.so matches
/etc/default/nss:#  If set to TRUE, the passwd routines in the NIS NSS module will not

3.2正則表達式測試案例
3.2.1、基本正則表達式案例

a）錨定符

#^ :行首錨定符    :查找以root開頭的行
[root@localhost ~]# grep "^root" /etc/passwd
root:x:0:0:root:/root:/bin/bash
roota:x:33130:33130::/home/roota:/bin/bash

#$ :行尾錨定符    :查找以nologin結尾的行
[root@localhost ~]# grep "nologin$" /etc/passwd
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin

#\<:詞首錨定符    :查找以root作為單詞首部的行
[root@localhost ~]# grep "\<root" /etc/passwd
root:x:0:0:root:/root:/bin/bash
operator:x:11:0:operator:/root:/sbin/nologin
roota:x:33130:33130::/home/roota:/bin/bash

#\>:詞尾錨定符    :查找以root作為單詞詞尾的行
[root@localhost ~]# grep "root\>" /etc/passwd
root:x:0:0:root:/root:/bin/bash
operator:x:11:0:operator:/root:/sbin/nologin
aroot:x:33131:33131::/home/aroot:/bin/bash

#\b:位于詞首前相當于\<；位于詞尾后，相當于\> ，詞首詞尾均錨定相當于參數-w  :匹配/etc/passwd中包含單
#   詞"root"的行
[root@localhost ~]# grep "\broot\b" /etc/passwd
root:x:0:0:root:/root:/bin/bash
operator:x:11:0:operator:/root:/sbin/nologin

b）字符、次數匹配

#.：匹配單個字符
[root@localhost ~]# grep "ar..t" /etc/passwd
ftp:x:14:50:FTP User:/var/ftp:/sbin/nologin
aroot:x:33131:33131::/home/aroot:/bin/bash

#*：匹配0個或多個重復位于星號前的字符 ：從/etc/passwd中匹配rt、rot、root、roo*t
[root@localhost ~]# grep "ro*t" /etc/passwd
root:x:0:0:root:/root:/bin/bash
operator:x:11:0:operator:/root:/sbin/nologin
vcsa:x:69:69:virtual console memory owner:/dev:/sbin/nologin
#如果要用選項*匹配r與t之間到少兩個以上的o，則需用rooo*
[root@localhost ~]# grep "rooo*" /etc/passwd
root:x:0:0:root:/root:/bin/bash
operator:x:11:0:operator:/root:/sbin/nologin

#[]:匹配一組字符中的任意一個   從/etc/passwd中匹配包含aroot或broot的行
[root@localhost ~]# grep "[ab]root" /etc/passwd
aroot:x:33131:33131::/home/aroot:/bin/bash
broot:x:33132:33132::/home/broot:/bin/bash

#[^]:取反  :匹配/etc/passwd中不包含root的行，如果案例，你會發現包含root行也會匹配成功，這是因為這些
#    行還有很多非root字符，所以成功匹配
[root@localhost tmp]# grep "[^root]" /etc/passwd 
root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
#可以這樣用：匹配非root開頭的行
[root@localhost tmp]# grep "^[^root]" /etc/passwd
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin

#\{m\}：出現m次    ：匹配/etc/passwd中字母o連續出現2次的行
[root@localhost ~]# grep "o\{2\}"  /etc/passwd
root:x:0:0:root:/root:/bin/bash
lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin

#\{m,n\}：最少出現m次，最多出現n次    
[root@localhost ~]# grep "ro\{2,4\}"  /etc/passwd
root:x:0:0:root:/root:/bin/bash
operator:x:11:0:operator:/root:/sbin/nologin

#\(\):分組引用，引用：\1, \2, \3   ：匹配test.conf文件中以15開頭且以15結尾的行
[root@localhost tmp]# grep "^\(15\).*\1$" test.conf 
15:this is test file 15

c)特殊符號

#[:alnum:]:表示數字與大小寫字母[0-9a-zA-Z]
[root@localhost tmp]# grep "[[:alnum:]]" test.conf 
15379111
this is test file 
THIS IS TEST FILE
This is test file

#[:alpha:]:表示大小寫字母[a-zA-Z]
[root@localhost tmp]# grep "[[:alpha:]]" test.conf 
this is test file 
THIS IS TEST FILE
This is test file

#[:digit:]:表示數字
[root@localhost tmp]# grep "[[:digit:]]" test.conf 
15379111

#[:lower:]:代表小寫字母
[root@localhost tmp]# grep "[[:lower:]]" test.conf 
this is test file 
This is test file

#[:upper:]:代表大寫字母
[root@localhost tmp]# grep "[[:upper:]]" test.conf 
THIS IS TEST FILE
This is test file

#[:punct:]:代表標點符號
[root@localhost tmp]# grep "[[:punct:]]" test.conf 
This is test file.

#[:space:]:代表空白鍵
[root@localhost tmp]# grep "[[:space:]]" test.conf 
1537911    1
this is test file

3.2.2、擴展正則表達式案例
使用方法及參數與基本正則表達式一致，與之不一樣的是特殊字符無需轉義（詞首和詞尾錨定除外），另新增了幾個參數，詳情如下：
a)、特殊字符無轉使用轉義符

#()：分用引用，相當于grep \(\)   ：從test.conf文件中匹配以15開頭且以15結尾的行
[root@localhost tmp]# egrep "^(15).*\1" test.conf 
15:THIS IS TEST FILE 15

#{m}、{m,n}:與grep使用方法一致，同（）一樣無須轉義符而已

#+：匹配其前導字符最少一次  ：從/etc/passwd中匹配包含ro字符串，且字母至少出現一次以上的行
[root@localhost tmp]# egrep "ro+" /etc/passwd
root:x:0:0:root:/root:/bin/bash
operator:x:11:0:operator:/root:/sbin/nologin
rtkit:x:499:497:RealtimeKit:/proc:/sbin/nologin

#？：匹配其前導字符0次或1次
[root@localhost tmp]# egrep "roo?" /etc/passwd
root:x:0:0:root:/root:/bin/bash
operator:x:11:0:operator:/root:/sbin/nologin
rtkit:x:499:497:RealtimeKit:/proc:/sbin/nologin

#| ：或的意思，a|b；匹配a或b    從/etc/passwd中匹配aroot或broot
[root@localhost tmp]# egrep "[a|b]root" /etc/passwd
aroot:x:33131:33131::/home/aroot:/bin/bash
broot:x:33132:33132::/home/broot:/bin/bash

3.2.3、快速正則表達式案例

root@chenss test]# man gcc | tr -cs "[:alpha:]" "\n" > out.conf        
#                    ==>創建純字符串文本，grep提取做“搜索字符串”用
[root@chenss test]# time `man gcc | grep -F -f out.conf > /dev/null`   
#                    ==>測試fgrep提取out.conf為搜索字符串來匹配man gcc所消耗的時間
real    0m1.264s
user    0m1.235s
sys    0m0.128s
[root@chenss test]# time `man gcc | grep -f out.conf > /dev/null`      
#                    ==>測試grep提取out.conf為搜索字符串來匹配man gcc所消耗的時間
real    12m26.280s
user    12m25.121s
sys    0m1.559s
#對比結果告訴我們，純字符串匹配時，fgrep比grep速度快的不是一點半點。

原創文章，作者：老鼠上了貓，如若轉載，請注明出處：http://www.www58058.com/5609

egrep fgrep，正則表達式，grep

贊 (0)

老鼠上了貓

3

設計模式（七）組合模式Composite（結構型）

上一篇 2015-07-01 10:19

設計模式（八）裝飾器模式Decorator（結構型）

下一篇 2015-07-03 10:32

bash的基礎特性之一

bash的基礎特性之一命令歷史：shell進程會保存會話中此前用戶使用過的命令； history：命令的用法 history 【-c】【-d #】【n】或者【文件名】 -c：清空命令歷史 -d 【#】：刪除指定的命令歷史…

Linux干貨 2016-12-18
Linux干貨

7.11 centos 7安裝重點之磁盤分區+SecureCRT 8.0安裝步驟

一：Centos安裝重點之磁盤分區 0.前言 0.1 常見的磁盤接口有兩種，IDE與SATA接口，目前主流的為SATA接口 0.2 關于主分區、擴展分區、邏輯分區的特性 ①…

2017-07-11
為什么中國的網頁設計那么爛？

Nick Johnson，一個有12年經驗的Web設計師在它的blog里寫下了“Why is Chinese Web Design So Bad”，新浪，人人，百度，阿里巴巴，騰訊榜上有名。其中的觀點相當的好，希望所有的中國人都讀一下。我不全文翻譯了，只是給大家看一些摘要。（保證不會像《環球時報》一樣） —————————— 作者2005年的夏天來到中國，他…

Linux干貨 2016-07-11
Linux系統vim文本編輯器

&nbsp…

Linux干貨 2016-08-15
Linux第九周總結

1、寫一個腳本，判斷當前系統上所有用戶的shell是否為可登錄shell（即用戶的shell不是/sbin/nologin）；分別這兩類用戶的個數；通過字符串比較來實現； #!/bin/bash # login_user=0 nologin_user=0 for i in $(cat /etc/passwd | cut -d : -f 7);do &nbsp…

Linux干貨 2017-08-28
文件查找：find命令、locate命令；Linux文件系統上的權限

文件查找：find、locate locate：依賴事先構建的索引，是在系統空閑周期性自動進行；手動更新（updatedb）；極其消耗資源； find [option]… [查找路徑] [查找條件] [處理動作] 查找條件：根據文件名查找： -name “文件名稱”：支持使用通配符glob（*，？，[]，[…

Linux干貨 2017-12-14

評論列表（3條）

stanley 2015-07-01 10:20

有些地方的排版好像不是特別優美,可以再看看調整調整
- 老鼠上了貓 2015-07-02 09:03
  
  @stanley：審核通過之后我就沒有再編輯的權限了，只能下次注意了
Pavel86 2015-07-28 22:40

寫得很詳細, 受教受教.

欧美性久久久久