pcre

Perl Compatible Regular Expressions (or pcre) are not supported by grep. There is a pcregrep that supports it, but traditional grep on either FreeBSD or Linux lacks support.

This table compares single-letter pcre character class to POSIX names:

Description

POSIX

PCRE Single-Letter

Word boundaries

See below list for support

\b

Digits (0-9)

[[:digit:]] or [0-9]

\d

Non-Digits

[^[:digit:]] or [^0-9]

\D

Whitespace characters

[[:space:]] or [ \t\r\n\v\f]

\s

Non-whitespace characters

[^[:space:]] or [^ \t\r\n\v\f]

\S

Alpha-numeric and underscore

[[:alnum:]_] or [A-Za-z0-9_]

\w

Non-word characters

[^[:alnum:]_] or [^A-Za-z0-9_]

\W

Word boundary support:

  • awk can use (^|[^_[:alnum:]]|$)

  • grep can use \(^\|[^_[:alnum:]]\|$\) or \(\<\|\>\)

  • egrep and grep -E can use (^|[^_[:alnum:]]|$) or (\<|\>)

In pcre, the word boundary test (\b) works for either the left- or right-side of a word. This is different than the \< and \> word boundary sequences supported by grep which have to be used on the appropriate side of a word. This chapter focuses on pcre \b word bounding for awk . For information on \< and \> support for awk, see the previous chapter.

Single-letter pcre support can be implemented in awk for all platforms using:

 1 #!/usr/bin/awk -f
 2 function expand(seq)
 3 { 
 4     return seq == "\\b" ? "(^|[^_[:alnum:]]|$)" : \
 5         seq == "\\d" ? "[[:digit:]]" : \
 6         seq == "\\D" ? "[^[:digit:]]" : \
 7         seq == "\\s" ? "[[:space:]]" : \
 8         seq == "\\S" ? "[^[:space:]]" : \
 9         seq == "\\w" ? "[[:alnum:]_]" : \
10         seq == "\\W" ? "[^[:alnum:]_]" : \
11         seq
12 }
13 
14 function pcre(re,        head, repl, tail, rstr)
15 {
16     tail = re
17     while (match(tail, "\\\\[bdDsSwW]"))
18     {
19         head = substr(tail, 1, RSTART - 1) # text before match
20         repl = substr(tail, RSTART, RLENGTH) # match to replace
21         tail = substr(tail, RSTART + RLENGTH) # text after match
22         if ((match(head, /\\+$/) ? RLENGTH + 1 : 1) % 2 == 1)
23             repl = expand(repl)
24         rstr = rstr head repl
25     }
26     return rstr tail
27 }
28 
29 # Test code for processing sample regex from stdin or file argument
30 { print $0 " -> " pcre($0) }

Last updated