# pcre

*Perl Compatible Regular Expressions* (or pcre) are not supported by `grep`. There is a `pcregrep` that supports it, but traditional `grep` on either FreeBSD or Linux lacks support.

This table compares single-letter pcre character class to POSIX names:

| Description                  | POSIX                              | PCRE Single-Letter |
| ---------------------------- | ---------------------------------- | ------------------ |
| Word boundaries              | See below list for support         | `\b`               |
| Digits (0-9)                 | `[[:digit:]]` or `[0-9]`           | `\d`               |
| Non-Digits                   | `[^[:digit:]]` or `[^0-9]`         | `\D`               |
| Whitespace characters        | `[[:space:]]` or `[ \t\r\n\v\f]`   | `\s`               |
| Non-whitespace characters    | `[^[:space:]]` or `[^ \t\r\n\v\f]` | `\S`               |
| Alpha-numeric and underscore | `[[:alnum:]_]` or `[A-Za-z0-9_]`   | `\w`               |
| Non-word characters          | `[^[:alnum:]_]` or `[^A-Za-z0-9_]` | `\W`               |

Word boundary support:

* `awk` can use `(^|[^_[:alnum:]]|$)`
* `grep` can use `\(^\|[^_[:alnum:]]\|$\)` or `\(\<\|\>\)`
* `egrep` and `grep -E` can use `(^|[^_[:alnum:]]|$)` or `(\<|\>)`

In pcre, the word boundary test (`\b`) works for either the left- or right-side of a word. This is different than the `\<` and `\>` word boundary sequences supported by `grep` which have to be used on the appropriate side of a word. This chapter focuses on pcre `\b` word bounding for `awk` . For information on `\<` and `\>` support for `awk`, see the previous chapter.

Single-letter pcre support can be implemented in `awk` for all platforms using:

```
 1 #!/usr/bin/awk -f
 2 function expand(seq)
 3 { 
 4     return seq == "\\b" ? "(^|[^_[:alnum:]]|$)" : \
 5         seq == "\\d" ? "[[:digit:]]" : \
 6         seq == "\\D" ? "[^[:digit:]]" : \
 7         seq == "\\s" ? "[[:space:]]" : \
 8         seq == "\\S" ? "[^[:space:]]" : \
 9         seq == "\\w" ? "[[:alnum:]_]" : \
10         seq == "\\W" ? "[^[:alnum:]_]" : \
11         seq
12 }
13 
14 function pcre(re,        head, repl, tail, rstr)
15 {
16     tail = re
17     while (match(tail, "\\\\[bdDsSwW]"))
18     {
19         head = substr(tail, 1, RSTART - 1) # text before match
20         repl = substr(tail, RSTART, RLENGTH) # match to replace
21         tail = substr(tail, RSTART + RLENGTH) # text after match
22         if ((match(head, /\\+$/) ? RLENGTH + 1 : 1) % 2 == 1)
23             repl = expand(repl)
24         rstr = rstr head repl
25     }
26     return rstr tail
27 }
28 
29 # Test code for processing sample regex from stdin or file argument
30 { print $0 " -> " pcre($0) }
```
