2.4.5.3. - Regular Expression Functions
Regular expressions are sequences of special characters
for searching for patterns in strings.
spec implements extended regular expression using
the C library
regcomp()
and
regexec()
functions,
which have a somewhat platform-dependent implementation.
See the regular expression man page (
man 7 regex
on
Linux
and
man re_format
on OS X) for details of regular expression syntax.
The names and usage of the following spec functions resemble those used in
the UNIX
awk
(or
gawk)
utility.
(These functions added in spec release 6.03.04.)
rsplit(str, arr, regex)-
Similar to
split()above, but the optional delimiter argument can be a regular expression. The stringstris split into elements that are delimited by the regular expressionregexand the resulting substrings are assigned to successive elements of the arrayarr, starting with element 0. The delimiting characters are eliminated. Returns the number of elements assigned. sub(regex, sub, str)-
Replaces the first instance of the regular expression
regexin the source stringstrwith the substitute stringsub. An&in the substitute string is replaced with the text that was matched by the regular expression. A\&(which must be typed as"\\&") will produce a literal&. Returns the modified string. gsub(regex, sub,Ostr)-
Replaces all instances of the regular expression
regexin the source stringstrwith the substitute stringsub. An&in the substitute string is replaced with the text that was matched by the regular expression. A\&(which must be typed as"\\&") will produce a literal&. Returns the modified string. gensub(regex, sub, which, str)-
Replaces instances of the regular expression
regexin the source stringstrwith the substitute stringsubbased on the value ofwhich. Ifwhichis a string beginning withGorg(for global), all instances that match are replaced. Otherwise,whichis a positive integer that indicates which match to replace. For example, a2means replace the second match.
In addition, the substitute text may contain the sequences\N(which must be typed as"\\N"), whereNis a digit from 0 to 9. That sequence will be replaced with the text that matches theNth parenthesized subexpression inregex. A\0is replaced with the text that matches the entire regular expression. Returns the modified string. match(str, regex [, arr])-
Returns the position in the source string
strthat matches the regular expressionregex. The first position is 1. Returns 0 if there is no match or -1 if the regular expression is invalid. If the associative arrayarris provided, its contents are cleared and new elements are assigned based on the consecutive matching parenthesized subexpressions inregex. The zeroth element,arr\fC[0], is assigned the entire matching text, whilearris assigned the starting position of the match and[0]["start"]arris assigned the length of the match. Elements from 1 onward are assigned matches, positions and lengths of the corresponding matching parenthesized subexpressions in[0]["length"]regex.
