spec

Software for Diffraction

2.4.5.3. - Regular Expression Functions

Regular expressions are sequences of special characters for searching for patterns in strings. spec implements extended regular expression using the C library regcomp() and regexec() functions, which have a somewhat platform-dependent implementation. See the regular expression man page (man 7 regex on Linux and man re_format on OS X) for details of regular expression syntax. The names and usage of the following spec functions resemble those used in the UNIX awk (or gawk) utility. (These functions added in spec release 6.03.04.)


rsplit(str, arr, regex)
Similar to split() above, but the optional delimiter argument can be a regular expression. The string str is split into elements that are delimited by the regular expression regex and the resulting substrings are assigned to successive elements of the array arr, starting with element 0. The delimiting characters are eliminated. Returns the number of elements assigned.

sub(regex, sub, str)
Replaces the first instance of the regular expression regex in the source string str with the substitute string sub. An & in the substitute string is replaced with the text that was matched by the regular expression. A \& (which must be typed as "\\&") will produce a literal &. Returns the modified string.

gsub(regex, sub, Ostr)
Replaces all instances of the regular expression regex in the source string str with the substitute string sub. An & in the substitute string is replaced with the text that was matched by the regular expression. A \& (which must be typed as "\\&") will produce a literal &. Returns the modified string.

gensub(regex, sub, which, str)
Replaces instances of the regular expression regex in the source string str with the substitute string sub based on the value of which. If which is a string beginning with G or g (for global), all instances that match are replaced. Otherwise, which is a positive integer that indicates which match to replace. For example, a 2 means replace the second match.

In addition, the substitute text may contain the sequences \N (which must be typed as "\\N"), where N is a digit from 0 to 9. That sequence will be replaced with the text that matches the Nth parenthesized subexpression in regex. A \0 is replaced with the text that matches the entire regular expression. Returns the modified string.

match(str, regex [, arr])
Returns the position in the source string str that matches the regular expression regex. The first position is 1. Returns 0 if there is no match or -1 if the regular expression is invalid. If the associative array arr is provided, its contents are cleared and new elements are assigned based on the consecutive matching parenthesized subexpressions in regex. The zeroth element, arr\fC[0], is assigned the entire matching text, while arr[0]["start"] is assigned the starting position of the match and arr[0]["length"] is assigned the length of the match. Elements from 1 onward are assigned matches, positions and lengths of the corresponding matching parenthesized subexpressions in regex.