array
declare a data array
Description
Unlike the "associative" arrays that have always been part of spec, data arrays (introduced in spec release 4) are more like those used in math and in programming. While associative arrays are indexed by arbitrary strings or numbers and can store either strings or numbers, data arrays are indexed by consecutive integers (starting from zero, as is the C convention) and hold a specific data type, such as short integer, float, double, etc.
Data arrays must be specifically declared and dimensioned using the array keyword (unlike associative arrays, which can come into existence when used in an expression). The arrays can have one or two dimensions. The mca_get() and image_get() functions can directly fill the arrays with data from one- or two-dimensional detectors.
Data arrays can be used in expressions containing the standard arithmetic operators to perform simultaneous operations on each element of the array. In addition, a subarray syntax provides a method for performing assignments, operations and functions on only portions of the array.
The functions array_dump(), array_fit(), array_op(), array_copy(), array_plot(), array_pipe() and array_read() handle special array operations. The functions fmt_read() and fmt_write() provide a means to transfer array data to and from binary-format data files. Also, the functions mca_get(), mca_sget(), mca_put(), mca_sput(), image_get() and image_put() accept array arguments.
The print command will print data arrays in a concise format on the screen, giving a count of repeated columns and rows, rather than printing each array element.
Array data can be placed in shared memory, making the data accessible to other processes, such as image-display or data-crunching programs. The shared arrays can both be read and written by the other processes. The implementation includes a number of special aids for making the processes work smoothly with spec.
Usage
Data arrays must be declared with the array key word. One- and two-dimensional arrays are declared as:
[shared] [type] array var[cols] [shared] [type] array var[rows][cols]
On platforms that support System V Interprocess Communication (IPC) calls, the shared keyword causes spec to place the array in shared memory (see below). The type keyword specifies the storage type of the array and may be one of byte, ubyte, short, ushort, long, ulong, long64, ulong64, float, double or string. An initial u denotes the "unsigned" version of the data type. The short and ushort types are 16-bit (two-byte) integers. The long and ulong types are 32-bit (four-byte) integers. The long64 and ulong64 types are 64-bit (eight-byte) integers. The float type uses four bytes per element. The double type uses eight bytes per element. The default data type is double.
The array name var is an ordinary spec variable name. Arrays are global by default, although they may also be declared local within statement blocks.
Unlike traditional spec associative arrays, which can store and be indexed by arbitrary strings or numbers, a data array is indexed by consecutive integers (starting from zero), and can hold only numbers, or in the case of string arrays, only strings.
Operations on these arrays can be performed on all elements of the array at once, or on one or more blocks of elements. Consider the following example:
array a[20] a = 2 a[3] = 3 a[10:19] = 4 a[2,4,6,10:15] = 5
The first expression assigns the value 2 to all twenty elements of the array. The second expressions assigns 3 to one element. The third assign the value 4 to the tenth through last element. The final expression assigns the value 5 to the elements listed.
A negative number as an array index counts elements from the end of the array, with a[-1] referring to the last element of a.
As per the usual conventions, the first index is row and the second is column. Note, however, spec considers arrays declared with one dimension to be a single row. For example,
array a[20]
is a one-row, twenty-column array. Use
array a[20][1]
to declare a 20-row, one-column array.
Array values can be initialized in the declaration (as of spec release 6.10.02):
array a[10] = [ 1, 2, 3 ] string array a[20] = [ "this is a test" ] ushort array a[2][3] = [ 1, 2, 3, 4, 5, 6 ]
Unassigned array elements are set to zero. Surplus initializers are ignored. For 2D arrays, elements are assigned row by row. In the last example above, the first row becomes { 1, 2, 3 } and the second row is { 4, 5, 6 }. If a single value appears on the right hand side with no square brackets, each element of the array is assigned that value.
Also note well, all operations between two arrays are defined as element-by-element operations, not matrix operations, which are currently unimplemented in spec. In the following example:
array a[5][5], b[5][5], c[5][5] c = a * b
c[i][j] is the product a[i][j] * b[i][j] for each i and j.
When two array operands have different dimensions the operations are performed on the elements that have dimensions in common. In the case
array a[5][5], b[5], c[5][5] c = a * b
only the first row of c will have values assigned, since b only has one row. The remaining elements of c are unchanged by the assignment.
Portions of the array can be accessed using the subarray syntax, which uses colons and commas, as in the following examples:
array a[10][10] a[1] # second row of a a[2:4][] # rows 2 to 4 a[][2:] # all rows, cols 2 to last a[1,3,5,7,9][3:7] # odd rows and cols 3 to 7
The elements of an array can be accessed in reverse order, as in:
a = x[-1:0]
which will assign to a the reversed elements of x. Note, though, that presently, an assignment such as x = x[-1:0] will not work properly, as spec will not make a temporary copy of the elements. However, x = x[-1:0]+0 will work.
The functions fabs(), int(), cos(), acos(), sin(), asin(), tan(), atan(), deg(), rad(), exp(), exp10(), log(), log10(), pow() and sqrt() can all take arrays as an argument. The functions perform the operation on each element of the array argument and return the results in an array of the same dimension as the argument array.
The operations <, <=, !=, ==, > and >= can be used with array arguments. The Boolean result (0 or 1) will be assigned to each element of an array returned as the result of the operation, based on the element-by-element comparison of the operands.
The bit-wise operators ~, |, &, >> and << can also be used with array operands.
Note, spec generally uses double-precision floating point for storing intermediate values and for mathematical operations. Double-precision floating point has only 52 bits for the significand (the remaining 12 bits are for sign and exponent). Thus, for most operations the 64-bit types will only maintain 52 bits of significance. (The 64-bit integer types were added in spec release 6.01.)
Although global data array names, data types and dimensions are saved in the user state file to be restored when restarting spec, data array values are not saved between sessions.
String Arrays
Arrays of type string are identical to byte arrays in terms of storage requirements and behavior in most operations. However, when used as described below, the string arrays do behave differently.
If a row of a string array represents a number and is used in a conditional expression, then the value of the conditional expression will be the number. For example, the strings "0.00" or "0x000" will evaluate as zero or false in a conditional expression. In contrast, for number arrays, a conditional evaluates as zero only if every element of the array is zero.
Functions that take string arguments, such as on(), length(), unix(), etc., will allow a row of a string array to be used as an argument. Use of a number array is invalid and produces an error.
The print command will print string arrays as ASCII text, while byte arrays display each byte as a number.
In assignments to a row of a string array, the right hand side is copied to the byte elements of the string array as a string, even if the right hand side is a number. Any remaining elements of the string array row are set to zero. Thus, the results differ in the assignments below:
string array arr_string[20] arr_string = 3.14159 print arr_string 3.14159 byte array arr_byte[20] arr_byte = 3.14159 print arr_byte {3 <20 repeats>}
In the first example, the string representation of the number is copied to the row of the string array, while in the second, each element of the array is assigned the (truncated) value of the number.
Row-wise and Column-wise Sense
For the functions array_dump(), array_fit(), array_pipe(), array_plot() and array_read() it matters whether each row or each column of a two-dimensional array corresponds to a data point. By default, spec takes the larger dimension to correspond to point number, and if both dimensions are the same, to use the rows as data points. The "row_wise" and "col_wise" arguments to array_op(), described below, can be used to force the sense of an array one way or the other, regardless of the array dimensions. If an array has row-wise sense, the contents of each row correspond to a data point, and one might then plot the contents of column two of each row versus column one, for example.
Although the built-in functions can work with either row-wise or column-wise arrays, the standard plotting and data analysis macros only work correctly with row-wise arrays.
Built-in Functions
- array_op("cmd", a [, args ...])
Performs the following operations on the arguments as follows:
- "fill"
Fills the array a with values. For a two-dimensional array,
array_op("fill", a, u, v)
produces
a[i][j] = u * i + v * j
With subarrays, i and j refer to the subarray index. Also, i and j always increase, even for reversed subarrays, so
array_op("fill", a[-1:0][-1:0], 1, 1)
fills a in reverse order.
- "contract"
- With array_op("contract", a, u, v), returns a new array with dimensions contracted by a factor of u in rows and v in columns. Elements of the new array are formed by averaging every u elements of each row with every v elements of each column. If there are leftover rows or columns, they are averaged also.
- "every"
- With array_op("every", arr, n), returns a new array consisting of every n-th element of arr. If arr is a 2D array, returns every n-th row. With a 2D array, array_op("every", arr, n, m) returns a new array consisting of every n-th row and m-th column.
- "min" or "gmin"
- Returns the minimum value contained in the array.
- "max" or "gmax"
- Returns the maximum value contained in the array.
- "i_at_min" or "i_at_gmin"
- Returns the index number of the minimum value of the array. For a two-dimensional array dimensioned as D[N][M], the index number of element D[i][j] is (i * M) + j. If a is a subarray, the index is with respect to the full array, although the minimum is the minimum value in the specified subarray.
- "i_at_max" or "i_at_gmax"
- Returns the index number of the maximum value of the array. See "i_at_min" for subarray considerations.
- "row_at_min" or "rmin"
- Returns the row number containing the minimum value of the array. If a is a subarray, the row is with respect to the full array, although the minimum is the minimum value in the specified subarray.
- "row_at_max" or "rmax"
- Returns the row number containing the maximum value of the array. If a is a subarray, the row is with respect to the full array, although the maximum is the maximum value in the specified subarray.
- "col_at_min" or "cmin"
- Returns the column number containing the minimum value of the array. If a is a subarray, the column is with respect to the full array, although the minimum is the minimum value in the specified subarray.
- "col_at_max" or "cmax"
- Returns the column number containing the maximum value of the array. If a is a subarray, the column is with respect to the full array, although the maximum is the maximum value in the specified subarray.
- "i_<=_value"
- Returns the index number of the nearest element of the array with a value at or less than u. For a two-dimensional array dimensioned as D[N][M], the index number of element D[i][j] is (i * M) + j. Unlike "i_at_min", "i_at_max", etc., if a is a subarray, the index is with respect to the subarray.
- "i_>=_value"
- Returns the index number of the nearest element of the array with a value at or greater than u, starting from the last element. For a two-dimensional array dimensioned as D[N][M], the index number of element D[i][j] is (i * M) + j. Unlike "i_at_min", "i_at_max", etc., if a is a subarray, the index is with respect to the subarray.
- "sum" or "gsum"
Returns the sum of the elements of the array. If there is a third argument greater than zero, the array is considered as a sequence of frames, with the third argument the number of rows in each frame. The return value is a new array with that number of rows and the same number of columns as the original array. Each element of the returned array is the sum of the corresponding elements of each frame. For example, if the original array is dimensioned as data[N][M], the return value for:
a = array_op("sum", data, R)
is a new array of dimension a[N/R][M], where each element a[i][j] is the sum of k from 0 to R - 1 of data[i + k * N / R][j].
- "sumsq"
- Returns the sum of the squares of the elements of the array. If there is a third argument and it is greater than zero, the interpretation is the same as above for "sum", except the elements in the returned array are sums of squares of the elements in the original array.
- "transpose"
- Returns a new array of the same type with the rows and columns switched.
- "updated?"
- Returns nonzero if the data in the array has been accessed for writing since the last check, otherwise returns zero.
- "rows"
- Returns the number of rows in the array.
- "cols"
- Returns the number of columns in the array.
- "type"
- Returns a string containing the array data type. Possible return values are "byte", "ubyte", "short", "ushort", "long", "ulong", "long64", "ulong64", "float", "double" and "string".
- "row_wise"
- With a nonzero third argument, forces the array_dump(), array_fit(), array_pipe(), array_plot() and array_read() functions to treat the array as row-wise, meaning each row corresponds to a data point. With only two arguments, returns nonzero if the array is already set to row-wise mode.
- "col_wise"
- As above, but sets or indicates the column-wise sense of the array.
- "sort"
- Returns an ascending sort of the array. A 2D array is sorted by rows (as of spec release 6.08.08). When sorting by rows, the value in the first column is used by default. A different column can be specified as an optional argument. Column numbers start at zero.
- "merge"
- Merges rows of a 2D array by averaging values in each column when values in a particular column of adjacent rows are identical or within a threshold. The default threshold value is zero, but can be set via the first optional argument. The column number used for comparison defaults to zero, but can be specified by a second optional argument. A threshold of zero means only rows with identical values in the merge column will be averaged together. The return value is a new array with data type double. The returned array will have the same number of columns as the input array, but possibly fewer rows. This feature can be useful for merging overlapping scans. (This option added in spec release 6.08.08.)
- "swap"
- Swaps the bytes of the named array. The command can change big-endian integer data to little-endian and vice versa. Works with 16-, 32- and 64-bit types. For most built-in data collection, spec automatically swaps bytes as appropriate, but this function is available for other cases that may come up.
- "frame_size"
- The number of rows in a frame. The frame size is part of the shared array header and may be useful to auxiliary programs, although the value is maintained for non-shared arrays. Note, setting the frame size to zero will clear the "frames" tag. Setting the frame size to a non-zero value will set the "frames" tag.
- "latest_frame"
- The most recently updated frame. The latest frame is part of the shared array header and may be useful to auxiliary programs, although the value is maintained for non-shared arrays.
- "tag"
- Shared arrays can be tagged with a type that will be available to other processes accessing the array. Usage is array_op("tag", arr, arg) where arr is the array and arg is "mca", "image", "frames", "scan" or "info".
- "untag"
- Removes tag information.
- "info"
- Returns or sets the info field of a shared array segment. The field can be contain up to 512 bytes of arbitrary text. When setting the field, if the string argument is longer than 512 bytes, the first 512 bytes will be copied. The function returns the number of bytes copied, -1 if a is not a shared array or 0 if a is a shared array that doesn't support the "info" field. The "info" field is included in SHM_VERSION version 6 headers, added in spec release 6.00.08.
- "meta"
- Returns or sets the meta area of a shared array segment. With spec, the field can contain up to 8,192 bytes of arbitrary text. When setting the field, if the string argument is longer than 8,192 bytes, the first 8,192 bytes will be copied. The function returns the number of bytes copied, -1 if a is not a shared array or 0 if a is a shared array that doesn't support the "meta" field. The "meta" field is included in SHM_VERSION version 6 headers, added in spec release 6.00.08.
The following operations can be called with one or two array arguments, which represent a single row or single column. If called with one array argument, a temporary internal array formed using the index numbers of the argument array becomes the first array, while the argument array becomes the second array in the operation. If the array argument is a subarray, the index values are with respect to the original array. spec release 6.11.01 added the single array functionality.
- "fwhm"
- Returns the full-width in the first array at half the maximum value of the second array.
- "cfwhm"
- Returns the center of the full-width in the first array at half the maximum value of the second array.
- "uhmx"
- Returns the value in the first array corresponding to half the maximum value in the second array and at a higher index.
- "lhmx"
- Returns the value in the first array corresponding to half the maximum value in the second array and at a lower index.
- "com"
- Returns the center of mass in the first array with respect to the second array. The value is the sum of the products of each element of the first array and the corresponding element of the second array, divided by the number of points.
- "x_at_min"
- Returns the value of the element in the first array that corresponds to the minimum value in the second array.
- "x_at_max"
- Returns the value of the element in the first array that corresponds to the maximum value in the second array.
- array_copy(a, b [, c, ...])
Fills consecutive bytes in the destination array (or subarray) a with bytes from the arrays or strings in the subsequent arguments. The arrays can be of different types, which allows creating a binary data stream of mixed types. If a source argument is not a data array, the string value of the argument is copied.
As an example of how array_copy() might be useful, consider a device that sends and receives a binary stream consisting of four floats followed by two integers then seven more floats. Here is how to prepare a byte array containing the mixed binary data types:
float array float_d[11] ulong array long_d[2] ubyte array ubyte_d[52] # ... assign values to float_d and long_d, then ... array_copy(ubyte_d, float_d[0:3], long_d, float_d[4:]) sock_put("host", ubyte_d)
The source array is not erased prior to the copy. The above assignment could also be carried out as follows:
array_copy(ubyte_d[0:15,24:], float_d) array_copy(ubyte_d[16:], long_d)
Four floats consume sixteen bytes. Two integers consume eight bytes. The subarray notation selects the first sixteen bytes of ubyte_d for the first four floats, then skips eight bytes for where the integers will go, then copies the remainder of the floats. Since only as much data will be copied as is contained in the source array and since the source arrays are fixed size, it is not necessary to specify the final byte position in the destinations.
If the returned data uses the same format, floats and integers can be extracted using similar syntax:
sock_get("host", ubyte_d) array_copy(float_d, ubyte_d[0:15], ubyte_d[24:]) array_copy(long_d, ubyte_d[16:23])
The function will only copy as many bytes as fit into the preallocated space of the destination array.
If the source arguments are not data arrays, spec will take the string value of the argument and copy the ASCII value of each byte to corresponding bytes in the destination. The terminating null byte is not copied. If the argument is a number, the string value of the number is what one would see on the display with the print command.
The function returns the updated array a. If a is a subarray, the full array is returned. A -1 is returned if a is not a data array.
Note, this function allows arbitrary bytes to be copied to the elements of float and double arrays, which can result in undefined or not a number (NaN) values for those elements.
The array_copy() function appeared in spec release 6.00.07.
- array_fit(pars, a [, b, ...])
- Performs a linear fit of the data in the array a. The fitted parameters are returned in the array pars. The function returns the chi-squared value of the fit, if the fit was successful. A -1 is returned if the covariance matrix is singular. The fit algorithm is along the same lines as the lfit() routine in Numerical Recipes (W.H. Press, et al., Cambridge University Press, 1986, page 512).
- array_dump([file, ] a [, b, ...] [, options, ...])
Efficiently writes the data in the array a and optionally arrays b, ..., etc. If the initial optional file argument is given, the output is just to the named file or device. Otherwise, output is to all "on" output devices. If the file argument is the null string "", the formatted output is returned as a string, and nothing is written to the screen. The additional options arguments are strings that contain a percent sign as follows:
An optional format argument can specify a printf()-style format for the values. The default format is "%.9g", which prints nine digits of precision using fixed point or exponential format, choosing whichever is more appropriate to the value's magnitude. Recognized alternate format characters are e or E (exponential), f or F (fixed point), g or G (fixed or exponential based on magnitude), d or i (signed decimal integer), u (unsigned decimal integer), o (octal integer), x or X (hexadecimal integer), a or A (hexadecimal floating point). All formats accept standard options such as precision and field width. For example, "%15.8f" uses fixed-point format with eight digits after the decimal point and a fifteen-character-wide field. For the integer formats, double values will be converted to integers. Also, initial characters can be included in the format string, for example, "0x%08x" is valid.
The option "%D=c", specifies an alternate delimiter character c to replace the default space character delimiter that is placed between each element in a row of output. For example, one might use a comma, a colon or the tab character with "%D=,", "%D=:" or "%D=\t", respectively. Use "%D=" for no delimiter.
Also, by default, the output is one data row per line. Thus, for one-dimensional column-wise arrays, all elements will be printed on one line, while one-dimensional row-wise arrays will have just one data element per line. For two-dimensional arrays, each line will contain one row of data, unless modified as described next.
The number of elements per line can be controlled with the option "%#[C|W]". For one-dimensional arrays, the number # is the number of elements to print per line. For two-dimensional arrays, # is the number of rows to print per line. If an optional W is added, the number becomes the number of elements to print per line, which can split two-dimensional arrays at different points in the rows. If an optional C is added to the option string, a backslash will be added to each line where a row is split. (The C-PLOT scans.4 user function can properly interpret such "continued" lines for one-dimensional MCA-type array data.)
For example, for a 1D array,
array_dump(data, "%10C")
will print 10 elements per line, adding a backslash at the end of each line (except the last) if there are more than 10 elements in the array.
Finally, the various options can be combined in a single string. For example,
a = array_dump(data, "%15.4f", "%D=:", "%8W")
and
a = array_dump(data, "%15.4f%D=:%8W")
work the same.
- array_plot(a [, b, ...])
- Plots the data in the array a. Depending on whether a is a row-wise or column-wise array, the first column or first row elements are used for x. Subsequent elements (up to a maximum of 64) are plotted along the y axis. If preceded by a call of plot_cntl("addpoint") and the ranges have not changed, only the last point in the array is drawn. If preceded by a call of plot_cntl("addline") the current plot will not be erased, and the plot ranges will not be changed. The plotting area is not automatically erased by a call of array_plot() - use plot_cntl("erase") for that. The axis ranges are set using the plot_range() function. See plot_cntl() for other options that affect drawing the plot.
- array_read(file_name, a [, options])
Reads data from the ASCII text file file_name, and stuffs the data into the array a. For a row-wise array, the values on each line of the file are assigned into successive columns for each row of the array. If there are more items on a line in the file than columns in the array, or if there are more points in the file than rows in the array, the extra values are ignored. For a column-wise array, each row of the data file is assigned to successive columns of the array.
If a is a string array, successive bytes from each line of the file are assigned to elements of the array (as of spec release 6.04.05).
Lines beginning with the # character are ignored, except for the case where a is a string array. There is no limit on the length of the input line. Prior to spec release 6.03.05, the maximum length was 2,048 characters.
The only currently recognized option is a "C=#", where # is the starting column number in the file to use when making assignments (as of spec release 6.03.05).
Returns -1 if the file can't be opened, otherwise returns the number of points (bytes in the case of a string array) read and assigned.
- fmt_read(file, format, array [, header [, flags]])
- Reads data from file using the indicated format into array, possibly assigning elements to header which can be an associative array or a string. Possible values for flags are below.
- fmt_write(file, format, array [, header [, flags]])
Saves data from array to file using the indicated format, possibly also writing elements of header, which can be an associative array or string. Possible values for flags are below.
The standard spec distribution includes support for the following binary formats:
- "raw"
- The array data is read or written as is.
- "pgm"
- Binary Portable PixMap output formatted for gray scale images is read or created. Only works with 8-bit or 16-bit array elements.
- "tiff"
- Creates a TIFF file with fmt_write(). Contains code for reading TIFF files with fmt_read(), but the fmt_tiff.c file needs to be recompiled with the code activated and spec needs to be relinked to include the TIFF library.
The header argument is implementation dependent. None of the above format types use it.
The flags argument is a string, currently recognizing the following options. Multiple options are separated by commas or spaces.
- "append"
- Append data to the file.
- "overwrite"
- Overwrite the file with the data.
- "#num"
- The value of num is passed to the implementation's write or read function.
The default behavior for the above formats, "raw", "pgm" and "tiff", is to overwrite the file with the new data. Note, a place-holder argument is needed for header if using flags.
Contact CSS directly for more information on the fmt_read() and fmt_write() functions. Also, examine the C files fmt_*.c included in the spec distribution. Besides the source files that implement the above formats, files named fmt_esrf.c and fmt_immcat.c implement more complicated formats used at particular sites. The file fmt_test.c contains some simple usage examples.
- fmt_close(file, format)
- Calls the file close routine associated with format for file.