gbget - Basic data extraction and manipulation tool


gbget [options] 'filename[index](C,R)trans'


Print slices of tabular data from files and apply transformations. Data are read from text
files with fields separated by space (use option -F to specify a different separator).
Inside data file, data-blocks are separated by two empty lines. File can be compressed
with zlib (.gz).

is the input file. If not specified it default to stdin or the last specified
filename if any.

index stands for a data-block index.

C,R stands for columns and rows spec given as "min:max:skip" to select from "min" to
"max" every "skip" steps. If negative min and max are counted from the end. By
default all data are printed ("1:-1:1"). If min>max then count is reversed and skip
must be negative (-1 by default). Different specs are separated by semicolon ';'
and considered sequentially.

trans is a list of transformations applied to selected data: 'd' take the diff of
subsequent columns; 'D' remove all rows with at least one Not-A-Number (NAN) entry;
'f' flatten the output piling all columns; 'l' take log of all entries, 'P' print
all entries collected as a data-block; 't' transpose the matrix of data; 'z'
subtract from the entries in each column their mean; 'Z' replace the entry in each
column with their zscore; 'w' divide the entry in each columns by their mean.

'<..;..>' functions separated by semicolons in angle brackets can be used for
generic data transformation; the function is computed for each row of data.
Variables names are 'x' followed by the number of the column and optionally by 'l'
and the number of lags. For instance 'x2+x3l1' means the sum of the entries in the
2nd column plus the entries in the 3rd column in the previous row. 'x0' stands for
the row number and 'x' is equal to 'x1'

'<@..;..>' if the functions specification starts with a '@' the functions are
computed recursively along the columns. In this case the number after the 'x' is
the relative column counted starting from the one considered at each step.

'{...}' a function in curly brackets can be use to select data: only rows that
return a non-negative value are retained


-F set the input fields separators (default ' \t')

-o set the output format (default '%12.6e')

-e set the output format for empty fields (default '%13s')

-s set the output separation string (default ' ')

-t define global transformations applied before each output (default '')

-v verbose mode


gbget 'file(1:3)ld'
select the first three columns in 'file', take the log and the difference of
successive columns;

gbget 'file(2,-10:-1)
<x^2> select the last ten elements of the second' of 'file' and print their squares

gbget '[2]()' '[1]()' < ...
select the second and first data block from the standard input.

gbget 'file(1:3)<x1*x2-x3>'
select the first three columns in 'file' and in each row multiply the first two
entries and. subtract the third.

gbget 'file()<@x1+x2>'
print the sum of two subsequent columns

gbget 'file(1:3){x2-2}'
select the first three columns in 'file' for the rows whose second field is not
lower then 2

