Calculations and manipulation


The directive CALCULATE allows arithmetic calculations on the values of any numeric data structure; logical tests can also be done on numerical and textual values. Functions and operators are available for a very wide range of calculations on matrices and tables. Another general directive is EQUATE, which allows values to be copied from one set of data structures to another; the structures must store values of the same mode (for example, numbers or text), but need not be of the same type. Structure values can be deleted to save space within GenStat; attributes can also be deleted so that the structure can be redefined, for example as another type. Contents of data structures can be compared, to see if they contain the same distinct items, or whether the distinct values in one structure are a subset of those in another.


CALCULATE
performs arithmetic and logical calculations

DELETE
allows values and attributes of data structures to be deleted

EQUATE
copies values between sets of data structures

SETRELATE
compares the sets of values in two data structures


There are several general directives for manipulating vectors (variates, factors or texts). Units of vectors can be sorted into systematic order or into random order. Boolean arithmetic can be performed on their contents, or you can form all the ways of partitioning them into subsets. A "restriction" can be associated with a vector, so that subsequent statements operate on only a subset of its units. A default length and labelling can be defined for vectors formed later in the job. Facilities for specific types of vector allow interpolation of values for variates, monotonic regression, generation of factor values, concatenation and editing of text.


SORT
sorts units of vectors into alphabetic or numerical order of an index vector, or forms a factor from a variate or text

SETCALCULATE
performs Boolean set calculations on the contents of vectors and pointers

SETALLOCATIONS
runs through all ways of allocating a set of objects to subsets

RESTRICT
defines a "restriction" on the units of a vector

UNITS
defines default length or labelling for vectors defined subsequently in the job

INTERPOLATE
calculates variates of interpolated values

MONOTONIC
fits an increasing monotonic regression

GROUPS
forms a factor (or grouping variable) from a variate or text, together with the set of distinct values that occur

CONCATENATE
concatenates together lines of text vectors

EDIT
line editor for units of text vectors


Other facilities for vectors are provided by the procedures in the manipulation module of the GenStat Procedure Library, including


APPEND
appends a list of vectors of the same type

FACAMEND
permutes the levels and labels of a factor

FACDIVIDE
represents a factor by factorial combinations of a set of factors

FACPRODUCT
forms a factor with a level for every combination of other factors

FACSORT
sorts the levels of a factor according to an index vector

FMFACTORS
forms a pointer of factors representing a multiple-response

FRESTRICTEDSET
forms vectors with the restricted subset of a list of vectors

FSTRING
forms a single string from a list of strings in a text

FTEXT
forms a text structure from a variate

GRANDOM
generates pseudo-random numbers from probability distributions

GRMULTINORMAL
generates multivariate normal pseudo-random numbers

JOIN
joins or merges two sets of vectors together, based on classifying keys

MVFILL
replaces missing values in a vector with the previous non-missing value

ORTHPOLYNOMIAL
calculates orthogonal polynomials

PARTIALCORRELATIONS
calculates partial correlations for a list of variates

QUANTILE
calculates quantiles of the values in a variate

RANK
produces ranks, from the values in a variate, allowing for ties

SAMPLE
samples from a set of units, possibly stratified by factors

SPLINE
calculates a set of basis functions for M-, B- or I-splines

NCSPLINE
calculates natural cubic spline basis functions (for use e.g. in REML)

STACK
combines several data sets by "stacking" the corresponding vectors

STANDARDIZE
standardizes columns of a data matrix to have mean 0 and variance 1

SUBSET
forms vectors containing subsets of the values in other vectors

UNSTACK
splits vectors into individual vectors according to levels of a factor

VEQUATE
equates across numerical structures

VINTERPOLATE
performs linear and inverse linear interpolation between variates

VMATRIX
copies values and row/column labels from a matrix to variates or texts


Tables can be formed containing summaries of values in variates: totals, minimum and maximum values, quantiles, numbers of missing and non-missing values, means and variances. Manipulations of multi-way structures include the ability to add various types of marginal summaries to tables, and to combine "slices" of tables, of matrices or of variates. Directives are also available for eigenvalue and singular-value decompositions of matrices, and to form the values of SSPM structures.


TABULATE
forms tables of summaries of the values of a variate

MARGIN
calculates or deletes margins of tables

COMBINE
combines or omits "slices" of tables, matrices or variates

FLRV
calculates latent roots and vectors (that is, eigenvalues and eigenvectors)

SVD
calculates singular-value decompositions of matrices

FSSPM
calculates values for SSPM structures (sums of squares and products, means, etc.)


Procedures in the Library for operating on tables and matrices include


FHADAMARDMATRIX
forms Hadamard matrices

FPROJECTIONMATRIX
forms a projection matrix for a set of model terms

FVCOVARIANCE
forms the variance-covariance matrix for a list of variates

GINVERSE
calculates the generalized inverse of a matrix

LINDEPENDENCE
finds the linear relations associated with matrix singularities

MPOWER
forms integer powers of a square matrix

MEDIANTETRAD
gives robust identification of multiple outliers in 2-way tables

MTABULATE
tabulates data classified by multiple-response factors

PERCENT
expresses the body of a table as percentages of one of its margins

SVCALIBRATE
performs generalized calibration of survey data

SVREWEIGHT
modifies survey weights adjusting to ensure that their overall sum weights remains unchanged

SVSTRATIFIED
analyses stratified random surveys by expansion or ratio raising

SVTABULATE
tabulates data from random surveys, including multistage surveys and surveys with unequal probabilities of selection

SVWEIGHT
forms survey weights

TABMODE
forms summary tables of modes of values

TABSORT
sorts tables so their margins are in ascending or descending order


Directives are available for adding and removing branches of trees, and to assist in the construction of trees.


BASSESS
assesses potential splits for regression and classification trees

BCUT
cuts a tree at a defined node, discarding nodes and information below it

BJOIN
extends a tree by joining another tree to a terminal node

BGROW
adds new branches to a node of a tree


There are also procedures for displaying and pruning trees. These are provide basic utilities for tree-based analysis, and are used by the existing procedures for classification trees, identification keys and regression trees (BCLASSIFICATION, BKEY and BREGRESSION).


BCONSTRUCT
constructs a tree

BGRAPH
plots a tree

BPRINT
displays a tree

BPRUNE
prunes a tree using minimal cost complexity


Formulae can be interpreted, revised or constructed automatically from the contents of pointers.


FCLASSIFICATION
forms classification sets for the terms in a formula or breaks a formula up into separate formulae (one for each term)

REFORMULATE
modifies a formula or an expression to operate on a different set of data structures

SET2FORMULA
forms a model formula using structures supplied in a pointer


Values can be assigned to dummies and pointers.


ASSIGN
sets values of dummies and pointers


Aspects of the "environment" of the current job can be modified, such as whether or not GenStat starts output from a statistical analysis at the top of a new page, or whether it should pause during interactive output. New defaults can be set for options and parameters. Details of the environmental settings can be copied into GenStat data structures. Attributes of data structures can also be accessed.


SET
sets details of the "environment" of a GenStat job

SETOPTION
sets or modifies defaults of options of GenStat directives or procedures

SETPARAMETER
sets or modifies defaults of parameters of GenStat directives or procedures

GET
gets details of the "environment" of a GenStat job

GETATTRIBUTE
accesses attributes of data structures


There are also various specialist mathematical facilities


GALOIS
forms addition and multiplication tables for a Galois finite field

NCONVERT
converts integers between base 10 and other bases

PERMUTE
forms all possible permutations of the integers 1...n

PRIMEPOWER
decomposes a positive integer into its constituent prime powers