Calculations and manipulation
The directive CALCULATE allows arithmetic calculations on the values of any numeric
data structure; logical tests can also be done on numerical and textual values. Functions and
operators are available for a very wide range of calculations on matrices and tables. Another
general directive is EQUATE, which allows values to be copied from one set of data
structures to another; the structures must store values of the same mode (for example, numbers
or text), but need not be of the same type. Structure values can be deleted to save space within
GenStat; attributes can also be deleted so that the structure can be redefined, for example as
another type. Contents of data structures can be compared, to see if they contain the same
distinct items, or whether the distinct values in one structure are a subset of those in another.
performs arithmetic and logical calculations
allows values and attributes of data structures to be deleted
copies values between sets of data structures
compares the sets of values in two data structures
There are several general directives for manipulating vectors (variates, factors or texts). Units of
vectors can be sorted into systematic order or into random order. Boolean arithmetic can be
performed on their contents, or you can form all the ways of partitioning them into subsets. A
"restriction" can be associated with a vector, so that subsequent statements operate on only a
subset of its units. A default length and labelling can be defined for vectors formed later in the job.
Facilities for specific types of vector allow interpolation of values for variates, monotonic
regression, generation of factor values, concatenation and editing of text.
sorts units of vectors into alphabetic or numerical order of an index
vector, or forms a factor from a variate or text
performs Boolean set calculations on the contents of
vectors and pointers
runs through all ways of allocating a set of objects to
subsets
defines a "restriction" on the units of a vector
defines default length or labelling for vectors defined subsequently
in the job
calculates variates of interpolated values
fits an increasing monotonic regression
forms a factor (or grouping variable) from a variate or text,
together with the set of distinct values that occur
concatenates together lines of text vectors
line editor for units of text vectors
Other facilities for vectors are provided by the procedures in the manipulation module of the
GenStat Procedure Library, including
appends a list of vectors of the same type
permutes the levels and labels of a factor
represents a factor by factorial combinations of a set of
factors
forms a factor with a level for every combination of other
factors
sorts the levels of a factor according to an index vector
forms a pointer of factors representing a multiple-response
forms vectors with the restricted subset of a list of
vectors
forms a single string from a list of strings in a text
forms a text structure from a variate
generates pseudo-random numbers from probability
distributions
generates multivariate normal pseudo-random
numbers
joins or merges two sets of vectors together, based on classifying
keys
replaces missing values in a vector with the previous non-missing
value
calculates orthogonal polynomials
calculates partial correlations for a list of
variates
calculates quantiles of the values in a variate
produces ranks, from the values in a variate, allowing for ties
samples from a set of units, possibly stratified by factors
calculates a set of basis functions for M-, B- or I-splines
calculates natural cubic spline basis functions (for use e.g. in
REML)
combines several data sets by "stacking" the corresponding
vectors
standardizes columns of a data matrix to have mean 0 and
variance 1
forms vectors containing subsets of the values in other vectors
splits vectors into individual vectors according to levels of a
factor
equates across numerical structures
performs linear and inverse linear interpolation between
variates
copies values and row/column labels from a matrix to variates
or texts
Tables can be formed containing summaries of values in variates: totals, minimum and maximum
values, quantiles, numbers of missing and non-missing values, means and variances.
Manipulations of multi-way structures include the ability to add various types of marginal
summaries to tables, and to combine "slices" of tables, of matrices or of variates. Directives are
also available for eigenvalue and singular-value decompositions of matrices, and to form the
values of SSPM structures.
forms tables of summaries of the values of a variate
calculates or deletes margins of tables
combines or omits "slices" of tables, matrices or variates
calculates latent roots and vectors (that is, eigenvalues and
eigenvectors)
calculates singular-value decompositions of matrices
calculates values for SSPM structures (sums of squares and
products, means, etc.)
Procedures in the Library for operating on tables and matrices include
forms Hadamard matrices
forms a projection matrix for a set of model terms
forms the variance-covariance matrix for a list of
variates
calculates the generalized inverse of a matrix
finds the linear relations associated with matrix
singularities
forms integer powers of a square matrix
gives robust identification of multiple outliers in 2-way
tables
tabulates data classified by multiple-response factors
expresses the body of a table as percentages of one of its
margins
performs generalized calibration of survey data
modifies survey weights adjusting to ensure that their overall
sum weights remains unchanged
analyses stratified random surveys by expansion or ratio
raising
tabulates data from random surveys, including multistage
surveys and surveys with unequal probabilities of selection
forms survey weights
forms summary tables of modes of values
sorts tables so their margins are in ascending or descending
order
Directives are available for adding and removing branches of trees, and to assist in the
construction of trees.
assesses potential splits for regression and classification
trees
cuts a tree at a defined node, discarding nodes and information
below it
extends a tree by joining another tree to a terminal node
adds new branches to a node of a tree
There are also procedures for displaying and pruning trees. These are provide basic utilities for
tree-based analysis, and are used by the existing procedures for classification trees, identification
keys and regression trees (BCLASSIFICATION, BKEY and
BREGRESSION).
constructs a tree
plots a tree
displays a tree
prunes a tree using minimal cost complexity
Formulae can be interpreted, revised or constructed automatically from the contents of pointers.
forms classification sets for the terms in a formula or
breaks a formula up into separate formulae (one for each term)
modifies a formula or an expression to operate on a different
set of data structures
forms a model formula using structures supplied in a
pointer
Values can be assigned to dummies and pointers.
sets values of dummies and pointers
Aspects of the "environment" of the current job can be modified, such as whether or not GenStat
starts output from a statistical analysis at the top of a new page, or whether it should pause during
interactive output. New defaults can be set for options and parameters. Details of the
environmental settings can be copied into GenStat data structures. Attributes of data structures
can also be accessed.
sets details of the "environment" of a GenStat job
sets or modifies defaults of options of GenStat directives or
procedures
sets or modifies defaults of parameters of GenStat
directives or procedures
gets details of the "environment" of a GenStat job
accesses attributes of data structures
There are also various specialist mathematical facilities
forms addition and multiplication tables for a Galois finite field
converts integers between base 10 and other bases
forms all possible permutations of the integers 1...n
decomposes a positive integer into its constituent prime
powers