STATISTICS command
| Syntax: |
STATISTICS x { s1\keyword { s2\keyword ... }}
|
| Qualifiers: | \MESSAGES, \WEIGHTS, \MOMENTS, \PEARSON |
| Defaults: | \MESSAGES, \-WEIGHTS |
| Examples: |
STATISTICS X
|
The STATISTICS command calculates various statistics
for the input variable x, which can be
a vector or a matrix. Specific statistics are chosen with qualifier keywords
which are appended to the output parameters with the backslash, \. All
vectors must be the same size.
Table 1 below shows the parameter qualifier keywords and corresponding output values for extrema. Table 2 shows the parameter qualifier keywords and corresponding output values for central measures. Table 3 shows the parameter qualifier keywords and corresponding output values for dispersion and skewness.
| Keyword | Output Value |
\MAX |
maximum value of x |
\IMAX |
index of the maximum if x is a vectorrow index of the maximum if x is a matrix |
\JMAX |
column index of the maximum if x is a matrix |
\MIN |
minimum value of x |
\IMIN |
index of the minimum if x is a vectorrow index of the minimum if x is a matrix |
\JMIN |
column index of the minimum value if x is a matrix |
| Table 1: Extrema keywords |
| Keyword | Output Value |
\SUM | arithmetic sum (unweighted) |
\MEAN | arithmetic mean |
\GMEAN | geometric mean |
\MEDIAN | median value |
\RMS | root-mean-square |
| Table 2: Central measure keywords |
| Keyword | Output Value |
\VARIANCE | variance |
\SDEV | standard deviation |
\ADEV | average deviation |
\KURTOSIS | kurtosis |
\SKEWNESS | skewness |
| Table 3: Dispersion and skewness keywords |
Informational messages
The default is to display all the calculated statistics. If the
\-MESSAGES command qualifier is used, and if at least one output scalar is entered,
then the statistics values will not be displayed.
Weights
| Syntax: |
STATISTICS\WEIGHTS w x { s1\keyword { s2\keyword ... }}
|
You must use the \WEIGHTS
qualifier to indicate that a weight vector is present. Weights cannot be
applied to matrix data.
A weighting factor, w[i] ≥ 0,
could be the frequency, the probability, the mass, the reliability, or some
other multiplier. The lengths of w and x must be equal.
Definitions
Suppose that x is a vector with N elements.
If a weight vector, w, is entered, remember to use the
\WEIGHTS command qualifier. The
length of w is assumed to also be N. If no weights are entered,
let wi default to 1, for i = 1,2,...,N.
Define the total weight: W = w1 + w2 + ... + wN
Sum
The sum is defined by x1 + x2 + ... + xN
Mean value
The mean value, M, is defined by
M = (1/W)*[w1x1 +
w2x2 + ... + wNxN]Geometric mean
The geometric mean, Gx, is defined if each xi ≥ 0
by:
Gx = exp(1/W)*[w1log(x1) +
w2log(x2) + ... +
wNlog(xN)]Median
The median is the element of x which has equal numbers of values above
it and below it. If N is even, the median is the average of the unique
two central values.
Root-mean-square
The root-mean-square, RMS, is defined by
RMS = sqrt([1/W]*[w1x12 +
w2x22
+ ... + wNxN2])Variance
The variance, μ, is defined by
μ = [N/W(N-1)]*[w1(x1-M)2 +
w2(x2-M)2 + ... +
wN(xN-M)2]Standard deviation
The standard deviation, σ, is defined by σ = sqrt(μ)
Average deviation
The average deviation, or mean deviation, δ, is defined by
δ = (1/W)*[w1|x1-M| + w2|x2-M| + ... +
wN|xN-M|]Skewness
The skewness, or third moment, skew, is a nondimensional quantity that
characterizes the degree of asymmetry of a distribution around its mean. The
skewness is a pure number that characterizes only the shape of the
distribution, and is defined by
skew = (1/W)*{w1[(x1-M)/σ]3 +
w2[(x2-M)/σ]3 + ... +
wN[(xN-M)/σ]3}A positive value of skewness signifies a distribution with an asymmetric tail extending out towards more positive x; a negative value signifies a distribution whose tail extends out towards more negative x.
Kurtosis
The kurtosis, kurt, is a nondimensional quantity which measures the
relative peakedness or flatness of a distribution, relative to a normal
distribution. A distribution with positive kurtosis is termed leptokurtic;
a distribution with negative kurtosis is termed platykurtic. An in-between
distribution is termed mesokurtic. The kurtosis is defined by
kurt =
w1[(x1-M)/σ]4 +
w2[(x2-M)/σ]4 + ... +
wN[(xN-M)/σ]4 - 3where the -3 term makes the value zero for a normal distribution.
Moments
| Syntax: |
STATISTICS\MOMENTS w x n { s }
|
If the \MOMENTS command qualifier is used, the nth
moment of vector x, with weight w, is calculated and optionally
stored in output scalar s. The moment number, n, can be any integer
> 0.
s = (1/W)*[w1x1n +
w2x2n + ... +
wNxNn]Linear correlation coefficient
| Syntax: |
STATISTICS\PEARSON x y { r p }
|
Pearson's r, or the linear correlation coefficient, is widely used as
a measure of association between variables that are continuous. For pairs
of quantities (xi,yi), for i = 1,2,...,N, the
linear correlation coefficient r is given by the formula:

where
is the mean of x, and
is the mean of y.
The value of r lies between -1 and +1, inclusive. It
takes on a value of +1 when the data points lie on a straight line
with positive slope, x and y increase together. The value
+1 holds independent of the magnitude of this slope. If the data
points lie on a straight line with negative slope, y decreases as
x increases, then r has the value -1. A value of
r near zero indicates that the variables x and y are
uncorrelated.
r is a way of summarizing the strength of a correlation which is
known to be significant, but it is a poor statistic for deciding whether an
observed correlation is statistically significant, and/or whether one observed
correlation is significantly stronger than another. The reason is that
r is ignorant of the individual distributions of x and
y, so there is no universal way to compute its distribution in the
case of the null hypothesis.
The STATISTICS\PEARSON command returns Pearson's r in the scalar variable
r. It also returns scalar p, the significance
level at which the null hypothesis of zero correlation is disproved.
A small value of p indicates a significant correlation.

where I is the incomplete Beta function and t is defined by:

Examples
Suppose you have a vector X=[1.2;2.1;3.2;4.5;5;6;7]. Entering
STATISTICS X produces the following display:

If you want to use the values for the maximum, minimum and mean of X, enter:
STATISTICS X XMEAN\MEAN XMIN\MIN XMAX\MAX
and you will have the scalars: XMAX=7, XMIN=1.2, and
XMEAN=4.142857
If you also want the index values for the maximum and the minimum of X, enter:
STATISTICS X XMEAN\MEAN XMIN\MIN XMAX\MAX IMX\IMAX IMN\IMIN
and you will also have scalars: IMX=7 and IMN=1.