The standard basic Zipfian LNRE model.
input
text.spc: frequency spectrum
options
-h: display on-line help
-mW: number of ranks in fit is set to W (default: 15)
-kX: number of chunks for interpolation is set to X (default: 20)
-KY: number of chunks for extrapolation is set to Y (default: 20)
-EZ: extrapolation sample size is set to Z (default: 2N_0)
-H: input files do not have a header (default: header is presupposed)
-sS: calculate only the expected spectrum for S ranks, output on textZ.fsp
output
text_Z.spc: observed and expected frequency spectrum
m: m (frequency)
Vm: V(m,N) (frequency at sample size N)
EVm: E[V(m,N)] (expected frequency at sample size N)
text_Z.fsp: expected frequency spectrum
m: m (frequency)
EVm: E[V(m,N)] (expected frequency at sample size N)
text_Z.sp2: expected frequency spectrum at 2N
m: m (frequency)
EVm2N: E[V(m,2N)] (expected frequency at sample size 2N)
text_Z.ev2: vocabulary size statistics
V: V(N) (observed vocabulary size at N)
EV: E[V(N)] (expected vocabulary size at N)
EV2N: E[V(2N)] (expected vocabulary size at 2N)
text_Z.int, text_Z.ext: interpolation and extrapolation statistics
N: N (number of tokens)
E[V(N)]: E[V(N)] (expected number of types)
Alpha1: E[alpha(1)] (E[V(1,N)]/E[V(N)])
EV1-5: E[V(1-5,N)] (expected spectrum elements)
GV: E[V(N+1)] - E[V(N)] (token-unit growth rate)
text_Z.sum: summary statistics and estimated parameters
N: N (number of tokens)
V(N): V(N) (observed number of types)
E[V(N)]: E[V(N)] (expected number of types)
V(1,N): V(1,N) (observed number of hapax legomena)
E[V(1,N)]: E[V(1,N)] (expected number of hapax legomena)
Z: Z (parameter)
S: S (population number of types)
technical details
The integrals of the extended Zipf model are evaluated by means of Rombergintegration for the interval [0.000001, 10000.0], using the subroutine qromb of Press et al. (1988). The downhill simplex minimization method is used for parameter estimation, using the subroutine amoeba of Press et al. (1988).