lnreZipf: extended Zipf


lnreZipf (-h -mW -kX -KY -EZ -H -sS) text.spc

The standard basic Zipfian LNRE model.

input

    text.spc: frequency spectrum

options

    -h: display on-line help

    -mW: number of ranks in fit is set to W (default: 15)

    -kX: number of chunks for interpolation is set to X (default: 20)

    -KY: number of chunks for extrapolation is set to Y (default: 20)

    -EZ: extrapolation sample size is set to Z (default: 2N_0)

    -H: input files do not have a header (default: header is presupposed)

    -sS: calculate only the expected spectrum for S ranks, output on textZ.fsp

output

    text_Z.spc: observed and expected frequency spectrum

          m: m (frequency)

          Vm: V(m,N) (frequency at sample size N)

          EVm: E[V(m,N)] (expected frequency at sample size N)

    text_Z.fsp: expected frequency spectrum

          m: m (frequency)

          EVm: E[V(m,N)] (expected frequency at sample size N)

    text_Z.sp2: expected frequency spectrum at 2N

          m: m (frequency)

          EVm2N: E[V(m,2N)] (expected frequency at sample size 2N)

    text_Z.ev2: vocabulary size statistics

          V: V(N) (observed vocabulary size at N)

          EV: E[V(N)] (expected vocabulary size at N)

          EV2N: E[V(2N)] (expected vocabulary size at 2N)

    text_Z.int, text_Z.ext: interpolation and extrapolation statistics

          N: N (number of tokens)

          E[V(N)]: E[V(N)] (expected number of types)

          Alpha1: E[alpha(1)] (E[V(1,N)]/E[V(N)])

          EV1-5: E[V(1-5,N)] (expected spectrum elements)

          GV: E[V(N+1)] - E[V(N)] (token-unit growth rate)

    text_Z.sum: summary statistics and estimated parameters

          N: N (number of tokens)

          V(N): V(N) (observed number of types)

          E[V(N)]: E[V(N)] (expected number of types)

          V(1,N): V(1,N) (observed number of hapax legomena)

          E[V(1,N)]: E[V(1,N)] (expected number of hapax legomena)

          Z: Z (parameter)

          S: S (population number of types)

technical details

The integrals of the extended Zipf model are evaluated by means of Rombergintegration for the interval [0.000001, 10000.0], using the subroutine qromb of Press et al. (1988). The downhill simplex minimization method is used for parameter estimation, using the subroutine amoeba of Press et al. (1988).

[ Previous | Index | Next ]