mcprofile: developmental profile statistics


mcprofile (-kV -pW -H -aX -tY -cZ -z) text.zvc

Monte Carlo based analysis of the developmental profile.

input

    text.zvc: text in vector format with Zipf ranks

options

    -kV: number of text chunks (default: 20)

    -pW: number of permutation runs (default: 0)

    -aX: Zipf rank of first word to be traced (default: 4)

    -tY: Zipf rank of second word to be traced (default: 1)

    -cZ: confidence interval (default: 95)

    -z: use local instead of global p* for Zipf's law

    -H: input files do not heave a header

output

    text.mcm: table with Monte Carlo means

    text.mcl: table with lower 95% Monte Carlo confidence interval

    text.mch: table with upper 95% Monte Carlo confidence interval

    text.mco: table with observed values

   Each table lists the values of 19 measures (colums) for each of the specified text chunks (rows):

          N: N (number of tokens)

          K: K (Yule's charactistic constant)

          D: D (Simpson's diversity index)

          V: V (number of types)

          V1: V(1,N) (number of hapax legomena)

          V2: V(2,N) (number of dis legomena)

          V3: V(3,N) (number of tris legomena)

          V4: V(4,N) (number of types with frequency 4)

          V5: V(5,N) (number of types with frequency 5)

          R: R (Guiraud's constant)

          W: W (Brunet's constant)

          S: S (Sichel's constant)

          H: H (Honor'{e}'s constant)

          C: C (Herdan's constant)

          E: E (sample entropy)

          lM: mu (mean log frequency)

          lSt: sigma (standard deviation of log frequency)

          b: b (parameter of Sichel's model)

          c: c (parameter of Sichel's model)

          a1: alpha(1,N) (relative number of hapax legomena)

          Z: Z (parameter of extended Zipf's law)

          fa: frequency of first word with specified Zipf rank

          fthe: frequency of second word with specified Zipf rank

          sLmean: sample mean of lognormal model

          sLstdev: sample standard deviation of lognormal model

technical details

The maximum text length implemented equals 220000 word tokens, themaximum number of word types 20000, the maximum number of permutation runs 5000, and the maximum number of text chunks 40.

[ Previous | Index | Next ]