Monte Carlo based dispersion analysis.
input
text.zvc: text in vector format with Zipf ranks
text.spc: the corresponding frequency spectrum
options
-kX: number of text chunks is set to X
-pY: number of permutation runs is set to Y
-sZ: seed for random generator is set to Z
-H: input files do not have the standard header
output
text.mcd: list with for each word type:
z: the Zipf rank z
Frequency: f(i,N)
Obs: observed dispersion d_i
Exp: expected dispersion E[d_i] using the binomial model
StDev: the corresponding standard deviation
Z: the corresponding Z-score
MCperc: proportion of simulation runs with dispersion <= d_i.
text.fik: list of word types and their frequencies for each text chunk
technical details
The maximum text length currently implemented equals 100000 word tokens,the maximum number of types 20000, and the maximum number of text chunks 100.