distributions. The obvious test (I think) to use is the Kolmogorov-Smirnov

two sample test (provided in R as the function ks.test in package ctest).

The KS test is for continuous variables and this obviously includes length,

weight etc. However, limitations in measuring (e.g length to the nearest

cm/mm, weight to the nearest g/mg etc) has the obvious effect of

"discretising" real data.

The ks.test function checks for the presence of ties noting in the help page

that "continuous distributions do not generate them". Given the problem of

"measuring to the nearest..." noted above I frequently find that my data has

ties and ks.test generates a warning.

I was interested to note that the example of a two-sample KS test given in

Sokal & Rohlf's "Biometry" (I have the 2nd edition where the example is on

p.441) has exactly the same problem:

A <- c(104,109,112,114,116,118,118,117,121,123,125,126,126,128,128,128)

B <- c(100,105,107,107,108,111,116,120,121,123)

ks.test(A,B)

Two-sample Kolmogorov-Smirnov testB <- c(100,105,107,107,108,111,116,120,121,123)

ks.test(A,B)

data: A and B

D = 0.475, p-value = 0.1244

alternative hypothesis: two.sided

Warning message:

cannot compute correct p-values with ties in: ks.test(A, B)

In their chapter 2, "Data in Biology", Sokal & Rohlf note "any given reading

of a continuous variable ... is therefore an approximation to the exact

reading, which is in practice unknowable. However, for the purposes of

computation these approximations are usually sufficient..."

I am interested to know whether this can be made more exact. Are there

methods to test that data are measured at an appropriate scale so as to be

regarded as sufficiently continuous for a KS test, or is common sense choice

of measurement precision widely regarded as sufficient?

Any comments/references would be appreciated!

David Middleton

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-

r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html

Send "info", "help", or "[un]subscribe"

(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch

_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._