On Fri, 8 Mar 2002, [iso-8859-1] dechao wang wrote:

Many thanks for all who have joined the discussion.

From the instructive discussion, it seems there may

not have a command or function to deal with DIRECTLY

the comparison between two items, such as

x1<-c(weight1, ...weightn, height1,...heightn)

x2<-c(weight1, ...weightn, height1,...heightn)

It's not the lack of a command, it's the question of a *method*. There

are, as I said before, fields of statistics dedicated to measuring the

similarity/difference between vectors of measures. Chapters 10 and 11 in

Venables and Ripley's _Modern Applied Statistics with S-Plus_ might be a

good place for you to start. But it's most definitely *not* the right idea

to simply decide that cor() sounds like a nice command so you'll use it,

regardless of whether it has any validity.

However, this may be quite a common question in the

real world to be asked. As we have already seen that

correlation analysis could be used to address this

issue, except that the resolution rate is not good.

According to the theory of gray systems, several

measures can be taken to increase the compatibility of

different items which contain different units and

measurements. Take trees for example, after the data

were normalised, the relation degree between tree1 and

tree2 is 0.9997, while the relation degree between

tree1 and tree3 is 0.4988.

Well, if you'd like to define the term "relation degree" to mean "the

meaningless correlation between the various measures across cases" or

something to that effect, you're free to do so. But you'd need to make a

case that the correlation coefficient between trees is a statistically

appropriate way to measure the degree of similarity between cases, which

is what you're really asking. Given that lots of people have worked lots

of years to develop appropriate methods for measuring similarity between

cases, I suspect your research using this measure would be, er... poorly

received.

tree1<-c(1, 2, 3, 100, 200, 300)

tree2<-c(1.1,2.8,3.3, 108, 209, 303)

tree3<-c(3.8,6.8,5.3, 108, 209, 303)

trees<-cbind(tree1,tree2,tree3)

cor(trees,trees)

tree1 tree2 tree3

tree1 1.000000 0.9996549 0.9997620

tree2 0.999655 1.0000000 0.9999687

tree3 0.999762 0.9999687 1.0000000

tree1<-c(tree1[1:3]/6.8, tree1[4:6]/303)

tree2<-c(tree2[1:3]/6.8, tree2[4:6]/303)

tree3<-c(tree3[1:3]/6.8, tree3[4:6]/303)

trees<-cbind(tree1,tree2,tree3)

cor(trees,trees)

tree1 tree2 tree3

tree1 1.0000000 0.9918951 0.4988191

tree2 0.9918951 1.0000000 0.5806924

tree3 0.4988191 0.5806924 1.0000000

All you've shown here is that it's possible to calculate a

correlation. But R won't tell you whether it's a good idea or not -- R's

job is to calculate. It assumes you know what you're doing. In this case,

I submit, you do not. The fact that a correlation coefficient can be

calculated does not mean that it says anything at all about the similarity

between cases, which is your real question here. As an example, I give

you Joe and Jane, two rather different individuals; Joe is 75 inches tall

and weights 90kg. He has two eyes, and is 45 years old. Jane, by contrast,

is only 48 inches tall, and weighs only 57kg. She, too, has two eyes, but

is only 32 years old.

people

Weight Height NrEyes Age

joe 75 90.0 2 45

jane 48 57.6 2 32

Nevertheless, according to your metric, they are very similar:

cor(joe,jane)

[1] 0.998295

My strong advice is that you give up on trying to use correlation

coefficients to measure the similarity between cases and consider methods

that are actually suited to that task.

----------------------------------------------------------------------

Andrew J Perrin - andrew_perrin at unc.edu -

http://www.unc.edu/~aperrinAssistant Professor of Sociology, U of North Carolina, Chapel Hill

269 Hamilton Hall, CB#3210, Chapel Hill, NC 27599-3210 USA

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-

r-help mailing list -- Read

http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.htmlSend "info", "help", or "[un]subscribe"

(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch

_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._