Grokbase Groups R r-help March 2002
FAQ
Dear R help users:

I have set up a r help mailing list archive based on mysql which support
full text search and auto-update.

Please visit http://www.baidao.net/r/maillist/index.cgi . I hope you could
provide me bug reports and suggestions.

I will add r_dev and r_announce mailing list as soon as possible.

Thanks in advance!


eLan
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Search Discussions

  • Friedrich Leisch at Mar 7, 2002 at 9:12 am

    On Thu, 7 Mar 2002 13:35:37 +0800,
    Chen Huashan (CH) wrote:
    Dear R help users:
    I have set up a r help mailing list archive based on mysql which support
    full text search and auto-update.
    Please visit http://www.baidao.net/r/maillist/index.cgi . I hope you could
    provide me bug reports and suggestions.
    I will add r_dev and r_announce mailing list as soon as possible.
    Thanks in advance!

    Wow, this looks great. I'll put links to it on CRAN once you have
    support for the other two lists, too. Please let me know when you're
    ready.

    Thanks a lot for this effort!

    All the best,
    Fritz

    --
    -------------------------------------------------------------------
    Friedrich Leisch
    Institut f?r Statistik Tel: (+43 1) 58801 10715
    Technische Universit?t Wien Fax: (+43 1) 58801 10798
    Wiedner Hauptstra?e 8-10/1071 Friedrich.Leisch at ci.tuwien.ac.at
    A-1040 Wien, Austria http://www.ci.tuwien.ac.at/~leisch
    -------------------------------------------------------------------

    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
    r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
  • Laurent Gautier at Mar 7, 2002 at 11:22 am
    Dear Chen,


    I just tried it too.

    It looks really really helpful (the highlighted query words in the
    body of the mail is nice) ! Thanks !

    Is there a plan to include older archives (I tried the time constraint
    'all posts' but it seems that it does not go to far in the past) ?




    Regards,



    Laurent




    Laurent Gautier CBS, Building 208, DTU
    PhD. Student D-2800 Lyngby,Denmark
    tel: +45 45 25 24 85 http://www.cbs.dtu.dk/laurent
    On Thu, 7 Mar 2002 Friedrich.Leisch at ci.tuwien.ac.at wrote:

    On Thu, 7 Mar 2002 13:35:37 +0800,
    Chen Huashan (CH) wrote:
    Dear R help users:
    I have set up a r help mailing list archive based on mysql which support
    full text search and auto-update.
    Please visit http://www.baidao.net/r/maillist/index.cgi . I hope you could
    provide me bug reports and suggestions.
    I will add r_dev and r_announce mailing list as soon as possible.
    Thanks in advance!

    Wow, this looks great. I'll put links to it on CRAN once you have
    support for the other two lists, too. Please let me know when you're
    ready.

    Thanks a lot for this effort!

    All the best,
    Fritz

    --
    -------------------------------------------------------------------
    Friedrich Leisch
    Institut für Statistik Tel: (+43 1) 58801 10715
    Technische Universität Wien Fax: (+43 1) 58801 10798
    Wiedner Hauptstraße 8-10/1071 Friedrich.Leisch at ci.tuwien.ac.at
    A-1040 Wien, Austria http://www.ci.tuwien.ac.at/~leisch
    -------------------------------------------------------------------

    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
    r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
    r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
  • Chen Huashan at Mar 7, 2002 at 1:27 pm
    Hi Laurent,

    The archive currently contains messages from 20002-2.
    I will try to add all messages of three lists as soon as possible.

    btw: how about the server's transfer rate?

    Best wishes

    Chen
    -----Original Message-----
    From: Laurent Gautier [mailto:laurent at genome.cbs.dtu.dk]
    Sent: Thursday, March 07, 2002 7:22 PM
    To: chenhsh at mail.disa.pku.edu.cn
    Cc: Friedrich.Leisch at ci.tuwien.ac.at; r-help at stat.math.ethz.ch
    Subject: Re: [R] mailing list archive




    Dear Chen,


    I just tried it too.

    It looks really really helpful (the highlighted query words in the
    body of the mail is nice) ! Thanks !

    Is there a plan to include older archives (I tried the time constraint
    'all posts' but it seems that it does not go to far in the past) ?




    Regards,



    Laurent


    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
    r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
  • Dechao wang at Mar 7, 2002 at 10:33 am
    Hi, I have checked statistic textbooks about
    correlations, but I am still not sure the correlation
    analysis with different units, for example,

    x1<-c(1, 2, 3, 100, 200, 300)
    x2<-c(1.1,2.8,3.3, 108, 209, 303)
    the unit of the first 3 numbers is cm
    the unit of the last 3 numbers is kg

    cor(x1,x2)=0.999655

    Can I explain the correlation coefficient as normal in
    which all numbers have the same unit?

    Secondly, if keep the three large numbers unchanged,
    just change the three small numbers, the coefficient
    changes little, this means that the variation of three
    small numbers is hidden by the three larger numbers.
    Is there any solution in R to solve this issue?

    Thanks,

    Dechao

    __________________________________________________

    Everything you'll ever need on one web page
    from News and Sport to Email and Music Charts

    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
    r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
  • Joerg Maeder at Mar 7, 2002 at 1:31 pm
    Hello

    dechao wang wrote:
    Hi, I have checked statistic textbooks about
    correlations, but I am still not sure the correlation
    analysis with different units, for example,

    x1<-c(1, 2, 3, 100, 200, 300)
    x2<-c(1.1,2.8,3.3, 108, 209, 303)
    the unit of the first 3 numbers is cm
    the unit of the last 3 numbers is kg

    cor(x1,x2)=0.999655

    Can I explain the correlation coefficient as normal in
    which all numbers have the same unit?
    No, that will give different results. The unit must be the same for all
    values. Which unit isn't important, but it must be the same

    Secondly, if keep the three large numbers unchanged,
    just change the three small numbers, the coefficient
    changes little, this means that the variation of three
    small numbers is hidden by the three larger numbers.
    Is there any solution in R to solve this issue?
    If you have a vector with the units, you can use it to bring all values
    to the same unit

    eg (for two different units, if there are more it will be more
    complicated)
    xu <- c('m','m','m','cm','cm','cm') #units
    cor(ifelse(xu=='m',100,1)*x1,ifelse(xu=='m',100,1)*x2)

    gruess

    joerg
    __________________________________________________

    Everything you'll ever need on one web page
    from News and Sport to Email and Music Charts

    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
    r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
    --
    Joerg Maeder .:|:||:..:.||.:: maeder at atmos.umnw.ethz.ch
    Tel: +41 1 633 36 25 .:|:||:..:.||.::
    http://www.iac.ethz.ch/staff/maeder
    PhD student at INSTITUTE FOR ATMOSPHERIC AND CLIMATE SCIENCE (IACETH)
    ETH Z?RICH Switzerland
    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
    r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
  • Andrew Perrin at Mar 7, 2002 at 2:45 pm

    On Thu, 7 Mar 2002, Joerg Maeder wrote:

    Hello

    dechao wang wrote:
    Hi, I have checked statistic textbooks about
    correlations, but I am still not sure the correlation
    analysis with different units, for example,

    x1<-c(1, 2, 3, 100, 200, 300)
    x2<-c(1.1,2.8,3.3, 108, 209, 303)
    the unit of the first 3 numbers is cm
    the unit of the last 3 numbers is kg

    cor(x1,x2)=0.999655

    Can I explain the correlation coefficient as normal in
    which all numbers have the same unit?
    No, that will give different results. The unit must be the same for all
    values. Which unit isn't important, but it must be the same
    OOPS - I apologize, I misread the question, I understood the OP to be
    saying that x1 was in cm and x2 was in kg.

    What on earth would a correlation mean between two vectors, each of which
    is made up of two entirely different measures? (These aren't just
    different units, they're measures of entirely different phenomena.)


    ----------------------------------------------------------------------
    Andrew J Perrin - andrew_perrin at unc.edu - http://www.unc.edu/~aperrin
    Assistant Professor of Sociology, U of North Carolina, Chapel Hill
    269 Hamilton Hall, CB#3210, Chapel Hill, NC 27599-3210 USA




    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
    r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
  • Andrew Perrin at Mar 7, 2002 at 2:07 pm

    On Thu, 7 Mar 2002, [iso-8859-1] dechao wang wrote:

    Hi, I have checked statistic textbooks about
    correlations, but I am still not sure the correlation
    analysis with different units, for example,

    x1<-c(1, 2, 3, 100, 200, 300)
    x2<-c(1.1,2.8,3.3, 108, 209, 303)
    the unit of the first 3 numbers is cm
    the unit of the last 3 numbers is kg

    cor(x1,x2)=0.999655

    Can I explain the correlation coefficient as normal in
    which all numbers have the same unit?
    I don't think the correlation depends on the units; it's a ratio, not an
    absolute. Consider the case of making the centimeters into meters:
    x1m<-x1 * 100
    cor(x1m,x2)
    [1] 0.999655

    The correlation doesn't change.
    Secondly, if keep the three large numbers unchanged,
    just change the three small numbers, the coefficient
    changes little, this means that the variation of three
    small numbers is hidden by the three larger numbers.
    Is there any solution in R to solve this issue?
    I'm not sure what you mean by "hidden"; in your case, the correlations
    between the vectors are similar for both first and second halves:
    cor(x1[4:6],x2[4:6])
    [1] 0.9997853
    cor(x1[1:3],x2[1:3])
    [1] 0.953821

    so removing either half isn't going to change the result much.


    ----------------------------------------------------------------------
    Andrew J Perrin - andrew_perrin at unc.edu - http://www.unc.edu/~aperrin
    Assistant Professor of Sociology, U of North Carolina, Chapel Hill
    269 Hamilton Hall, CB#3210, Chapel Hill, NC 27599-3210 USA




    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
    r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
  • Dechao wang at Mar 7, 2002 at 3:06 pm
    Thanks Andrew,

    Consider the following example:
    x1<-c(1, 2, 3, 100, 200, 300)
    x2<-c(1.1,2.8,3.3, 108, 209, 303)
    x3<-c(2.8,3.8,5.3, 108, 209, 303)
    cor(x1,x2)
    [1] 0.999655
    cor(x1,x3)
    [1] 0.9997286

    You can see that as x2 changed to x3 with only first
    three numbers changing, the coefficients (x1, x2) and
    (x1,x3) changed little. I thought this may be because
    the last three numbers were in different units.

    Consider another example:
    y1<-c(1, 2, 3, 4, 5, 6)
    y2<-c(1.1,2.8,3.3, 4.4, 5.5, 6.6)
    y3<-c(2.8,3.8,5.3, 4.5, 5.5, 6.6)
    cor(y1,y2)
    [1] 0.9934715
    cor(y1,y3)
    [1] 0.9254707

    You can see that the coefficients (y1,y2) and (y1,y3)
    are different as the first three numbers changed.
    From the two examples, we can see that the resolution
    of compatibility bewteen items that contain different
    units is lower (as shown in the first example) than
    that of compatibility of items that contain the same
    scale (as shown in example 2).

    The results of the first example is not what we want,
    isn't it? So I think it would be better if pre-process
    the data that contain different units before
    regression analysis. I do not think it is difficult to
    write code using R to do that. My question is there
    command already exist to do that before I write code?




    --- Andrew Perrin wrote: > On
    Thu, 7 Mar 2002, [iso-8859-1] dechao wang wrote:
    Hi, I have checked statistic textbooks about
    correlations, but I am still not sure the
    correlation
    analysis with different units, for example,

    x1<-c(1, 2, 3, 100, 200, 300)
    x2<-c(1.1,2.8,3.3, 108, 209, 303)
    the unit of the first 3 numbers is cm
    the unit of the last 3 numbers is kg

    cor(x1,x2)=0.999655

    Can I explain the correlation coefficient as normal in
    which all numbers have the same unit?
    I don't think the correlation depends on the units;
    it's a ratio, not an
    absolute. Consider the case of making the
    centimeters into meters:
    x1m<-x1 * 100
    cor(x1m,x2)
    [1] 0.999655

    The correlation doesn't change.
    Secondly, if keep the three large numbers
    unchanged,
    just change the three small numbers, the
    coefficient
    changes little, this means that the variation of three
    small numbers is hidden by the three larger numbers.
    Is there any solution in R to solve this issue?
    I'm not sure what you mean by "hidden"; in your
    case, the correlations
    between the vectors are similar for both first and
    second halves:
    cor(x1[4:6],x2[4:6])
    [1] 0.9997853
    cor(x1[1:3],x2[1:3])
    [1] 0.953821

    so removing either half isn't going to change the
    result much.


    ----------------------------------------------------------------------
    Andrew J Perrin - andrew_perrin at unc.edu -
    http://www.unc.edu/~aperrin
    Assistant Professor of Sociology, U of North
    Carolina, Chapel Hill
    269 Hamilton Hall, CB#3210, Chapel Hill, NC
    27599-3210 USA


    __________________________________________________

    Everything you'll ever need on one web page
    from News and Sport to Email and Music Charts

    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
    r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
  • Andrew Perrin at Mar 7, 2002 at 3:12 pm

    On Thu, 7 Mar 2002, [iso-8859-1] dechao wang wrote:

    Thanks Andrew,

    Consider the following example:
    x1<-c(1, 2, 3, 100, 200, 300)
    x2<-c(1.1,2.8,3.3, 108, 209, 303)
    x3<-c(2.8,3.8,5.3, 108, 209, 303)
    cor(x1,x2)
    [1] 0.999655
    cor(x1,x3)
    [1] 0.9997286

    You can see that as x2 changed to x3 with only first
    three numbers changing, the coefficients (x1, x2) and
    (x1,x3) changed little. I thought this may be because
    the last three numbers were in different units.
    It's not because they're different units -- it's because they're different
    measures altogether! Can you state, in words (e.g., not in mathematical
    terms) what you think a correlation would *mean* between these two
    vectors? R is happily telling you, as any statistical package would, what
    the correlation is between two vectors of numbers. But that correlation
    doesn't necessarily mean anything at all; its meaning is based on what the
    vectors measure.

    ----------------------------------------------------------------------
    Andrew J Perrin - andrew_perrin at unc.edu - http://www.unc.edu/~aperrin
    Assistant Professor of Sociology, U of North Carolina, Chapel Hill
    269 Hamilton Hall, CB#3210, Chapel Hill, NC 27599-3210 USA




    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
    r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
  • Dechao wang at Mar 7, 2002 at 3:29 pm
    --- Andrew Perrin wrote: > On
    Thu, 7 Mar 2002, [iso-8859-1] dechao wang wrote:
    Thanks Andrew,

    Consider the following example:
    x1<-c(1, 2, 3, 100, 200, 300)
    x2<-c(1.1,2.8,3.3, 108, 209, 303)
    x3<-c(2.8,3.8,5.3, 108, 209, 303)
    cor(x1,x2)
    [1] 0.999655
    cor(x1,x3)
    [1] 0.9997286

    You can see that as x2 changed to x3 with only first
    three numbers changing, the coefficients (x1, x2) and
    (x1,x3) changed little. I thought this may be because
    the last three numbers were in different units.
    It's not because they're different units -- it's
    because they're different
    measures altogether! Can you state, in words (e.g.,
    not in mathematical
    terms) what you think a correlation would *mean*
    between these two
    vectors? R is happily telling you, as any
    statistical package would, what
    the correlation is between two vectors of numbers.
    But that correlation
    doesn't necessarily mean anything at all; its
    meaning is based on what the
    vectors measure.
    There are lots of examples. Let us consider the first
    three numbers representing three branches of an apple
    tree, the last three numbers representing the
    corresponding branching angles of the branches. So x1,
    x2, x3 represents three different trees. Maybe we can
    ask which tree is similar to which tree?

    __________________________________________________

    Everything you'll ever need on one web page
    from News and Sport to Email and Music Charts

    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
    r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
  • Andrew Perrin at Mar 7, 2002 at 3:41 pm

    On Thu, 7 Mar 2002, [iso-8859-1] dechao wang wrote:

    --- Andrew Perrin wrote: > On
    Thu, 7 Mar 2002, [iso-8859-1] dechao wang wrote:
    Thanks Andrew,

    Consider the following example:
    x1<-c(1, 2, 3, 100, 200, 300)
    x2<-c(1.1,2.8,3.3, 108, 209, 303)
    x3<-c(2.8,3.8,5.3, 108, 209, 303)
    cor(x1,x2)
    [1] 0.999655
    cor(x1,x3)
    [1] 0.9997286

    You can see that as x2 changed to x3 with only first
    three numbers changing, the coefficients (x1, x2) and
    (x1,x3) changed little. I thought this may be because
    the last three numbers were in different units.
    It's not because they're different units -- it's
    because they're different
    measures altogether! Can you state, in words (e.g.,
    not in mathematical
    terms) what you think a correlation would *mean*
    between these two
    vectors? R is happily telling you,
    as any
    statistical package would, what
    the correlation is between two vectors of numbers.
    But that correlation
    doesn't necessarily mean anything at all; its
    meaning is based on what the
    vectors measure.
    There are lots of examples. Let us consider the first
    three numbers representing three branches of an apple
    tree, the last three numbers representing the
    corresponding branching angles of the branches. So x1,
    x2, x3 represents three different trees. Maybe we can
    ask which tree is similar to which tree?
    In which case you probably shouldn't be storing the data in vectors
    (although you can), but you certainly shouldn't be using correlations to
    measure similarity among vectors where each vector represents one unit of
    analysis. There are various ways of classifying the "similarity" among
    vectors (indeed, Brian Ripley of Venables and Ripley fame is an expert in
    this field) but correlation is not one of them.

    You could ask, in your example, whether the length of a branch is
    correlated with its angle; in that case, you'd want something like:
    x1<-c(1, 2, 3, 100, 200, 300)
    x2<-c(1.1,2.8,3.3, 108, 209, 303)
    x3<-c(2.8,3.8,5.3, 108, 209, 303)
    x.df<-as.data.frame(t(data.frame(x1,x2,x3)))
    colnames(x.df)<-c('l1','l2','l3','a1','a2','a3')attach(x.df)
    cor(l1,a1)

    which returns:
    [1] 0.5421936

    or the correlation between length 1 (l1) and angle 1 (a1). That's a
    suitable (although not very sophisticated) use of correlation. But
    measuring the correlation between cases using different measures is not a
    useful, or even meaningful, exercise, IMNSHO.


    ----------------------------------------------------------------------
    Andrew J Perrin - andrew_perrin at unc.edu - http://www.unc.edu/~aperrin
    Assistant Professor of Sociology, U of North Carolina, Chapel Hill
    269 Hamilton Hall, CB#3210, Chapel Hill, NC 27599-3210 USA




    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
    r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
  • Scott, Uriel at Mar 7, 2002 at 3:12 pm
    Whether the two variables have the same units does not matter. Moreover,
    even if there were some way of converting cm to kg the correlation would
    still be the same because the correlation is invariant under unit conversion
    as it is invariant under multiplication of its arguments by a constant.

    As for your second question, the correlation estimator is a continuous
    function of each of the individual data points, so perturbing the values of
    any of them by a sufficiently small amount will only perturb the correlation
    by a small amount.
    -----Original Message-----
    From: dechao wang [SMTP:dechwang at yahoo.co.uk]
    Sent: Thursday, March 07, 2002 5:34 AM
    To: r-help at stat.math.ethz.ch
    Subject: [R] linear correlation?

    Hi, I have checked statistic textbooks about
    correlations, but I am still not sure the correlation
    analysis with different units, for example,

    x1<-c(1, 2, 3, 100, 200, 300)
    x2<-c(1.1,2.8,3.3, 108, 209, 303)
    the unit of the first 3 numbers is cm
    the unit of the last 3 numbers is kg

    cor(x1,x2)=0.999655

    Can I explain the correlation coefficient as normal in
    which all numbers have the same unit?

    Secondly, if keep the three large numbers unchanged,
    just change the three small numbers, the coefficient
    changes little, this means that the variation of three
    small numbers is hidden by the three larger numbers.
    Is there any solution in R to solve this issue?

    Thanks,

    Dechao

    __________________________________________________

    Everything you'll ever need on one web page
    from News and Sport to Email and Music Charts

    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.
    -.-.-
    r-help mailing list -- Read
    http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._.
    _._._
    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
    r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
  • Setzer Woodrow at Mar 7, 2002 at 3:16 pm
    Perhaps I've led a sheltered life, but my own experience leads me to
    question the logic behind an analysis that leads me to want to compute
    correlations between vectors in which the elements have different units;
    cm and kg are not generally interconvertible!

    R. Woodrow Setzer, Jr. Phone:
    (919) 541-0128
    Experimental Toxicology Division Fax: (919)
    541-5394
    Pharmacokinetics Branch
    NHEERL MD-74; US EPA; RTP, NC 27711



    dechao wang
    <dechwang at yahoo.co.u To: r-help at stat.math.ethz.ch
    k> cc:
    Sent by: Subject: [R] linear correlation?
    owner-r-help at stat.ma
    th.ethz.ch


    03/07/02 05:33 AM






    Hi, I have checked statistic textbooks about
    correlations, but I am still not sure the correlation
    analysis with different units, for example,

    x1<-c(1, 2, 3, 100, 200, 300)
    x2<-c(1.1,2.8,3.3, 108, 209, 303)
    the unit of the first 3 numbers is cm
    the unit of the last 3 numbers is kg

    cor(x1,x2)=0.999655

    Can I explain the correlation coefficient as normal in
    which all numbers have the same unit?

    Secondly, if keep the three large numbers unchanged,
    just change the three small numbers, the coefficient
    changes little, this means that the variation of three
    small numbers is hidden by the three larger numbers.
    Is there any solution in R to solve this issue?

    Thanks,

    Dechao

    __________________________________________________

    Everything you'll ever need on one web page
    from News and Sport to Email and Music Charts

    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.
    -.-.-.-
    r-help mailing list -- Read
    http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._.
    _._._._




    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
    r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
  • Michaell Taylor at Mar 7, 2002 at 6:05 pm
    Uh???

    A good deal of this thread leaves me perplexed.

    Of course you can correlate vectors of differing units. Correlations
    are covariances expressed in a standardized unit. I.e. differing units
    is the reason for correlation coefficients in the first place.

    Of course you can correlate measures of different phenomenon - i.e.
    economic growth is correlated with percentage of voters voting for the
    incumbent in the next election. Correlation of two different measures
    of the same phenomenon is called a test of reliability.

    Of course you can correlate cm and kg. I would be perfectly confortable
    stating that an person's weight in kg is correlated to their height in
    cm. Anyone disagree?

    Obviously one has to be careful in extracting substantive meaning from
    correlations - just like every statistic that I can think of.

    In term of the big number small number thing. The major source of your
    observed correlations is coming from their being a set of small numbers
    and a set of big numbers. Think of these things as points on a graph.
    In your example,
    x1<-c(1, 2, 3, 100, 200, 300)
    x2<-c(1.1,2.8,3.3, 108, 209, 303)
    x3<-c(2.8,3.8,5.3, 108, 209, 303)
    cor(x1,x2)
    [1] 0.999655
    cor(x1,x3)
    [1] 0.9997286
    The minor fluctions in these series between observations 1, 2,3 and
    4,5,6 is totally dwarfed by the difference between 3-4 It is this jump
    between (3,3.3) and (100,108) which drives your correlations.
    Comparatively, the other changes are a wash.


    ============
    Michaell Taylor
    Senior Economist, Reis, New York, USA
    Associate Professor, NTNU, Norway
    Adjunct Professor, UD, South Africa


    On Thu, 2002-03-07 at 10:16, Setzer.Woodrow at epamail.epa.gov wrote:

    Perhaps I've led a sheltered life, but my own experience leads me to
    question the logic behind an analysis that leads me to want to compute
    correlations between vectors in which the elements have different units;
    cm and kg are not generally interconvertible!

    R. Woodrow Setzer, Jr. Phone:
    (919) 541-0128
    Experimental Toxicology Division Fax: (919)
    541-5394
    Pharmacokinetics Branch
    NHEERL MD-74; US EPA; RTP, NC 27711



    dechao wang
    <dechwang at yahoo.co.u To: r-help at stat.math.ethz.ch
    k> cc:
    Sent by: Subject: [R] linear correlation?
    owner-r-help at stat.ma
    th.ethz.ch


    03/07/02 05:33 AM






    Hi, I have checked statistic textbooks about
    correlations, but I am still not sure the correlation
    analysis with different units, for example,

    x1<-c(1, 2, 3, 100, 200, 300)
    x2<-c(1.1,2.8,3.3, 108, 209, 303)
    the unit of the first 3 numbers is cm
    the unit of the last 3 numbers is kg

    cor(x1,x2)=0.999655

    Can I explain the correlation coefficient as normal in
    which all numbers have the same unit?

    Secondly, if keep the three large numbers unchanged,
    just change the three small numbers, the coefficient
    changes little, this means that the variation of three
    small numbers is hidden by the three larger numbers.
    Is there any solution in R to solve this issue?

    Thanks,

    Dechao

    __________________________________________________

    Everything you'll ever need on one web page
    from News and Sport to Email and Music Charts

    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.
    -.-.-.-
    r-help mailing list -- Read
    http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._.
    _._._._




    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
    r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._


    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
    r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
  • Andrew Perrin at Mar 7, 2002 at 7:14 pm

    On 7 Mar 2002, Michaell Taylor wrote:
    Uh???

    A good deal of this thread leaves me perplexed.

    Of course you can correlate vectors of differing units. Correlations
    are covariances expressed in a standardized unit. I.e. differing units
    is the reason for correlation coefficients in the first place.

    Of course you can correlate measures of different phenomenon - i.e.
    economic growth is correlated with percentage of voters voting for the
    incumbent in the next election. Correlation of two different measures
    of the same phenomenon is called a test of reliability.

    Of course you can correlate cm and kg. I would be perfectly confortable
    stating that an person's weight in kg is correlated to their height in
    cm. Anyone disagree?
    You're missing something basic, which is what I missed too when the OP
    first posted. He's not correlating two variables, one of which is in cm
    and one of which is in kg. He's correlating two *vectors* of six variables
    each; three of these variables are in cm and three are in kg. So he's
    treating a *case* (in his example, an apple tree) as a variable, and
    asking for the correlation between two cases (apple trees).
    Obviously one has to be careful in extracting substantive meaning from
    correlations - just like every statistic that I can think of.

    In term of the big number small number thing. The major source of your
    observed correlations is coming from their being a set of small numbers
    and a set of big numbers. Think of these things as points on a graph.
    In your example,
    x1<-c(1, 2, 3, 100, 200, 300)
    x2<-c(1.1,2.8,3.3, 108, 209, 303)
    x3<-c(2.8,3.8,5.3, 108, 209, 303)
    cor(x1,x2)
    [1] 0.999655
    cor(x1,x3)
    [1] 0.9997286
    The minor fluctions in these series between observations 1, 2,3 and
    4,5,6 is totally dwarfed by the difference between 3-4 It is this jump
    between (3,3.3) and (100,108) which drives your correlations.
    Comparatively, the other changes are a wash.
    ... and the reason for these jumps is that the "small" numbers (the first
    three in each vector) are centimeters, while the "large" numbers (the
    latter three) are kilograms. That's the essence of the problem, and the
    reason why the very exercise is inappropriate.


    ----------------------------------------------------------------------
    Andrew J Perrin - andrew_perrin at unc.edu - http://www.unc.edu/~aperrin
    Assistant Professor of Sociology, U of North Carolina, Chapel Hill
    269 Hamilton Hall, CB#3210, Chapel Hill, NC 27599-3210 USA



    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
    r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
  • Dechao wang at Mar 8, 2002 at 11:08 am
    Many thanks for all who have joined the discussion.
    From the instructive discussion, it seems there may
    not have a command or function to deal with DIRECTLY
    the comparison between two items, such as
    x1<-c(weight1, ...weightn, height1,...heightn)
    x2<-c(weight1, ...weightn, height1,...heightn)

    However, this may be quite a common question in the
    real world to be asked. As we have already seen that
    correlation analysis could be used to address this
    issue, except that the resolution rate is not good.
    According to the theory of gray systems, several
    measures can be taken to increase the compatibility of
    different items which contain different units and
    measurements. Take trees for example, after the data
    were normalised, the relation degree between tree1 and
    tree2 is 0.9997, while the relation degree between
    tree1 and tree3 is 0.4988.

    tree1<-c(1, 2, 3, 100, 200, 300)
    tree2<-c(1.1,2.8,3.3, 108, 209, 303)
    tree3<-c(3.8,6.8,5.3, 108, 209, 303)
    trees<-cbind(tree1,tree2,tree3)
    cor(trees,trees)
    tree1 tree2 tree3
    tree1 1.000000 0.9996549 0.9997620
    tree2 0.999655 1.0000000 0.9999687
    tree3 0.999762 0.9999687 1.0000000
    tree1<-c(tree1[1:3]/6.8, tree1[4:6]/303)
    tree2<-c(tree2[1:3]/6.8, tree2[4:6]/303)
    tree3<-c(tree3[1:3]/6.8, tree3[4:6]/303)
    trees<-cbind(tree1,tree2,tree3)
    cor(trees,trees)
    tree1 tree2 tree3
    tree1 1.0000000 0.9918951 0.4988191
    tree2 0.9918951 1.0000000 0.5806924
    tree3 0.4988191 0.5806924 1.0000000


    __________________________________________________

    Everything you'll ever need on one web page
    from News and Sport to Email and Music Charts

    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
    r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
  • Andrew Perrin at Mar 8, 2002 at 2:22 pm

    On Fri, 8 Mar 2002, [iso-8859-1] dechao wang wrote:

    Many thanks for all who have joined the discussion.
    From the instructive discussion, it seems there may
    not have a command or function to deal with DIRECTLY
    the comparison between two items, such as
    x1<-c(weight1, ...weightn, height1,...heightn)
    x2<-c(weight1, ...weightn, height1,...heightn)
    It's not the lack of a command, it's the question of a *method*. There
    are, as I said before, fields of statistics dedicated to measuring the
    similarity/difference between vectors of measures. Chapters 10 and 11 in
    Venables and Ripley's _Modern Applied Statistics with S-Plus_ might be a
    good place for you to start. But it's most definitely *not* the right idea
    to simply decide that cor() sounds like a nice command so you'll use it,
    regardless of whether it has any validity.
    However, this may be quite a common question in the
    real world to be asked. As we have already seen that
    correlation analysis could be used to address this
    issue, except that the resolution rate is not good.
    According to the theory of gray systems, several
    measures can be taken to increase the compatibility of
    different items which contain different units and
    measurements. Take trees for example, after the data
    were normalised, the relation degree between tree1 and
    tree2 is 0.9997, while the relation degree between
    tree1 and tree3 is 0.4988.
    Well, if you'd like to define the term "relation degree" to mean "the
    meaningless correlation between the various measures across cases" or
    something to that effect, you're free to do so. But you'd need to make a
    case that the correlation coefficient between trees is a statistically
    appropriate way to measure the degree of similarity between cases, which
    is what you're really asking. Given that lots of people have worked lots
    of years to develop appropriate methods for measuring similarity between
    cases, I suspect your research using this measure would be, er... poorly
    received.
    tree1<-c(1, 2, 3, 100, 200, 300)
    tree2<-c(1.1,2.8,3.3, 108, 209, 303)
    tree3<-c(3.8,6.8,5.3, 108, 209, 303)
    trees<-cbind(tree1,tree2,tree3)
    cor(trees,trees)
    tree1 tree2 tree3
    tree1 1.000000 0.9996549 0.9997620
    tree2 0.999655 1.0000000 0.9999687
    tree3 0.999762 0.9999687 1.0000000
    tree1<-c(tree1[1:3]/6.8, tree1[4:6]/303)
    tree2<-c(tree2[1:3]/6.8, tree2[4:6]/303)
    tree3<-c(tree3[1:3]/6.8, tree3[4:6]/303)
    trees<-cbind(tree1,tree2,tree3)
    cor(trees,trees)
    tree1 tree2 tree3
    tree1 1.0000000 0.9918951 0.4988191
    tree2 0.9918951 1.0000000 0.5806924
    tree3 0.4988191 0.5806924 1.0000000
    All you've shown here is that it's possible to calculate a
    correlation. But R won't tell you whether it's a good idea or not -- R's
    job is to calculate. It assumes you know what you're doing. In this case,
    I submit, you do not. The fact that a correlation coefficient can be
    calculated does not mean that it says anything at all about the similarity
    between cases, which is your real question here. As an example, I give
    you Joe and Jane, two rather different individuals; Joe is 75 inches tall
    and weights 90kg. He has two eyes, and is 45 years old. Jane, by contrast,
    is only 48 inches tall, and weighs only 57kg. She, too, has two eyes, but
    is only 32 years old.
    people
    Weight Height NrEyes Age
    joe 75 90.0 2 45
    jane 48 57.6 2 32

    Nevertheless, according to your metric, they are very similar:
    cor(joe,jane)
    [1] 0.998295

    My strong advice is that you give up on trying to use correlation
    coefficients to measure the similarity between cases and consider methods
    that are actually suited to that task.

    ----------------------------------------------------------------------
    Andrew J Perrin - andrew_perrin at unc.edu - http://www.unc.edu/~aperrin
    Assistant Professor of Sociology, U of North Carolina, Chapel Hill
    269 Hamilton Hall, CB#3210, Chapel Hill, NC 27599-3210 USA



    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
    r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
  • Allan Strand at Mar 19, 2002 at 1:07 pm
    Hi all,

    I have developed a routine to classify observations based upon
    clustering. In my current case there are 5 classes, so the data at
    the end of the classification look like this:

    obs class
    1 2
    2 2
    3 1
    4 4
    5 4
    6 3
    7 5
    8 5
    . .
    . .

    I always know the numbers of classes a priori. I wanted to see how
    well my approach is performing so I wrote a simulation to generate
    observations in a fairly realistic manner. I then run the simulated
    observations through my scheme. The "known" simulated data have the
    same form as the results of the classification, but the class
    identifiers may differ. In other words, a class of observations may be
    constructed correctly by my approach, but the "name" of the class may
    change.

    I would like to compare the results of my scheme to the "known"
    simulated classes and assess its error rate. AS I start, I would just
    like to know the number of observations that were mis-classified. No
    doubt this is a brain-dead question to those who work in this field,
    but this is my first foray into such analyses. Ultimately I was
    wondering of there is an R package that performs such analyses out of
    the box or if anyone who does these kind of analyses routinely has a
    code snippet I could use as an example.

    Cheers,
    a.
    --
    Allan Strand, Biology http://linum.cofc.edu
    College of Charleston Ph. (843) 953-8085
    Charleston, SC 29424 Fax (843) 953-5453

    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
    r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
  • Evgenia Dimitriadou at Mar 19, 2002 at 1:32 pm
    if i understood right you want to match the class labels to the clustering
    labels. try matchClasses in library(e1071).
    best,
    -e
    On 19 Mar 2002, Allan Strand wrote:

    Hi all,

    I have developed a routine to classify observations based upon
    clustering. In my current case there are 5 classes, so the data at
    the end of the classification look like this:

    obs class
    1 2
    2 2
    3 1
    4 4
    5 4
    6 3
    7 5
    8 5
    . .
    . .

    I always know the numbers of classes a priori. I wanted to see how
    well my approach is performing so I wrote a simulation to generate
    observations in a fairly realistic manner. I then run the simulated
    observations through my scheme. The "known" simulated data have the
    same form as the results of the classification, but the class
    identifiers may differ. In other words, a class of observations may be
    constructed correctly by my approach, but the "name" of the class may
    change.

    I would like to compare the results of my scheme to the "known"
    simulated classes and assess its error rate. AS I start, I would just
    like to know the number of observations that were mis-classified. No
    doubt this is a brain-dead question to those who work in this field,
    but this is my first foray into such analyses. Ultimately I was
    wondering of there is an R package that performs such analyses out of
    the box or if anyone who does these kind of analyses routinely has a
    code snippet I could use as an example.

    Cheers,
    a.
    --
    Allan Strand, Biology http://linum.cofc.edu
    College of Charleston Ph. (843) 953-8085
    Charleston, SC 29424 Fax (843) 953-5453

    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
    r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
    ************************************************************************
    * Evgenia Dimitriadou *
    ************************************************************************
    * Institut f?r Statistik * Tel: (+43 1) 58801 10773 *
    * Technische Universit?t Wien * Fax: (+43 1) 58801 10798 *
    * Wiedner Hauptstr. 8-10/1071 * Evgenia.Dimitriadou at ci.tuwien.ac.at *
    * A-1040 Wien, Austria * http://www.ci.tuwien.ac.at/~dimi*
    ************************************************************************


    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
    r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
  • David Meyer at Mar 19, 2002 at 3:26 pm
    Allan Strand wrote:

    Apart from matchClasses(), you might look at classAgreement(), also in
    e1071.

    g.
    -d
    Hi all,

    I have developed a routine to classify observations based upon
    clustering. In my current case there are 5 classes, so the data at
    the end of the classification look like this:

    obs class
    1 2
    2 2
    3 1
    4 4
    5 4
    6 3
    7 5
    8 5
    . .
    . .

    I always know the numbers of classes a priori. I wanted to see how
    well my approach is performing so I wrote a simulation to generate
    observations in a fairly realistic manner. I then run the simulated
    observations through my scheme. The "known" simulated data have the
    same form as the results of the classification, but the class
    identifiers may differ. In other words, a class of observations may be
    constructed correctly by my approach, but the "name" of the class may
    change.

    I would like to compare the results of my scheme to the "known"
    simulated classes and assess its error rate. AS I start, I would just
    like to know the number of observations that were mis-classified. No
    doubt this is a brain-dead question to those who work in this field,
    but this is my first foray into such analyses. Ultimately I was
    wondering of there is an R package that performs such analyses out of
    the box or if anyone who does these kind of analyses routinely has a
    code snippet I could use as an example.

    Cheers,
    a.
    --
    Allan Strand, Biology http://linum.cofc.edu
    College of Charleston Ph. (843) 953-8085
    Charleston, SC 29424 Fax (843) 953-5453

    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
    r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
    --
    Mag. David Meyer Wiedner Hauptstrasse 8-10
    Vienna University of Technology A-1040 Vienna/AUSTRIA
    Department for Tel.: (+431) 58801/10772
    Statistics and Probability Theory mail: david.meyer at ci.tuwien.ac.at
    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
    r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
  • Allan Strand at Mar 20, 2002 at 1:41 pm
    R-help comes through again. I had several responses to my question
    yesterday (all within an hour of posting). I have found two similar
    solutions to my problem. The first was suggested by Ron Wehrens and
    is found in his CompClus package (http://www-cac.sci.kun.nl/cac/).
    The other came from David Meyer and is found below.

    Thanks everyone,
    a.

    David Meyer <david.meyer@ci.tuwien.ac.at> writes:
    Allan Strand wrote:

    Apart from matchClasses(), you might look at classAgreement(), also in
    e1071.

    g.
    -d
    Hi all,

    I have developed a routine to classify observations based upon
    clustering. In my current case there are 5 classes, so the data at
    the end of the classification look like this:
    snip
    I would like to compare the results of my scheme to the "known"
    simulated classes and assess its error rate. AS I start, I would just
    like to know the number of observations that were mis-classified. No
    doubt this is a brain-dead question to those who work in this field,
    but this is my first foray into such analyses. Ultimately I was
    wondering of there is an R package that performs such analyses out of
    the box or if anyone who does these kind of analyses routinely has a
    code snippet I could use as an example.
    snip
    --
    Mag. David Meyer Wiedner Hauptstrasse 8-10
    Vienna University of Technology A-1040 Vienna/AUSTRIA
    Department for Tel.: (+431) 58801/10772
    Statistics and Probability Theory mail: david.meyer at ci.tuwien.ac.at
    --
    Allan Strand, Biology http://linum.cofc.edu
    College of Charleston Ph. (843) 953-8085
    Charleston, SC 29424 Fax (843) 953-5453
    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
    r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
  • Setzer Woodrow at Mar 7, 2002 at 4:01 pm
    To follow up, consider the vector x3 <- c(3,2,1,108, 209, 303) with the
    same units as before.
    cor(x1,x3)
    [1] 0.9995864

    Now express the first three values as microns instead of cm:
    x4 <- x3
    x4[1:3] <- 10000 * x4[1:3]
    cor(x1,x4)
    [1] -0.7461934

    Just changing the units changes the whole sense of the correlation.

    R. Woodrow Setzer, Jr. Phone:
    (919) 541-0128
    Experimental Toxicology Division Fax: (919)
    541-5394
    Pharmacokinetics Branch
    NHEERL MD-74; US EPA; RTP, NC 27711



    Andrew Perrin
    <andrew_perrin at unc.e To: Joerg Maeder <maeder@atmos.umnw.ethz.ch>
    du> cc: dechao wang <dechwang@yahoo.co.uk>, "'R-help at lists.R-project.org'"
    Sent by: <r-help@stat.math.ethz.ch>
    owner-r-help at stat.ma Subject: Re: [R] linear correlation?
    th.ethz.ch


    03/07/02 09:45 AM
    Please respond to
    andrew_perrin





    On Thu, 7 Mar 2002, Joerg Maeder wrote:

    Hello

    dechao wang wrote:
    Hi, I have checked statistic textbooks about
    correlations, but I am still not sure the correlation
    analysis with different units, for example,

    x1<-c(1, 2, 3, 100, 200, 300)
    x2<-c(1.1,2.8,3.3, 108, 209, 303)
    the unit of the first 3 numbers is cm
    the unit of the last 3 numbers is kg

    cor(x1,x2)=0.999655

    Can I explain the correlation coefficient as normal in
    which all numbers have the same unit?
    No, that will give different results. The unit must be the same for all
    values. Which unit isn't important, but it must be the same
    OOPS - I apologize, I misread the question, I understood the OP to be
    saying that x1 was in cm and x2 was in kg.

    What on earth would a correlation mean between two vectors, each of
    which
    is made up of two entirely different measures? (These aren't just
    different units, they're measures of entirely different phenomena.)


    ----------------------------------------------------------------------
    Andrew J Perrin - andrew_perrin at unc.edu - http://www.unc.edu/~aperrin
    Assistant Professor of Sociology, U of North Carolina, Chapel Hill
    269 Hamilton Hall, CB#3210, Chapel Hill, NC 27599-3210 USA




    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.
    -.-.-.-
    r-help mailing list -- Read
    http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._.
    _._._._




    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
    r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
  • David O. Nelson at Mar 7, 2002 at 6:11 pm
    I keep on my bulletin board, in plain view of clients at all times, a paper
    by David Brillinger with the title "Does Anyone Know When the Correlation
    Coefficient is Useful? A Study of the Times of Extreme River Flows"
    (Technometrics, 2001). In it he quotes Tukey as follows:

    "I frequently hold to the position that correlation coefficients are
    justified in two and only two circumstances, when they are regression
    coefficients, or when the measurement of one or both variables on a
    determinate scale is hopeless."

    Neither seems to apply here.

    David O Nelson, Ph.D. (daven at llnl.gov)
    Lawrence Livermore National Laboratory
    Box 808, L-441
    Livermore CA 94551

    ph: +1.925.423.8898
    fax: +1.925.422.2282
    -----Original Message-----
    From: owner-r-help at stat.math.ethz.ch
    [mailto:owner-r-help at stat.math.ethz.ch]On Behalf Of
    Setzer.Woodrow at epamail.epa.gov
    Sent: Thursday, March 07, 2002 8:02 AM
    To: dechao wang
    Cc: 'R-help at lists.R-project.org'
    Subject: Re: [R] linear correlation?



    To follow up, consider the vector x3 <- c(3,2,1,108, 209, 303) with the
    same units as before.
    cor(x1,x3)
    [1] 0.9995864

    Now express the first three values as microns instead of cm:
    x4 <- x3
    x4[1:3] <- 10000 * x4[1:3]
    cor(x1,x4)
    [1] -0.7461934

    Just changing the units changes the whole sense of the correlation.

    R. Woodrow Setzer, Jr. Phone:
    (919) 541-0128
    Experimental Toxicology Division Fax: (919)
    541-5394
    Pharmacokinetics Branch
    NHEERL MD-74; US EPA; RTP, NC 27711
    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
    r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
  • Scott, Uriel at Mar 7, 2002 at 5:28 pm
    Sorry, I also misread your original question and thought x1 was in cm and x2
    in kg.

    I don't think it makes any sense for some values of x1 (or x2) to be in cm
    and others in kg. How can they represent samples from the same population?
    It would be okay if, say, some were in cm and others in km as they are
    equivalent units, and you could simply convert to the same unit, but
    otherwise I don't see how some members of a population are in cm and others
    in kg.
    -----Original Message-----
    From: Scott, Uriel [SMTP:uriel.scott at mirant.com]
    Sent: Thursday, March 07, 2002 10:12 AM
    To: 'dechao wang'; r-help at stat.math.ethz.ch
    Subject: RE: [R] linear correlation?


    Whether the two variables have the same units does not matter. Moreover,
    even if there were some way of converting cm to kg the correlation would
    still be the same because the correlation is invariant under unit
    conversion
    as it is invariant under multiplication of its arguments by a constant.

    As for your second question, the correlation estimator is a continuous
    function of each of the individual data points, so perturbing the values
    of
    any of them by a sufficiently small amount will only perturb the
    correlation
    by a small amount.
    -----Original Message-----
    From: dechao wang [SMTP:dechwang at yahoo.co.uk]
    Sent: Thursday, March 07, 2002 5:34 AM
    To: r-help at stat.math.ethz.ch
    Subject: [R] linear correlation?

    Hi, I have checked statistic textbooks about
    correlations, but I am still not sure the correlation
    analysis with different units, for example,

    x1<-c(1, 2, 3, 100, 200, 300)
    x2<-c(1.1,2.8,3.3, 108, 209, 303)
    the unit of the first 3 numbers is cm
    the unit of the last 3 numbers is kg

    cor(x1,x2)=0.999655

    Can I explain the correlation coefficient as normal in
    which all numbers have the same unit?

    Secondly, if keep the three large numbers unchanged,
    just change the three small numbers, the coefficient
    changes little, this means that the variation of three
    small numbers is hidden by the three larger numbers.
    Is there any solution in R to solve this issue?

    Thanks,

    Dechao

    __________________________________________________

    Everything you'll ever need on one web page
    from News and Sport to Email and Music Charts

    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.
    -.-.-
    r-help mailing list -- Read
    http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._.
    _._._
    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.
    -.-.-
    r-help mailing list -- Read
    http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._.
    _._._
    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
    r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
  • Scott, Uriel at Mar 7, 2002 at 5:37 pm
    If you have 3 numbers a1, a2, a3 representing the lengths of 3 branches and
    b1, b2, b3 representing their branching angles then this should be
    represented as { (a1,b1), (a2,b2), (a3,b3) } rather than { a1, a2, a3, b1,
    b2, b3 }. I.e., you have 3 samples from a bivariate random variable
    (length, angle), not 6 samples from a univariate RV.
    -----Original Message-----
    From: dechao wang [SMTP:dechwang at yahoo.co.uk]
    Sent: Thursday, March 07, 2002 10:29 AM
    To: andrew_perrin at unc.edu
    Cc: r-help at stat.math.ethz.ch
    Subject: Re: [R] linear correlation?

    --- Andrew Perrin wrote: > On
    Thu, 7 Mar 2002, [iso-8859-1] dechao wang wrote:
    Thanks Andrew,

    Consider the following example:
    x1<-c(1, 2, 3, 100, 200, 300)
    x2<-c(1.1,2.8,3.3, 108, 209, 303)
    x3<-c(2.8,3.8,5.3, 108, 209, 303)
    cor(x1,x2)
    [1] 0.999655
    cor(x1,x3)
    [1] 0.9997286

    You can see that as x2 changed to x3 with only first
    three numbers changing, the coefficients (x1, x2) and
    (x1,x3) changed little. I thought this may be because
    the last three numbers were in different units.
    It's not because they're different units -- it's
    because they're different
    measures altogether! Can you state, in words (e.g.,
    not in mathematical
    terms) what you think a correlation would *mean*
    between these two
    vectors? R is happily telling you, as any
    statistical package would, what
    the correlation is between two vectors of numbers.
    But that correlation
    doesn't necessarily mean anything at all; its
    meaning is based on what the
    vectors measure.
    There are lots of examples. Let us consider the first
    three numbers representing three branches of an apple
    tree, the last three numbers representing the
    corresponding branching angles of the branches. So x1,
    x2, x3 represents three different trees. Maybe we can
    ask which tree is similar to which tree?

    __________________________________________________

    Everything you'll ever need on one web page
    from News and Sport to Email and Music Charts

    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.
    -.-.-
    r-help mailing list -- Read
    http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._.
    _._._
    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
    r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
  • Liu Chunhua at Mar 8, 2002 at 5:02 pm
    What about if we measure the height (x1 in cm) and weight (x2 in kg) of
    a sample of people from some population. It seems it makes sense to me
    to get the correlation between x1 and x2.

    Charlie Liu,
    Intern at ECO/EPA.




    "Scott, Uriel"
    <uriel.scott at mirant. To: "Scott, Uriel" <uriel.scott@mirant.com>, 'dechao
    com> wang' <dechwang@yahoo.co.uk>, r-help at stat.math.ethz.ch
    Sent by: cc:
    owner-r-help at stat.ma Subject: RE: [R] linear correlation?
    th.ethz.ch


    03/07/02 12:28 PM







    Sorry, I also misread your original question and thought x1 was in cm
    and x2
    in kg.

    I don't think it makes any sense for some values of x1 (or x2) to be in
    cm
    and others in kg. How can they represent samples from the same
    population?
    It would be okay if, say, some were in cm and others in km as they are
    equivalent units, and you could simply convert to the same unit, but
    otherwise I don't see how some members of a population are in cm and
    others
    in kg.
    -----Original Message-----
    From: Scott, Uriel [SMTP:uriel.scott at mirant.com]
    Sent: Thursday, March 07, 2002 10:12 AM
    To: 'dechao wang'; r-help at stat.math.ethz.ch
    Subject: RE: [R] linear correlation?


    Whether the two variables have the same units does not matter. Moreover,
    even if there were some way of converting cm to kg the correlation would
    still be the same because the correlation is invariant under unit
    conversion
    as it is invariant under multiplication of its arguments by a constant.
    As for your second question, the correlation estimator is a continuous
    function of each of the individual data points, so perturbing the values
    of
    any of them by a sufficiently small amount will only perturb the
    correlation
    by a small amount.
    -----Original Message-----
    From: dechao wang [SMTP:dechwang at yahoo.co.uk]
    Sent: Thursday, March 07, 2002 5:34 AM
    To: r-help at stat.math.ethz.ch
    Subject: [R] linear correlation?

    Hi, I have checked statistic textbooks about
    correlations, but I am still not sure the correlation
    analysis with different units, for example,

    x1<-c(1, 2, 3, 100, 200, 300)
    x2<-c(1.1,2.8,3.3, 108, 209, 303)
    the unit of the first 3 numbers is cm
    the unit of the last 3 numbers is kg

    cor(x1,x2)=0.999655

    Can I explain the correlation coefficient as normal in
    which all numbers have the same unit?

    Secondly, if keep the three large numbers unchanged,
    just change the three small numbers, the coefficient
    changes little, this means that the variation of three
    small numbers is hidden by the three larger numbers.
    Is there any solution in R to solve this issue?

    Thanks,

    Dechao

    __________________________________________________

    Everything you'll ever need on one web page
    from News and Sport to Email and Music Charts
    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.
    -.-.
    -.-.-
    r-help mailing list -- Read
    http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To:
    r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._.
    _._.
    _._._
    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-. -.-.
    -.-.-
    r-help mailing list -- Read
    http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To:
    r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._. _._.
    _._._
    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.
    -.-.-.-
    r-help mailing list -- Read
    http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._.
    _._._._




    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
    r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
  • Andrew Perrin at Mar 9, 2002 at 7:44 pm
    All I can say is, please read the whole thread. He's not correlating two
    variables across cases, as you assume in your message. He's correlating
    two cases across variables, and getting meaningless (but attractively
    high) numbers.

    ap

    ---------------------------------------------------------
    Andrew J. Perrin - Assistant Professor of Sociology
    University of North Carolina, Chapel Hill
    269 Hamilton Hall CB#3210, Chapel Hill, NC 27599-3210 USA
    andrew_perrin at unc.edu - http://www.unc.edu/~aperrin
    On Fri, 8 Mar 2002 Liu.Chunhua at epamail.epa.gov wrote:


    What about if we measure the height (x1 in cm) and weight (x2 in kg) of
    a sample of people from some population. It seems it makes sense to me
    to get the correlation between x1 and x2.

    Charlie Liu,
    Intern at ECO/EPA.




    "Scott, Uriel"
    <uriel.scott at mirant. To: "Scott, Uriel" <uriel.scott@mirant.com>, 'dechao
    com> wang' <dechwang@yahoo.co.uk>, r-help at stat.math.ethz.ch
    Sent by: cc:
    owner-r-help at stat.ma Subject: RE: [R] linear correlation?
    th.ethz.ch


    03/07/02 12:28 PM







    Sorry, I also misread your original question and thought x1 was in cm
    and x2
    in kg.

    I don't think it makes any sense for some values of x1 (or x2) to be in
    cm
    and others in kg. How can they represent samples from the same
    population?
    It would be okay if, say, some were in cm and others in km as they are
    equivalent units, and you could simply convert to the same unit, but
    otherwise I don't see how some members of a population are in cm and
    others
    in kg.
    -----Original Message-----
    From: Scott, Uriel [SMTP:uriel.scott at mirant.com]
    Sent: Thursday, March 07, 2002 10:12 AM
    To: 'dechao wang'; r-help at stat.math.ethz.ch
    Subject: RE: [R] linear correlation?


    Whether the two variables have the same units does not matter. Moreover,
    even if there were some way of converting cm to kg the correlation would
    still be the same because the correlation is invariant under unit
    conversion
    as it is invariant under multiplication of its arguments by a constant.
    As for your second question, the correlation estimator is a continuous
    function of each of the individual data points, so perturbing the values
    of
    any of them by a sufficiently small amount will only perturb the
    correlation
    by a small amount.
    -----Original Message-----
    From: dechao wang [SMTP:dechwang at yahoo.co.uk]
    Sent: Thursday, March 07, 2002 5:34 AM
    To: r-help at stat.math.ethz.ch
    Subject: [R] linear correlation?

    Hi, I have checked statistic textbooks about
    correlations, but I am still not sure the correlation
    analysis with different units, for example,

    x1<-c(1, 2, 3, 100, 200, 300)
    x2<-c(1.1,2.8,3.3, 108, 209, 303)
    the unit of the first 3 numbers is cm
    the unit of the last 3 numbers is kg

    cor(x1,x2)=0.999655

    Can I explain the correlation coefficient as normal in
    which all numbers have the same unit?

    Secondly, if keep the three large numbers unchanged,
    just change the three small numbers, the coefficient
    changes little, this means that the variation of three
    small numbers is hidden by the three larger numbers.
    Is there any solution in R to solve this issue?

    Thanks,

    Dechao

    __________________________________________________

    Everything you'll ever need on one web page
    from News and Sport to Email and Music Charts
    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.
    -.-.
    -.-.-
    r-help mailing list -- Read
    http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To:
    r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._.
    _._.
    _._._
    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-. -.-.
    -.-.-
    r-help mailing list -- Read
    http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To:
    r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._. _._.
    _._._
    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.
    -.-.-.-
    r-help mailing list -- Read
    http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._.
    _._._._




    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
    r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
    r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
  • Chen Huashan at Mar 9, 2002 at 5:51 pm
    Dear R help users:

    All the posts of r-help list before 2001-12-28 have been added to database!
    The address is : http://www.baidao.net/r/maillist/archive/index.cgi

    The old address (http://www.baidao.net/r/maillist/index.cgi ) is still under testing

    The other two lists will be added untile r-help list archive is considered stable enough.

    All the bests!

    Chen Huashan
    -----Original Message-----
    From: owner-r-help at stat.math.ethz.ch
    [mailto:owner-r-help at stat.math.ethz.ch]On Behalf Of Chen Huashan
    Sent: Thursday, March 07, 2002 1:36 PM
    To: r-help at stat.math.ethz.ch
    Subject: [R] mailing list archive


    Dear R help users:

    I have set up a r help mailing list archive based on mysql which support
    full text search and auto-update.

    Please visit http://www.baidao.net/r/maillist/index.cgi . I hope
    you could
    provide me bug reports and suggestions.

    I will add r_dev and r_announce mailing list as soon as possible.

    Thanks in advance!


    eLan
    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
    r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
  • Chen Huashan at Apr 10, 2002 at 5:06 am
    Dear R users:

    The searchable list archives(R Help, Dev, Announce) are here:

    http://www.baidao.net/r/archives/index.cgi

    The database will be updated monthly.

    The R Help archive (http://www.baidao.net/r/maillist/index.cgi) is updated
    every day and contains posts within 3 monthes.

    Special thanks to Patrick, Laurent, Friedrich!




    Chen Huashan


    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
    r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
  • Friedrich Leisch at Apr 10, 2002 at 7:14 am

    On Wed, 10 Apr 2002 13:06:10 +0800,
    Chen Huashan (CH) wrote:
    Dear R users:
    The searchable list archives(R Help, Dev, Announce) are here:
    http://www.baidao.net/r/archives/index.cgi
    The database will be updated monthly.
    The R Help archive (http://www.baidao.net/r/maillist/index.cgi) is updated
    every day and contains posts within 3 monthes.
    Special thanks to Patrick, Laurent, Friedrich!

    Thanks a lot!

    Could you create a single page with links to the two archives
    explaining them (basically a web page with the text of your mail). I
    would then link from CRAN to that page.

    Best,
    Fritz

    --
    -------------------------------------------------------------------
    Friedrich Leisch
    Institut f?r Statistik Tel: (+43 1) 58801 10715
    Technische Universit?t Wien Fax: (+43 1) 58801 10798
    Wiedner Hauptstra?e 8-10/1071 Friedrich.Leisch at ci.tuwien.ac.at
    A-1040 Wien, Austria http://www.ci.tuwien.ac.at/~leisch
    -------------------------------------------------------------------

    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
    r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
  • Chen Huashan at Apr 13, 2002 at 8:19 am
    Hi , Fritz,

    I merged those two URLs together yesterday to make it easier for
    most people. Now the url is http://www.baidao.net/r/archives/index.cgi .

    The archive contains two parts now:
    1, Monthly updates from CRAN including 3 lists.
    2, Daily update of R Help list. ( I will add the other 2 lists if necessary)

    I noticed that there was an argue about newsgroup. And I think this
    archive is suitable for those who dont' want to receive many emails since
    every people has a browser and can read every new post from the archive.

    I am planning to add such features :
    1, registered users can mark some threads as their personal favoriates for further use.
    2, registered users receive replys of threads that they have interests through email.
    And then, they don't have to receive all posts from the list.

    Best regards!

    Chen Huashan

    -----Original Message-----
    Thanks a lot!

    Could you create a single page with links to the two archives
    explaining them (basically a web page with the text of your mail). I
    would then link from CRAN to that page.

    Best,
    Fritz
    -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
    r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
    Send "info", "help", or "[un]subscribe"
    (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
    _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Related Discussions

People

Translate

site design / logo © 2017 Grokbase