Grokbase Groups R r-help August 2006
FAQ
Hello,
I am using logistic discriminant analysis to check whether a known
classification Yobs can be predicted by few continuous variables X.

What I do is to predict class probabilities with multinom() in nnet(),
obtaining a predicted classification Ypred and then compute the percentage
P(obs) of objects classified the same in Yobs and Ypred.

My problem now is to figure out whether P(obs) is significantly higher than
chance.

I opted for a crude permutation approach: compute P(perm) over 10000 random
permutations of Yobs (i.e., refit the multinom() model 10000 times randomly
permuting Yobs) and consider P(obs) as significantly higher than chance if
higher than the 95th percentile of the P(perm) distribution.

Now, the problem is that the mode of P(perm) is always really close to
P(obs), e.g., if P(obs)=1 (perfect discrimination) also the most likely
P(perm) value is 1!!!

I figured out that this is due to the fact that, with my data, randomly
permuted classifications are highly likely to strongly agree with the
observed classification Yobs, but, probably since my machine learning
background is almost 0, I am kind of lost about how to proceed at this
point.

I would greatly appreciate a comment on this.

Thanks
Bruno

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Bruno L. Giordano, Ph.D.
CIRMMT
Schulich School of Music, McGill University
555 Sherbrooke Street West
Montr?al, QC H3A 1E3
Canada
http://www.music.mcgill.ca/~bruno/

Search Discussions

  • Bruno L. Giordano at Aug 11, 2006 at 3:07 am
    Well,
    If posting a possible solution to one's own problem is not part of the
    netiquette of this list please correct me.

    Following Titus et al. (1984) one might use Cohen's kappa to have a
    chance-corrected measure of agreement between the original and reproduced
    classification:

    Kappa() in library vcd
    kappa2() in library irr
    ckappa() in library psy
    cohen.kappa() in library concord......

    Bruno

    Kimberly Titus; James A. Mosher; Byron K. Williams (1984), Chance-corrected
    Classification for Use in Discriminant Analysis: Ecological Applications,
    American Midland Naturalist, 111(1),1-7.


    ----- Original Message -----
    From: "Bruno L. Giordano" <[email protected]>
    To: <[email protected]>
    Sent: Thursday, August 10, 2006 6:18 PM
    Subject: [R] logistic discrimination: which chance performance??

    Hello,
    I am using logistic discriminant analysis to check whether a known
    classification Yobs can be predicted by few continuous variables X.

    What I do is to predict class probabilities with multinom() in nnet(),
    obtaining a predicted classification Ypred and then compute the percentage
    P(obs) of objects classified the same in Yobs and Ypred.

    My problem now is to figure out whether P(obs) is significantly higher
    than
    chance.

    I opted for a crude permutation approach: compute P(perm) over 10000
    random
    permutations of Yobs (i.e., refit the multinom() model 10000 times
    randomly
    permuting Yobs) and consider P(obs) as significantly higher than chance if
    higher than the 95th percentile of the P(perm) distribution.

    Now, the problem is that the mode of P(perm) is always really close to
    P(obs), e.g., if P(obs)=1 (perfect discrimination) also the most likely
    P(perm) value is 1!!!

    I figured out that this is due to the fact that, with my data, randomly
    permuted classifications are highly likely to strongly agree with the
    observed classification Yobs, but, probably since my machine learning
    background is almost 0, I am kind of lost about how to proceed at this
    point.

    I would greatly appreciate a comment on this.

    Thanks
    Bruno

    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Bruno L. Giordano, Ph.D.
    CIRMMT
    Schulich School of Music, McGill University
    555 Sherbrooke Street West
    Montr?al, QC H3A 1E3
    Canada
    http://www.music.mcgill.ca/~bruno/

    ______________________________________________
    R-help at stat.math.ethz.ch mailing list
    https://stat.ethz.ch/mailman/listinfo/r-help
    PLEASE do read the posting guide
    http://www.R-project.org/posting-guide.html
    and provide commented, minimal, self-contained, reproducible code.
  • Frank E Harrell Jr at Aug 11, 2006 at 3:33 pm

    Bruno L. Giordano wrote:
    Well,
    If posting a possible solution to one's own problem is not part of the
    netiquette of this list please correct me.

    Following Titus et al. (1984) one might use Cohen's kappa to have a
    chance-corrected measure of agreement between the original and reproduced
    classification:

    Kappa() in library vcd
    kappa2() in library irr
    ckappa() in library psy
    cohen.kappa() in library concord......

    Bruno

    Kimberly Titus; James A. Mosher; Byron K. Williams (1984), Chance-corrected
    Classification for Use in Discriminant Analysis: Ecological Applications,
    American Midland Naturalist, 111(1),1-7.


    ----- Original Message -----
    From: "Bruno L. Giordano" <[email protected]>
    To: <[email protected]>
    Sent: Thursday, August 10, 2006 6:18 PM
    Subject: [R] logistic discrimination: which chance performance??

    Hello,
    I am using logistic discriminant analysis to check whether a known
    classification Yobs can be predicted by few continuous variables X.

    What I do is to predict class probabilities with multinom() in nnet(),
    obtaining a predicted classification Ypred and then compute the percentage
    P(obs) of objects classified the same in Yobs and Ypred.

    My problem now is to figure out whether P(obs) is significantly higher
    than
    chance.
    The most powerful approach, and one that is automatically corrected for
    chance, is to use the likelihood ratio test for the global null
    hypothesis for the whole model.

    With classification proportions you not only lose power and have trouble
    correcting for chance, but you have arbitrariness in what constitutes a
    positive prediction.

    Frank Harrell
    I opted for a crude permutation approach: compute P(perm) over 10000
    random
    permutations of Yobs (i.e., refit the multinom() model 10000 times
    randomly
    permuting Yobs) and consider P(obs) as significantly higher than chance if
    higher than the 95th percentile of the P(perm) distribution.

    Now, the problem is that the mode of P(perm) is always really close to
    P(obs), e.g., if P(obs)=1 (perfect discrimination) also the most likely
    P(perm) value is 1!!!

    I figured out that this is due to the fact that, with my data, randomly
    permuted classifications are highly likely to strongly agree with the
    observed classification Yobs, but, probably since my machine learning
    background is almost 0, I am kind of lost about how to proceed at this
    point.

    I would greatly appreciate a comment on this.

    Thanks
    Bruno

    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Bruno L. Giordano, Ph.D.
    CIRMMT
    Schulich School of Music, McGill University
    555 Sherbrooke Street West
    Montr?al, QC H3A 1E3
    Canada
    http://www.music.mcgill.ca/~bruno/

    ______________________________________________
    R-help at stat.math.ethz.ch mailing list
    https://stat.ethz.ch/mailman/listinfo/r-help
    PLEASE do read the posting guide
    http://www.R-project.org/posting-guide.html
    and provide commented, minimal, self-contained, reproducible code.
    ______________________________________________
    R-help at stat.math.ethz.ch mailing list
    https://stat.ethz.ch/mailman/listinfo/r-help
    PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
    and provide commented, minimal, self-contained, reproducible code.

    --
    Frank E Harrell Jr Professor and Chair School of Medicine
    Department of Biostatistics Vanderbilt University

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupr-help @
categoriesr
postedAug 10, '06 at 10:18p
activeAug 11, '06 at 3:33p
posts3
users2
websiter-project.org
irc#r

People

Translate

site design / logo © 2023 Grokbase