Grokbase Groups R r-help June 2016
FAQ
Dear list,


I am using the lrm function from the rms package to estimate a logistic
model with weights. The c-statistic (or area under the curve) is part of
the lrm output.


To understand how the weights enter the computation of the c-statistics, I
looked at the script of lrm and lrm.fit but I am out of luck because it is
making a call to a Fortran routine and I don't know Fortran.


     z <- .Fortran("lrmfit", coef = initial, nx, 1:nx, x,
             y, offset, u = double(nvi), double(nvi * (nvi + 1)),
             double(1), n, nx, sumw, nvi, v = double(nvi * nvi),
             double(nvi), double(2 * nvi), double(nvi), integer(nvi),
             opts = opts, ftable, penmat, weights, PACKAGE = "rms")




Can somebody help me figure out how the weights from the regression are
used in the computation of the c-statistic? Here is a small example that
shows that the c-statistic computed from the rms package and using the pROC
packages are not the same (not even close) when calculated from a weighted
logistic regression.


set.seed(1233)
x <- rnorm(100)
w <- runif(100)
y <- rbinom(100, 1, .5)
require(rms)
# unweighted model
umod <- lrm(y~x)
umod$stat # c-statistic is 0.5776796
# weighted model
wmod <- lrm(y~x, weight = w)
wmod$stat # c-statistic is 0.65625
# using pROC
require(pROC)
umod2 <- glm(y~x, family = binomial)
auc(y, predict(umod2)) # 0.5769
wmod2 <- glm(y~x, weights = w, family = binomial)
auc(y, predict(wmod2)) # 0.5769


BTW results from umod and umod2 and from wmod and wmod2 are identical so
the discrepancy in c-statistics in not due to using lrm vs. glm.


Best regards,
MP


  [[alternative HTML version deleted]]

Search Discussions

  • David Winsemius at Jun 16, 2016 at 2:49 pm

    On Jun 15, 2016, at 6:44 PM, Marie-Pierre Sylvestre wrote:

    Dear list,

    I am using the lrm function from the rms package to estimate a logistic
    model with weights. The c-statistic (or area under the curve) is part of
    the lrm output.

    To understand how the weights enter the computation of the c-statistics, I
    looked at the script of lrm and lrm.fit but I am out of luck because it is
    making a call to a Fortran routine and I don't know Fortran.

    z <- .Fortran("lrmfit", coef = initial, nx, 1:nx, x,
    y, offset, u = double(nvi), double(nvi * (nvi + 1)),
    double(1), n, nx, sumw, nvi, v = double(nvi * nvi),
    double(nvi), double(2 * nvi), double(nvi), integer(nvi),
    opts = opts, ftable, penmat, weights, PACKAGE = "rms")


    Can somebody help me figure out how the weights from the regression are
    used in the computation of the c-statistic? Here is a small example that
    shows that the c-statistic computed from the rms package and using the pROC
    packages are not the same (not even close) when calculated from a weighted
    logistic regression.

    set.seed(1233)
    x <- rnorm(100)
    w <- runif(100)
    y <- rbinom(100, 1, .5)
    require(rms)
    # unweighted model
    umod <- lrm(y~x)
    umod$stat # c-statistic is 0.5776796
    # weighted model
    wmod <- lrm(y~x, weight = w)
    wmod$stat # c-statistic is 0.65625
    # using pROC
    require(pROC)
    umod2 <- glm(y~x, family = binomial)
    auc(y, predict(umod2)) # 0.5769
    wmod2 <- glm(y~x, weights = w, family = binomial)
    auc(y, predict(wmod2)) # 0.5769

    BTW results from umod and umod2 and from wmod and wmod2 are identical so
    the discrepancy in c-statistics in not due to using lrm vs. glm.

    Your output appears to imply that the `auc` function is ignoring the weights, whereas the lrm function is honoring them. The `lrm` documentation implies these are handled as possibly fractional case weights.


    --
    David.
    Best regards,
    MP

    [[alternative HTML version deleted]]

    ______________________________________________
    R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
    https://stat.ethz.ch/mailman/listinfo/r-help
    PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
    and provide commented, minimal, self-contained, reproducible code.

    David Winsemius
    Alameda, CA, USA

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupr-help @
categoriesr
postedJun 16, '16 at 1:44a
activeJun 16, '16 at 2:49p
posts2
users2
websiter-project.org
irc#r

People

Translate

site design / logo © 2017 Grokbase