FAQ
I have been asked to look at options for doing relative risk regression
on some survey data. I have a binary DV and several predictor /
adjustment variables. In R, would this be as "simple" as using the
survey package to set up an appropriate design object and then running
svyglm with family=binomial(log) ? Any other suggestions for covariate
adjustment of relative risk estimates? Any and all suggestions welcomed.

Dan

--
Daniel Nordlund
Bothell, WA USA

Search Discussions

  • Thomas Lumley at Sep 14, 2010 at 2:40 am

    On Mon, 13 Sep 2010, Daniel Nordlund wrote:

    I have been asked to look at options for doing relative risk regression on
    some survey data. I have a binary DV and several predictor / adjustment
    variables. In R, would this be as "simple" as using the survey package to
    set up an appropriate design object and then running svyglm with
    family=binomial(log) ? Any other suggestions for covariate adjustment of
    relative risk estimates? Any and all suggestions welcomed.
    If the fitted values don't get too close to 1 then svyglm( ,family=quasibinomial(log)) will do it.

    The log-binomial model is very non-robust when the fitted values get close to 1, and there is some controversy over the best approach. You can still use svyglm( ,family=quasibinomial(log)) but you will probably need to set the number of iterations much higher (perhaps 200).

    Alternatively, you can use nonlinear least squares [svyglm(, family=gaussian(log))] or other quasilikelihood approaches, such as family=quasipoisson(log). These are all consistent for the same parameter if the model is correctly specified and are much more robust to x-outliers. I rather like nonlinear least squares, because it's easy to explain.

    -thomas


    Thomas Lumley
    Professor of Biostatistics
    University of Washington, Seattle
  • Daniel Nordlund at Sep 14, 2010 at 3:45 am
    Thanks to Thomas Lumley and David Winsemius for their responses. I
    had read a number of papers by Thomas and have ordered his book on
    survey analysis, but I wanted to get some confirmation because I
    wanted to get started before the book arrived. Thanks, again.

    Dan

    Daniel Nordlund
    Bothell, WA USA
    On Mon, Sep 13, 2010 at 7:40 PM, Thomas Lumley wrote:
    On Mon, 13 Sep 2010, Daniel Nordlund wrote:

    I have been asked to look at options for doing relative risk regression on
    some survey data. ?I have a binary DV and several predictor / adjustment
    variables. ?In R, would this be as "simple" as using the survey package to
    set up an appropriate design object and then running svyglm with
    family=binomial(log) ? ?Any other suggestions for covariate adjustment of
    relative risk estimates? ?Any and all suggestions welcomed.
    If the fitted values don't get too close to 1 then svyglm(
    ?,family=quasibinomial(log)) will do it.

    The log-binomial model is very non-robust when the fitted values get close
    to 1, and there is some controversy over the best approach. ?You can still
    use svyglm( ?,family=quasibinomial(log)) but you will probably need to set
    the number of iterations much higher (perhaps 200).

    Alternatively, you can use nonlinear least squares ?[svyglm(,
    family=gaussian(log))] or other quasilikelihood approaches, such as
    family=quasipoisson(log). ?These are all consistent for the same parameter
    if the model is correctly specified and are much more robust to x-outliers.
    ?I rather like nonlinear least squares, because it's easy to explain.

    ? ? -thomas


    Thomas Lumley
    Professor of Biostatistics
    University of Washington, Seattle
  • Ravi Varadhan at Sep 15, 2010 at 1:37 pm
    Dear Thomas,

    You said, "the log-binomial model is very non-robust when the fitted values
    get close to 1, and there is some controversy over the best approach."
    Could you please point me to a paper that discusses the issues?

    I have written some code to do maximum likelihood estimation for relative,
    additive, and mixed risk regression models with binomial model. I have been
    able to obtain good convergence. I have used bootstrap to get standard
    errors. However, I am not sure if these standard errors are valid when
    fitted values were close to 0 or 1. It seems to me that when the fitted
    probabilities are close to 0 or 1, there is not a good way to estimate
    standard errors.


    Thanks,
    Ravi.

    -----Original Message-----
    From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
    Behalf Of Thomas Lumley
    Sent: Monday, September 13, 2010 10:41 PM
    To: Daniel Nordlund
    Cc: r-help at r-project.org
    Subject: Re: [R] relative risk regression with survey data
    On Mon, 13 Sep 2010, Daniel Nordlund wrote:

    I have been asked to look at options for doing relative risk regression on
    some survey data. I have a binary DV and several predictor / adjustment
    variables. In R, would this be as "simple" as using the survey package to
    set up an appropriate design object and then running svyglm with
    family=binomial(log) ? Any other suggestions for covariate adjustment of
    relative risk estimates? Any and all suggestions welcomed.
    If the fitted values don't get too close to 1 then svyglm(
    ,family=quasibinomial(log)) will do it.

    The log-binomial model is very non-robust when the fitted values get close
    to 1, and there is some controversy over the best approach. You can still
    use svyglm( ,family=quasibinomial(log)) but you will probably need to set
    the number of iterations much higher (perhaps 200).

    Alternatively, you can use nonlinear least squares [svyglm(,
    family=gaussian(log))] or other quasilikelihood approaches, such as
    family=quasipoisson(log). These are all consistent for the same parameter
    if the model is correctly specified and are much more robust to x-outliers.
    I rather like nonlinear least squares, because it's easy to explain.

    -thomas


    Thomas Lumley
    Professor of Biostatistics
    University of Washington, Seattle

    ______________________________________________
    R-help at r-project.org mailing list
    https://stat.ethz.ch/mailman/listinfo/r-help
    PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
    and provide commented, minimal, self-contained, reproducible code.
  • Thomas Lumley at Sep 15, 2010 at 3:57 pm

    On Wed, 15 Sep 2010, Ravi Varadhan wrote:

    Dear Thomas,

    You said, "the log-binomial model is very non-robust when the fitted values
    get close to 1, and there is some controversy over the best approach."
    Could you please point me to a paper that discusses the issues?

    I have written some code to do maximum likelihood estimation for relative,
    additive, and mixed risk regression models with binomial model. I have been
    able to obtain good convergence. I have used bootstrap to get standard
    errors. However, I am not sure if these standard errors are valid when
    fitted values were close to 0 or 1. It seems to me that when the fitted
    probabilities are close to 0 or 1, there is not a good way to estimate
    standard errors.
    There's a technical report at
    http://www.bepress.com/uwbiostat/paper293/
    with simulations, some theory, and references. It's under review at the moment, after being forgotten for a few years.

    The distribution of the parameter estimates when the true parameter is on the boundary of the parameter space is a separate mess.
    Theoretically it is the intersection of the the multivariate Normal with the parameter space, and if the parameter space has a piecewise linear boundary the log likelihood ratio has a chi-squared mixture distribution. In practice, if there isn't a hard edge to the covariate distribution it's not going to be easy to get a good approximation to the distribution of parameter estimates. As an example of the complications, the sampling distributions for fixed and random design matrices can be very different, because a random design matrix means that the estimated edge of the parameter space moves from one realization to another.

    -thomas

    Thomas Lumley
    Professor of Biostatistics
    University of Washington, Seattle

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupr-help @
categoriesr
postedSep 13, '10 at 11:52p
activeSep 15, '10 at 3:57p
posts5
users3
websiter-project.org
irc#r

People

Translate

site design / logo © 2018 Grokbase