Dear Thomas,

You said, "the log-binomial model is very non-robust when the fitted values

get close to 1, and there is some controversy over the best approach."

Could you please point me to a paper that discusses the issues?

I have written some code to do maximum likelihood estimation for relative,

additive, and mixed risk regression models with binomial model. I have been

able to obtain good convergence. I have used bootstrap to get standard

errors. However, I am not sure if these standard errors are valid when

fitted values were close to 0 or 1. It seems to me that when the fitted

probabilities are close to 0 or 1, there is not a good way to estimate

standard errors.

Thanks,

Ravi.

-----Original Message-----

From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On

Behalf Of Thomas Lumley

Sent: Monday, September 13, 2010 10:41 PM

To: Daniel Nordlund

Cc: r-help at r-project.org

Subject: Re: [R] relative risk regression with survey data

On Mon, 13 Sep 2010, Daniel Nordlund wrote:

I have been asked to look at options for doing relative risk regression on

some survey data. I have a binary DV and several predictor / adjustment

variables. In R, would this be as "simple" as using the survey package to

set up an appropriate design object and then running svyglm with

family=binomial(log) ? Any other suggestions for covariate adjustment of

relative risk estimates? Any and all suggestions welcomed.

If the fitted values don't get too close to 1 then svyglm(

,family=quasibinomial(log)) will do it.

The log-binomial model is very non-robust when the fitted values get close

to 1, and there is some controversy over the best approach. You can still

use svyglm( ,family=quasibinomial(log)) but you will probably need to set

the number of iterations much higher (perhaps 200).

Alternatively, you can use nonlinear least squares [svyglm(,

family=gaussian(log))] or other quasilikelihood approaches, such as

family=quasipoisson(log). These are all consistent for the same parameter

if the model is correctly specified and are much more robust to x-outliers.

I rather like nonlinear least squares, because it's easy to explain.

-thomas

Thomas Lumley

Professor of Biostatistics

University of Washington, Seattle

______________________________________________

R-help at r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide

http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.