FAQ

[R] Problem with glm, gaussian family with log-link

Florian Weiler
Nov 26, 2012 at 12:33 pm
Dear all,

I am using the book "Generalized Linera Models and Extension" by Hardin and
Hilbe (second edition, 2007) at the moment. The authors suggest that
instead of OLS models, "the log link is generally used for response data
that take only positive values on the continuous scale". Of course they
also suggest residual plots to check whether a "normal" linera model using
an identity link can still be used.

I am trying to replicate in R what they do in the book in STATA. Indeed, I
have no problems in STATA with the log link. However, when calling the same
model using R's glm-function, but specifying *family=gaussian(link="log") *I
am asked to provide starting values. When I set them all equal to zero, I
always get the message that the algorithm did not converge. Picking other
values the message is sometimes the same, but more often I get:
*
*
*Error in glm.fit(x = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, : *
* NA/NaN/Inf in 'x' *
*
*
As I said, in STATA I can run these models without setting starting values
and without errors. I tried many different models, and different datasets,
but the problem is always the same (unless I only include one single
independent variable). Could anyone tell me why this is the case, or what I
do wrong, or why the suggested models from the book might not be
appropriate? I'd appreciate any help!

Best,
Florian
reply

Search Discussions

1 response

  • Ilai at Nov 27, 2012 at 5:16 am

    On Mon, Nov 26, 2012 at 5:33 AM, Florian Weiler wrote:

    Dear all,

    I am using the book "Generalized Linera Models and Extension" by Hardin and
    Hilbe (second edition, 2007) at the moment. The authors suggest that
    instead of OLS models, "the log link is generally used for response data
    that take only positive values on the continuous scale".

    <snip>
    specifying *family=gaussian(link="log") *I
    am asked to provide starting values. When I set them all equal to zero, I
    always get the message that the algorithm did not converge. Picking other
    values the message is sometimes the same, but more often I get:
    *
    *
    *Error in glm.fit(x = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, :
    *
    * NA/NaN/Inf in 'x' *
    *
    *
    As I said, in STATA I can run these models without setting starting values
    and without errors. I tried many different models, and different datasets,
    And yet you've failed to provide even one of them together with your code
    as a reproducible example ...

    # This works without starting values:
    set.seed(2341)
    x <- rep(1:10,3) ; y <- jitter(rpois(30,5+x))
    plot(x,y)
    (gausslog <- glm(y~x,family=gaussian(link='log')))
    exp(coef(gausslog))

    # This works only with starting values
    set.seed(2341)
    x <- rep(1:10,3) ; y <- jitter(rpois(30,x))
    plot(x,y) ; summary(y) # yes,yes, some y <0, just trying to reproduce the
    error...
    (gausslog <- glm(y~x,family=gaussian(link='log')))
    (gausslog <- glm(y~x,family=gaussian(link='log'),start=c(0,0)))

    # also
    set.seed(2341)
    x <- rep(1:10,3) ; y <- rnorm(30,0+0.1*x)
    plot(x,y) ; summary(y)
    (gausslog <- glm(y~x,family=gaussian(link='log'),start=c(0,0)))

    So really this is a non issue without the offending data set and code.

    but the problem is always the same (unless I only include one single
    independent variable).

    Oh, more information... way to build up the suspense

    set.seed(2341)
    x <- rep(1:10,3) ; xx <- rep(seq(20,50,l=5),6) ; y <- rnorm(30,5+3*x-2*xx)
    (gausslog <- glm(y~x+xx,family=gaussian(link='log'),start=c(0,0,0)))

    No joy. Still fits.

    Could anyone tell me why this is the case, or what I
    do wrong,

    No

    or why the suggested models from the book might not be
    appropriate? I'd appreciate any help!

    Personally I don't care for reproducing some results from STATA and have
    no comment on the validity of the above but maybe someone in the list would
    have a better answer if you repost.




    Best,
    Florian

    Also this:
    [[alternative HTML version deleted]]

    ______________________________________________
    r-help@r-project.org mailing list
    https://stat.ethz.ch/mailman/listinfo/r-help
    PLEASE do read the posting guide
    http://www.R-project.org/posting-guide.html
    and provide commented, minimal, self-contained, reproducible code.

Related Discussions

Discussion Navigation
viewthread | post

2 users in discussion

Florian Weiler: 1 post Ilai: 1 post