Grokbase Groups R r-help March 2012
FAQ
Hi all,

I am doing bootstrap with logistic regression by using glm function, and I
get the errors;

glm.fit: fitted probabilities numerically 0 or 1 occurred and
glm.fit: algorithm did not converge

I have read some things about this issue in the mailing list. I can guess
what was the problem. My data contains one or may be two outliers. Does the
error occur due to these extreme values or something else such as MLE? Is
there any way to to fix this problem?

Regards,

Ufuk

-----
Best regards

Ufuk
--
View this message in context: http://r.789695.n4.nabble.com/glm-fit-fitted-probabilities-numerically-0-or-1-occurred-tp4490722p4490722.html
Sent from the R help mailing list archive at Nabble.com.

Search Discussions

  • S Ellison at Mar 21, 2012 at 11:30 am

    I get the errors;
    glm.fit: fitted probabilities numerically 0 or 1 occurred and
    glm.fit: algorithm did not converge
    .....
    Is there any way to to
    fix this problem?
    There are two separate issues.
    One is the appearance of fitted values at 0 or 1.
    The other is the lack of convergence.

    The first is usually not fatal; it means that the probabilities are so close to 0 or 1 that a double precision value can't distinguish them from 0 or 1. Often that's a transient condition during iteration and the final fitted values are inside (0,1), but final fitted values can also be out there if you have continuous predictor values a long way out; by itself, that usually won't stop a glm.

    The second is a bit more problematic. Sometimes it's just that you need to increase the maximum number of iterations (see the control= argument and ?glm.control). That is always worth a try - use some absurdly high number like 1000 instead of the default 25 and go find some coffee while it thinks about it. If that solves your problem then you're OK, or at least as OK as you can be with a data set that hard to fit.
    But if you're bootstrapping with some anomalous values it is also possible that some of your bootstrapped sets have too high a proportion of anomalies, and under those conditions it's possible that there could be no sensible optimum within reach.

    One way of dealing with that in a boostrap or other simulation context is to check the 'converged' value and if it's FALSE, return an NA for your statistic. But of course that is a form of censoring; if you have a high proportion of such instances you'd be on very thin ice drawing conclusions.

    S Ellison

    *******************************************************************
    This email and any attachments are confidential. Any use...{{dropped:8}}
  • Ufuk beyaztas at Mar 21, 2012 at 2:22 pm
    Dear Ellison,
    Many thanks for your reply.

    The information you typed is clear and now I know what to do. Your
    suggestion about finding some coffee while running simulation is so good =)

    Regards

    -----
    Best regards

    Ufuk
    --
    View this message in context: http://r.789695.n4.nabble.com/glm-fit-fitted-probabilities-numerically-0-or-1-occurred-tp4490722p4492436.html
    Sent from the R help mailing list archive at Nabble.com.
  • Ted Harding at Mar 21, 2012 at 3:35 pm

    On 21-Mar-2012 S Ellison wrote:
    I get the errors;
    glm.fit: fitted probabilities numerically 0 or 1 occurred and
    glm.fit: algorithm did not converge
    .....
    Is there any way to to
    fix this problem?
    There are two separate issues.
    One is the appearance of fitted values at 0 or 1.
    The other is the lack of convergence.

    The first is usually not fatal; it means that the probabilities
    are so close to 0 or 1 that a double precision value can't
    distinguish them from 0 or 1.
    Often that's a transient condition during iteration and the
    final fitted values are inside (0,1), but final fitted values
    can also be out there if you have continuous predictor values
    a long way out; by itself, that usually won't stop a glm.

    The second is a bit more problematic. Sometimes it's just that
    you need to increase the maximum number of iterations (see the
    control= argument and ?glm.control). That is always worth a try
    - use some absurdly high number like 1000 instead of the default
    25 and go find some coffee while it thinks about it. If that
    solves your problem then you're OK, or at least as OK as you can
    be with a data set that hard to fit.
    But if you're bootstrapping with some anomalous values it is
    also possible that some of your bootstrapped sets have too high
    a proportion of anomalies, and under those conditions it's
    possible that there could be no sensible optimum within reach.

    One way of dealing with that in a boostrap or other simulation
    context is to check the 'converged' value and if it's FALSE,
    return an NA for your statistic. But of course that is a form
    of censoring; if you have a high proportion of such instances
    you'd be on very thin ice drawing conclusions.

    S Ellison
    There is a further general issue. The occurence of

    glm.fit: fitted probabilities numerically 0 or 1 occurred and
    glm.fit: algorithm did not converge

    is usually a symptom of what is called "linear separation".
    The simplest instance of this is when there is a single
    predictor variable X whose values are, say, ordered as
    X[1] < X[2] < ... < X[n-1] < X[n].

    If it should happen that the values Y[1] ... Y[n] of the 0/1
    response variable Y are all equal to 0 for X[1],...,X[k]
    and all equal to 1 for X[k+1],...,X[n], and you are fitting
    a glm(Y~X,family=binomial) (which is what you use for logistic
    regression), then the formula being fitted is

    log(P(Y=1|X)/(Y=0|X)) = (X - m)/s for some values of m and s.

    In the circumstances described, for any value of m between

    X[k] < m < X[k+1]

    the fit can force P[Y=1|X] --> 0 for X = X[1],...,X[k]
    and can force P[Y=1|X] --> 1 for X = X[k+1],...,X[n] by
    pushing s --> 0 (so then the first set of (X - m)/s --> -Inf
    and the second set of (X - m)/s --> +Inf), thus forcing the
    probabilities predicted by the model towards agreeing exactly
    with the frequencies observed in the data.

    In other words, any value of m satisfying X[k] < m < X[k+1]
    *separates* the Y=0 responses from the Y=1 responses: all the
    Y=0 responses are for X-values less than m, and all the Y=1
    responses are for values of X greater than m. Since such a
    value of m is not unique (any value between X[k] and X[k+1]
    will do), failure of convergence will be observed.

    This can certainly arise from excessively "outlying" data,
    i.e. the difference X[k+1] - X[k] is much greater than the
    true value of the logistic scale parameter s, so that it
    is almost certain that all responses for X <= X[k] will be 0,
    and all responses for X >= X[k+1] will be 1.

    But it can also readily arise when there are only a few data,
    i.e. n is small; and also, even with many data, if the scale
    paramater is n ot large compared with the spacing between
    X values. If you run the following code a few times, you
    will see that "separation" occurs with about 30% probability.

    n <- 6
    X <- (1:n) ; m <- (n+1)/2 ; s <- 2
    count <- numeric(100)
    for(i in (1:100)) {
    Y <- 1*(runif(n) <= exp((X-m)/s)/(1+exp((X-m)/s)))
    count[i] <- (max(X[Y==0]) < min(X[Y==1]))
    }
    sum(count)

    Similarly, with n <- 10, it occurs about 20% of the time;
    for n <- 15 about 15%, and it never gets much below 15%
    no matter how large n is.

    The more general situation of "linear separation" is where
    you have many predictor variables, when there can be a high
    probability that the random outcomes Y=0 label one set of
    cases, the Y=1 label another set of cases, and the X-vectors
    for Y=0 can be separated from the X-vectors for Y=1 by a
    linear form, i.e. there is a vector of coefficients beta
    such that

    X[for any Y==0]%*%beta < X[for any Y==1]%*%beta

    This also can arise quite readily in practice.

    There is no definitive procedure for coping with this situation
    when it arises. What the fitting procedure is doing is just
    what it is supposed to do, namely predicting P(Y=1)=0 for
    every X such that the observed Y=0, and predicting P(Y=1)=1
    for every X such that the observed Y=1 whenever it can (i.e.
    whenever there is separation). So indeed this is a result
    of maximum-likelihood estimation in that it is seeking values
    of the parameters

    There are work-rounds which are variants of penalising small
    values of the scale paramater s, so that what is being
    maximised is not the plain likelihood L but L moderated
    by the penalty so that for the smaller values of s the
    penalised likelihood will be less than the plain likelihood
    (and, for really small s, much less).

    However, choosing a penalising strategy is not well-defined
    from a theoretical point of view, and will involve an element
    of arbitrary choice, i.e. a penalisation function which
    appropriately reflects your objectives.

    Hoping this helps,
    Ted.

    PS Try searching for "linear separation" and "linear separability"
    in the searchable R-help archives at

    http://tolstoy.newcastle.edu.au/R/

    (the old R Site Help and Archive Search at http://finzi.psych.upenn.edu/
    is no longer of much use).

    -------------------------------------------------
    E-Mail: (Ted Harding) <ted.harding@wlandres.net>
    Date: 21-Mar-2012 Time: 15:35:00
    This message was sent by XFMail
  • Ufuk beyaztas at Mar 22, 2012 at 8:08 pm
    Dear ted, thanks for your help. Now, everything is more clear. I read
    something about "linear separation" you mentioned, and my data set is very
    suitable for this problem. But, there is a confusing question for me;

    I can not controll the process adequately because of usage of bootstrap. So,
    this warnings will arise.
    After the analyse, If I interpret the result with these warnings, does this
    right? At least i may control the warnings "glm.fit: algorithm did not
    converge" with gm.control, but "fitted probabilities numerically 0 or 1
    occurred" will certainly arise. Does the interpretation of results with this
    warning right?

    Many thanks.

    -----
    Best regards

    Ufuk
    --
    View this message in context: http://r.789695.n4.nabble.com/glm-fit-fitted-probabilities-numerically-0-or-1-occurred-tp4490722p4496710.html
    Sent from the R help mailing list archive at Nabble.com.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupr-help @
categoriesr
postedMar 20, '12 at 11:22p
activeMar 22, '12 at 8:08p
posts5
users3
websiter-project.org
irc#r

People

Translate

site design / logo © 2022 Grokbase