FAQ
Sorry if this is a duplicate: This is a re-post because the pdf's mentioned
below did not go through.

Hello,

I'm new'ish to R, and very new to glm. I've read a lot about my issue:
Warning message:
glm.fit: fitted probabilities numerically 0 or 1 occurred

...including:

http://tolstoy.newcastle.edu.au/R/help/05/07/7759.html
http://r.789695.n4.nabble.com/glm-fit-quot-fitted-probabilities-numerically-0-or-1-occurred-quot-td849242.html
(note that I never found: "MASS4 pp.197-8" However, Ted's post was quite
helpful.)

This is a common question, sorry. Because it is a common issue I am posting
everything I know about the issue and how I think I am not falling into the
same trap at the others (but I must be due to some reason I am not yet
aware of).
From the two links above I gather that my warning "glm.fit: fitted
probabilities numerically 0 or 1 occurred" arises from a "perfect fit"
situation (i.e. the issue where all the high value x's (predictor
variables) are Y=1 (response=1) or the other way around). I don't feel my
data has this issue. Please point out how it does!

The list post instructions state that I can attach pdf's, so I attached
plots of my data right before I do the call to glm.

The attachments are plots of my data stored in variable l_yx (as can be
seen in the axis names):
My response (vertical axis) by row index (horizontal axis):
plot(l_yx[,1],type='h')
My predictor variable (vertical axis) by row index index (horizontal axis):
plot(l_yx[,2],type='h')

So here is more info on my data frame/data (in case you can't see my pdf
attachments):
unique(l_yx[,1]) [1] 0 1
mean(l_yx[,2])
[1] 0.01123699
max(l_yx[,2])
[1] 14.66518
min(l_yx[,2]) [1] 0
attributes(l_yx)
$dim
[1] 690303 2

$dimnames
$dimnames[[1]]
NULL

$dimnames[[2]]
[1] "y" "x"


With the above data I do:
l_logit = glm(y~x, data=as.data.frame(l_yx),
family=binomial(link="logit"))
Warning message:
glm.fit: fitted probabilities numerically 0 or 1 occurred

Why am I getting this warning when I have data points of varying values for
y=1 and y=0? In other words, I don't think I have the linear separation
issue discussed in one of the links I provided.

PS - Then I do this and I get a odds ratio a crazy size:
l_sm = summary(l_logit) # coef pval is $coefficients[8], log odds
$coefficients[2]
l_exp_coef = exp(l_logit$coefficients)[2] # exponentiate the
coeffcients
l_exp_coef
x
3161.781

So for one unit increase in the predictor variable I get 3160.781%
(3161.781 - 1 = 3160.781) increase in odds? That can't be correct either.
How do I correct for this issue? (I tried multiplying the predictor
variables by a constant and the odds ratio goes down, but the warning above
still persists and shouldn't the odds ratio be predictor variable size
independent?)

Thank you for your help!

Ben

Search Discussions

  • Peter dalgaard at Dec 1, 2011 at 6:55 pm

    On Dec 1, 2011, at 18:54 , Ben quant wrote:

    Sorry if this is a duplicate: This is a re-post because the pdf's mentioned
    below did not go through.
    Still not there. Sometimes it's because your mailer doesn't label them with the appropriate mime-type (e.g. as application/octet-stream, which is "arbitrary binary"). Anyways, see below

    [snip]
    With the above data I do:
    l_logit = glm(y~x, data=as.data.frame(l_yx),
    family=binomial(link="logit"))
    Warning message:
    glm.fit: fitted probabilities numerically 0 or 1 occurred

    Why am I getting this warning when I have data points of varying values for
    y=1 and y=0? In other words, I don't think I have the linear separation
    issue discussed in one of the links I provided.
    I bet that you do... You can get the warning without that effect (one of my own examples is the probability of menarche in a data set that includes infants and old age pensioners), but not with a huge odds ratio as well. Take a look at

    d <- as.data.frame(l_yx)
    with(d, y[order(x)])

    if it comes out as all zeros followed by all ones or vice versa, then you have the problem.

    PS - Then I do this and I get a odds ratio a crazy size:
    l_sm = summary(l_logit) # coef pval is $coefficients[8], log odds
    $coefficients[2]
    l_exp_coef = exp(l_logit$coefficients)[2] # exponentiate the
    coeffcients
    l_exp_coef
    x
    3161.781

    So for one unit increase in the predictor variable I get 3160.781%
    (3161.781 - 1 = 3160.781) increase in odds? That can't be correct either.
    How do I correct for this issue? (I tried multiplying the predictor
    variables by a constant and the odds ratio goes down, but the warning above
    still persists and shouldn't the odds ratio be predictor variable size
    independent?)

    --
    Peter Dalgaard, Professor,
    Center for Statistics, Copenhagen Business School
    Solbjerg Plads 3, 2000 Frederiksberg, Denmark
    Phone: (+45)38153501
    Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
  • Ben quant at Dec 1, 2011 at 8:32 pm
    Thank you for the feedback, but my data looks fine to me. Please tell me if
    I'm not understanding.

    I followed your instructions and here is a sample of the first 500 values :
    (info on 'd' is below that)
    d <- as.data.frame(l_yx)
    x = with(d, y[order(x)])
    x[1:500] # I have 1's and 0's dispersed throughout
    [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0
    [101] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    [201] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0
    0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
    1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 0 0 1 0 0 0 1 0 0
    [301] 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    [401] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0
    1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0

    # I get the warning still
    l_df = as.data.frame(l_yx)
    l_logit = glm(y~x, data=l_df, family=binomial(link="logit"))
    Warning message:
    glm.fit: fitted probabilities numerically 0 or 1 occurred

    # some info on 'd' above:
    d[1:500,1]
    [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    [101] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    [201] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    [301] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    [401] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    d[1:500,2]
    [1] 3.023160e-03 7.932130e-02 0.000000e+00 1.779657e-02 1.608374e-01
    0.000000e+00 5.577064e-02 7.753926e-03 4.018553e-03 4.760918e-02
    2.080511e-02 1.642404e-01 3.703720e-03 8.901981e-02 1.260415e-03
    [16] 2.202523e-02 3.750940e-02 4.441975e-04 9.351171e-03 8.374567e-03
    0.000000e+00 8.440448e-02 5.081017e-01 2.538640e-05 1.806017e-02
    2.954641e-04 1.434859e-03 6.964976e-04 0.000000e+00 1.202162e-02
    [31] 3.420300e-03 4.276100e-02 1.457324e-02 4.140121e-03 1.349180e-04
    1.525292e-03 4.817502e-02 9.515717e-03 2.232953e-02 1.227908e-01
    3.293581e-02 1.454352e-02 1.176011e-03 6.274138e-02 2.879205e-02
    [46] 6.900926e-03 1.414648e-04 3.446349e-02 8.807174e-03 3.549332e-02
    2.828509e-03 2.935003e-02 7.162872e-03 5.650050e-03 1.221191e-02
    0.000000e+00 2.126334e-02 2.052020e-02 7.542409e-02 2.586269e-04
    [61] 5.258664e-02 1.213126e-02 1.493275e-02 8.152657e-03 1.774966e-02
    2.433198e-02 1.371257e-02 6.732339e-02 9.747677e-03 8.180594e-03
    1.882473e-01 1.682596e-03 1.410170e-02 8.429676e-03 3.520589e-01
    [76] 1.865160e-02 2.389941e-02 3.301558e-02 4.896605e-03 6.826244e-03
    9.151337e-02 1.722270e-02 1.764039e-01 3.554063e-02 7.338532e-03
    9.577793e-03 3.632366e-03 1.422400e-02 3.020497e-02 1.797096e-02
    [91] 2.577899e-03 1.338365e-01 8.202223e-03 1.088335e-02 2.137259e-02
    6.829797e-03 3.104860e-04 8.927160e-03 5.626881e-02 1.197682e-02
    6.140553e-02 2.729364e-02 1.377913e-02 3.018277e-03 1.188304e-02
    [106] 2.029268e-03 2.108815e-02 1.765503e-03 2.449253e-02 3.677046e-03
    1.013463e-02 4.286642e-02 1.238931e-02 3.072090e-03 1.509613e-02
    3.609885e-02 5.537268e-03 3.151614e-02 3.703148e-03 1.409401e-01
    [121] 1.473365e-02 9.160461e-03 1.035099e-01 3.005865e-02 1.332199e-02
    6.936535e-03 2.433539e-02 4.935373e-03 4.822740e-03 1.597643e-03
    3.805619e-03 1.092683e-02 1.760635e-01 5.565614e-03 7.739213e-04
    [136] 4.529198e-03 5.110359e-03 7.391258e-02 5.018207e-03 2.071417e-02
    1.825787e-02 9.141405e-03 1.078386e-01 2.171470e-04 7.164583e-03
    2.522077e-02 1.994428e-03 6.044653e-03 1.778078e-04 5.797706e-03
    [151] 1.526778e-02 1.595897e-02 1.995627e-01 1.946735e-03 6.711582e-02
    2.722350e-02 3.127499e-02 1.074904e-01 2.477414e-03 5.015375e-02
    3.417810e-02 2.046643e-02 1.644832e-02 5.220166e-03 1.217752e-02
    [166] 1.187352e-02 1.634016e-02 2.349568e-03 3.451811e-02 2.593540e-03
    6.868595e-03 1.311236e-02 6.457757e-03 2.942391e-04 1.628352e-02
    8.288831e-03 3.170856e-04 1.251331e+00 1.706954e-02 1.063723e-03
    [181] 1.374416e-02 2.140507e-02 2.817009e-02 2.272793e-02 4.365562e-02
    6.089414e-03 2.498083e-02 1.360471e-02 1.884079e-02 1.448660e-02
    2.341314e-02 8.167064e-03 4.109117e-02 2.660633e-02 7.711723e-03
    [196] 9.590278e-03 2.515490e-03 1.978033e-02 3.454990e-02 8.072748e-03
    4.718885e-03 1.621131e-01 4.547743e-03 1.081195e-02 9.572051e-04
    1.790391e-02 1.618026e-02 1.910230e-02 1.861914e-02 3.485475e-02
    [211] 2.844890e-03 1.866889e-02 1.378208e-02 2.451514e-02 2.535044e-03
    3.921364e-04 1.557266e-03 3.315892e-03 1.752821e-03 6.786187e-03
    1.360921e-02 9.550702e-03 8.114506e-03 5.068741e-03 1.729822e-02
    [226] 1.902033e-02 8.196564e-03 2.632880e-03 1.587969e-02 8.354079e-04
    1.050023e-03 4.236195e-04 9.181120e-03 4.995919e-04 1.092234e-02
    1.207544e-02 2.187243e-01 3.251349e-02 1.269134e-03 1.557751e-04
    [241] 1.232498e-02 2.654449e-02 1.049324e-03 8.442729e-03 6.331691e-03
    1.715609e-02 1.017800e-03 9.230006e-03 1.331373e-02 5.596195e-02
    1.296551e-03 5.272687e-03 2.805640e-02 4.790665e-02 2.043011e-02
    [256] 1.047226e-02 1.866499e-02 9.323001e-03 8.920536e-03 1.582911e-03
    2.776238e-03 2.914762e-02 4.402356e-03 9.555274e-04 1.681966e-03
    7.584319e-04 6.758914e-02 1.505431e-02 2.213308e-02 1.329330e-02
    [271] 7.284363e-03 2.687818e-02 2.997535e-03 7.470007e-03 2.070569e-03
    3.441944e-02 1.717768e-02 4.523364e-02 1.003558e-02 1.365111e-02
    1.906845e-02 1.676223e-02 3.506809e-04 9.164257e-02 9.008416e-03
    [286] 1.073903e-02 4.855937e-03 8.618043e-03 2.529247e-02 1.059375e-02
    5.834253e-03 2.004309e-02 1.460387e-02 2.899190e-02 5.867984e-03
    1.983956e-02 6.834339e-03 1.925821e-03 9.231870e-03 6.839616e-03
    [301] 1.029972e-02 2.009769e-02 9.458785e-03 1.183901e-02 8.911549e-03
    1.264745e-02 2.995451e-03 7.657983e-04 5.315853e-03 1.325039e-02
    1.044103e-02 2.307236e-02 2.780789e-02 1.735145e-02 9.053126e-03
    [316] 5.847638e-02 3.815715e-03 5.087690e-03 1.040513e-02 4.475672e-02
    6.564791e-02 3.233571e-03 1.076193e-02 8.283819e-02 5.370256e-03
    3.533256e-02 1.302812e-02 1.896783e-02 2.055282e-02 3.572239e-03
    [331] 5.867681e-03 5.864974e-04 9.715807e-03 1.665469e-02 5.082044e-02
    3.547168e-03 3.069631e-03 1.274717e-02 1.858226e-03 3.104809e-04
    1.247831e-02 2.073575e-03 3.544110e-04 7.240736e-03 8.452117e-05
    [346] 8.149151e-04 4.942461e-05 1.142303e-03 6.265512e-04 3.666717e-04
    3.244669e-02 7.242018e-03 6.335951e-04 2.329072e-02 3.719716e-03
    2.803425e-02 1.623981e-02 6.387102e-03 8.807679e-03 1.214914e-02
    [361] 6.699341e-03 1.148082e-02 1.329736e-02 1.537364e-03 2.004390e-02
    1.562065e-02 1.655465e-02 9.960172e-02 2.174588e-02 1.209472e-02
    2.328413e-02 2.012760e-04 1.422327e-02 2.194455e-03 2.307362e-02
    [376] 4.315764e-03 3.208576e-02 3.826598e-02 1.828001e-02 3.935978e-03
    5.294211e-04 1.392423e-02 6.588394e-03 1.040147e-03 1.260787e-02
    9.051757e-04 5.353215e-02 6.049058e-02 1.382630e-01 1.064124e-01
    [391] 3.380742e-03 1.798038e-02 1.557048e-01 1.217146e-02 4.140520e-02
    4.707564e-02 2.786042e-02 8.836988e-03 5.542879e-03 1.862664e-02
    8.858770e-03 1.026681e-03 1.692105e-02 8.849238e-03 7.143816e-03
    [406] 1.630118e-02 1.165920e-01 9.471496e-03 4.879998e-02 1.388216e-02
    1.453267e-02 4.845224e-04 1.415190e-03 1.208627e-02 1.372348e-02
    2.573131e-02 1.169595e-02 1.825447e-02 2.574299e-02 5.301360e-02
    [421] 6.961110e-03 7.781891e-03 1.013308e-03 3.160916e-03 1.090344e-02
    1.530841e-02 9.398088e-04 9.143726e-04 1.286683e-02 2.006193e-02
    1.774378e-02 5.681591e-02 9.584676e-03 7.957152e-02 4.485609e-03
    [436] 1.086684e-02 2.930273e-03 6.085481e-03 4.342320e-03 1.319999e-02
    2.120402e-02 4.477545e-02 1.991814e-02 8.893947e-03 7.790133e-03
    1.610199e-02 2.441280e-02 2.781231e-03 1.410080e-02 1.639912e-02
    [451] 1.797498e-02 1.185382e-02 2.775063e-02 3.797315e-02 1.428883e-02
    1.272659e-02 2.390500e-03 7.503478e-03 8.965356e-03 2.139452e-02
    2.028536e-02 6.916416e-02 1.615986e-02 4.837412e-02 1.561731e-02
    [466] 7.130332e-03 9.208406e-05 1.099934e-02 2.003469e-02 1.395857e-02
    9.883482e-03 4.110852e-02 1.202052e-02 2.833039e-02 1.233236e-02
    2.145801e-02 7.900161e-03 4.663819e-02 4.410819e-03 5.115056e-04
    [481] 9.100270e-04 4.013683e-03 1.227139e-02 3.304697e-03 2.919099e-03
    6.112390e-03 1.922229e-02 1.208282e-03 1.164037e-02 2.166888e-02
    4.381615e-02 5.318929e-03 7.226343e-03 2.732819e-02 2.385092e-04
    [496] 4.905250e-02 1.159876e-02 4.068228e-03 3.349013e-02 1.273468e-03


    Thanks for your help,

    Ben
    On Thu, Dec 1, 2011 at 11:55 AM, peter dalgaard wrote:

    On Dec 1, 2011, at 18:54 , Ben quant wrote:

    Sorry if this is a duplicate: This is a re-post because the pdf's mentioned
    below did not go through.
    Still not there. Sometimes it's because your mailer doesn't label them
    with the appropriate mime-type (e.g. as application/octet-stream, which is
    "arbitrary binary"). Anyways, see below

    [snip]
    With the above data I do:
    l_logit = glm(y~x, data=as.data.frame(l_yx),
    family=binomial(link="logit"))
    Warning message:
    glm.fit: fitted probabilities numerically 0 or 1 occurred

    Why am I getting this warning when I have data points of varying values for
    y=1 and y=0? In other words, I don't think I have the linear separation
    issue discussed in one of the links I provided.
    I bet that you do... You can get the warning without that effect (one of
    my own examples is the probability of menarche in a data set that includes
    infants and old age pensioners), but not with a huge odds ratio as well.
    Take a look at

    d <- as.data.frame(l_yx)
    with(d, y[order(x)])

    if it comes out as all zeros followed by all ones or vice versa, then you
    have the problem.

    PS - Then I do this and I get a odds ratio a crazy size:
    l_sm = summary(l_logit) # coef pval is $coefficients[8], log odds
    $coefficients[2]
    l_exp_coef = exp(l_logit$coefficients)[2] # exponentiate the
    coeffcients
    l_exp_coef
    x
    3161.781

    So for one unit increase in the predictor variable I get 3160.781%
    (3161.781 - 1 = 3160.781) increase in odds? That can't be correct either.
    How do I correct for this issue? (I tried multiplying the predictor
    variables by a constant and the odds ratio goes down, but the warning above
    still persists and shouldn't the odds ratio be predictor variable size
    independent?)

    --
    Peter Dalgaard, Professor,
    Center for Statistics, Copenhagen Business School
    Solbjerg Plads 3, 2000 Frederiksberg, Denmark
    Phone: (+45)38153501
    Email: pd.mes@cbs.dk Priv: pdalgd@gmail.com







  • Peter dalgaard at Dec 1, 2011 at 9:24 pm

    On Dec 1, 2011, at 21:32 , Ben quant wrote:

    Thank you for the feedback, but my data looks fine to me. Please tell me if I'm not understanding.
    Hum, then maybe it really is a case of a transition region being short relative to the range of your data. Notice that the warning is just that: a warning. I do notice that the distribution of your x values is rather extreme -- you stated a range of 0--14 and a mean of 0.01. And after all, an odds ratio of 3000 per unit is only a tad over a doubling per 0.1 units.

    Have a look at range(x[y==0]) and range(x[y==1]).

    I followed your instructions and here is a sample of the first 500 values : (info on 'd' is below that)
    d <- as.data.frame(l_yx)
    x = with(d, y[order(x)])
    x[1:500] # I have 1's and 0's dispersed throughout
    [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0
    [101] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    [201] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 0 0 1 0 0 0 1 0 0
    [301] 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    [401] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0

    # I get the warning still
    l_df = as.data.frame(l_yx)
    l_logit = glm(y~x, data=l_df, family=binomial(link="logit"))
    Warning message:
    glm.fit: fitted probabilities numerically 0 or 1 occurred

    # some info on 'd' above:
    d[1:500,1]
    [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    [101] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    [201] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    [301] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    [401] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    d[1:500,2]
    [1] 3.023160e-03 7.932130e-02 0.000000e+00 1.779657e-02 1.608374e-01 0.000000e+00 5.577064e-02 7.753926e-03 4.018553e-03 4.760918e-02 2.080511e-02 1.642404e-01 3.703720e-03 8.901981e-02 1.260415e-03
    [16] 2.202523e-02 3.750940e-02 4.441975e-04 9.351171e-03 8.374567e-03 0.000000e+00 8.440448e-02 5.081017e-01 2.538640e-05 1.806017e-02 2.954641e-04 1.434859e-03 6.964976e-04 0.000000e+00 1.202162e-02
    [31] 3.420300e-03 4.276100e-02 1.457324e-02 4.140121e-03 1.349180e-04 1.525292e-03 4.817502e-02 9.515717e-03 2.232953e-02 1.227908e-01 3.293581e-02 1.454352e-02 1.176011e-03 6.274138e-02 2.879205e-02
    [46] 6.900926e-03 1.414648e-04 3.446349e-02 8.807174e-03 3.549332e-02 2.828509e-03 2.935003e-02 7.162872e-03 5.650050e-03 1.221191e-02 0.000000e+00 2.126334e-02 2.052020e-02 7.542409e-02 2.586269e-04
    [61] 5.258664e-02 1.213126e-02 1.493275e-02 8.152657e-03 1.774966e-02 2.433198e-02 1.371257e-02 6.732339e-02 9.747677e-03 8.180594e-03 1.882473e-01 1.682596e-03 1.410170e-02 8.429676e-03 3.520589e-01
    [76] 1.865160e-02 2.389941e-02 3.301558e-02 4.896605e-03 6.826244e-03 9.151337e-02 1.722270e-02 1.764039e-01 3.554063e-02 7.338532e-03 9.577793e-03 3.632366e-03 1.422400e-02 3.020497e-02 1.797096e-02
    [91] 2.577899e-03 1.338365e-01 8.202223e-03 1.088335e-02 2.137259e-02 6.829797e-03 3.104860e-04 8.927160e-03 5.626881e-02 1.197682e-02 6.140553e-02 2.729364e-02 1.377913e-02 3.018277e-03 1.188304e-02
    [106] 2.029268e-03 2.108815e-02 1.765503e-03 2.449253e-02 3.677046e-03 1.013463e-02 4.286642e-02 1.238931e-02 3.072090e-03 1.509613e-02 3.609885e-02 5.537268e-03 3.151614e-02 3.703148e-03 1.409401e-01
    [121] 1.473365e-02 9.160461e-03 1.035099e-01 3.005865e-02 1.332199e-02 6.936535e-03 2.433539e-02 4.935373e-03 4.822740e-03 1.597643e-03 3.805619e-03 1.092683e-02 1.760635e-01 5.565614e-03 7.739213e-04
    [136] 4.529198e-03 5.110359e-03 7.391258e-02 5.018207e-03 2.071417e-02 1.825787e-02 9.141405e-03 1.078386e-01 2.171470e-04 7.164583e-03 2.522077e-02 1.994428e-03 6.044653e-03 1.778078e-04 5.797706e-03
    [151] 1.526778e-02 1.595897e-02 1.995627e-01 1.946735e-03 6.711582e-02 2.722350e-02 3.127499e-02 1.074904e-01 2.477414e-03 5.015375e-02 3.417810e-02 2.046643e-02 1.644832e-02 5.220166e-03 1.217752e-02
    [166] 1.187352e-02 1.634016e-02 2.349568e-03 3.451811e-02 2.593540e-03 6.868595e-03 1.311236e-02 6.457757e-03 2.942391e-04 1.628352e-02 8.288831e-03 3.170856e-04 1.251331e+00 1.706954e-02 1.063723e-03
    [181] 1.374416e-02 2.140507e-02 2.817009e-02 2.272793e-02 4.365562e-02 6.089414e-03 2.498083e-02 1.360471e-02 1.884079e-02 1.448660e-02 2.341314e-02 8.167064e-03 4.109117e-02 2.660633e-02 7.711723e-03
    [196] 9.590278e-03 2.515490e-03 1.978033e-02 3.454990e-02 8.072748e-03 4.718885e-03 1.621131e-01 4.547743e-03 1.081195e-02 9.572051e-04 1.790391e-02 1.618026e-02 1.910230e-02 1.861914e-02 3.485475e-02
    [211] 2.844890e-03 1.866889e-02 1.378208e-02 2.451514e-02 2.535044e-03 3.921364e-04 1.557266e-03 3.315892e-03 1.752821e-03 6.786187e-03 1.360921e-02 9.550702e-03 8.114506e-03 5.068741e-03 1.729822e-02
    [226] 1.902033e-02 8.196564e-03 2.632880e-03 1.587969e-02 8.354079e-04 1.050023e-03 4.236195e-04 9.181120e-03 4.995919e-04 1.092234e-02 1.207544e-02 2.187243e-01 3.251349e-02 1.269134e-03 1.557751e-04
    [241] 1.232498e-02 2.654449e-02 1.049324e-03 8.442729e-03 6.331691e-03 1.715609e-02 1.017800e-03 9.230006e-03 1.331373e-02 5.596195e-02 1.296551e-03 5.272687e-03 2.805640e-02 4.790665e-02 2.043011e-02
    [256] 1.047226e-02 1.866499e-02 9.323001e-03 8.920536e-03 1.582911e-03 2.776238e-03 2.914762e-02 4.402356e-03 9.555274e-04 1.681966e-03 7.584319e-04 6.758914e-02 1.505431e-02 2.213308e-02 1.329330e-02
    [271] 7.284363e-03 2.687818e-02 2.997535e-03 7.470007e-03 2.070569e-03 3.441944e-02 1.717768e-02 4.523364e-02 1.003558e-02 1.365111e-02 1.906845e-02 1.676223e-02 3.506809e-04 9.164257e-02 9.008416e-03
    [286] 1.073903e-02 4.855937e-03 8.618043e-03 2.529247e-02 1.059375e-02 5.834253e-03 2.004309e-02 1.460387e-02 2.899190e-02 5.867984e-03 1.983956e-02 6.834339e-03 1.925821e-03 9.231870e-03 6.839616e-03
    [301] 1.029972e-02 2.009769e-02 9.458785e-03 1.183901e-02 8.911549e-03 1.264745e-02 2.995451e-03 7.657983e-04 5.315853e-03 1.325039e-02 1.044103e-02 2.307236e-02 2.780789e-02 1.735145e-02 9.053126e-03
    [316] 5.847638e-02 3.815715e-03 5.087690e-03 1.040513e-02 4.475672e-02 6.564791e-02 3.233571e-03 1.076193e-02 8.283819e-02 5.370256e-03 3.533256e-02 1.302812e-02 1.896783e-02 2.055282e-02 3.572239e-03
    [331] 5.867681e-03 5.864974e-04 9.715807e-03 1.665469e-02 5.082044e-02 3.547168e-03 3.069631e-03 1.274717e-02 1.858226e-03 3.104809e-04 1.247831e-02 2.073575e-03 3.544110e-04 7.240736e-03 8.452117e-05
    [346] 8.149151e-04 4.942461e-05 1.142303e-03 6.265512e-04 3.666717e-04 3.244669e-02 7.242018e-03 6.335951e-04 2.329072e-02 3.719716e-03 2.803425e-02 1.623981e-02 6.387102e-03 8.807679e-03 1.214914e-02
    [361] 6.699341e-03 1.148082e-02 1.329736e-02 1.537364e-03 2.004390e-02 1.562065e-02 1.655465e-02 9.960172e-02 2.174588e-02 1.209472e-02 2.328413e-02 2.012760e-04 1.422327e-02 2.194455e-03 2.307362e-02
    [376] 4.315764e-03 3.208576e-02 3.826598e-02 1.828001e-02 3.935978e-03 5.294211e-04 1.392423e-02 6.588394e-03 1.040147e-03 1.260787e-02 9.051757e-04 5.353215e-02 6.049058e-02 1.382630e-01 1.064124e-01
    [391] 3.380742e-03 1.798038e-02 1.557048e-01 1.217146e-02 4.140520e-02 4.707564e-02 2.786042e-02 8.836988e-03 5.542879e-03 1.862664e-02 8.858770e-03 1.026681e-03 1.692105e-02 8.849238e-03 7.143816e-03
    [406] 1.630118e-02 1.165920e-01 9.471496e-03 4.879998e-02 1.388216e-02 1.453267e-02 4.845224e-04 1.415190e-03 1.208627e-02 1.372348e-02 2.573131e-02 1.169595e-02 1.825447e-02 2.574299e-02 5.301360e-02
    [421] 6.961110e-03 7.781891e-03 1.013308e-03 3.160916e-03 1.090344e-02 1.530841e-02 9.398088e-04 9.143726e-04 1.286683e-02 2.006193e-02 1.774378e-02 5.681591e-02 9.584676e-03 7.957152e-02 4.485609e-03
    [436] 1.086684e-02 2.930273e-03 6.085481e-03 4.342320e-03 1.319999e-02 2.120402e-02 4.477545e-02 1.991814e-02 8.893947e-03 7.790133e-03 1.610199e-02 2.441280e-02 2.781231e-03 1.410080e-02 1.639912e-02
    [451] 1.797498e-02 1.185382e-02 2.775063e-02 3.797315e-02 1.428883e-02 1.272659e-02 2.390500e-03 7.503478e-03 8.965356e-03 2.139452e-02 2.028536e-02 6.916416e-02 1.615986e-02 4.837412e-02 1.561731e-02
    [466] 7.130332e-03 9.208406e-05 1.099934e-02 2.003469e-02 1.395857e-02 9.883482e-03 4.110852e-02 1.202052e-02 2.833039e-02 1.233236e-02 2.145801e-02 7.900161e-03 4.663819e-02 4.410819e-03 5.115056e-04
    [481] 9.100270e-04 4.013683e-03 1.227139e-02 3.304697e-03 2.919099e-03 6.112390e-03 1.922229e-02 1.208282e-03 1.164037e-02 2.166888e-02 4.381615e-02 5.318929e-03 7.226343e-03 2.732819e-02 2.385092e-04
    [496] 4.905250e-02 1.159876e-02 4.068228e-03 3.349013e-02 1.273468e-03


    Thanks for your help,

    Ben

    On Thu, Dec 1, 2011 at 11:55 AM, peter dalgaard wrote:
    On Dec 1, 2011, at 18:54 , Ben quant wrote:

    Sorry if this is a duplicate: This is a re-post because the pdf's mentioned
    below did not go through.
    Still not there. Sometimes it's because your mailer doesn't label them with the appropriate mime-type (e.g. as application/octet-stream, which is "arbitrary binary"). Anyways, see below

    [snip]
    With the above data I do:
    l_logit = glm(y~x, data=as.data.frame(l_yx),
    family=binomial(link="logit"))
    Warning message:
    glm.fit: fitted probabilities numerically 0 or 1 occurred

    Why am I getting this warning when I have data points of varying values for
    y=1 and y=0? In other words, I don't think I have the linear separation
    issue discussed in one of the links I provided.
    I bet that you do... You can get the warning without that effect (one of my own examples is the probability of menarche in a data set that includes infants and old age pensioners), but not with a huge odds ratio as well. Take a look at

    d <- as.data.frame(l_yx)
    with(d, y[order(x)])

    if it comes out as all zeros followed by all ones or vice versa, then you have the problem.

    PS - Then I do this and I get a odds ratio a crazy size:
    l_sm = summary(l_logit) # coef pval is $coefficients[8], log odds
    $coefficients[2]
    l_exp_coef = exp(l_logit$coefficients)[2] # exponentiate the
    coeffcients
    l_exp_coef
    x
    3161.781

    So for one unit increase in the predictor variable I get 3160.781%
    (3161.781 - 1 = 3160.781) increase in odds? That can't be correct either.
    How do I correct for this issue? (I tried multiplying the predictor
    variables by a constant and the odds ratio goes down, but the warning above
    still persists and shouldn't the odds ratio be predictor variable size
    independent?)

    --
    Peter Dalgaard, Professor,
    Center for Statistics, Copenhagen Business School
    Solbjerg Plads 3, 2000 Frederiksberg, Denmark
    Phone: (+45)38153501
    Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com







    --
    Peter Dalgaard, Professor,
    Center for Statistics, Copenhagen Business School
    Solbjerg Plads 3, 2000 Frederiksberg, Denmark
    Phone: (+45)38153501
    Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
  • Ben quant at Dec 1, 2011 at 10:07 pm

    Here you go:

    attach(as.data.frame(l_yx))
    range(x[y==1])
    [1] -22500.0000 746.6666
    range(x[y==0])
    [1] -10076.5303 653.0228

    How do I know what is acceptable?

    Also, here are the screen shots of my data that I tried to send earlier
    (two screen shots, two pages):
    http://scientia.crescat.net/static/ben/warn%20num%200%20or%201.pdf

    Thank you,

    Ben
    On Thu, Dec 1, 2011 at 2:24 PM, peter dalgaard wrote:

    On Dec 1, 2011, at 21:32 , Ben quant wrote:

    Thank you for the feedback, but my data looks fine to me. Please tell me
    if I'm not understanding.

    Hum, then maybe it really is a case of a transition region being short
    relative to the range of your data. Notice that the warning is just that: a
    warning. I do notice that the distribution of your x values is rather
    extreme -- you stated a range of 0--14 and a mean of 0.01. And after all,
    an odds ratio of 3000 per unit is only a tad over a doubling per 0.1 units.

    Have a look at range(x[y==0]) and range(x[y==1]).

    I followed your instructions and here is a sample of the first 500
    values : (info on 'd' is below that)
    d <- as.data.frame(l_yx)
    x = with(d, y[order(x)])
    x[1:500] # I have 1's and 0's dispersed throughout
    [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0
    [101] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    [201] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
    1 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
    0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 0 0 1 0 0 0 1 0 0
    [301] 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 0 1 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    [401] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0
    0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0
    # I get the warning still
    l_df = as.data.frame(l_yx)
    l_logit = glm(y~x, data=l_df, family=binomial(link="logit"))
    Warning message:
    glm.fit: fitted probabilities numerically 0 or 1 occurred

    # some info on 'd' above:
    d[1:500,1]
    [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    [101] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    [201] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    [301] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    [401] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    d[1:500,2]
    [1] 3.023160e-03 7.932130e-02 0.000000e+00 1.779657e-02 1.608374e-01
    0.000000e+00 5.577064e-02 7.753926e-03 4.018553e-03 4.760918e-02
    2.080511e-02 1.642404e-01 3.703720e-03 8.901981e-02 1.260415e-03
    [16] 2.202523e-02 3.750940e-02 4.441975e-04 9.351171e-03 8.374567e-03
    0.000000e+00 8.440448e-02 5.081017e-01 2.538640e-05 1.806017e-02
    2.954641e-04 1.434859e-03 6.964976e-04 0.000000e+00 1.202162e-02
    [31] 3.420300e-03 4.276100e-02 1.457324e-02 4.140121e-03 1.349180e-04
    1.525292e-03 4.817502e-02 9.515717e-03 2.232953e-02 1.227908e-01
    3.293581e-02 1.454352e-02 1.176011e-03 6.274138e-02 2.879205e-02
    [46] 6.900926e-03 1.414648e-04 3.446349e-02 8.807174e-03 3.549332e-02
    2.828509e-03 2.935003e-02 7.162872e-03 5.650050e-03 1.221191e-02
    0.000000e+00 2.126334e-02 2.052020e-02 7.542409e-02 2.586269e-04
    [61] 5.258664e-02 1.213126e-02 1.493275e-02 8.152657e-03 1.774966e-02
    2.433198e-02 1.371257e-02 6.732339e-02 9.747677e-03 8.180594e-03
    1.882473e-01 1.682596e-03 1.410170e-02 8.429676e-03 3.520589e-01
    [76] 1.865160e-02 2.389941e-02 3.301558e-02 4.896605e-03 6.826244e-03
    9.151337e-02 1.722270e-02 1.764039e-01 3.554063e-02 7.338532e-03
    9.577793e-03 3.632366e-03 1.422400e-02 3.020497e-02 1.797096e-02
    [91] 2.577899e-03 1.338365e-01 8.202223e-03 1.088335e-02 2.137259e-02
    6.829797e-03 3.104860e-04 8.927160e-03 5.626881e-02 1.197682e-02
    6.140553e-02 2.729364e-02 1.377913e-02 3.018277e-03 1.188304e-02
    [106] 2.029268e-03 2.108815e-02 1.765503e-03 2.449253e-02 3.677046e-03
    1.013463e-02 4.286642e-02 1.238931e-02 3.072090e-03 1.509613e-02
    3.609885e-02 5.537268e-03 3.151614e-02 3.703148e-03 1.409401e-01
    [121] 1.473365e-02 9.160461e-03 1.035099e-01 3.005865e-02 1.332199e-02
    6.936535e-03 2.433539e-02 4.935373e-03 4.822740e-03 1.597643e-03
    3.805619e-03 1.092683e-02 1.760635e-01 5.565614e-03 7.739213e-04
    [136] 4.529198e-03 5.110359e-03 7.391258e-02 5.018207e-03 2.071417e-02
    1.825787e-02 9.141405e-03 1.078386e-01 2.171470e-04 7.164583e-03
    2.522077e-02 1.994428e-03 6.044653e-03 1.778078e-04 5.797706e-03
    [151] 1.526778e-02 1.595897e-02 1.995627e-01 1.946735e-03 6.711582e-02
    2.722350e-02 3.127499e-02 1.074904e-01 2.477414e-03 5.015375e-02
    3.417810e-02 2.046643e-02 1.644832e-02 5.220166e-03 1.217752e-02
    [166] 1.187352e-02 1.634016e-02 2.349568e-03 3.451811e-02 2.593540e-03
    6.868595e-03 1.311236e-02 6.457757e-03 2.942391e-04 1.628352e-02
    8.288831e-03 3.170856e-04 1.251331e+00 1.706954e-02 1.063723e-03
    [181] 1.374416e-02 2.140507e-02 2.817009e-02 2.272793e-02 4.365562e-02
    6.089414e-03 2.498083e-02 1.360471e-02 1.884079e-02 1.448660e-02
    2.341314e-02 8.167064e-03 4.109117e-02 2.660633e-02 7.711723e-03
    [196] 9.590278e-03 2.515490e-03 1.978033e-02 3.454990e-02 8.072748e-03
    4.718885e-03 1.621131e-01 4.547743e-03 1.081195e-02 9.572051e-04
    1.790391e-02 1.618026e-02 1.910230e-02 1.861914e-02 3.485475e-02
    [211] 2.844890e-03 1.866889e-02 1.378208e-02 2.451514e-02 2.535044e-03
    3.921364e-04 1.557266e-03 3.315892e-03 1.752821e-03 6.786187e-03
    1.360921e-02 9.550702e-03 8.114506e-03 5.068741e-03 1.729822e-02
    [226] 1.902033e-02 8.196564e-03 2.632880e-03 1.587969e-02 8.354079e-04
    1.050023e-03 4.236195e-04 9.181120e-03 4.995919e-04 1.092234e-02
    1.207544e-02 2.187243e-01 3.251349e-02 1.269134e-03 1.557751e-04
    [241] 1.232498e-02 2.654449e-02 1.049324e-03 8.442729e-03 6.331691e-03
    1.715609e-02 1.017800e-03 9.230006e-03 1.331373e-02 5.596195e-02
    1.296551e-03 5.272687e-03 2.805640e-02 4.790665e-02 2.043011e-02
    [256] 1.047226e-02 1.866499e-02 9.323001e-03 8.920536e-03 1.582911e-03
    2.776238e-03 2.914762e-02 4.402356e-03 9.555274e-04 1.681966e-03
    7.584319e-04 6.758914e-02 1.505431e-02 2.213308e-02 1.329330e-02
    [271] 7.284363e-03 2.687818e-02 2.997535e-03 7.470007e-03 2.070569e-03
    3.441944e-02 1.717768e-02 4.523364e-02 1.003558e-02 1.365111e-02
    1.906845e-02 1.676223e-02 3.506809e-04 9.164257e-02 9.008416e-03
    [286] 1.073903e-02 4.855937e-03 8.618043e-03 2.529247e-02 1.059375e-02
    5.834253e-03 2.004309e-02 1.460387e-02 2.899190e-02 5.867984e-03
    1.983956e-02 6.834339e-03 1.925821e-03 9.231870e-03 6.839616e-03
    [301] 1.029972e-02 2.009769e-02 9.458785e-03 1.183901e-02 8.911549e-03
    1.264745e-02 2.995451e-03 7.657983e-04 5.315853e-03 1.325039e-02
    1.044103e-02 2.307236e-02 2.780789e-02 1.735145e-02 9.053126e-03
    [316] 5.847638e-02 3.815715e-03 5.087690e-03 1.040513e-02 4.475672e-02
    6.564791e-02 3.233571e-03 1.076193e-02 8.283819e-02 5.370256e-03
    3.533256e-02 1.302812e-02 1.896783e-02 2.055282e-02 3.572239e-03
    [331] 5.867681e-03 5.864974e-04 9.715807e-03 1.665469e-02 5.082044e-02
    3.547168e-03 3.069631e-03 1.274717e-02 1.858226e-03 3.104809e-04
    1.247831e-02 2.073575e-03 3.544110e-04 7.240736e-03 8.452117e-05
    [346] 8.149151e-04 4.942461e-05 1.142303e-03 6.265512e-04 3.666717e-04
    3.244669e-02 7.242018e-03 6.335951e-04 2.329072e-02 3.719716e-03
    2.803425e-02 1.623981e-02 6.387102e-03 8.807679e-03 1.214914e-02
    [361] 6.699341e-03 1.148082e-02 1.329736e-02 1.537364e-03 2.004390e-02
    1.562065e-02 1.655465e-02 9.960172e-02 2.174588e-02 1.209472e-02
    2.328413e-02 2.012760e-04 1.422327e-02 2.194455e-03 2.307362e-02
    [376] 4.315764e-03 3.208576e-02 3.826598e-02 1.828001e-02 3.935978e-03
    5.294211e-04 1.392423e-02 6.588394e-03 1.040147e-03 1.260787e-02
    9.051757e-04 5.353215e-02 6.049058e-02 1.382630e-01 1.064124e-01
    [391] 3.380742e-03 1.798038e-02 1.557048e-01 1.217146e-02 4.140520e-02
    4.707564e-02 2.786042e-02 8.836988e-03 5.542879e-03 1.862664e-02
    8.858770e-03 1.026681e-03 1.692105e-02 8.849238e-03 7.143816e-03
    [406] 1.630118e-02 1.165920e-01 9.471496e-03 4.879998e-02 1.388216e-02
    1.453267e-02 4.845224e-04 1.415190e-03 1.208627e-02 1.372348e-02
    2.573131e-02 1.169595e-02 1.825447e-02 2.574299e-02 5.301360e-02
    [421] 6.961110e-03 7.781891e-03 1.013308e-03 3.160916e-03 1.090344e-02
    1.530841e-02 9.398088e-04 9.143726e-04 1.286683e-02 2.006193e-02
    1.774378e-02 5.681591e-02 9.584676e-03 7.957152e-02 4.485609e-03
    [436] 1.086684e-02 2.930273e-03 6.085481e-03 4.342320e-03 1.319999e-02
    2.120402e-02 4.477545e-02 1.991814e-02 8.893947e-03 7.790133e-03
    1.610199e-02 2.441280e-02 2.781231e-03 1.410080e-02 1.639912e-02
    [451] 1.797498e-02 1.185382e-02 2.775063e-02 3.797315e-02 1.428883e-02
    1.272659e-02 2.390500e-03 7.503478e-03 8.965356e-03 2.139452e-02
    2.028536e-02 6.916416e-02 1.615986e-02 4.837412e-02 1.561731e-02
    [466] 7.130332e-03 9.208406e-05 1.099934e-02 2.003469e-02 1.395857e-02
    9.883482e-03 4.110852e-02 1.202052e-02 2.833039e-02 1.233236e-02
    2.145801e-02 7.900161e-03 4.663819e-02 4.410819e-03 5.115056e-04
    [481] 9.100270e-04 4.013683e-03 1.227139e-02 3.304697e-03 2.919099e-03
    6.112390e-03 1.922229e-02 1.208282e-03 1.164037e-02 2.166888e-02
    4.381615e-02 5.318929e-03 7.226343e-03 2.732819e-02 2.385092e-04
    [496] 4.905250e-02 1.159876e-02 4.068228e-03 3.349013e-02 1.273468e-03


    Thanks for your help,

    Ben

    On Thu, Dec 1, 2011 at 11:55 AM, peter dalgaard wrote:
    On Dec 1, 2011, at 18:54 , Ben quant wrote:

    Sorry if this is a duplicate: This is a re-post because the pdf's
    mentioned
    below did not go through.
    Still not there. Sometimes it's because your mailer doesn't label them
    with the appropriate mime-type (e.g. as application/octet-stream, which is
    "arbitrary binary"). Anyways, see below
    [snip]
    With the above data I do:
    l_logit = glm(y~x, data=as.data.frame(l_yx),
    family=binomial(link="logit"))
    Warning message:
    glm.fit: fitted probabilities numerically 0 or 1 occurred

    Why am I getting this warning when I have data points of varying
    values for
    y=1 and y=0? In other words, I don't think I have the linear
    separation
    issue discussed in one of the links I provided.
    I bet that you do... You can get the warning without that effect (one of
    my own examples is the probability of menarche in a data set that includes
    infants and old age pensioners), but not with a huge odds ratio as well.
    Take a look at
    d <- as.data.frame(l_yx)
    with(d, y[order(x)])

    if it comes out as all zeros followed by all ones or vice versa, then
    you have the problem.
    PS - Then I do this and I get a odds ratio a crazy size:
    l_sm = summary(l_logit) # coef pval is $coefficients[8], log odds
    $coefficients[2]
    l_exp_coef = exp(l_logit$coefficients)[2] # exponentiate the
    coeffcients
    l_exp_coef
    x
    3161.781

    So for one unit increase in the predictor variable I get 3160.781%
    (3161.781 - 1 = 3160.781) increase in odds? That can't be correct
    either.
    How do I correct for this issue? (I tried multiplying the predictor
    variables by a constant and the odds ratio goes down, but the warning
    above
    still persists and shouldn't the odds ratio be predictor variable size
    independent?)

    --
    Peter Dalgaard, Professor,
    Center for Statistics, Copenhagen Business School
    Solbjerg Plads 3, 2000 Frederiksberg, Denmark
    Phone: (+45)38153501
    Email: pd.mes@cbs.dk Priv: pdalgd@gmail.com







    --
    Peter Dalgaard, Professor,
    Center for Statistics, Copenhagen Business School
    Solbjerg Plads 3, 2000 Frederiksberg, Denmark
    Phone: (+45)38153501
    Email: pd.mes@cbs.dk Priv: pdalgd@gmail.com







  • Ben quant at Dec 1, 2011 at 10:25 pm
    Oops! Please ignore my last post. I mistakenly gave you different data I
    was testing with. This is the correct data:

    Here you go:
    attach(as.data.frame(l_yx))
    range(x[y==0])
    [1] 0.00000 14.66518
    range(x[y==1])
    [1] 0.00000 13.49791

    How do I know what is acceptable?

    Also, here are the screen shots of my data that I tried to send earlier
    (two screen shots, two pages):
    http://scientia.crescat.net/static/ben/warn%20num%200%20or%201.pdf

    Thank you,

    Ben
    On Thu, Dec 1, 2011 at 3:07 PM, Ben quant wrote:

    Here you go:
    attach(as.data.frame(l_yx))
    range(x[y==1])
    [1] -22500.0000 746.6666
    range(x[y==0])
    [1] -10076.5303 653.0228

    How do I know what is acceptable?

    Also, here are the screen shots of my data that I tried to send earlier
    (two screen shots, two pages):
    http://scientia.crescat.net/static/ben/warn%20num%200%20or%201.pdf

    Thank you,

    Ben

    On Thu, Dec 1, 2011 at 2:24 PM, peter dalgaard wrote:

    On Dec 1, 2011, at 21:32 , Ben quant wrote:

    Thank you for the feedback, but my data looks fine to me. Please tell
    me if I'm not understanding.

    Hum, then maybe it really is a case of a transition region being short
    relative to the range of your data. Notice that the warning is just that: a
    warning. I do notice that the distribution of your x values is rather
    extreme -- you stated a range of 0--14 and a mean of 0.01. And after all,
    an odds ratio of 3000 per unit is only a tad over a doubling per 0.1 units.

    Have a look at range(x[y==0]) and range(x[y==1]).

    I followed your instructions and here is a sample of the first 500
    values : (info on 'd' is below that)
    d <- as.data.frame(l_yx)
    x = with(d, y[order(x)])
    x[1:500] # I have 1's and 0's dispersed throughout
    [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0
    [101] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    [201] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
    1 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
    0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 0 0 1 0 0 0 1 0 0
    [301] 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 0 1 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    [401] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0
    0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0
    # I get the warning still
    l_df = as.data.frame(l_yx)
    l_logit = glm(y~x, data=l_df, family=binomial(link="logit"))
    Warning message:
    glm.fit: fitted probabilities numerically 0 or 1 occurred

    # some info on 'd' above:
    d[1:500,1]
    [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    [101] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    [201] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    [301] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    [401] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    d[1:500,2]
    [1] 3.023160e-03 7.932130e-02 0.000000e+00 1.779657e-02 1.608374e-01
    0.000000e+00 5.577064e-02 7.753926e-03 4.018553e-03 4.760918e-02
    2.080511e-02 1.642404e-01 3.703720e-03 8.901981e-02 1.260415e-03
    [16] 2.202523e-02 3.750940e-02 4.441975e-04 9.351171e-03 8.374567e-03
    0.000000e+00 8.440448e-02 5.081017e-01 2.538640e-05 1.806017e-02
    2.954641e-04 1.434859e-03 6.964976e-04 0.000000e+00 1.202162e-02
    [31] 3.420300e-03 4.276100e-02 1.457324e-02 4.140121e-03 1.349180e-04
    1.525292e-03 4.817502e-02 9.515717e-03 2.232953e-02 1.227908e-01
    3.293581e-02 1.454352e-02 1.176011e-03 6.274138e-02 2.879205e-02
    [46] 6.900926e-03 1.414648e-04 3.446349e-02 8.807174e-03 3.549332e-02
    2.828509e-03 2.935003e-02 7.162872e-03 5.650050e-03 1.221191e-02
    0.000000e+00 2.126334e-02 2.052020e-02 7.542409e-02 2.586269e-04
    [61] 5.258664e-02 1.213126e-02 1.493275e-02 8.152657e-03 1.774966e-02
    2.433198e-02 1.371257e-02 6.732339e-02 9.747677e-03 8.180594e-03
    1.882473e-01 1.682596e-03 1.410170e-02 8.429676e-03 3.520589e-01
    [76] 1.865160e-02 2.389941e-02 3.301558e-02 4.896605e-03 6.826244e-03
    9.151337e-02 1.722270e-02 1.764039e-01 3.554063e-02 7.338532e-03
    9.577793e-03 3.632366e-03 1.422400e-02 3.020497e-02 1.797096e-02
    [91] 2.577899e-03 1.338365e-01 8.202223e-03 1.088335e-02 2.137259e-02
    6.829797e-03 3.104860e-04 8.927160e-03 5.626881e-02 1.197682e-02
    6.140553e-02 2.729364e-02 1.377913e-02 3.018277e-03 1.188304e-02
    [106] 2.029268e-03 2.108815e-02 1.765503e-03 2.449253e-02 3.677046e-03
    1.013463e-02 4.286642e-02 1.238931e-02 3.072090e-03 1.509613e-02
    3.609885e-02 5.537268e-03 3.151614e-02 3.703148e-03 1.409401e-01
    [121] 1.473365e-02 9.160461e-03 1.035099e-01 3.005865e-02 1.332199e-02
    6.936535e-03 2.433539e-02 4.935373e-03 4.822740e-03 1.597643e-03
    3.805619e-03 1.092683e-02 1.760635e-01 5.565614e-03 7.739213e-04
    [136] 4.529198e-03 5.110359e-03 7.391258e-02 5.018207e-03 2.071417e-02
    1.825787e-02 9.141405e-03 1.078386e-01 2.171470e-04 7.164583e-03
    2.522077e-02 1.994428e-03 6.044653e-03 1.778078e-04 5.797706e-03
    [151] 1.526778e-02 1.595897e-02 1.995627e-01 1.946735e-03 6.711582e-02
    2.722350e-02 3.127499e-02 1.074904e-01 2.477414e-03 5.015375e-02
    3.417810e-02 2.046643e-02 1.644832e-02 5.220166e-03 1.217752e-02
    [166] 1.187352e-02 1.634016e-02 2.349568e-03 3.451811e-02 2.593540e-03
    6.868595e-03 1.311236e-02 6.457757e-03 2.942391e-04 1.628352e-02
    8.288831e-03 3.170856e-04 1.251331e+00 1.706954e-02 1.063723e-03
    [181] 1.374416e-02 2.140507e-02 2.817009e-02 2.272793e-02 4.365562e-02
    6.089414e-03 2.498083e-02 1.360471e-02 1.884079e-02 1.448660e-02
    2.341314e-02 8.167064e-03 4.109117e-02 2.660633e-02 7.711723e-03
    [196] 9.590278e-03 2.515490e-03 1.978033e-02 3.454990e-02 8.072748e-03
    4.718885e-03 1.621131e-01 4.547743e-03 1.081195e-02 9.572051e-04
    1.790391e-02 1.618026e-02 1.910230e-02 1.861914e-02 3.485475e-02
    [211] 2.844890e-03 1.866889e-02 1.378208e-02 2.451514e-02 2.535044e-03
    3.921364e-04 1.557266e-03 3.315892e-03 1.752821e-03 6.786187e-03
    1.360921e-02 9.550702e-03 8.114506e-03 5.068741e-03 1.729822e-02
    [226] 1.902033e-02 8.196564e-03 2.632880e-03 1.587969e-02 8.354079e-04
    1.050023e-03 4.236195e-04 9.181120e-03 4.995919e-04 1.092234e-02
    1.207544e-02 2.187243e-01 3.251349e-02 1.269134e-03 1.557751e-04
    [241] 1.232498e-02 2.654449e-02 1.049324e-03 8.442729e-03 6.331691e-03
    1.715609e-02 1.017800e-03 9.230006e-03 1.331373e-02 5.596195e-02
    1.296551e-03 5.272687e-03 2.805640e-02 4.790665e-02 2.043011e-02
    [256] 1.047226e-02 1.866499e-02 9.323001e-03 8.920536e-03 1.582911e-03
    2.776238e-03 2.914762e-02 4.402356e-03 9.555274e-04 1.681966e-03
    7.584319e-04 6.758914e-02 1.505431e-02 2.213308e-02 1.329330e-02
    [271] 7.284363e-03 2.687818e-02 2.997535e-03 7.470007e-03 2.070569e-03
    3.441944e-02 1.717768e-02 4.523364e-02 1.003558e-02 1.365111e-02
    1.906845e-02 1.676223e-02 3.506809e-04 9.164257e-02 9.008416e-03
    [286] 1.073903e-02 4.855937e-03 8.618043e-03 2.529247e-02 1.059375e-02
    5.834253e-03 2.004309e-02 1.460387e-02 2.899190e-02 5.867984e-03
    1.983956e-02 6.834339e-03 1.925821e-03 9.231870e-03 6.839616e-03
    [301] 1.029972e-02 2.009769e-02 9.458785e-03 1.183901e-02 8.911549e-03
    1.264745e-02 2.995451e-03 7.657983e-04 5.315853e-03 1.325039e-02
    1.044103e-02 2.307236e-02 2.780789e-02 1.735145e-02 9.053126e-03
    [316] 5.847638e-02 3.815715e-03 5.087690e-03 1.040513e-02 4.475672e-02
    6.564791e-02 3.233571e-03 1.076193e-02 8.283819e-02 5.370256e-03
    3.533256e-02 1.302812e-02 1.896783e-02 2.055282e-02 3.572239e-03
    [331] 5.867681e-03 5.864974e-04 9.715807e-03 1.665469e-02 5.082044e-02
    3.547168e-03 3.069631e-03 1.274717e-02 1.858226e-03 3.104809e-04
    1.247831e-02 2.073575e-03 3.544110e-04 7.240736e-03 8.452117e-05
    [346] 8.149151e-04 4.942461e-05 1.142303e-03 6.265512e-04 3.666717e-04
    3.244669e-02 7.242018e-03 6.335951e-04 2.329072e-02 3.719716e-03
    2.803425e-02 1.623981e-02 6.387102e-03 8.807679e-03 1.214914e-02
    [361] 6.699341e-03 1.148082e-02 1.329736e-02 1.537364e-03 2.004390e-02
    1.562065e-02 1.655465e-02 9.960172e-02 2.174588e-02 1.209472e-02
    2.328413e-02 2.012760e-04 1.422327e-02 2.194455e-03 2.307362e-02
    [376] 4.315764e-03 3.208576e-02 3.826598e-02 1.828001e-02 3.935978e-03
    5.294211e-04 1.392423e-02 6.588394e-03 1.040147e-03 1.260787e-02
    9.051757e-04 5.353215e-02 6.049058e-02 1.382630e-01 1.064124e-01
    [391] 3.380742e-03 1.798038e-02 1.557048e-01 1.217146e-02 4.140520e-02
    4.707564e-02 2.786042e-02 8.836988e-03 5.542879e-03 1.862664e-02
    8.858770e-03 1.026681e-03 1.692105e-02 8.849238e-03 7.143816e-03
    [406] 1.630118e-02 1.165920e-01 9.471496e-03 4.879998e-02 1.388216e-02
    1.453267e-02 4.845224e-04 1.415190e-03 1.208627e-02 1.372348e-02
    2.573131e-02 1.169595e-02 1.825447e-02 2.574299e-02 5.301360e-02
    [421] 6.961110e-03 7.781891e-03 1.013308e-03 3.160916e-03 1.090344e-02
    1.530841e-02 9.398088e-04 9.143726e-04 1.286683e-02 2.006193e-02
    1.774378e-02 5.681591e-02 9.584676e-03 7.957152e-02 4.485609e-03
    [436] 1.086684e-02 2.930273e-03 6.085481e-03 4.342320e-03 1.319999e-02
    2.120402e-02 4.477545e-02 1.991814e-02 8.893947e-03 7.790133e-03
    1.610199e-02 2.441280e-02 2.781231e-03 1.410080e-02 1.639912e-02
    [451] 1.797498e-02 1.185382e-02 2.775063e-02 3.797315e-02 1.428883e-02
    1.272659e-02 2.390500e-03 7.503478e-03 8.965356e-03 2.139452e-02
    2.028536e-02 6.916416e-02 1.615986e-02 4.837412e-02 1.561731e-02
    [466] 7.130332e-03 9.208406e-05 1.099934e-02 2.003469e-02 1.395857e-02
    9.883482e-03 4.110852e-02 1.202052e-02 2.833039e-02 1.233236e-02
    2.145801e-02 7.900161e-03 4.663819e-02 4.410819e-03 5.115056e-04
    [481] 9.100270e-04 4.013683e-03 1.227139e-02 3.304697e-03 2.919099e-03
    6.112390e-03 1.922229e-02 1.208282e-03 1.164037e-02 2.166888e-02
    4.381615e-02 5.318929e-03 7.226343e-03 2.732819e-02 2.385092e-04
    [496] 4.905250e-02 1.159876e-02 4.068228e-03 3.349013e-02 1.273468e-03


    Thanks for your help,

    Ben

    On Thu, Dec 1, 2011 at 11:55 AM, peter dalgaard <pdalgd@gmail.com>
    wrote:
    On Dec 1, 2011, at 18:54 , Ben quant wrote:

    Sorry if this is a duplicate: This is a re-post because the pdf's
    mentioned
    below did not go through.
    Still not there. Sometimes it's because your mailer doesn't label them
    with the appropriate mime-type (e.g. as application/octet-stream, which is
    "arbitrary binary"). Anyways, see below
    [snip]
    With the above data I do:
    l_logit = glm(y~x, data=as.data.frame(l_yx),
    family=binomial(link="logit"))
    Warning message:
    glm.fit: fitted probabilities numerically 0 or 1 occurred

    Why am I getting this warning when I have data points of varying
    values for
    y=1 and y=0? In other words, I don't think I have the linear
    separation
    issue discussed in one of the links I provided.
    I bet that you do... You can get the warning without that effect (one
    of my own examples is the probability of menarche in a data set that
    includes infants and old age pensioners), but not with a huge odds ratio as
    well. Take a look at
    d <- as.data.frame(l_yx)
    with(d, y[order(x)])

    if it comes out as all zeros followed by all ones or vice versa, then
    you have the problem.
    PS - Then I do this and I get a odds ratio a crazy size:
    l_sm = summary(l_logit) # coef pval is $coefficients[8], log odds
    $coefficients[2]
    l_exp_coef = exp(l_logit$coefficients)[2] # exponentiate the
    coeffcients
    l_exp_coef
    x
    3161.781

    So for one unit increase in the predictor variable I get 3160.781%
    (3161.781 - 1 = 3160.781) increase in odds? That can't be correct
    either.
    How do I correct for this issue? (I tried multiplying the predictor
    variables by a constant and the odds ratio goes down, but the warning
    above
    still persists and shouldn't the odds ratio be predictor variable size
    independent?)

    --
    Peter Dalgaard, Professor,
    Center for Statistics, Copenhagen Business School
    Solbjerg Plads 3, 2000 Frederiksberg, Denmark
    Phone: (+45)38153501
    Email: pd.mes@cbs.dk Priv: pdalgd@gmail.com







    --
    Peter Dalgaard, Professor,
    Center for Statistics, Copenhagen Business School
    Solbjerg Plads 3, 2000 Frederiksberg, Denmark
    Phone: (+45)38153501
    Email: pd.mes@cbs.dk Priv: pdalgd@gmail.com







  • Ben quant at Dec 1, 2011 at 10:43 pm
    I'm not proposing this as a permanent solution, just investigating the
    warning. I zeroed out the three outliers and received no warning. Can
    someone tell me why I am getting no warning now?

    I did this 3 times to get rid of the 3 outliers:
    mx_dims = arrayInd(which.max(l_yx), dim(l_yx))
    l_yx[mx_dims] = 0

    Now this does not produce an warning:
    l_logit = glm(y~x, data=as.data.frame(l_yx), family=binomial(link="logit"))

    Can someone tell me why occurred?

    Also, again, here are the screen shots of my data that I tried to send
    earlier (two screen shots, two pages):
    http://scientia.crescat.net/static/ben/warn%20num%200%20or%201.pdf

    Thank you for your help,

    Ben
    On Thu, Dec 1, 2011 at 3:25 PM, Ben quant wrote:

    Oops! Please ignore my last post. I mistakenly gave you different data I
    was testing with. This is the correct data:

    Here you go:
    attach(as.data.frame(l_yx))
    range(x[y==0])
    [1] 0.00000 14.66518
    range(x[y==1])
    [1] 0.00000 13.49791


    How do I know what is acceptable?

    Also, here are the screen shots of my data that I tried to send earlier
    (two screen shots, two pages):
    http://scientia.crescat.net/static/ben/warn%20num%200%20or%201.pdf

    Thank you,

    Ben
    On Thu, Dec 1, 2011 at 3:07 PM, Ben quant wrote:

    Here you go:
    attach(as.data.frame(l_yx))
    range(x[y==1])
    [1] -22500.0000 746.6666
    range(x[y==0])
    [1] -10076.5303 653.0228

    How do I know what is acceptable?

    Also, here are the screen shots of my data that I tried to send earlier
    (two screen shots, two pages):
    http://scientia.crescat.net/static/ben/warn%20num%200%20or%201.pdf

    Thank you,

    Ben

    On Thu, Dec 1, 2011 at 2:24 PM, peter dalgaard wrote:

    On Dec 1, 2011, at 21:32 , Ben quant wrote:

    Thank you for the feedback, but my data looks fine to me. Please tell
    me if I'm not understanding.

    Hum, then maybe it really is a case of a transition region being short
    relative to the range of your data. Notice that the warning is just that: a
    warning. I do notice that the distribution of your x values is rather
    extreme -- you stated a range of 0--14 and a mean of 0.01. And after all,
    an odds ratio of 3000 per unit is only a tad over a doubling per 0.1 units.

    Have a look at range(x[y==0]) and range(x[y==1]).

    I followed your instructions and here is a sample of the first 500
    values : (info on 'd' is below that)
    d <- as.data.frame(l_yx)
    x = with(d, y[order(x)])
    x[1:500] # I have 1's and 0's dispersed throughout
    [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0
    [101] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    [201] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0
    0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0
    0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 0 0 1 0 0 0 1 0 0
    [301] 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 0 1 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    [401] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0
    0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0
    # I get the warning still
    l_df = as.data.frame(l_yx)
    l_logit = glm(y~x, data=l_df, family=binomial(link="logit"))
    Warning message:
    glm.fit: fitted probabilities numerically 0 or 1 occurred

    # some info on 'd' above:
    d[1:500,1]
    [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    [101] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    [201] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    [301] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    [401] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    d[1:500,2]
    [1] 3.023160e-03 7.932130e-02 0.000000e+00 1.779657e-02 1.608374e-01
    0.000000e+00 5.577064e-02 7.753926e-03 4.018553e-03 4.760918e-02
    2.080511e-02 1.642404e-01 3.703720e-03 8.901981e-02 1.260415e-03
    [16] 2.202523e-02 3.750940e-02 4.441975e-04 9.351171e-03 8.374567e-03
    0.000000e+00 8.440448e-02 5.081017e-01 2.538640e-05 1.806017e-02
    2.954641e-04 1.434859e-03 6.964976e-04 0.000000e+00 1.202162e-02
    [31] 3.420300e-03 4.276100e-02 1.457324e-02 4.140121e-03 1.349180e-04
    1.525292e-03 4.817502e-02 9.515717e-03 2.232953e-02 1.227908e-01
    3.293581e-02 1.454352e-02 1.176011e-03 6.274138e-02 2.879205e-02
    [46] 6.900926e-03 1.414648e-04 3.446349e-02 8.807174e-03 3.549332e-02
    2.828509e-03 2.935003e-02 7.162872e-03 5.650050e-03 1.221191e-02
    0.000000e+00 2.126334e-02 2.052020e-02 7.542409e-02 2.586269e-04
    [61] 5.258664e-02 1.213126e-02 1.493275e-02 8.152657e-03 1.774966e-02
    2.433198e-02 1.371257e-02 6.732339e-02 9.747677e-03 8.180594e-03
    1.882473e-01 1.682596e-03 1.410170e-02 8.429676e-03 3.520589e-01
    [76] 1.865160e-02 2.389941e-02 3.301558e-02 4.896605e-03 6.826244e-03
    9.151337e-02 1.722270e-02 1.764039e-01 3.554063e-02 7.338532e-03
    9.577793e-03 3.632366e-03 1.422400e-02 3.020497e-02 1.797096e-02
    [91] 2.577899e-03 1.338365e-01 8.202223e-03 1.088335e-02 2.137259e-02
    6.829797e-03 3.104860e-04 8.927160e-03 5.626881e-02 1.197682e-02
    6.140553e-02 2.729364e-02 1.377913e-02 3.018277e-03 1.188304e-02
    [106] 2.029268e-03 2.108815e-02 1.765503e-03 2.449253e-02 3.677046e-03
    1.013463e-02 4.286642e-02 1.238931e-02 3.072090e-03 1.509613e-02
    3.609885e-02 5.537268e-03 3.151614e-02 3.703148e-03 1.409401e-01
    [121] 1.473365e-02 9.160461e-03 1.035099e-01 3.005865e-02 1.332199e-02
    6.936535e-03 2.433539e-02 4.935373e-03 4.822740e-03 1.597643e-03
    3.805619e-03 1.092683e-02 1.760635e-01 5.565614e-03 7.739213e-04
    [136] 4.529198e-03 5.110359e-03 7.391258e-02 5.018207e-03 2.071417e-02
    1.825787e-02 9.141405e-03 1.078386e-01 2.171470e-04 7.164583e-03
    2.522077e-02 1.994428e-03 6.044653e-03 1.778078e-04 5.797706e-03
    [151] 1.526778e-02 1.595897e-02 1.995627e-01 1.946735e-03 6.711582e-02
    2.722350e-02 3.127499e-02 1.074904e-01 2.477414e-03 5.015375e-02
    3.417810e-02 2.046643e-02 1.644832e-02 5.220166e-03 1.217752e-02
    [166] 1.187352e-02 1.634016e-02 2.349568e-03 3.451811e-02 2.593540e-03
    6.868595e-03 1.311236e-02 6.457757e-03 2.942391e-04 1.628352e-02
    8.288831e-03 3.170856e-04 1.251331e+00 1.706954e-02 1.063723e-03
    [181] 1.374416e-02 2.140507e-02 2.817009e-02 2.272793e-02 4.365562e-02
    6.089414e-03 2.498083e-02 1.360471e-02 1.884079e-02 1.448660e-02
    2.341314e-02 8.167064e-03 4.109117e-02 2.660633e-02 7.711723e-03
    [196] 9.590278e-03 2.515490e-03 1.978033e-02 3.454990e-02 8.072748e-03
    4.718885e-03 1.621131e-01 4.547743e-03 1.081195e-02 9.572051e-04
    1.790391e-02 1.618026e-02 1.910230e-02 1.861914e-02 3.485475e-02
    [211] 2.844890e-03 1.866889e-02 1.378208e-02 2.451514e-02 2.535044e-03
    3.921364e-04 1.557266e-03 3.315892e-03 1.752821e-03 6.786187e-03
    1.360921e-02 9.550702e-03 8.114506e-03 5.068741e-03 1.729822e-02
    [226] 1.902033e-02 8.196564e-03 2.632880e-03 1.587969e-02 8.354079e-04
    1.050023e-03 4.236195e-04 9.181120e-03 4.995919e-04 1.092234e-02
    1.207544e-02 2.187243e-01 3.251349e-02 1.269134e-03 1.557751e-04
    [241] 1.232498e-02 2.654449e-02 1.049324e-03 8.442729e-03 6.331691e-03
    1.715609e-02 1.017800e-03 9.230006e-03 1.331373e-02 5.596195e-02
    1.296551e-03 5.272687e-03 2.805640e-02 4.790665e-02 2.043011e-02
    [256] 1.047226e-02 1.866499e-02 9.323001e-03 8.920536e-03 1.582911e-03
    2.776238e-03 2.914762e-02 4.402356e-03 9.555274e-04 1.681966e-03
    7.584319e-04 6.758914e-02 1.505431e-02 2.213308e-02 1.329330e-02
    [271] 7.284363e-03 2.687818e-02 2.997535e-03 7.470007e-03 2.070569e-03
    3.441944e-02 1.717768e-02 4.523364e-02 1.003558e-02 1.365111e-02
    1.906845e-02 1.676223e-02 3.506809e-04 9.164257e-02 9.008416e-03
    [286] 1.073903e-02 4.855937e-03 8.618043e-03 2.529247e-02 1.059375e-02
    5.834253e-03 2.004309e-02 1.460387e-02 2.899190e-02 5.867984e-03
    1.983956e-02 6.834339e-03 1.925821e-03 9.231870e-03 6.839616e-03
    [301] 1.029972e-02 2.009769e-02 9.458785e-03 1.183901e-02 8.911549e-03
    1.264745e-02 2.995451e-03 7.657983e-04 5.315853e-03 1.325039e-02
    1.044103e-02 2.307236e-02 2.780789e-02 1.735145e-02 9.053126e-03
    [316] 5.847638e-02 3.815715e-03 5.087690e-03 1.040513e-02 4.475672e-02
    6.564791e-02 3.233571e-03 1.076193e-02 8.283819e-02 5.370256e-03
    3.533256e-02 1.302812e-02 1.896783e-02 2.055282e-02 3.572239e-03
    [331] 5.867681e-03 5.864974e-04 9.715807e-03 1.665469e-02 5.082044e-02
    3.547168e-03 3.069631e-03 1.274717e-02 1.858226e-03 3.104809e-04
    1.247831e-02 2.073575e-03 3.544110e-04 7.240736e-03 8.452117e-05
    [346] 8.149151e-04 4.942461e-05 1.142303e-03 6.265512e-04 3.666717e-04
    3.244669e-02 7.242018e-03 6.335951e-04 2.329072e-02 3.719716e-03
    2.803425e-02 1.623981e-02 6.387102e-03 8.807679e-03 1.214914e-02
    [361] 6.699341e-03 1.148082e-02 1.329736e-02 1.537364e-03 2.004390e-02
    1.562065e-02 1.655465e-02 9.960172e-02 2.174588e-02 1.209472e-02
    2.328413e-02 2.012760e-04 1.422327e-02 2.194455e-03 2.307362e-02
    [376] 4.315764e-03 3.208576e-02 3.826598e-02 1.828001e-02 3.935978e-03
    5.294211e-04 1.392423e-02 6.588394e-03 1.040147e-03 1.260787e-02
    9.051757e-04 5.353215e-02 6.049058e-02 1.382630e-01 1.064124e-01
    [391] 3.380742e-03 1.798038e-02 1.557048e-01 1.217146e-02 4.140520e-02
    4.707564e-02 2.786042e-02 8.836988e-03 5.542879e-03 1.862664e-02
    8.858770e-03 1.026681e-03 1.692105e-02 8.849238e-03 7.143816e-03
    [406] 1.630118e-02 1.165920e-01 9.471496e-03 4.879998e-02 1.388216e-02
    1.453267e-02 4.845224e-04 1.415190e-03 1.208627e-02 1.372348e-02
    2.573131e-02 1.169595e-02 1.825447e-02 2.574299e-02 5.301360e-02
    [421] 6.961110e-03 7.781891e-03 1.013308e-03 3.160916e-03 1.090344e-02
    1.530841e-02 9.398088e-04 9.143726e-04 1.286683e-02 2.006193e-02
    1.774378e-02 5.681591e-02 9.584676e-03 7.957152e-02 4.485609e-03
    [436] 1.086684e-02 2.930273e-03 6.085481e-03 4.342320e-03 1.319999e-02
    2.120402e-02 4.477545e-02 1.991814e-02 8.893947e-03 7.790133e-03
    1.610199e-02 2.441280e-02 2.781231e-03 1.410080e-02 1.639912e-02
    [451] 1.797498e-02 1.185382e-02 2.775063e-02 3.797315e-02 1.428883e-02
    1.272659e-02 2.390500e-03 7.503478e-03 8.965356e-03 2.139452e-02
    2.028536e-02 6.916416e-02 1.615986e-02 4.837412e-02 1.561731e-02
    [466] 7.130332e-03 9.208406e-05 1.099934e-02 2.003469e-02 1.395857e-02
    9.883482e-03 4.110852e-02 1.202052e-02 2.833039e-02 1.233236e-02
    2.145801e-02 7.900161e-03 4.663819e-02 4.410819e-03 5.115056e-04
    [481] 9.100270e-04 4.013683e-03 1.227139e-02 3.304697e-03 2.919099e-03
    6.112390e-03 1.922229e-02 1.208282e-03 1.164037e-02 2.166888e-02
    4.381615e-02 5.318929e-03 7.226343e-03 2.732819e-02 2.385092e-04
    [496] 4.905250e-02 1.159876e-02 4.068228e-03 3.349013e-02 1.273468e-03


    Thanks for your help,

    Ben

    On Thu, Dec 1, 2011 at 11:55 AM, peter dalgaard <pdalgd@gmail.com>
    wrote:
    On Dec 1, 2011, at 18:54 , Ben quant wrote:

    Sorry if this is a duplicate: This is a re-post because the pdf's
    mentioned
    below did not go through.
    Still not there. Sometimes it's because your mailer doesn't label them
    with the appropriate mime-type (e.g. as application/octet-stream, which is
    "arbitrary binary"). Anyways, see below
    [snip]
    With the above data I do:
    l_logit = glm(y~x, data=as.data.frame(l_yx),
    family=binomial(link="logit"))
    Warning message:
    glm.fit: fitted probabilities numerically 0 or 1 occurred

    Why am I getting this warning when I have data points of varying
    values for
    y=1 and y=0? In other words, I don't think I have the linear
    separation
    issue discussed in one of the links I provided.
    I bet that you do... You can get the warning without that effect (one
    of my own examples is the probability of menarche in a data set that
    includes infants and old age pensioners), but not with a huge odds ratio as
    well. Take a look at
    d <- as.data.frame(l_yx)
    with(d, y[order(x)])

    if it comes out as all zeros followed by all ones or vice versa, then
    you have the problem.
    PS - Then I do this and I get a odds ratio a crazy size:
    l_sm = summary(l_logit) # coef pval is $coefficients[8], log odds
    $coefficients[2]
    l_exp_coef = exp(l_logit$coefficients)[2] # exponentiate the
    coeffcients
    l_exp_coef
    x
    3161.781

    So for one unit increase in the predictor variable I get 3160.781%
    (3161.781 - 1 = 3160.781) increase in odds? That can't be correct
    either.
    How do I correct for this issue? (I tried multiplying the predictor
    variables by a constant and the odds ratio goes down, but the
    warning above
    still persists and shouldn't the odds ratio be predictor variable
    size
    independent?)

    --
    Peter Dalgaard, Professor,
    Center for Statistics, Copenhagen Business School
    Solbjerg Plads 3, 2000 Frederiksberg, Denmark
    Phone: (+45)38153501
    Email: pd.mes@cbs.dk Priv: pdalgd@gmail.com







    --
    Peter Dalgaard, Professor,
    Center for Statistics, Copenhagen Business School
    Solbjerg Plads 3, 2000 Frederiksberg, Denmark
    Phone: (+45)38153501
    Email: pd.mes@cbs.dk Priv: pdalgd@gmail.com







  • Peter dalgaard at Dec 1, 2011 at 11:32 pm

    On Dec 1, 2011, at 23:43 , Ben quant wrote:

    I'm not proposing this as a permanent solution, just investigating the warning. I zeroed out the three outliers and received no warning. Can someone tell me why I am getting no warning now?
    It's easier to explain why you got the warning before. If the OR for a one unit change is 3000, the OR for a 14 unit change is on the order of 10^48 and that causes over/underflow in the conversion to probabilities.

    I'm still baffled at how you can get that model fitted to your data, though. One thing is that you can have situations where there are fitted probabilities of one corresponding to data that are all one and/or fitted zeros where data are zero, but you seem to have cases where you have both zeros and ones at both ends of the range of x. Fitting a zero to a one or vice versa would make the likelihood zero, so you'd expect that the algorithm would find a better set of parameters rather quickly. Perhaps the extremely large number of observations that you have has something to do with it?

    You'll get the warning if the fitted zeros or ones occur at any point of the iterative procedure. Maybe it isn't actually true for the final model, but that wouldn't seem consistent with the OR that you cited.

    Anyways, your real problem lies with the distribution of the x values. I'd want to try transforming it to something more sane. Taking logarithms is the obvious idea, but you'd need to find out what to do about the zeros -- perhaps log(x + 1e-4) ? Or maybe just cut the outliers down to size with pmin(x,1).
    I did this 3 times to get rid of the 3 outliers:
    mx_dims = arrayInd(which.max(l_yx), dim(l_yx))
    l_yx[mx_dims] = 0

    Now this does not produce an warning:
    l_logit = glm(y~x, data=as.data.frame(l_yx), family=binomial(link="logit"))

    Can someone tell me why occurred?

    Also, again, here are the screen shots of my data that I tried to send earlier (two screen shots, two pages):
    http://scientia.crescat.net/static/ben/warn%20num%200%20or%201.pdf

    Thank you for your help,

    Ben

    On Thu, Dec 1, 2011 at 3:25 PM, Ben quant wrote:
    Oops! Please ignore my last post. I mistakenly gave you different data I was testing with. This is the correct data:

    Here you go:
    attach(as.data.frame(l_yx))
    range(x[y==0])
    [1] 0.00000 14.66518
    range(x[y==1])
    [1] 0.00000 13.49791


    How do I know what is acceptable?

    Also, here are the screen shots of my data that I tried to send earlier (two screen shots, two pages):
    http://scientia.crescat.net/static/ben/warn%20num%200%20or%201.pdf

    Thank you,

    Ben

    On Thu, Dec 1, 2011 at 3:07 PM, Ben quant wrote:
    Here you go:
    attach(as.data.frame(l_yx))
    range(x[y==1])
    [1] -22500.0000 746.6666
    range(x[y==0])
    [1] -10076.5303 653.0228

    How do I know what is acceptable?

    Also, here are the screen shots of my data that I tried to send earlier (two screen shots, two pages):
    http://scientia.crescat.net/static/ben/warn%20num%200%20or%201.pdf

    Thank you,

    Ben


    On Thu, Dec 1, 2011 at 2:24 PM, peter dalgaard wrote:
    On Dec 1, 2011, at 21:32 , Ben quant wrote:

    Thank you for the feedback, but my data looks fine to me. Please tell me if I'm not understanding.
    Hum, then maybe it really is a case of a transition region being short relative to the range of your data. Notice that the warning is just that: a warning. I do notice that the distribution of your x values is rather extreme -- you stated a range of 0--14 and a mean of 0.01. And after all, an odds ratio of 3000 per unit is only a tad over a doubling per 0.1 units.

    Have a look at range(x[y==0]) and range(x[y==1]).

    I followed your instructions and here is a sample of the first 500 values : (info on 'd' is below that)
    d <- as.data.frame(l_yx)
    x = with(d, y[order(x)])
    x[1:500] # I have 1's and 0's dispersed throughout
    [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0
    [101] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    [201] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 0 0 1 0 0 0 1 0 0
    [301] 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    [401] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0

    # I get the warning still
    l_df = as.data.frame(l_yx)
    l_logit = glm(y~x, data=l_df, family=binomial(link="logit"))
    Warning message:
    glm.fit: fitted probabilities numerically 0 or 1 occurred

    # some info on 'd' above:
    d[1:500,1]
    [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    [101] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    [201] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    [301] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    [401] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    d[1:500,2]
    [1] 3.023160e-03 7.932130e-02 0.000000e+00 1.779657e-02 1.608374e-01 0.000000e+00 5.577064e-02 7.753926e-03 4.018553e-03 4.760918e-02 2.080511e-02 1.642404e-01 3.703720e-03 8.901981e-02 1.260415e-03
    [16] 2.202523e-02 3.750940e-02 4.441975e-04 9.351171e-03 8.374567e-03 0.000000e+00 8.440448e-02 5.081017e-01 2.538640e-05 1.806017e-02 2.954641e-04 1.434859e-03 6.964976e-04 0.000000e+00 1.202162e-02
    [31] 3.420300e-03 4.276100e-02 1.457324e-02 4.140121e-03 1.349180e-04 1.525292e-03 4.817502e-02 9.515717e-03 2.232953e-02 1.227908e-01 3.293581e-02 1.454352e-02 1.176011e-03 6.274138e-02 2.879205e-02
    [46] 6.900926e-03 1.414648e-04 3.446349e-02 8.807174e-03 3.549332e-02 2.828509e-03 2.935003e-02 7.162872e-03 5.650050e-03 1.221191e-02 0.000000e+00 2.126334e-02 2.052020e-02 7.542409e-02 2.586269e-04
    [61] 5.258664e-02 1.213126e-02 1.493275e-02 8.152657e-03 1.774966e-02 2.433198e-02 1.371257e-02 6.732339e-02 9.747677e-03 8.180594e-03 1.882473e-01 1.682596e-03 1.410170e-02 8.429676e-03 3.520589e-01
    [76] 1.865160e-02 2.389941e-02 3.301558e-02 4.896605e-03 6.826244e-03 9.151337e-02 1.722270e-02 1.764039e-01 3.554063e-02 7.338532e-03 9.577793e-03 3.632366e-03 1.422400e-02 3.020497e-02 1.797096e-02
    [91] 2.577899e-03 1.338365e-01 8.202223e-03 1.088335e-02 2.137259e-02 6.829797e-03 3.104860e-04 8.927160e-03 5.626881e-02 1.197682e-02 6.140553e-02 2.729364e-02 1.377913e-02 3.018277e-03 1.188304e-02
    [106] 2.029268e-03 2.108815e-02 1.765503e-03 2.449253e-02 3.677046e-03 1.013463e-02 4.286642e-02 1.238931e-02 3.072090e-03 1.509613e-02 3.609885e-02 5.537268e-03 3.151614e-02 3.703148e-03 1.409401e-01
    [121] 1.473365e-02 9.160461e-03 1.035099e-01 3.005865e-02 1.332199e-02 6.936535e-03 2.433539e-02 4.935373e-03 4.822740e-03 1.597643e-03 3.805619e-03 1.092683e-02 1.760635e-01 5.565614e-03 7.739213e-04
    [136] 4.529198e-03 5.110359e-03 7.391258e-02 5.018207e-03 2.071417e-02 1.825787e-02 9.141405e-03 1.078386e-01 2.171470e-04 7.164583e-03 2.522077e-02 1.994428e-03 6.044653e-03 1.778078e-04 5.797706e-03
    [151] 1.526778e-02 1.595897e-02 1.995627e-01 1.946735e-03 6.711582e-02 2.722350e-02 3.127499e-02 1.074904e-01 2.477414e-03 5.015375e-02 3.417810e-02 2.046643e-02 1.644832e-02 5.220166e-03 1.217752e-02
    [166] 1.187352e-02 1.634016e-02 2.349568e-03 3.451811e-02 2.593540e-03 6.868595e-03 1.311236e-02 6.457757e-03 2.942391e-04 1.628352e-02 8.288831e-03 3.170856e-04 1.251331e+00 1.706954e-02 1.063723e-03
    [181] 1.374416e-02 2.140507e-02 2.817009e-02 2.272793e-02 4.365562e-02 6.089414e-03 2.498083e-02 1.360471e-02 1.884079e-02 1.448660e-02 2.341314e-02 8.167064e-03 4.109117e-02 2.660633e-02 7.711723e-03
    [196] 9.590278e-03 2.515490e-03 1.978033e-02 3.454990e-02 8.072748e-03 4.718885e-03 1.621131e-01 4.547743e-03 1.081195e-02 9.572051e-04 1.790391e-02 1.618026e-02 1.910230e-02 1.861914e-02 3.485475e-02
    [211] 2.844890e-03 1.866889e-02 1.378208e-02 2.451514e-02 2.535044e-03 3.921364e-04 1.557266e-03 3.315892e-03 1.752821e-03 6.786187e-03 1.360921e-02 9.550702e-03 8.114506e-03 5.068741e-03 1.729822e-02
    [226] 1.902033e-02 8.196564e-03 2.632880e-03 1.587969e-02 8.354079e-04 1.050023e-03 4.236195e-04 9.181120e-03 4.995919e-04 1.092234e-02 1.207544e-02 2.187243e-01 3.251349e-02 1.269134e-03 1.557751e-04
    [241] 1.232498e-02 2.654449e-02 1.049324e-03 8.442729e-03 6.331691e-03 1.715609e-02 1.017800e-03 9.230006e-03 1.331373e-02 5.596195e-02 1.296551e-03 5.272687e-03 2.805640e-02 4.790665e-02 2.043011e-02
    [256] 1.047226e-02 1.866499e-02 9.323001e-03 8.920536e-03 1.582911e-03 2.776238e-03 2.914762e-02 4.402356e-03 9.555274e-04 1.681966e-03 7.584319e-04 6.758914e-02 1.505431e-02 2.213308e-02 1.329330e-02
    [271] 7.284363e-03 2.687818e-02 2.997535e-03 7.470007e-03 2.070569e-03 3.441944e-02 1.717768e-02 4.523364e-02 1.003558e-02 1.365111e-02 1.906845e-02 1.676223e-02 3.506809e-04 9.164257e-02 9.008416e-03
    [286] 1.073903e-02 4.855937e-03 8.618043e-03 2.529247e-02 1.059375e-02 5.834253e-03 2.004309e-02 1.460387e-02 2.899190e-02 5.867984e-03 1.983956e-02 6.834339e-03 1.925821e-03 9.231870e-03 6.839616e-03
    [301] 1.029972e-02 2.009769e-02 9.458785e-03 1.183901e-02 8.911549e-03 1.264745e-02 2.995451e-03 7.657983e-04 5.315853e-03 1.325039e-02 1.044103e-02 2.307236e-02 2.780789e-02 1.735145e-02 9.053126e-03
    [316] 5.847638e-02 3.815715e-03 5.087690e-03 1.040513e-02 4.475672e-02 6.564791e-02 3.233571e-03 1.076193e-02 8.283819e-02 5.370256e-03 3.533256e-02 1.302812e-02 1.896783e-02 2.055282e-02 3.572239e-03
    [331] 5.867681e-03 5.864974e-04 9.715807e-03 1.665469e-02 5.082044e-02 3.547168e-03 3.069631e-03 1.274717e-02 1.858226e-03 3.104809e-04 1.247831e-02 2.073575e-03 3.544110e-04 7.240736e-03 8.452117e-05
    [346] 8.149151e-04 4.942461e-05 1.142303e-03 6.265512e-04 3.666717e-04 3.244669e-02 7.242018e-03 6.335951e-04 2.329072e-02 3.719716e-03 2.803425e-02 1.623981e-02 6.387102e-03 8.807679e-03 1.214914e-02
    [361] 6.699341e-03 1.148082e-02 1.329736e-02 1.537364e-03 2.004390e-02 1.562065e-02 1.655465e-02 9.960172e-02 2.174588e-02 1.209472e-02 2.328413e-02 2.012760e-04 1.422327e-02 2.194455e-03 2.307362e-02
    [376] 4.315764e-03 3.208576e-02 3.826598e-02 1.828001e-02 3.935978e-03 5.294211e-04 1.392423e-02 6.588394e-03 1.040147e-03 1.260787e-02 9.051757e-04 5.353215e-02 6.049058e-02 1.382630e-01 1.064124e-01
    [391] 3.380742e-03 1.798038e-02 1.557048e-01 1.217146e-02 4.140520e-02 4.707564e-02 2.786042e-02 8.836988e-03 5.542879e-03 1.862664e-02 8.858770e-03 1.026681e-03 1.692105e-02 8.849238e-03 7.143816e-03
    [406] 1.630118e-02 1.165920e-01 9.471496e-03 4.879998e-02 1.388216e-02 1.453267e-02 4.845224e-04 1.415190e-03 1.208627e-02 1.372348e-02 2.573131e-02 1.169595e-02 1.825447e-02 2.574299e-02 5.301360e-02
    [421] 6.961110e-03 7.781891e-03 1.013308e-03 3.160916e-03 1.090344e-02 1.530841e-02 9.398088e-04 9.143726e-04 1.286683e-02 2.006193e-02 1.774378e-02 5.681591e-02 9.584676e-03 7.957152e-02 4.485609e-03
    [436] 1.086684e-02 2.930273e-03 6.085481e-03 4.342320e-03 1.319999e-02 2.120402e-02 4.477545e-02 1.991814e-02 8.893947e-03 7.790133e-03 1.610199e-02 2.441280e-02 2.781231e-03 1.410080e-02 1.639912e-02
    [451] 1.797498e-02 1.185382e-02 2.775063e-02 3.797315e-02 1.428883e-02 1.272659e-02 2.390500e-03 7.503478e-03 8.965356e-03 2.139452e-02 2.028536e-02 6.916416e-02 1.615986e-02 4.837412e-02 1.561731e-02
    [466] 7.130332e-03 9.208406e-05 1.099934e-02 2.003469e-02 1.395857e-02 9.883482e-03 4.110852e-02 1.202052e-02 2.833039e-02 1.233236e-02 2.145801e-02 7.900161e-03 4.663819e-02 4.410819e-03 5.115056e-04
    [481] 9.100270e-04 4.013683e-03 1.227139e-02 3.304697e-03 2.919099e-03 6.112390e-03 1.922229e-02 1.208282e-03 1.164037e-02 2.166888e-02 4.381615e-02 5.318929e-03 7.226343e-03 2.732819e-02 2.385092e-04
    [496] 4.905250e-02 1.159876e-02 4.068228e-03 3.349013e-02 1.273468e-03


    Thanks for your help,

    Ben

    On Thu, Dec 1, 2011 at 11:55 AM, peter dalgaard wrote:
    On Dec 1, 2011, at 18:54 , Ben quant wrote:

    Sorry if this is a duplicate: This is a re-post because the pdf's mentioned
    below did not go through.
    Still not there. Sometimes it's because your mailer doesn't label them with the appropriate mime-type (e.g. as application/octet-stream, which is "arbitrary binary"). Anyways, see below

    [snip]
    With the above data I do:
    l_logit = glm(y~x, data=as.data.frame(l_yx),
    family=binomial(link="logit"))
    Warning message:
    glm.fit: fitted probabilities numerically 0 or 1 occurred

    Why am I getting this warning when I have data points of varying values for
    y=1 and y=0? In other words, I don't think I have the linear separation
    issue discussed in one of the links I provided.
    I bet that you do... You can get the warning without that effect (one of my own examples is the probability of menarche in a data set that includes infants and old age pensioners), but not with a huge odds ratio as well. Take a look at

    d <- as.data.frame(l_yx)
    with(d, y[order(x)])

    if it comes out as all zeros followed by all ones or vice versa, then you have the problem.

    PS - Then I do this and I get a odds ratio a crazy size:
    l_sm = summary(l_logit) # coef pval is $coefficients[8], log odds
    $coefficients[2]
    l_exp_coef = exp(l_logit$coefficients)[2] # exponentiate the
    coeffcients
    l_exp_coef
    x
    3161.781

    So for one unit increase in the predictor variable I get 3160.781%
    (3161.781 - 1 = 3160.781) increase in odds? That can't be correct either.
    How do I correct for this issue? (I tried multiplying the predictor
    variables by a constant and the odds ratio goes down, but the warning above
    still persists and shouldn't the odds ratio be predictor variable size
    independent?)

    --
    Peter Dalgaard, Professor,
    Center for Statistics, Copenhagen Business School
    Solbjerg Plads 3, 2000 Frederiksberg, Denmark
    Phone: (+45)38153501
    Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com







    --
    Peter Dalgaard, Professor,
    Center for Statistics, Copenhagen Business School
    Solbjerg Plads 3, 2000 Frederiksberg, Denmark
    Phone: (+45)38153501
    Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com









    --
    Peter Dalgaard, Professor,
    Center for Statistics, Copenhagen Business School
    Solbjerg Plads 3, 2000 Frederiksberg, Denmark
    Phone: (+45)38153501
    Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
  • Ben quant at Dec 2, 2011 at 1:00 am
    Thank you so much for your help.

    The data I am using is the last file called l_yx.RData at this link (the
    second file contains the plots from earlier):
    http://scientia.crescat.net/static/ben/

    Seems like the warning went away with pmin(x,1) but now the OR is over
    15k. If I multiple my x's by 1000 I get a much more realistic OR. So I
    guess this brings me to a much different question: aren't OR's comparable
    between factors/data? In this case they don't seem to be. However, with
    different data the OR's only change a very small amount (+8.0e-4) when I
    multiply the x's by 1000. I don't understand.

    Anyways, here is a run with the raw data and a run with your suggestion
    (pmin(x,1)) that removed the error:
    l_logit = glm(y~x, data=as.data.frame(l_yx),
    family=binomial(link="logit"))
    l_logit
    Call: glm(formula = y ~ x, family = binomial(link = "logit"), data =
    as.data.frame(l_yx))

    Coefficients:
    (Intercept) x
    -2.293 8.059

    Degrees of Freedom: 690302 Total (i.e. Null); 690301 Residual
    Null Deviance: 448800
    Residual Deviance: 447100 AIC: 447100
    l_exp_coef = exp(l_logit$coefficients)[2]
    l_exp_coef
    x
    3161.781
    dim(l_yx)
    [1] 690303 2
    l_yx = cbind(l_yx[,1],pmin(l_yx[,2],1))
    dim(l_yx)
    [1] 690303 2
    colnames(l_yx) = c('y','x')
    mean(l_yx[,2])
    [1] 0.01117248
    range(l_yx[,2]) [1] 0 1
    head(l_yx[,2])
    [1] 0.00302316 0.07932130 0.00000000 0.01779657 0.16083735 0.00000000
    unique(l_yx[,1]) [1] 0 1
    l_logit = glm(y~x, data=as.data.frame(l_yx),
    family=binomial(link="logit"))
    l_logit
    Call: glm(formula = y ~ x, family = binomial(link = "logit"), data =
    as.data.frame(l_yx))

    Coefficients:
    (Intercept) x
    -2.312 9.662

    Degrees of Freedom: 690302 Total (i.e. Null); 690301 Residual
    Null Deviance: 448800
    Residual Deviance: 446800 AIC: 446800
    l_exp_coef = exp(l_logit$coefficients)[2]
    l_exp_coef
    x
    15709.52


    Thanks,

    Ben


    On Thu, Dec 1, 2011 at 4:32 PM, peter dalgaard wrote:

    On Dec 1, 2011, at 23:43 , Ben quant wrote:

    I'm not proposing this as a permanent solution, just investigating the
    warning. I zeroed out the three outliers and received no warning. Can
    someone tell me why I am getting no warning now?

    It's easier to explain why you got the warning before. If the OR for a one
    unit change is 3000, the OR for a 14 unit change is on the order of 10^48
    and that causes over/underflow in the conversion to probabilities.

    I'm still baffled at how you can get that model fitted to your data,
    though. One thing is that you can have situations where there are fitted
    probabilities of one corresponding to data that are all one and/or fitted
    zeros where data are zero, but you seem to have cases where you have both
    zeros and ones at both ends of the range of x. Fitting a zero to a one or
    vice versa would make the likelihood zero, so you'd expect that the
    algorithm would find a better set of parameters rather quickly. Perhaps the
    extremely large number of observations that you have has something to do
    with it?

    You'll get the warning if the fitted zeros or ones occur at any point of
    the iterative procedure. Maybe it isn't actually true for the final model,
    but that wouldn't seem consistent with the OR that you cited.

    Anyways, your real problem lies with the distribution of the x values. I'd
    want to try transforming it to something more sane. Taking logarithms is
    the obvious idea, but you'd need to find out what to do about the zeros --
    perhaps log(x + 1e-4) ? Or maybe just cut the outliers down to size with
    pmin(x,1).
    I did this 3 times to get rid of the 3 outliers:
    mx_dims = arrayInd(which.max(l_yx), dim(l_yx))
    l_yx[mx_dims] = 0

    Now this does not produce an warning:
    l_logit = glm(y~x, data=as.data.frame(l_yx),
    family=binomial(link="logit"))
    Can someone tell me why occurred?

    Also, again, here are the screen shots of my data that I tried to send
    earlier (two screen shots, two pages):
    http://scientia.crescat.net/static/ben/warn%20num%200%20or%201.pdf

    Thank you for your help,

    Ben

    On Thu, Dec 1, 2011 at 3:25 PM, Ben quant wrote:
    Oops! Please ignore my last post. I mistakenly gave you different data I
    was testing with. This is the correct data:
    Here you go:
    attach(as.data.frame(l_yx))
    range(x[y==0])
    [1] 0.00000 14.66518
    range(x[y==1])
    [1] 0.00000 13.49791


    How do I know what is acceptable?

    Also, here are the screen shots of my data that I tried to send earlier
    (two screen shots, two pages):
    http://scientia.crescat.net/static/ben/warn%20num%200%20or%201.pdf

    Thank you,

    Ben

    On Thu, Dec 1, 2011 at 3:07 PM, Ben quant wrote:
    Here you go:
    attach(as.data.frame(l_yx))
    range(x[y==1])
    [1] -22500.0000 746.6666
    range(x[y==0])
    [1] -10076.5303 653.0228

    How do I know what is acceptable?

    Also, here are the screen shots of my data that I tried to send earlier
    (two screen shots, two pages):
    http://scientia.crescat.net/static/ben/warn%20num%200%20or%201.pdf

    Thank you,

    Ben


    On Thu, Dec 1, 2011 at 2:24 PM, peter dalgaard wrote:
    On Dec 1, 2011, at 21:32 , Ben quant wrote:

    Thank you for the feedback, but my data looks fine to me. Please tell
    me if I'm not understanding.
    Hum, then maybe it really is a case of a transition region being short
    relative to the range of your data. Notice that the warning is just that: a
    warning. I do notice that the distribution of your x values is rather
    extreme -- you stated a range of 0--14 and a mean of 0.01. And after all,
    an odds ratio of 3000 per unit is only a tad over a doubling per 0.1 units.
    Have a look at range(x[y==0]) and range(x[y==1]).

    I followed your instructions and here is a sample of the first 500
    values : (info on 'd' is below that)
    d <- as.data.frame(l_yx)
    x = with(d, y[order(x)])
    x[1:500] # I have 1's and 0's dispersed throughout
    [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0
    [101] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    [201] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0
    0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0
    0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 0 0 1 0 0 0 1 0 0
    [301] 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 0 1 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    [401] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0
    0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0
    # I get the warning still
    l_df = as.data.frame(l_yx)
    l_logit = glm(y~x, data=l_df, family=binomial(link="logit"))
    Warning message:
    glm.fit: fitted probabilities numerically 0 or 1 occurred

    # some info on 'd' above:
    d[1:500,1]
    [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    [101] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    [201] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    [301] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    [401] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    d[1:500,2]
    [1] 3.023160e-03 7.932130e-02 0.000000e+00 1.779657e-02 1.608374e-01
    0.000000e+00 5.577064e-02 7.753926e-03 4.018553e-03 4.760918e-02
    2.080511e-02 1.642404e-01 3.703720e-03 8.901981e-02 1.260415e-03
    [16] 2.202523e-02 3.750940e-02 4.441975e-04 9.351171e-03 8.374567e-03
    0.000000e+00 8.440448e-02 5.081017e-01 2.538640e-05 1.806017e-02
    2.954641e-04 1.434859e-03 6.964976e-04 0.000000e+00 1.202162e-02
    [31] 3.420300e-03 4.276100e-02 1.457324e-02 4.140121e-03 1.349180e-04
    1.525292e-03 4.817502e-02 9.515717e-03 2.232953e-02 1.227908e-01
    3.293581e-02 1.454352e-02 1.176011e-03 6.274138e-02 2.879205e-02
    [46] 6.900926e-03 1.414648e-04 3.446349e-02 8.807174e-03 3.549332e-02
    2.828509e-03 2.935003e-02 7.162872e-03 5.650050e-03 1.221191e-02
    0.000000e+00 2.126334e-02 2.052020e-02 7.542409e-02 2.586269e-04
    [61] 5.258664e-02 1.213126e-02 1.493275e-02 8.152657e-03 1.774966e-02
    2.433198e-02 1.371257e-02 6.732339e-02 9.747677e-03 8.180594e-03
    1.882473e-01 1.682596e-03 1.410170e-02 8.429676e-03 3.520589e-01
    [76] 1.865160e-02 2.389941e-02 3.301558e-02 4.896605e-03 6.826244e-03
    9.151337e-02 1.722270e-02 1.764039e-01 3.554063e-02 7.338532e-03
    9.577793e-03 3.632366e-03 1.422400e-02 3.020497e-02 1.797096e-02
    [91] 2.577899e-03 1.338365e-01 8.202223e-03 1.088335e-02 2.137259e-02
    6.829797e-03 3.104860e-04 8.927160e-03 5.626881e-02 1.197682e-02
    6.140553e-02 2.729364e-02 1.377913e-02 3.018277e-03 1.188304e-02
    [106] 2.029268e-03 2.108815e-02 1.765503e-03 2.449253e-02 3.677046e-03
    1.013463e-02 4.286642e-02 1.238931e-02 3.072090e-03 1.509613e-02
    3.609885e-02 5.537268e-03 3.151614e-02 3.703148e-03 1.409401e-01
    [121] 1.473365e-02 9.160461e-03 1.035099e-01 3.005865e-02 1.332199e-02
    6.936535e-03 2.433539e-02 4.935373e-03 4.822740e-03 1.597643e-03
    3.805619e-03 1.092683e-02 1.760635e-01 5.565614e-03 7.739213e-04
    [136] 4.529198e-03 5.110359e-03 7.391258e-02 5.018207e-03 2.071417e-02
    1.825787e-02 9.141405e-03 1.078386e-01 2.171470e-04 7.164583e-03
    2.522077e-02 1.994428e-03 6.044653e-03 1.778078e-04 5.797706e-03
    [151] 1.526778e-02 1.595897e-02 1.995627e-01 1.946735e-03 6.711582e-02
    2.722350e-02 3.127499e-02 1.074904e-01 2.477414e-03 5.015375e-02
    3.417810e-02 2.046643e-02 1.644832e-02 5.220166e-03 1.217752e-02
    [166] 1.187352e-02 1.634016e-02 2.349568e-03 3.451811e-02 2.593540e-03
    6.868595e-03 1.311236e-02 6.457757e-03 2.942391e-04 1.628352e-02
    8.288831e-03 3.170856e-04 1.251331e+00 1.706954e-02 1.063723e-03
    [181] 1.374416e-02 2.140507e-02 2.817009e-02 2.272793e-02 4.365562e-02
    6.089414e-03 2.498083e-02 1.360471e-02 1.884079e-02 1.448660e-02
    2.341314e-02 8.167064e-03 4.109117e-02 2.660633e-02 7.711723e-03
    [196] 9.590278e-03 2.515490e-03 1.978033e-02 3.454990e-02 8.072748e-03
    4.718885e-03 1.621131e-01 4.547743e-03 1.081195e-02 9.572051e-04
    1.790391e-02 1.618026e-02 1.910230e-02 1.861914e-02 3.485475e-02
    [211] 2.844890e-03 1.866889e-02 1.378208e-02 2.451514e-02 2.535044e-03
    3.921364e-04 1.557266e-03 3.315892e-03 1.752821e-03 6.786187e-03
    1.360921e-02 9.550702e-03 8.114506e-03 5.068741e-03 1.729822e-02
    [226] 1.902033e-02 8.196564e-03 2.632880e-03 1.587969e-02 8.354079e-04
    1.050023e-03 4.236195e-04 9.181120e-03 4.995919e-04 1.092234e-02
    1.207544e-02 2.187243e-01 3.251349e-02 1.269134e-03 1.557751e-04
    [241] 1.232498e-02 2.654449e-02 1.049324e-03 8.442729e-03 6.331691e-03
    1.715609e-02 1.017800e-03 9.230006e-03 1.331373e-02 5.596195e-02
    1.296551e-03 5.272687e-03 2.805640e-02 4.790665e-02 2.043011e-02
    [256] 1.047226e-02 1.866499e-02 9.323001e-03 8.920536e-03 1.582911e-03
    2.776238e-03 2.914762e-02 4.402356e-03 9.555274e-04 1.681966e-03
    7.584319e-04 6.758914e-02 1.505431e-02 2.213308e-02 1.329330e-02
    [271] 7.284363e-03 2.687818e-02 2.997535e-03 7.470007e-03 2.070569e-03
    3.441944e-02 1.717768e-02 4.523364e-02 1.003558e-02 1.365111e-02
    1.906845e-02 1.676223e-02 3.506809e-04 9.164257e-02 9.008416e-03
    [286] 1.073903e-02 4.855937e-03 8.618043e-03 2.529247e-02 1.059375e-02
    5.834253e-03 2.004309e-02 1.460387e-02 2.899190e-02 5.867984e-03
    1.983956e-02 6.834339e-03 1.925821e-03 9.231870e-03 6.839616e-03
    [301] 1.029972e-02 2.009769e-02 9.458785e-03 1.183901e-02 8.911549e-03
    1.264745e-02 2.995451e-03 7.657983e-04 5.315853e-03 1.325039e-02
    1.044103e-02 2.307236e-02 2.780789e-02 1.735145e-02 9.053126e-03
    [316] 5.847638e-02 3.815715e-03 5.087690e-03 1.040513e-02 4.475672e-02
    6.564791e-02 3.233571e-03 1.076193e-02 8.283819e-02 5.370256e-03
    3.533256e-02 1.302812e-02 1.896783e-02 2.055282e-02 3.572239e-03
    [331] 5.867681e-03 5.864974e-04 9.715807e-03 1.665469e-02 5.082044e-02
    3.547168e-03 3.069631e-03 1.274717e-02 1.858226e-03 3.104809e-04
    1.247831e-02 2.073575e-03 3.544110e-04 7.240736e-03 8.452117e-05
    [346] 8.149151e-04 4.942461e-05 1.142303e-03 6.265512e-04 3.666717e-04
    3.244669e-02 7.242018e-03 6.335951e-04 2.329072e-02 3.719716e-03
    2.803425e-02 1.623981e-02 6.387102e-03 8.807679e-03 1.214914e-02
    [361] 6.699341e-03 1.148082e-02 1.329736e-02 1.537364e-03 2.004390e-02
    1.562065e-02 1.655465e-02 9.960172e-02 2.174588e-02 1.209472e-02
    2.328413e-02 2.012760e-04 1.422327e-02 2.194455e-03 2.307362e-02
    [376] 4.315764e-03 3.208576e-02 3.826598e-02 1.828001e-02 3.935978e-03
    5.294211e-04 1.392423e-02 6.588394e-03 1.040147e-03 1.260787e-02
    9.051757e-04 5.353215e-02 6.049058e-02 1.382630e-01 1.064124e-01
    [391] 3.380742e-03 1.798038e-02 1.557048e-01 1.217146e-02 4.140520e-02
    4.707564e-02 2.786042e-02 8.836988e-03 5.542879e-03 1.862664e-02
    8.858770e-03 1.026681e-03 1.692105e-02 8.849238e-03 7.143816e-03
    [406] 1.630118e-02 1.165920e-01 9.471496e-03 4.879998e-02 1.388216e-02
    1.453267e-02 4.845224e-04 1.415190e-03 1.208627e-02 1.372348e-02
    2.573131e-02 1.169595e-02 1.825447e-02 2.574299e-02 5.301360e-02
    [421] 6.961110e-03 7.781891e-03 1.013308e-03 3.160916e-03 1.090344e-02
    1.530841e-02 9.398088e-04 9.143726e-04 1.286683e-02 2.006193e-02
    1.774378e-02 5.681591e-02 9.584676e-03 7.957152e-02 4.485609e-03
    [436] 1.086684e-02 2.930273e-03 6.085481e-03 4.342320e-03 1.319999e-02
    2.120402e-02 4.477545e-02 1.991814e-02 8.893947e-03 7.790133e-03
    1.610199e-02 2.441280e-02 2.781231e-03 1.410080e-02 1.639912e-02
    [451] 1.797498e-02 1.185382e-02 2.775063e-02 3.797315e-02 1.428883e-02
    1.272659e-02 2.390500e-03 7.503478e-03 8.965356e-03 2.139452e-02
    2.028536e-02 6.916416e-02 1.615986e-02 4.837412e-02 1.561731e-02
    [466] 7.130332e-03 9.208406e-05 1.099934e-02 2.003469e-02 1.395857e-02
    9.883482e-03 4.110852e-02 1.202052e-02 2.833039e-02 1.233236e-02
    2.145801e-02 7.900161e-03 4.663819e-02 4.410819e-03 5.115056e-04
    [481] 9.100270e-04 4.013683e-03 1.227139e-02 3.304697e-03 2.919099e-03
    6.112390e-03 1.922229e-02 1.208282e-03 1.164037e-02 2.166888e-02
    4.381615e-02 5.318929e-03 7.226343e-03 2.732819e-02 2.385092e-04
    [496] 4.905250e-02 1.159876e-02 4.068228e-03 3.349013e-02 1.273468e-03


    Thanks for your help,

    Ben

    On Thu, Dec 1, 2011 at 11:55 AM, peter dalgaard wrote:
    On Dec 1, 2011, at 18:54 , Ben quant wrote:

    Sorry if this is a duplicate: This is a re-post because the pdf's
    mentioned
    below did not go through.
    Still not there. Sometimes it's because your mailer doesn't label them
    with the appropriate mime-type (e.g. as application/octet-stream, which is
    "arbitrary binary"). Anyways, see below
    [snip]
    With the above data I do:
    l_logit = glm(y~x, data=as.data.frame(l_yx),
    family=binomial(link="logit"))
    Warning message:
    glm.fit: fitted probabilities numerically 0 or 1 occurred

    Why am I getting this warning when I have data points of varying
    values for
    y=1 and y=0? In other words, I don't think I have the linear
    separation
    issue discussed in one of the links I provided.
    I bet that you do... You can get the warning without that effect (one
    of my own examples is the probability of menarche in a data set that
    includes infants and old age pensioners), but not with a huge odds ratio as
    well. Take a look at
    d <- as.data.frame(l_yx)
    with(d, y[order(x)])

    if it comes out as all zeros followed by all ones or vice versa, then
    you have the problem.
    PS - Then I do this and I get a odds ratio a crazy size:
    l_sm = summary(l_logit) # coef pval is $coefficients[8], log odds
    $coefficients[2]
    l_exp_coef = exp(l_logit$coefficients)[2] # exponentiate the
    coeffcients
    l_exp_coef
    x
    3161.781

    So for one unit increase in the predictor variable I get 3160.781%
    (3161.781 - 1 = 3160.781) increase in odds? That can't be correct
    either.
    How do I correct for this issue? (I tried multiplying the predictor
    variables by a constant and the odds ratio goes down, but the
    warning above
    still persists and shouldn't the odds ratio be predictor variable
    size
    independent?)

    --
    Peter Dalgaard, Professor,
    Center for Statistics, Copenhagen Business School
    Solbjerg Plads 3, 2000 Frederiksberg, Denmark
    Phone: (+45)38153501
    Email: pd.mes@cbs.dk Priv: pdalgd@gmail.com







    --
    Peter Dalgaard, Professor,
    Center for Statistics, Copenhagen Business School
    Solbjerg Plads 3, 2000 Frederiksberg, Denmark
    Phone: (+45)38153501
    Email: pd.mes@cbs.dk Priv: pdalgd@gmail.com









    --
    Peter Dalgaard, Professor,
    Center for Statistics, Copenhagen Business School
    Solbjerg Plads 3, 2000 Frederiksberg, Denmark
    Phone: (+45)38153501
    Email: pd.mes@cbs.dk Priv: pdalgd@gmail.com







  • Patrick Breheny at Dec 2, 2011 at 2:08 pm

    On 12/01/2011 08:00 PM, Ben quant wrote:
    The data I am using is the last file called l_yx.RData at this link (the
    second file contains the plots from earlier):
    http://scientia.crescat.net/static/ben/
    The logistic regression model you are fitting assumes a linear
    relationship between x and the log odds of y; that does not seem to be
    the case for your data. To illustrate:

    x <- l_yx[,"x"]
    y <- l_yx[,"y"]
    ind1 <- x <= .002
    ind2 <- (x > .002 & x <= .0065)
    ind3 <- (x > .0065 & x <= .13)
    ind4 <- (x > .0065 & x <= .13)
    summary(glm(y[ind1]~x[ind1],family=binomial))
    ...
    Coefficients:
    Estimate Std. Error z value Pr(>|z|)
    (Intercept) -2.79174 0.02633 -106.03 <2e-16 ***
    x[ind1] 354.98852 22.78190 15.58 <2e-16 ***
    summary(glm(y[ind2]~x[ind2],family=binomial))
    Coefficients:
    Estimate Std. Error z value Pr(>|z|)
    (Intercept) -2.15805 0.02966 -72.766 <2e-16 ***
    x[ind2] -59.92934 6.51650 -9.197 <2e-16 ***
    summary(glm(y[ind3]~x[ind3],family=binomial))
    ...
    Coefficients:
    Estimate Std. Error z value Pr(>|z|)
    (Intercept) -2.367206 0.007781 -304.22 <2e-16 ***
    x[ind3] 18.104314 0.346562 52.24 <2e-16 ***
    summary(glm(y[ind4]~x[ind4],family=binomial))
    ...
    Coefficients:
    Estimate Std. Error z value Pr(>|z|)
    (Intercept) -1.31511 0.08549 -15.383 <2e-16 ***
    x[ind4] 0.06261 0.08784 0.713 0.476

    To summarize, the relationship between x and the log odds of y appears
    to vary dramatically in both magnitude and direction depending on which
    interval of x's range we're looking at. Trying to summarize this
    complicated pattern with a single line is leading to the fitted
    probabilities near 0 and 1 you are observing (note that only 0.1% of the
    data is in region 4 above, although region 4 accounts for 99.1% of the
    range of x).

    --
    Patrick Breheny
    Assistant Professor
    Department of Biostatistics
    Department of Statistics
    University of Kentucky

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupr-help @
categoriesr
postedDec 1, '11 at 5:54p
activeDec 2, '11 at 2:08p
posts10
users3
websiter-project.org
irc#r

People

Translate

site design / logo © 2022 Grokbase