Grokbase Groups R r-devel March 2012
FAQ
R version 2.14.0, started with --vanilla
table(c(1,2,3,4,NA), exclude=2, useNA='ifany')
1 3 4 <NA>
1 1 1 2

This came from a local user who wanted to remove one particular response
from some tables, but also wants to have NA always reported for data
checking purposes.
I don't think the above is what anyone would want.

PS.
This is on a background of our local desires, which is to have the
default action of the table command be
to report NA, if present. (It's one of the only commands that we
globally override at Mayo.) The user had
added only the exclude=2 argument, and the useNA value is our default.

The above makes this harder to do without rewriting the command
wholesale, which is ok (we've done it before at
various times in R and Splus) but we would avoid it if possible. Please
no wars about whether this is the "right" decison or not, we've done it
for 10+ years and quite firmly believe the extra robustness gained by
having NA appear
is worth the maintainance bother, correctness being paramount in medical
research. We're not trying to convert anyone
else, just get feedback on the best way to approach this.

Terry T.

Search Discussions

  • Prof Brian Ripley at Mar 27, 2012 at 7:05 am

    On 19/03/2012 17:01, Terry Therneau wrote:
    R version 2.14.0, started with --vanilla
    table(c(1,2,3,4,NA), exclude=2, useNA='ifany')
    1 3 4 <NA>
    1 1 1 2

    This came from a local user who wanted to remove one particular response
    from some tables, but also wants to have NA always reported for data
    checking purposes.
    I don't think the above is what anyone would want.
    You have not told us what you want!

    Try
    table(as.factor(c(1,2,3,4,NA)), exclude=2, useNA='ifany')
    1 3 4 <NA>
    1 1 1 1

    Note carefully how 'exclude' is defined:

    exclude: levels to remove from all factors in ?...?. If set to ?NULL?,
    it implies ?useNA="always"?.

    As you did not specify a factor, 'exclude' was used in forming the 'levels'.
    PS.
    This is on a background of our local desires, which is to have the
    default action of the table command be
    to report NA, if present. (It's one of the only commands that we
    globally override at Mayo.) The user had
    added only the exclude=2 argument, and the useNA value is our default.

    The above makes this harder to do without rewriting the command
    wholesale, which is ok (we've done it before at
    various times in R and Splus) but we would avoid it if possible. Please
    no wars about whether this is the "right" decison or not, we've done it
    for 10+ years and quite firmly believe the extra robustness gained by
    having NA appear
    is worth the maintainance bother, correctness being paramount in medical
    research. We're not trying to convert anyone
    else, just get feedback on the best way to approach this.
    Most likely, feed table() a factor with the properties you want.
    Terry T.

    ______________________________________________
    R-devel at r-project.org mailing list
    https://stat.ethz.ch/mailman/listinfo/r-devel

    --
    Brian D. Ripley, ripley at stats.ox.ac.uk
    Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
    University of Oxford, Tel: +44 1865 272861 (self)
    1 South Parks Road, +44 1865 272866 (PA)
    Oxford OX1 3TG, UK Fax: +44 1865 272595
  • Terry Therneau at Mar 27, 2012 at 1:12 pm

    On 03/27/2012 02:05 AM, Prof Brian Ripley wrote:
    n 19/03/2012 17:01, Terry Therneau wrote:
    R version 2.14.0, started with --vanilla
    table(c(1,2,3,4,NA), exclude=2, useNA='ifany')
    1 3 4 <NA>
    1 1 1 2

    This came from a local user who wanted to remove one particular response
    from some tables, but also wants to have NA always reported for data
    checking purposes.
    I don't think the above is what anyone would want.
    You have not told us what you want!
    Want: that the resulting table exclude values of "2" from the printout,
    while still reporting NA. This is what the local user expected, the one
    who came to me with their query.

    There are lots of ways to get the program to do the right thing, the
    simplest is
    table(c(1,2,3,4,NA), exclude=2) # keeping the default for useNA

    You show another below.
    Try
    table(as.factor(c(1,2,3,4,NA)), exclude=2, useNA='ifany')
    1 3 4 <NA>
    1 1 1 1

    Note carefully how 'exclude' is defined:

    exclude: levels to remove from all factors in ?...?. If set to ?NULL?,
    it implies ?useNA="always"?.

    As you did not specify a factor, 'exclude' was used in forming the
    'levels'.
    That is almost a "legal loophole" reading of the manual. I would never
    have seen through to that level of subtlety. A primary reason is that a
    simple test shows that exclude works on non-factors.

    I'm not sure what the best course of action is. What I've reported is a
    case where use of the options in a fairly obvious way gives an
    unexpected answer. On the other hand, I have never before seen or
    considered the case where someone wanted to exclude an actual data level
    from table: I myself would always have removed a column from the
    result. If fixing this causes other problems, then perhaps we just
    give up on this rare case.

    As to our local choices, we figured out a way to make display of NA the
    default without causing the above problem. As is often the case, a
    fairly simple solution became obvious to us about 30 minutes after
    submitting a question to the list.

    Terry T.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupr-devel @
categoriesr
postedMar 19, '12 at 5:01p
activeMar 27, '12 at 1:12p
posts3
users2
websiter-project.org
irc#r

People

Translate

site design / logo © 2022 Grokbase