Grokbase Groups R r-help January 2012
FAQ
Hi folks,

I know that density function will give a estimated density for a give
dataset. Now from that I want to have a percentage estimation for a
certain range. For examle:
y = density(c(-20,rep(0,98),20))
plot(y, xlim=c(-4,4))
Now if I want to know the percentage of data lying in (-20,2). Basically
it should be the area of the curve from -20 to 2. Anybody knows a simple
function to do it?

Thanks,

D.

Search Discussions

  • Rolf Turner at Jan 27, 2012 at 11:09 pm

    On 28/01/12 11:44, Duke wrote:
    Hi folks,

    I know that density function will give a estimated density for a give
    dataset. Now from that I want to have a percentage estimation for a
    certain range. For examle:
    y = density(c(-20,rep(0,98),20))
    plot(y, xlim=c(-4,4))
    Now if I want to know the percentage of data lying in (-20,2).
    Basically it should be the area of the curve from -20 to 2. Anybody
    knows a simple function to do it?
    You could try:

    foo <- with(y,splinefun(x,y))
    integrate(foo,lower=-20,upper=2)

    Note that

    integrate(foo,lower=min(y$x),upper=max(y$x))

    yields "1.000951 with absolute error < 0.00011", rather than giving
    exactly 1, so there's a bit of slop in the system.

    cheers,

    Rolf Turner
  • Greg Snow at Jan 29, 2012 at 4:11 am
    If you use logspline estimation (logspline package) instead of kernel density estimation then this is simple as there are cumulative area functions for logspline fits.

    If you need to do this with kernel density estimates then you can just find the area over your region for the kernel centered at each data point and average those values together to get the area under the entire density estimate.

    -----Original Message-----
    From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Duke
    Sent: Friday, January 27, 2012 3:45 PM
    To: r-help at r-project.org
    Subject: [R] percentage from density()

    Hi folks,

    I know that density function will give a estimated density for a give
    dataset. Now from that I want to have a percentage estimation for a
    certain range. For examle:
    y = density(c(-20,rep(0,98),20))
    plot(y, xlim=c(-4,4))
    Now if I want to know the percentage of data lying in (-20,2). Basically
    it should be the area of the curve from -20 to 2. Anybody knows a simple
    function to do it?

    Thanks,

    D.

    ______________________________________________
    R-help at r-project.org mailing list
    https://stat.ethz.ch/mailman/listinfo/r-help
    PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
    and provide commented, minimal, self-contained, reproducible code.
  • William Dunlap at Jan 29, 2012 at 9:03 pm
    If v is your original data,
    v <- c(-20, rep(0,98), 20)
    why not use
    mean( -20 < v & v < 2)
    as your estimate of the probability that v is in (-20,2)?

    Estimating a density is like taking the derivative
    of a smooth of the empirical distribution function,
    so why not eliminate the middleman instead of integrating
    the estimated density? Any difference between the two
    methods tells more about the smoothing used than about
    the data involved. (Not that I am any sort of expert
    in this matter.)

    Bill Dunlap
    Spotfire, TIBCO Software
    wdunlap tibco.com
    -----Original Message-----
    From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Greg Snow
    Sent: Saturday, January 28, 2012 8:12 PM
    To: Duke; r-help at r-project.org
    Subject: Re: [R] percentage from density()

    If you use logspline estimation (logspline package) instead of kernel density estimation then this is
    simple as there are cumulative area functions for logspline fits.

    If you need to do this with kernel density estimates then you can just find the area over your region
    for the kernel centered at each data point and average those values together to get the area under the
    entire density estimate.

    -----Original Message-----
    From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Duke
    Sent: Friday, January 27, 2012 3:45 PM
    To: r-help at r-project.org
    Subject: [R] percentage from density()

    Hi folks,

    I know that density function will give a estimated density for a give
    dataset. Now from that I want to have a percentage estimation for a
    certain range. For examle:
    y = density(c(-20,rep(0,98),20))
    plot(y, xlim=c(-4,4))
    Now if I want to know the percentage of data lying in (-20,2). Basically
    it should be the area of the curve from -20 to 2. Anybody knows a simple
    function to do it?

    Thanks,

    D.

    ______________________________________________
    R-help at r-project.org mailing list
    https://stat.ethz.ch/mailman/listinfo/r-help
    PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
    and provide commented, minimal, self-contained, reproducible code.

    ______________________________________________
    R-help at r-project.org mailing list
    https://stat.ethz.ch/mailman/listinfo/r-help
    PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
    and provide commented, minimal, self-contained, reproducible code.
  • Duke at Jan 30, 2012 at 2:52 pm
    Great suggestions and comments, Bill, Greg and Rolf. You provided me
    some valuable ways to deal with the data I am working with. Thank you
    all so much!

    Bests,

    D.
    On 1/29/12 4:03 PM, William Dunlap wrote:
    If v is your original data,
    v<- c(-20, rep(0,98), 20)
    why not use
    mean( -20< v& v< 2)
    as your estimate of the probability that v is in (-20,2)?

    Estimating a density is like taking the derivative
    of a smooth of the empirical distribution function,
    so why not eliminate the middleman instead of integrating
    the estimated density? Any difference between the two
    methods tells more about the smoothing used than about
    the data involved. (Not that I am any sort of expert
    in this matter.)

    Bill Dunlap
    Spotfire, TIBCO Software
    wdunlap tibco.com
    -----Original Message-----
    From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Greg Snow
    Sent: Saturday, January 28, 2012 8:12 PM
    To: Duke; r-help at r-project.org
    Subject: Re: [R] percentage from density()

    If you use logspline estimation (logspline package) instead of kernel density estimation then this is
    simple as there are cumulative area functions for logspline fits.

    If you need to do this with kernel density estimates then you can just find the area over your region
    for the kernel centered at each data point and average those values together to get the area under the
    entire density estimate.

    -----Original Message-----
    From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Duke
    Sent: Friday, January 27, 2012 3:45 PM
    To: r-help at r-project.org
    Subject: [R] percentage from density()

    Hi folks,

    I know that density function will give a estimated density for a give
    dataset. Now from that I want to have a percentage estimation for a
    certain range. For examle:
    y = density(c(-20,rep(0,98),20))
    plot(y, xlim=c(-4,4))
    Now if I want to know the percentage of data lying in (-20,2). Basically
    it should be the area of the curve from -20 to 2. Anybody knows a simple
    function to do it?

    Thanks,

    D.

    ______________________________________________
    R-help at r-project.org mailing list
    https://stat.ethz.ch/mailman/listinfo/r-help
    PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
    and provide commented, minimal, self-contained, reproducible code.

    ______________________________________________
    R-help at r-project.org mailing list
    https://stat.ethz.ch/mailman/listinfo/r-help
    PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
    and provide commented, minimal, self-contained, reproducible code.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupr-help @
categoriesr
postedJan 27, '12 at 10:44p
activeJan 30, '12 at 2:52p
posts5
users4
websiter-project.org
irc#r

People

Translate

site design / logo © 2017 Grokbase