Is there a way to select the top 10% of the values from a column?

For example the top 10% best selling items where number of sales is a column.

Thanks.

Search Discussions

  • Darren Duncan at Oct 17, 2010 at 11:53 pm

    Tim Uckun wrote:
    Is there a way to select the top 10% of the values from a column?

    For example the top 10% best selling items where number of sales is a column.
    The top 10% would be a variable number of records. Is that what you want, or
    would you rather, say, just see the top N items?

    The latter is easy, just something like this:

    SELECT * FROM mytbl ORDER BY num_sales DESC LIMIT $1

    ... where $1 is the number of records you want.

    To actually get 10% of the records, you can replace the $1 with a subquery,
    something like this probably:

    SELECT * FROM mytbl ORDER BY num_sales DESC
    LIMIT (SELECT (count(*) / 10) AS selnum FROM mytbl)

    There are probably other ways to do it also.

    Maybe you want all items whose sales are in the top 90 percentile or something,
    or maybe you want what generated the most profit, etc.

    -- Darren Duncan
  • Tim Uckun at Oct 18, 2010 at 12:03 am

    On Mon, Oct 18, 2010 at 12:53 PM, Darren Duncan wrote:
    Tim Uckun wrote:
    Is there a way to select the top 10% of the values from a column?

    For example the top 10% best selling items where number of sales is a
    column.
    The top 10% would be a variable number of records.  Is that what you want,
    or would you rather, say, just see the top N items?
    Yes this is what I want. I don't care how many records I pull up (well
    I might want to put a LIMIT on them). I want the top 10%
  • Peter Geoghegan at Oct 17, 2010 at 11:56 pm

    On 18 October 2010 00:33, Tim Uckun wrote:
    Is there a way to select the top 10% of the values from a column?

    For example the top 10% best selling items where number of sales is a column.
    That is a bit problematic because it necessitates knowing the number
    of rows total, and slow counting is an idiosyncrasy of postgres.

    http://wiki.postgresql.org/wiki/Slow_Counting

    To get the top 10%:

    SELECT * FROM table LIMIT(SELECT (COUNT(*) * 0.1)::integer FROM table)

    --
    Regards,
    Peter Geoghegan
  • Tim Uckun at Oct 18, 2010 at 12:06 am

    That is a bit problematic because it necessitates knowing the number
    of rows total, and slow counting is an idiosyncrasy of postgres.

    http://wiki.postgresql.org/wiki/Slow_Counting

    To get the top 10%:

    SELECT * FROM table LIMIT(SELECT (COUNT(*) * 0.1)::integer FROM table)

    I think I wasn't making myself clear. I don't want the top 10% of the
    rows. I want the rows with the top 10% of the values in a column.

    In my case there is a very non linear set of values. The lowest value
    is 1 and the vast majority of records have a 1 in the column. The
    highest value might be in the tens of thousands. I want to pull up
    the records that have the top ten percent values.
  • Craig Ringer at Oct 18, 2010 at 2:13 am

    On 10/18/2010 08:06 AM, Tim Uckun wrote:
    That is a bit problematic because it necessitates knowing the number
    of rows total, and slow counting is an idiosyncrasy of postgres.

    http://wiki.postgresql.org/wiki/Slow_Counting

    To get the top 10%:

    SELECT * FROM table LIMIT(SELECT (COUNT(*) * 0.1)::integer FROM table)

    I think I wasn't making myself clear. I don't want the top 10% of the
    rows. I want the rows with the top 10% of the values in a column.
    OK, so you want a median-style "sort them in descending order and count
    down until you've selected the first 10% of rows" approach? In other
    words, values in the 90th percentile of the distribution?

    Try this. Given table "x" with single integer column "y", obtain rows of
    x in the 90th percentile of y:

    select ranked.y FROM (select percent_rank() over (order by y desc) as
    pc, y from x) AS ranked WHERE pc <= 0.1;

    or:

    select ranked.y from (select ntile(10) over (order by y desc) as pc, y
    from x) AS ranked WHERE pc = 1;

    See:

    http://www.postgresql.org/docs/current/static/functions-window.html


    Both of these seem to produce odd results with small input row counts.
    Test carefully before trusting these expressions, as I'm quite new to
    the use of window functions.

    --
    Craig Ringer
  • Tim Uckun at Oct 18, 2010 at 2:24 am

    OK, so you want a median-style "sort them in descending order and count down
    until you've selected the first 10% of rows" approach? In other words,
    values in the 90th percentile of the distribution?

    Try this. Given table "x" with single integer column "y", obtain rows of x
    in the 90th percentile of y:

    select ranked.y FROM (select percent_rank() over (order by y desc) as pc, y
    from x) AS ranked WHERE pc <= 0.1;

    or:

    select ranked.y from (select ntile(10) over (order by y desc) as pc, y from
    x) AS ranked WHERE pc = 1;

    Thanks I will look into the window functions. I haven't used them
    before so thanks for the tip.
  • Brent Wood at Oct 18, 2010 at 3:23 am
    Something like this should work - (but is untested), and does require the extra subquery, so there may be a more efficient way?

    However, off the top of my head:

    select a,b,c
    from table
    where
    order by c desc
    limit (select count(*)/10 from table where );


    where c is the no of sales column



    Brent Wood
    DBA/GIS consultant
    NIWA, Wellington
    New Zealand
    Tim Uckun 10/18/10 3:40 PM >>>
    Is there a way to select the top 10% of the values from a column?

    For example the top 10% best selling items where number of sales is a column.

    Thanks.

    --
    Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
    To make changes to your subscription:
    http://www.postgresql.org/mailpref/pgsql-general

    NIWA is the trading name of the National Institute of Water & Atmospheric Research Ltd.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-general @
categoriespostgresql
postedOct 17, '10 at 11:34p
activeOct 18, '10 at 3:23a
posts8
users5
websitepostgresql.org
irc#postgresql

People

Translate

site design / logo © 2022 Grokbase