I'm having some trouble working with aggregates. I think I'm missing
the concept of how aggregates are treated. Is there a tutorial page?

Here's my problem, no doubt very simple. It's one table only, nice and
easy...
SELECT DISTINCT pdb_id FROM chain WHERE chain_id = 'A';

But now I want to count how many records are returned and I can't work
it out. I'm sure I'm missing something obvious.
For instance "SELECT COUNT(pdb_id) FROM chain WHERE chain_id = 'A';"
works fine but of course is no longer distinct.


A secondary question (relating to my quality control problem but not
the subject heading) is why I can't do this:
"SELECT DISTINCT chain_id FROM chain WHERE pdb_id IN (SELECT pdb_code
FROM pdb_entry WHERE chain_count=1);"
This looks OK but takes apparently forever (or until I lose patience)
on my Mac G4. There are about 10,000 values in that subquery, so I
didn't expect it to be lightning fast, but it seems ridiculous.


Relevant snippets of table:
pdb_entry - 19,587 entries
---------
pdb_code is KEY, CHAR(4), unique, indexed
chain_count is INTEGER

chain - 40,844 entries
-----
pdb_id is CHAR(4) Foreign key, indexed
chain_id is CHAR(1)



Cath Lawrence, Cath.Lawrence@anu.edu.au
Senior Scientific Programmer, Centre for Bioinformation Science,
John Curtin School of Medical Research (room 4088)
Australian National University, Canberra ACT 0200
ph: (02) 61257959 mobile: 0421-902694 fax: (02) 61252595

Search Discussions

  • Paul at Oct 1, 2003 at 9:10 am
    On 1 Oct 2003 at 17:26, Cath Lawrence wrote:

    Does this help?

    select count(distinct pdb_id),chainid from chain group by chain_id;
    I'm having some trouble working with aggregates. I think I'm missing
    the concept of how aggregates are treated. Is there a tutorial page?

    Here's my problem, no doubt very simple. It's one table only, nice and
    easy...
    SELECT DISTINCT pdb_id FROM chain WHERE chain_id = 'A';

    But now I want to count how many records are returned and I can't work
    it out. I'm sure I'm missing something obvious.
    For instance "SELECT COUNT(pdb_id) FROM chain WHERE chain_id = 'A';"
    works fine but of course is no longer distinct.


    A secondary question (relating to my quality control problem but not
    the subject heading) is why I can't do this:
    "SELECT DISTINCT chain_id FROM chain WHERE pdb_id IN (SELECT pdb_code
    FROM pdb_entry WHERE chain_count=1);"
    This looks OK but takes apparently forever (or until I lose patience)
    on my Mac G4. There are about 10,000 values in that subquery, so I
    didn't expect it to be lightning fast, but it seems ridiculous.


    Relevant snippets of table:
    pdb_entry - 19,587 entries
    ---------
    pdb_code is KEY, CHAR(4), unique, indexed
    chain_count is INTEGER

    chain - 40,844 entries
    -----
    pdb_id is CHAR(4) Foreign key, indexed
    chain_id is CHAR(1)



    Cath Lawrence, Cath.Lawrence@anu.edu.au
    Senior Scientific Programmer, Centre for Bioinformation Science,
    John Curtin School of Medical Research (room 4088)
    Australian National University, Canberra ACT 0200
    ph: (02) 61257959 mobile: 0421-902694 fax: (02) 61252595


    ---------------------------(end of broadcast)---------------------------
    TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
  • Paul at Oct 1, 2003 at 9:28 am
    On 1 Oct 2003 at 9:55, Cath Lawrence <Cath.Lawrence@ wrote:
    A secondary question (relating to my quality control problem but not
    the subject heading) is why I can't do this:
    "SELECT DISTINCT chain_id FROM chain WHERE pdb_id IN (SELECT pdb_code
    FROM pdb_entry WHERE chain_count=1);"
    This looks OK but takes apparently forever (or until I lose patience)
    on my Mac G4. There are about 10,000 values in that subquery, so I
    didn't expect it to be lightning fast, but it seems ridiculous.
    Try indexing the column in the where clause

    Relevant snippets of table:
    pdb_entry - 19,587 entries
    ---------
    pdb_code is KEY, CHAR(4), unique, indexed
    chain_count is INTEGER

    chain - 40,844 entries
    -----
    pdb_id is CHAR(4) Foreign key, indexed
    chain_id is CHAR(1)



    Cath Lawrence, Cath.Lawrence@anu.edu.au
    Senior Scientific Programmer, Centre for Bioinformation Science,
    John Curtin School of Medical Research (room 4088)
    Australian National University, Canberra ACT 0200
    ph: (02) 61257959 mobile: 0421-902694 fax: (02) 61252595


    ---------------------------(end of broadcast)---------------------------
    TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
  • Zhix at Oct 7, 2003 at 12:51 am
    This should give you distinct values counts:

    select pdb_id,count(*)
    from chain
    where chain_id = 'A'
    group by pdb_id;

    zhix.
  • Zhix at Oct 7, 2003 at 3:21 am
    This should do:

    select pdb_id,count(*)
    from chain
    where chain_id = 'A'
    group by pdb_id;

    rgds,
    zhix
  • Louise Cofield at Oct 1, 2003 at 2:54 pm
    http://www.postgresql.org/docs/7.3/static/tutorial-agg.html

    :)

    Louise

    -----Original Message-----
    From: pgsql-novice-owner@postgresql.org
    On Behalf Of Cath Lawrence
    Sent: Wednesday, October 01, 2003 1:27 AM
    To: pgsql-novice@postgresql.org
    Subject: [NOVICE] SELECT syntax question - combining COUNT and DISTINCT

    I'm having some trouble working with aggregates. I think I'm missing
    the concept of how aggregates are treated. Is there a tutorial page?

    Here's my problem, no doubt very simple. It's one table only, nice and
    easy...
    SELECT DISTINCT pdb_id FROM chain WHERE chain_id = 'A';

    But now I want to count how many records are returned and I can't work
    it out. I'm sure I'm missing something obvious.
    For instance "SELECT COUNT(pdb_id) FROM chain WHERE chain_id = 'A';"
    works fine but of course is no longer distinct.


    A secondary question (relating to my quality control problem but not
    the subject heading) is why I can't do this:
    "SELECT DISTINCT chain_id FROM chain WHERE pdb_id IN (SELECT pdb_code
    FROM pdb_entry WHERE chain_count=1);"
    This looks OK but takes apparently forever (or until I lose patience)
    on my Mac G4. There are about 10,000 values in that subquery, so I
    didn't expect it to be lightning fast, but it seems ridiculous.


    Relevant snippets of table:
    pdb_entry - 19,587 entries
    ---------
    pdb_code is KEY, CHAR(4), unique, indexed
    chain_count is INTEGER

    chain - 40,844 entries
    -----
    pdb_id is CHAR(4) Foreign key, indexed
    chain_id is CHAR(1)



    Cath Lawrence, Cath.Lawrence@anu.edu.au
    Senior Scientific Programmer, Centre for Bioinformation Science,
    John Curtin School of Medical Research (room 4088)
    Australian National University, Canberra ACT 0200
    ph: (02) 61257959 mobile: 0421-902694 fax: (02) 61252595


    ---------------------------(end of broadcast)---------------------------
    TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
  • Manfred Koizar at Oct 1, 2003 at 6:06 pm

    On Wed, 1 Oct 2003 17:26:55 +1000, Cath Lawrence wrote:
    "SELECT DISTINCT chain_id FROM chain WHERE pdb_id IN (SELECT pdb_code
    FROM pdb_entry WHERE chain_count=1);"
    This looks OK but takes apparently forever
    SELECT DISTINCT c.chain_id
    FROM chain AS c, pdb_entry AS p
    WHERE c.pdb_id = p.pdb_code
    AND p.chain_count = 1;

    Servus
    Manfred

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-novice @
categoriespostgresql
postedOct 1, '03 at 7:26a
activeOct 7, '03 at 3:21a
posts7
users5
websitepostgresql.org
irc#postgresql

People

Translate

site design / logo © 2022 Grokbase