FAQ
Hi everyone,

A bit of help please. This query works for me, and gives me 332,000
rows of records out of my table of 400,000 rows. It takes about a
minute to run.

SELECT vehicle.*
FROM vehicle, incidents
WHERE incidents.key = vehicle.incident_link;

Would I would like to do however, is remove the data from this table
that does not match the above query. So to start with, I thought that
I'd better try and identify this data using the query below. You'll
note I have simply changed the '=' for '!=' instead.

SELECT vehicle.*
FROM vehicle, incidents
WHERE incidents.key != vehicle.incident_link;

However when I run this query, after about 10-15 minutes, I still
don't have a result. I don't really understand why not.

Cheers as always.

James

Search Discussions

  • Merlin Moncure at Jun 8, 2011 at 10:24 pm

    On Wed, Jun 8, 2011 at 4:57 PM, James David Smith wrote:
    Hi everyone,

    A bit of help please. This query works for me, and gives me 332,000
    rows of records out of my table of 400,000 rows. It takes about a
    minute to run.

    SELECT vehicle.*
    FROM vehicle, incidents
    WHERE incidents.key = vehicle.incident_link;

    Would I would like to do however, is remove the data from this table
    that does not match the above query. So to start with, I thought that
    I'd better try and identify this data using the query below. You'll
    note I have simply changed the '=' for '!=' instead.

    SELECT vehicle.*
    FROM vehicle, incidents
    WHERE incidents.key != vehicle.incident_link;

    However when I run this query, after about 10-15 minutes, I still
    don't have a result. I don't really understand why not.
    You're getting a combination of every record from vehicle combined
    with every record incidents except in the very specific case where the
    identifiers match -- not want you want. There are a ton of ways to do
    what you want in sql. Typically the best/fastest is left join/not
    null:

    if you are trying to find vehicles with an incident key that is not in
    the incident table:
    SELECT vehicle.*
    FROM vehicle LEFT JOIN incidents ON incidents.key = vehicle.incident_link
    WHERE incidents.key IS NULL

    Incidents with no record in the vehicle table:
    SELECT incidents .*
    FROM incidents LEFT JOIN vehicle ON incidents.key = vehicle.incident_link
    WHERE vehicle.incident_link IS NULL

    Another way to do it is with 'where not exists' -- which is a lot
    easier to fold into delete syntax:
    SELECT vehicle.*
    FROM vehicle
    WHERE NOT EXISTS
    (
    SELECT 1 FROM incidents WHERE incidents.key = vehicle.incident_link
    )

    which you can turn into:
    DELETE FROM vehicle
    WHERE NOT EXISTS
    (
    SELECT 1 FROM incidents WHERE incidents.key = vehicle.incident_link
    )

    etc (don't jump directly to the delete, test it first!)

    merlin
  • Merlin Moncure at Jun 9, 2011 at 4:27 pm

    On Thu, Jun 9, 2011 at 6:17 AM, James David Smith wrote:
    Hi Merlin,

    Thank you very much for the full and brilliant reply. The last query
    you wrote does exactly what I want it too. I wonder whether it's not
    too much trouble whether you could explain to me a couple of things
    though...?

    1) I understand what you are saying my query does. Like an outer join
    I think? But don't get why. Using '!=' is the opposite of '=' is it
    not?
    It is the opposite -- just not in the way you are thinking. In SQL,
    joins between tables means 'give me every combination of data from
    table A combined with table B given a condition'. If A and B each
    have 100 records with identifiers 1-100, the not equal join would give
    you a join result of 99 records for A=1 (with B 2-100), 99 records for
    A=2 (with B 1, 3-100) etc. for a total of 9900 records.

    Your problem is that you are still associating tables A and B in your
    head in a way that is not expressed in the join. You have to imagine
    both tables as pools of unassociated records with no ordering except
    for what you give in the query.
    2) I understand the 'where not exists' query you suggest, and have
    used that, but I don't see why you use '1' in it. What does the 1 do?
    where not exists means 'return this record if this query does not
    return at least 1 record' -- since we don't care what is in the record
    that is actually returned, I just use 1 as a shorthand because in SQL
    you can't write queries that return 0 fields -- this is a pretty
    common convention. I instead of 1, I could have used *, -999, or
    anything -- it doesn't matter.

    Aside: exists/not exists in relational parlance are called 'semi
    joins'. They can often be faster than regular joins because the
    server can bail early without having to express the full join.

    merlin

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-novice @
categoriespostgresql
postedJun 8, '11 at 9:58p
activeJun 9, '11 at 4:27p
posts3
users2
websitepostgresql.org
irc#postgresql

People

Translate

site design / logo © 2022 Grokbase