FAQ
Hi everyone,

Could someone help me with this question please? I have a table with
four columns:

- gps_id (primary key)
- date_time
- crimes_link (foreign key)
- osgb36_geom

The data in the table involves lots of gps tracks of different
journeys. They are grouped into journeys by the 'crimes_link' field.
So for example the first 50 rows have an identical 'crimes_link'
field, then the next 50 rows a different value in 'crimes_link', then
the next 50 rows a different one, etc. What I would like to do is to
select the beginning location of each journey. This query gives me the
date_time of the beginning of the journey:

SELECT crimes_link, MIN(date_time)
FROM camdengps3
GROUP BY crimes_link;

However I need to add the osgb36_geom column into the query and am unable too.

Any ideas how to do this please?

Thank you

James Smith

Search Discussions

  • Michael Wood at Jul 9, 2011 at 5:50 pm

    On 9 July 2011 18:41, James David Smith wrote:
    Hi everyone,

    Could someone help me with this question please? I have a table with
    four columns:

    - gps_id (primary key)
    - date_time
    - crimes_link (foreign key)
    - osgb36_geom

    The data in the table involves lots of gps tracks of different
    journeys. They are grouped into journeys by the 'crimes_link' field.
    So for example the first 50 rows have an identical 'crimes_link'
    field, then the next 50 rows a different value in 'crimes_link', then
    the next 50 rows a different one, etc. What I would like to do is to
    select the beginning location of each journey. This query gives me the
    date_time of the beginning of the journey:

    SELECT crimes_link, MIN(date_time)
    FROM camdengps3
    GROUP BY crimes_link;

    However I need to add the osgb36_geom column into the query and am unable too.

    Any ideas how to do this please?
    Sorry, I don't have time to test this now and maybe someone else has a
    better way, but maybe it will give you and idea:

    SELECT l.date_time, l.crimes_link, l.osgb36_geom
    FROM camdengps3 AS l INNER JOIN (
    SELECT crimes_link, MIN(date_time) AS start
    FROM camdengps3
    GROUP BY crimes_link) AS r
    ON l.date_time = s.start AND l.crimes_link = s.crimes_link;

    --
    Michael Wood <esiotrot@gmail.com>
  • James David Smith at Jul 9, 2011 at 10:31 pm
    Dear Michael, Tom ( et al),

    Thanks for your help. Once I had edited Michael's code slightlyit
    worked perfectly! Thank you! I had just this minute managed to achieve
    a similar result, but by using two temp tables and about 10 more lines
    of code..! Haha.

    Unfortunately I am presented with 3425 rows of starting locations,
    when I know that there are only 2233 journeys in the table. It seems
    that some of the journeys have more than one starting point according
    to the GPS. Bad data. I think what I now need to do are to find the
    journeys that have more than one starting location, and to take some
    sort of average of their GEOM column. This code identifies how many
    starting points there are for each journey:

    ________________________
    DROP TABLE IF EXISTS temp;

    SELECT l.camdencrimes_link, l.osgb36_geom, date_time INTO TEMP TABLE temp
    FROM camdengps3 AS l INNER JOIN (
    SELECT camdencrimes_link, MIN(date_time) AS start
    FROM camdengps3
    GROUP BY camdencrimes_link) AS r
    ON l.date_time = r.start AND l.camdencrimes_link = r.camdencrimes_link;

    SELECT a.*, b
    FROM temp AS a
    INNER JOIN
    ( SELECT camdencrimes_link, COUNT(camdencrimes_link) AS b
    FROM temp
    GROUP BY camdencrimes_link) AS c
    ON a.camdencrimes_link = c.camdencrimes_link
    WHERE b >1;
    ________________________

    But I now need to take an average of the starting points for each
    journey where there are more than one starting point? Any ideas guys?

    Thank you so much for your help...

    James





    On 9 July 2011 18:50, Michael Wood wrote:
    On 9 July 2011 18:41, James David Smith wrote:
    Hi everyone,

    Could someone help me with this question please? I have a table with
    four columns:

    - gps_id (primary key)
    - date_time
    - crimes_link (foreign key)
    - osgb36_geom

    The data in the table involves lots of gps tracks of different
    journeys. They are grouped into journeys by the 'crimes_link' field.
    So for example the first 50 rows have an identical 'crimes_link'
    field, then the next 50 rows a different value in 'crimes_link', then
    the next 50 rows a different one, etc. What I would like to do is to
    select the beginning location of each journey. This query gives me the
    date_time of the beginning of the journey:

    SELECT crimes_link, MIN(date_time)
    FROM camdengps3
    GROUP BY crimes_link;

    However I need to add the osgb36_geom column into the query and am unable too.

    Any ideas how to do this please?
    Sorry, I don't have time to test this now and maybe someone else has a
    better way, but maybe it will give you and idea:

    SELECT l.date_time, l.crimes_link, l.osgb36_geom
    FROM camdengps3 AS l INNER JOIN (
    SELECT crimes_link, MIN(date_time) AS start
    FROM camdengps3
    GROUP BY crimes_link) AS r
    ON l.date_time = s.start AND l.crimes_link = s.crimes_link;

    --
    Michael Wood <esiotrot@gmail.com>
  • Tom Lane at Jul 9, 2011 at 5:53 pm

    James David Smith writes:
    ... What I would like to do is to
    select the beginning location of each journey. This query gives me the
    date_time of the beginning of the journey:
    SELECT crimes_link, MIN(date_time)
    FROM camdengps3
    GROUP BY crimes_link;
    However I need to add the osgb36_geom column into the query and am unable too.
    You could do it with SELECT DISTINCT ON; see the "weather reports"
    example in the SELECT reference page in the PG manual.

    The more SQL-standard way is to use a subselect, viz

    SELECT whatever
    FROM camdengps3 upper
    WHERE date_time = (SELECT MIN(date_time) FROM camdengps3 lower
    WHERE lower.crimes_link = upper.crimes_link);

    However this is generally a lot slower, and it also outputs
    multiple rows if there are multiple rows meeting the MIN date_time
    in any particular group, which might not be what you want.

    regards, tom lane
  • Gavin Flower at Jul 10, 2011 at 1:22 am

    On 10/07/11 04:41, James David Smith wrote:
    - gps_id (primary key)
    - date_time
    - crimes_link (foreign key)
    - osgb36_geom

    The data in the table involves lots of gps tracks of different
    journeys. They are grouped into journeys by the 'crimes_link' field.
    So for example the first 50 rows have an identical 'crimes_link'
    field, then the next 50 rows a different value in 'crimes_link', then
    the next 50 rows a different one, etc. What I would like to do is to
    select the beginning location of each journey. This query gives me the
    date_time of the beginning of the journey:

    SELECT crimes_link, MIN(date_time)
    FROM camdengps3
    GROUP BY crimes_link;

    However I need to add the osgb36_geom column into the query and am unable too.

    Any ideas how to do this please?

    Thank you

    James Smith
    Hi ,

    I would recommend adding an index, as follows:

    CREATE INDEX ON location(crimes_link, date_time);


    I tested the followiunmg query, I and observed that the above index
    improves performance. I used pg 9.1beta2, but this should also work for
    any pg veresion that imoplements the 'WITH' clause on 'SELECT'.


    WITH
    start (crimes_link, date_time) AS
    (
    SELECT crimes_link, min(date_time)
    FROM location
    GROUP BY crimes_link
    )
    SELECT
    l.gps_id,
    l.date_time,
    l.crimes_link,
    l.osgb36_geom
    FROM
    location l,
    start s
    WHERE
    l.crimes_link = s.crimes_link AND
    l.date_time = s.date_time
    ORDER BY
    l.gps_id;


    Cheers,
    Gavin

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-novice @
categoriespostgresql
postedJul 9, '11 at 4:42p
activeJul 10, '11 at 1:22a
posts5
users4
websitepostgresql.org
irc#postgresql

People

Translate

site design / logo © 2022 Grokbase