Hi All,

I am a new comer on postgres world and now using it for some serious (at
least for me) projects. I have a need where I am running some analytical +
aggregate functions on data where ordering is done on Date type column.

From my initial read on documentation I believe internally a date type is
represented by integer type of data. This makes me wonder would it make any
good to create additional column of Integer type and update it as data gets
added and use this integer column for all ordering purposes for my sqls - or
should I not hasitate using Date type straight into my sql for ordering?

Better yet, is there anyway I can verify impact of ordering on Date type vs.
Integer type, apart from using \timing and explain plan?


Thanks for sharing your insights.
-DP.

Search Discussions

  • Tom Lane at Apr 27, 2011 at 4:12 pm

    Dhimant Patel writes:
    From my initial read on documentation I believe internally a date type is
    represented by integer type of data. This makes me wonder would it make any
    good to create additional column of Integer type and update it as data gets
    added and use this integer column for all ordering purposes for my sqls - or
    should I not hasitate using Date type straight into my sql for ordering?
    Don't overcomplicate things. Comparison of dates is just about as fast as
    comparison of integers, anyway.

    regards, tom lane
  • Maciek Sakrejda at Apr 27, 2011 at 4:12 pm
    This makes me wonder would it make any good to create additional column of Integer type and update it as data gets added and use this integer column for all ordering purposes for my sqls - or should I not hasitate using Date type straight into my sql for ordering?
    Keep in mind what Michael A. Jackson (among others) had to say on
    this: "The First Rule of Program Optimization: Don't do it. The Second
    Rule of Program Optimization (for experts only!): Don't do it yet."
    For one thing, adding an extra column to your data would mean more
    data you need to cram in the cache as you query, so even if the *raw*
    integer versus date ordering is faster, the "optimization" could still
    be a net loss due to the fatter tuples. If you're willing to live with
    *only* integer-based dates, that could help, but that seems
    exceptionally painful and not worth considering unless you run into
    trouble.
    Better yet, is there anyway I can verify impact of ordering on Date type vs. Integer type, apart from using \timing and explain plan?
    Remember to use explain analyze (and not just explain) when validating
    these sorts of things. Explain is really just a guess. Also remember
    to ensure that stats are up to date before you test this.

    ---
    Maciek Sakrejda | System Architect | Truviso

    1065 E. Hillsdale Blvd., Suite 215
    Foster City, CA 94404
    (650) 242-3500 Main
    www.truviso.com
  • Kevin Grittner at Apr 27, 2011 at 4:18 pm

    Dhimant Patel wrote:

    I am a new comer on postgres world and now using it for some
    serious (at least for me) projects. I have a need where I am
    running some analytical + aggregate functions on data where
    ordering is done on Date type column.

    From my initial read on documentation I believe internally a date
    type is represented by integer type of data. This makes me wonder
    would it make any good to create additional column of Integer type
    and update it as data gets added and use this integer column for
    all ordering purposes for my sqls - or should I not hasitate using
    Date type straight into my sql for ordering?
    I doubt that this will improve performance, particularly if you ever
    want to see your dates formatted as dates.
    Better yet, is there anyway I can verify impact of ordering on
    Date type vs. Integer type, apart from using \timing and explain
    plan?
    You might be better off just writing the code in the most natural
    way, using the date type for dates, and then asking about any
    queries which aren't performing as you hope they would. Premature
    optimization is often counter-productive. If you really want to do
    some benchmarking of relative comparison speeds, though, see the
    generate_series function -- it can be good at generating test tables
    for such things.

    -Kevin
  • Phoenix Kiula at Apr 27, 2011 at 4:46 pm

    On Thu, Apr 28, 2011 at 12:17 AM, Kevin Grittner wrote:

    Dhimant Patel wrote:
    I am a new comer on postgres world and now using it for some
    serious (at least for me)  projects. I have a need where I am
    running some analytical + aggregate functions on data where
    ordering is done on Date type column.

    From my initial read on documentation I believe internally a date
    type is represented by integer type of data. This makes me wonder
    would it make any good to create additional column of Integer type
    and update it as data gets added and use this integer column for
    all ordering purposes for my sqls - or should I not hasitate using
    Date type straight into my sql for ordering?
    I doubt that this will improve performance, particularly if you ever
    want to see your dates formatted as dates.
    Better yet, is there anyway I can verify impact of ordering on
    Date type vs. Integer type, apart from using \timing and explain
    plan?
    You might be better off just writing the code in the most natural
    way, using the date type for dates, and then asking about any
    queries which aren't performing as you hope they would.  Premature
    optimization is often counter-productive.  If you really want to do
    some benchmarking of relative comparison speeds, though, see the
    generate_series function -- it can be good at generating test tables
    for such things.



    There is a lot of really good advice here already. I'll just add one thought.

    If the dates in your tables are static based only on creation (as in
    only a CREATE_DATE, which will never be modified per row like a
    MODIFY_DATE for each record), then your thought might have made sense.
    But in that case you can already use the ID field if you have one?

    In most real world cases however the DATE field will likely be storing
    an update time as well. Which would make your thought about numbering
    with integers pointless.
  • Dhimant Patel at Apr 27, 2011 at 6:34 pm
    Thanks for all valuable insights. I decided to drop the idea of adding
    additional column and
    will just rely on Date column for all ordering.

    Tom - thanks for clear answer on the issue I was concerned about.
    Maciek,Kevin -
    thanks for ideas, hint on generate_series() - I will have to go through cpl
    of times of postgres documentation before I will have better grasp of all
    available tools but this forum is very valuable.


    -DP.

    On Wed, Apr 27, 2011 at 12:46 PM, Phoenix Kiula wrote:

    On Thu, Apr 28, 2011 at 12:17 AM, Kevin Grittner
    wrote:
    Dhimant Patel wrote:
    I am a new comer on postgres world and now using it for some
    serious (at least for me) projects. I have a need where I am
    running some analytical + aggregate functions on data where
    ordering is done on Date type column.

    From my initial read on documentation I believe internally a date
    type is represented by integer type of data. This makes me wonder
    would it make any good to create additional column of Integer type
    and update it as data gets added and use this integer column for
    all ordering purposes for my sqls - or should I not hasitate using
    Date type straight into my sql for ordering?
    I doubt that this will improve performance, particularly if you ever
    want to see your dates formatted as dates.
    Better yet, is there anyway I can verify impact of ordering on
    Date type vs. Integer type, apart from using \timing and explain
    plan?
    You might be better off just writing the code in the most natural
    way, using the date type for dates, and then asking about any
    queries which aren't performing as you hope they would. Premature
    optimization is often counter-productive. If you really want to do
    some benchmarking of relative comparison speeds, though, see the
    generate_series function -- it can be good at generating test tables
    for such things.



    There is a lot of really good advice here already. I'll just add one
    thought.

    If the dates in your tables are static based only on creation (as in
    only a CREATE_DATE, which will never be modified per row like a
    MODIFY_DATE for each record), then your thought might have made sense.
    But in that case you can already use the ID field if you have one?

    In most real world cases however the DATE field will likely be storing
    an update time as well. Which would make your thought about numbering
    with integers pointless.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-performance @
categoriespostgresql
postedApr 27, '11 at 3:28p
activeApr 27, '11 at 6:34p
posts6
users5
websitepostgresql.org
irc#postgresql

People

Translate

site design / logo © 2021 Grokbase