Hi,

/delimiter/

The single character that separates columns within each row (line)
of the file. The default is a tab character in text mode, a comma in
CSV mode.


Why limit this to a single character?
For easy of use, should it be lifted, so we can specify such as '|*|'?

Rey

Search Discussions

  • Tom Lane at Sep 25, 2010 at 2:03 pm

    rey writes:
    Why limit this to a single character?
    Performance. Believe it or not, breaking fields at the delimiter is
    a significant factor in COPY speed.

    regards, tom lane
  • Rey at Sep 25, 2010 at 7:13 pm

    On 09/25/2010 10:03 AM, Tom Lane wrote:
    rey<reywang@optonline.net> writes:
    Why limit this to a single character?
    Performance. Believe it or not, breaking fields at the delimiter is
    a significant factor in COPY speed.

    regards, tom lane
    True, but just for 5% to 10% degradation here.
    For RDBMS, correct indexes and good logical design we are talking about
    10 times or more performance gains.

    Who cares about 10% waste here? Is it Oracle and other commercial RDBMS
    no such limitation.
  • Merlin Moncure at Sep 26, 2010 at 1:05 pm

    On Sat, Sep 25, 2010 at 3:12 PM, rey wrote:
    On 09/25/2010 10:03 AM, Tom Lane wrote:

    rey<reywang@optonline.net>  writes:
    Why limit this to a single character?
    Performance.  Believe it or not, breaking fields at the delimiter is
    a significant factor in COPY speed.

    regards, tom lane
    True, but just for 5% to 10% degradation here.
    For RDBMS, correct indexes and good logical design we are talking about 1
    times or more performance gains.

    Who cares about 10% waste here? Is it Oracle and other commercial RDBMS no
    such limitation.
    Believe it or not, data loading performance is one of the most common
    standard metrics people used to benchmark databases. A large class of
    applications need to slam data in the db as quickly as possible, do
    some work, and slam it out/dump it. Copy performance matters.

    merlin
  • Steve Crawford at Sep 29, 2010 at 6:36 pm

    On 09/25/2010 07:03 AM, Tom Lane wrote:
    rey<reywang@optonline.net> writes:
    Why limit this to a single character?
    Performance. Believe it or not, breaking fields at the delimiter is
    a significant factor in COPY speed.

    regards, tom lane
    I agree that that multi-character (or even regex) delimiters would be
    useful. Would it be reasonable for the copy process to differentiate
    between single character delimiters which could be processed in
    "high-speed" mode and multi-character or regex delimiters which would be
    available as needed albeit at the expense of a performance hit?

    Cheers,
    Steve
  • Bruce Momjian at Oct 13, 2010 at 3:29 am

    Steve Crawford wrote:
    On 09/25/2010 07:03 AM, Tom Lane wrote:
    rey<reywang@optonline.net> writes:
    Why limit this to a single character?
    Performance. Believe it or not, breaking fields at the delimiter is
    a significant factor in COPY speed.

    regards, tom lane
    I agree that that multi-character (or even regex) delimiters would be
    useful. Would it be reasonable for the copy process to differentiate
    between single character delimiters which could be processed in
    "high-speed" mode and multi-character or regex delimiters which would be
    available as needed albeit at the expense of a performance hit?
    I am not sure you are aware but Postgres never confuses delimiters from
    data because it uses a backslash before literal data that matches
    delimiters.

    --
    Bruce Momjian <bruce@momjian.us> http://momjian.us
    EnterpriseDB http://enterprisedb.com

    + It's impossible for everything to be true. +
  • Steve Crawford at Oct 18, 2010 at 8:10 pm

    On 10/12/2010 08:28 PM, Bruce Momjian wrote:
    Steve Crawford wrote:
    On 09/25/2010 07:03 AM, Tom Lane wrote:

    rey<reywang@optonline.net> writes:

    Why limit this to a single character?
    Performance. Believe it or not, breaking fields at the delimiter is
    a significant factor in COPY speed.

    regards, tom lane

    I agree that that multi-character (or even regex) delimiters would be
    useful. Would it be reasonable for the copy process to differentiate
    between single character delimiters which could be processed in
    "high-speed" mode and multi-character or regex delimiters which would be
    available as needed albeit at the expense of a performance hit?
    I am not sure you are aware but Postgres never confuses delimiters from
    data because it uses a backslash before literal data that matches
    delimiters.
    Yes, I am. But the discussion was about using multi-character strings as
    delimiters.

    But while I have encountered files using multiple-character delimiters,
    I'm finding myself leaning toward the camp that says that such cases are
    better processed externally by Perl/Python/sed/awk/Ruby/ETL/etc.
    Especially given the "fun" of defining how to properly escape a
    regex-matching string in a regex delimited file.

    Cheers,
    Steve

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-general @
categoriespostgresql
postedSep 24, '10 at 11:23p
activeOct 18, '10 at 8:10p
posts7
users5
websitepostgresql.org
irc#postgresql

People

Translate

site design / logo © 2022 Grokbase