Hi,

I want to split a table to 2 small tables. The 1st one contains 60% records
which are randomly selected from the source table.
How to do it?

Regards,
Felix

Search Discussions

  • Andreas Kretschmer at Oct 17, 2006 at 8:33 am

    Felix Zhang schrieb:

    Hi,

    I want to split a table to 2 small tables. The 1st one contains 60% records
    which are randomly selected from the source table.
    How to do it?
    Why do you want to do this?


    Andreas
    --
    Really, I'm not out to destroy Microsoft. That will just be a completely
    unintentional side effect. (Linus Torvalds)
    "If I was god, I would recompile penguin with --enable-fly." (unknow)
    Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889°
  • Gregory S. Williamson at Oct 17, 2006 at 8:53 am
    Perhaps something like:

    CREATE TABLE foo2 AS SELECT * FROM foo WHERE (rand() <= 0.60);

    ?

    HTH,

    Greg Williamson
    DBA
    GlobeXplorer LLC


    -----Original Message-----
    From: pgsql-sql-owner@postgresql.org on behalf of Andreas Kretschmer
    Sent: Tue 10/17/2006 1:34 AM
    To: pgsql-general@postgresql.org; pgsql-sql@postgresql.org; pgsql-novice@postgresql.org
    Cc:
    Subject: Re: [SQL] [GENERAL] How to split a table?

    Felix Zhang <felix.zhang.2005@gmail.com> schrieb:
    Hi,

    I want to split a table to 2 small tables. The 1st one contains 60% records
    which are randomly selected from the source table.
    How to do it?
    Why do you want to do this?


    Andreas
    --
    Really, I'm not out to destroy Microsoft. That will just be a completely
    unintentional side effect. (Linus Torvalds)
    "If I was god, I would recompile penguin with --enable-fly." (unknow)
    Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889°

    ---------------------------(end of broadcast)---------------------------
    TIP 9: In versions below 8.0, the planner will ignore your desire to
    choose an index scan if your joining column's datatypes do not
    match


    -------------------------------------------------------
    Click link below if it is SPAM gsw@globexplorer.com
    "https://mailscanner.globexplorer.com/dspam/dspam.cgi?signatureID=4534936b271274356172766&user=gsw@globexplorer.com&retrain=spam&template=history&history_page=1"
    !DSPAM:4534936b271274356172766!
    -------------------------------------------------------
  • Andreas Kretschmer at Oct 17, 2006 at 9:12 am

    am Tue, dem 17.10.2006, um 1:53:35 -0700 mailte Gregory S. Williamson folgendes:
    Perhaps something like:

    CREATE TABLE foo2 AS SELECT * FROM foo WHERE (rand() <= 0.60);
    Then we have 2 tables: one with 100% data and one with around 60% ;-)
    If the table contains a primary key you can delete simple the copied
    records from the origin table.

    (delete from origin where pk in (select pk from copy);

    -----Original Message-----
    Please, no top-posting with fullquote below.


    Andreas
    --
    Andreas Kretschmer
    Kontakt: Heynitz: 035242/47215, D1: 0160/7141639 (mehr: -> Header)
    GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net
  • Gregory S. Williamson at Oct 17, 2006 at 9:19 am
    A crude approach would be to add a column to the original table; then update that based on the rand() call:

    update foo set i_am_a_60 = 1 where (rand() <= 0.60);
    create table foo_60 as select * from foo where i_am_a_60 = 1;
    create table foo_40 as select * from foo where i_am_a_60 <> 1;

    The CASE condition might be usable as well but I haven't puzzled it out ...

    G


    -----Original Message-----
    From: pgsql-sql-owner@postgresql.org on behalf of A. Kretschmer
    Sent: Tue 10/17/2006 2:12 AM
    To: pgsql-sql@postgresql.org; pgsql-novice@postgresql.org
    Cc:
    Subject: Re: [SQL] [GENERAL] How to split a table?

    am Tue, dem 17.10.2006, um 1:53:35 -0700 mailte Gregory S. Williamson folgendes:
    Perhaps something like:

    CREATE TABLE foo2 AS SELECT * FROM foo WHERE (rand() <= 0.60);
    Then we have 2 tables: one with 100% data and one with around 60% ;-)
    If the table contains a primary key you can delete simple the copied
    records from the origin table.

    (delete from origin where pk in (select pk from copy);

    -----Original Message-----
    Please, no top-posting with fullquote below.


    Andreas
    --
    Andreas Kretschmer
    Kontakt: Heynitz: 035242/47215, D1: 0160/7141639 (mehr: -> Header)
    GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net

    ---------------------------(end of broadcast)---------------------------
    TIP 5: don't forget to increase your free space map settings


    -------------------------------------------------------
    Click link below if it is SPAM gsw@globexplorer.com
    "https://mailscanner.globexplorer.com/dspam/dspam.cgi?signatureID=45349cb0275391789821027&user=gsw@globexplorer.com&retrain=spam&template=history&history_page=1"
    !DSPAM:45349cb0275391789821027!
    -------------------------------------------------------
  • Jorge Godoy at Oct 17, 2006 at 10:52 am

    "Gregory S. Williamson" <gsw@globexplorer.com> writes:

    A crude approach would be to add a column to the original table; then update
    that based on the rand() call:

    update foo set i_am_a_60 = 1 where (rand() <= 0.60);
    create table foo_60 as select * from foo where i_am_a_60 = 1;
    create table foo_40 as select * from foo where i_am_a_60 <> 1;

    The CASE condition might be usable as well but I haven't puzzled it out ...
    If he's asking that 60% of the contents of the original table be randomly
    selected this won't work. He'll have to count how many rows the original
    table has, then loop randomly selecting rows until he has reached 60% of that
    total. Otherwise he might end up with something completely different from
    what he wants.

    On the other hand if he wants rows whose randomness factor at the time they
    were looked at was bigger than 0.6 then he can use that rand() trick.

    --
    Jorge Godoy <jgodoy@gmail.com>
  • Felix Zhang at Oct 17, 2006 at 9:07 am
    to do some statistics analysis.

    2006/10/17, Andreas Kretschmer <akretschmer@spamfence.net>:
    Felix Zhang <felix.zhang.2005@gmail.com> schrieb:
    Hi,

    I want to split a table to 2 small tables. The 1st one contains 60% records
    which are randomly selected from the source table.
    How to do it?
    Why do you want to do this?


    Andreas
    --
    Really, I'm not out to destroy Microsoft. That will just be a completely
    unintentional side effect. (Linus Torvalds)
    "If I was god, I would recompile penguin with --enable-fly." (unknow)
    Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889°

    ---------------------------(end of broadcast)---------------------------
    TIP 4: Have you searched our list archives?

    http://archives.postgresql.org
  • Christoph Frick at Oct 17, 2006 at 9:47 am

    On Tue, Oct 17, 2006 at 03:39:21PM +0800, Felix Zhang wrote:

    I want to split a table to 2 small tables. The 1st one contains 60%
    records which are randomly selected from the source table. How to do
    it?
    i do my A/B-Group splitting usually by &1 the serial of the table.
    assuming, that there are no irregularities in the process of creating
    your entries, this should give a 50/50 split, which is reuseable for
    future things and there never is a intersection of the two groups.

    --
    cu
  • Obe, Regina at Oct 17, 2006 at 10:55 am
    I would do

    select * into mynewtable from myoldtable ORDER by random() LIMIT 15000

    where 15000 in this case is <your table row count>*.6

    If you want to create another table with 40% of the remaining data then
    something like

    select * into mynewtable2 from myoldtable where myoldtable.primarykey
    NOT IN(select primarykey from mynewtable)
    ]
    In this case primarykey you would replace with the primary key field of
    your table.


    ________________________________

    From: pgsql-novice-owner@postgresql.org
    On Behalf Of Felix Zhang
    Sent: Tuesday, October 17, 2006 3:39 AM
    To: pgsql-sql@postgresql.org; pgsql-general@postgresql.org;
    pgsql-novice@postgresql.org
    Subject: [NOVICE] How to split a table?


    Hi,

    I want to split a table to 2 small tables. The 1st one contains 60%
    records which are randomly selected from the source table.
    How to do it?

    Regards,
    Felix



    -----------------------------------------
    The substance of this message, including any attachments, may be
    confidential, legally
    privileged and/or exempt from disclosure pursuant to Massachusetts
    law. It is intended
    solely for the addressee. If you received this in error, please
    contact the sender and
    delete the material from any computer.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-novice @
categoriespostgresql
postedOct 17, '06 at 7:39a
activeOct 17, '06 at 10:55a
posts9
users7
websitepostgresql.org
irc#postgresql

People

Translate

site design / logo © 2022 Grokbase