Dear List,

Please assist me with how to import a csv file into PostgreSQL that
has the following structure - http://pastebin.com/m56bb6cb2.

The challenges are these:
1. import the whole csv into one table - problem is the number of
fields will not be the same for the various rows, the csv has no field
names
2. split this one table into the various tables - a total of 6 tables
3. update the field names with the correct names


--
Sindile Bidla

Search Discussions

  • Michael Wood at Aug 6, 2009 at 9:25 am

    2009/8/6 Intengu Technologies <sindile.bidla@gmail.com>:
    Dear List,
    Hey, a fellow South African :)
    Please assist me with how to import a csv file into PostgreSQL that
    has the following structure - http://pastebin.com/m56bb6cb2.

    The challenges are these:
    1. import the whole csv into one table - problem is the number of
    fields will not be the same for the various rows, the csv has no field
    names
    2. split this one table into the various tables - a total of 6 tables
    3. update the field names with the correct names
    I think it makes the most sense to split the CSV into separate files
    for the different tables and deal with them separately instead of
    trying to import the whole lot into one table and then splitting it up
    afterwards.

    One way to do that would be to import the data into a spreadsheet and
    then sort by the first column. The cut and paste the data for the
    different tables into different spreadsheets and export them to
    separate CSV files.

    Or you could write a script to separate it for you. e.g. the
    following Python script should split into different files, assuming
    the original file is called original.csv the new files will be called
    table1.csv, table2.csv etc.

    ---[ cut here ]---
    #!/usr/bin/env python

    import csv

    def get_csv_writer(filename):
    return csv.writer(open(filename, "w"), quoting=csv.QUOTE_ALL)

    infile = open("original.csv")
    table = {"1": get_csv_writer("table1.csv"),
    "2": get_csv_writer("table2.csv"),
    "3": get_csv_writer("table3.csv"),
    "4": get_csv_writer("table4.csv"),
    "5": get_csv_writer("table5.csv"),
    "6": get_csv_writer("table6.csv")}

    reader = csv.reader(infile)
    for line in reader:
    table[line[0]].writerow(line[1:])
    ---[ cut here ]---

    You could also just insert the data from the script instead of
    outputting new CSV files.

    P.S. I've done a quick test of the above script using your sample
    data, but there's no error checking in the script, so you will have to
    see if it outputs sensible data based on the real CSV file.

    --
    Michael Wood <esiotrot@gmail.com>
  • Intengu Technologies at Aug 6, 2009 at 11:47 am
    Fellow South African indeed ;)

    I run windows and have installed Python - forgive me I am not a
    programmer and my data is on an external HD.
    I then modified your script to be like this
    import csv
    def get_csv_writer(filename):
    return csv.writer(open(filename, "w"), quoting=csv.QUOTE_ALL)

    infile = open("F:\data\myfile.csv")
    table = {"1": get_csv_writer("F:\data\myfiletable1.csv"),
    "2": get_csv_writer("F:\data\myfiletable2.csv"),
    "3": get_csv_writer("F:\data\myfiletable3.csv"),
    "4": get_csv_writer("F:\data\myfiletable4.csv"),
    "5": get_csv_writer("F:\data\myfiletable5.csv"),
    "6": get_csv_writer("F:\data\myfiletable6.csv")}

    reader = csv.reader(infile)
    for line in reader:
    table[line[0]].writerow(line[1:])
    I then saved this as importcsv.py

    Opened it in python IDE, but I am getting syntax errors.

    On 06/08/2009, Michael Wood wrote:
    2009/8/6 Intengu Technologies <sindile.bidla@gmail.com>:
    Dear List,
    Hey, a fellow South African :)
    Please assist me with how to import a csv file into PostgreSQL that
    has the following structure - http://pastebin.com/m56bb6cb2.

    The challenges are these:
    1. import the whole csv into one table - problem is the number of
    fields will not be the same for the various rows, the csv has no field
    names
    2. split this one table into the various tables - a total of 6 tables
    3. update the field names with the correct names
    I think it makes the most sense to split the CSV into separate files
    for the different tables and deal with them separately instead of
    trying to import the whole lot into one table and then splitting it up
    afterwards.

    One way to do that would be to import the data into a spreadsheet and
    then sort by the first column. The cut and paste the data for the
    different tables into different spreadsheets and export them to
    separate CSV files.

    Or you could write a script to separate it for you. e.g. the
    following Python script should split into different files, assuming
    the original file is called original.csv the new files will be called
    table1.csv, table2.csv etc.

    ---[ cut here ]---
    #!/usr/bin/env python

    import csv

    def get_csv_writer(filename):
    return csv.writer(open(filename, "w"), quoting=csv.QUOTE_ALL)

    infile = open("original.csv")
    table = {"1": get_csv_writer("table1.csv"),
    "2": get_csv_writer("table2.csv"),
    "3": get_csv_writer("table3.csv"),
    "4": get_csv_writer("table4.csv"),
    "5": get_csv_writer("table5.csv"),
    "6": get_csv_writer("table6.csv")}

    reader = csv.reader(infile)
    for line in reader:
    table[line[0]].writerow(line[1:])
    ---[ cut here ]---

    You could also just insert the data from the script instead of
    outputting new CSV files.

    P.S. I've done a quick test of the above script using your sample
    data, but there's no error checking in the script, so you will have to
    see if it outputs sensible data based on the real CSV file.

    --
    Michael Wood <esiotrot@gmail.com>

    --
    Sindile Bidla
  • Michael Wood at Aug 6, 2009 at 12:36 pm

    2009/8/6 Intengu Technologies <sindile.bidla@gmail.com>:
    Fellow South African indeed ;)

    I run windows and have installed Python - forgive me I am not a
    programmer and my data is on an external HD.
    I then modified your script to be like this
    You will need double backslashes if you specify the full path.

    e.g.:
    infile = open("F:\\data\\myfile.csv")

    This is because the backslash is a quoting character in Python (and
    many other languages).

    If this still doesn't work, you should be able to change back to just
    filenames instead of the full path, put the script and myfile.csv in
    the same directory and then double-click on the script.
    import csv
    def get_csv_writer(filename):
    return csv.writer(open(filename, "w"), quoting=csv.QUOTE_ALL)

    infile = open("F:\data\myfile.csv")
    table = {"1": get_csv_writer("F:\data\myfiletable1.csv"),
    "2": get_csv_writer("F:\data\myfiletable2.csv"),
    "3": get_csv_writer("F:\data\myfiletable3.csv"),
    "4": get_csv_writer("F:\data\myfiletable4.csv"),
    "5": get_csv_writer("F:\data\myfiletable5.csv"),
    "6": get_csv_writer("F:\data\myfiletable6.csv")}

    reader = csv.reader(infile)
    for line in reader:
    table[line[0]].writerow(line[1:])
    I then saved this as importcsv.py

    Opened it in python IDE, but I am getting syntax errors.
    If the advice above does not help, send the syntax errors. This is of
    course not a Python support list, but I hope nobody minds too much :)

    --
    Michael Wood <esiotrot@gmail.com>
  • Frank Bax at Aug 6, 2009 at 10:43 am

    Intengu Technologies wrote:
    Please assist me with how to import a csv file into PostgreSQL that
    has the following structure - http://pastebin.com/m56bb6cb2.

    The challenges are these:
    1. import the whole csv into one table - problem is the number of
    fields will not be the same for the various rows, the csv has no field
    names
    2. split this one table into the various tables - a total of 6 tables
    3. update the field names with the correct names

    create table intengu1 (...)
    create table intengu2 (...)
    create table intengu3 (...)
    create table intengu4 (...)
    create table intengu5 (...)
    create table intengu6 (...)

    grep ^.1. /tmp/intengu.txt | psql -c "copy intengu1 from STDIN CSV;"
    grep ^.2. /tmp/intengu.txt | psql -c "copy intengu2 from STDIN CSV;"
    grep ^.3. /tmp/intengu.txt | psql -c "copy intengu3 from STDIN CSV;"
    grep ^.4. /tmp/intengu.txt | psql -c "copy intengu4 from STDIN CSV;"
    grep ^.5. /tmp/intengu.txt | psql -c "copy intengu5 from STDIN CSV;"
    grep ^.6. /tmp/intengu.txt | psql -c "copy intengu6 from STDIN CSV;"
  • Michael Wood at Aug 6, 2009 at 12:28 pm

    2009/8/6 Frank Bax <fbax@sympatico.ca>:
    Intengu Technologies wrote:
    Please assist me with how to import a csv file into PostgreSQL that
    has the following structure - http://pastebin.com/m56bb6cb2.

    The challenges are these:
    1. import the whole csv into one table - problem is the number of
    fields will not be the same for the various rows, the csv has no field
    names
    2. split this one table into the various tables - a total of 6 tables
    3. update the field names with the correct names
    create table intengu1 (...)
    create table intengu2 (...)
    create table intengu3 (...)
    create table intengu4 (...)
    create table intengu5 (...)
    create table intengu6 (...)

    grep ^.1. /tmp/intengu.txt | psql -c "copy intengu1 from STDIN CSV;"
    grep ^.2. /tmp/intengu.txt | psql -c "copy intengu2 from STDIN CSV;"
    grep ^.3. /tmp/intengu.txt | psql -c "copy intengu3 from STDIN CSV;"
    grep ^.4. /tmp/intengu.txt | psql -c "copy intengu4 from STDIN CSV;"
    grep ^.5. /tmp/intengu.txt | psql -c "copy intengu5 from STDIN CSV;"
    grep ^.6. /tmp/intengu.txt | psql -c "copy intengu6 from STDIN CSV;"
    heh :)

    Good point. I probably need some sleep.

    --
    Michael Wood <esiotrot@gmail.com>

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-novice @
categoriespostgresql
postedAug 6, '09 at 8:37a
activeAug 6, '09 at 12:36p
posts6
users3
websitepostgresql.org
irc#postgresql

People

Translate

site design / logo © 2022 Grokbase