Grokbase Groups Pig user May 2013
FAQ
Hello,

In a Pig script I want to store the results in 2 different MySql tables
(using DBStorage) and a file on HDFS. This means 3 different STORE
statements. Right now when I do that, it does give success message in the
logs but saves nothing. What am I missing? Is it even possible?

I know I can use MultiStorage from PiggBank but I think it i sonly for 2
HDFS files. I want to store in 2 different MySQL tables or a MySQL table
and an HDFS file.

Any recommendations? Thanks.

Regards,
Shahab

Search Discussions

  • Ruslan Al-Fakikh at May 8, 2013 at 4:27 pm
    Hi,

    It is possible to have multiple store statements, but I can't tell why you
    have nothing in the result.
    I recommend to split the task to the appropriate tools: store everything in
    HDFS and then run Sqoop to upload data to an RDBMS.

    Ruslan

    On Wed, May 8, 2013 at 6:11 PM, Shahab Yunus wrote:

    Hello,

    In a Pig script I want to store the results in 2 different MySql tables
    (using DBStorage) and a file on HDFS. This means 3 different STORE
    statements. Right now when I do that, it does give success message in the
    logs but saves nothing. What am I missing? Is it even possible?

    I know I can use MultiStorage from PiggBank but I think it i sonly for 2
    HDFS files. I want to store in 2 different MySQL tables or a MySQL table
    and an HDFS file.

    Any recommendations? Thanks.

    Regards,
    Shahab
  • Shahab Yunus at May 8, 2013 at 5:48 pm
    @Peter thanks for the tip. -no_multiquery flag works but I guess it will be
    a performance hit. It is weird because it does not generate any error
    otherwise. In fact success message is explicitly logged even if nothing is
    saved.

    @Ruslan, yeah I was thinking of that too but then the thing is that if we
    indeed have to split the processing and first generate multiple HDFS files
    and then use SQOOP to load RDMS, then why not write few more short PIG
    scripts to load those HDFS files in RDMS?

    Regards,
    Shahab

    On Wed, May 8, 2013 at 12:27 PM, Ruslan Al-Fakikh wrote:

    Hi,

    It is possible to have multiple store statements, but I can't tell why you
    have nothing in the result.
    I recommend to split the task to the appropriate tools: store everything in
    HDFS and then run Sqoop to upload data to an RDBMS.

    Ruslan

    On Wed, May 8, 2013 at 6:11 PM, Shahab Yunus wrote:

    Hello,

    In a Pig script I want to store the results in 2 different MySql tables
    (using DBStorage) and a file on HDFS. This means 3 different STORE
    statements. Right now when I do that, it does give success message in the
    logs but saves nothing. What am I missing? Is it even possible?

    I know I can use MultiStorage from PiggBank but I think it i sonly for 2
    HDFS files. I want to store in 2 different MySQL tables or a MySQL table
    and an HDFS file.

    Any recommendations? Thanks.

    Regards,
    Shahab
  • Ruslan Al-Fakikh at May 10, 2013 at 12:56 am
    I was just thinking that using Sqoop is a more reliable and robust
    solution. I guess it may work in cases where other tools don't. And
    probably it can provide some more useful functionality.

    On Wed, May 8, 2013 at 9:47 PM, Shahab Yunus wrote:

    @Peter thanks for the tip. -no_multiquery flag works but I guess it will be
    a performance hit. It is weird because it does not generate any error
    otherwise. In fact success message is explicitly logged even if nothing is
    saved.

    @Ruslan, yeah I was thinking of that too but then the thing is that if we
    indeed have to split the processing and first generate multiple HDFS files
    and then use SQOOP to load RDMS, then why not write few more short PIG
    scripts to load those HDFS files in RDMS?

    Regards,
    Shahab


    On Wed, May 8, 2013 at 12:27 PM, Ruslan Al-Fakikh <metaruslan@gmail.com
    wrote:
    Hi,

    It is possible to have multiple store statements, but I can't tell why you
    have nothing in the result.
    I recommend to split the task to the appropriate tools: store everything in
    HDFS and then run Sqoop to upload data to an RDBMS.

    Ruslan


    On Wed, May 8, 2013 at 6:11 PM, Shahab Yunus <shahab.yunus@gmail.com>
    wrote:
    Hello,

    In a Pig script I want to store the results in 2 different MySql tables
    (using DBStorage) and a file on HDFS. This means 3 different STORE
    statements. Right now when I do that, it does give success message in
    the
    logs but saves nothing. What am I missing? Is it even possible?

    I know I can use MultiStorage from PiggBank but I think it i sonly for
    2
    HDFS files. I want to store in 2 different MySQL tables or a MySQL
    table
    and an HDFS file.

    Any recommendations? Thanks.

    Regards,
    Shahab
  • Russell Jurney at May 10, 2013 at 1:07 am
    Are you running the script from the command line by filename, or
    pasting it into an interactive grunt session? If you're pasting, it
    only runs the first STORE statement. Just a guess.

    Russell Jurney http://datasyndrome.com
    On May 9, 2013, at 5:56 PM, Ruslan Al-Fakikh wrote:

    I was just thinking that using Sqoop is a more reliable and robust
    solution. I guess it may work in cases where other tools don't. And
    probably it can provide some more useful functionality.

    On Wed, May 8, 2013 at 9:47 PM, Shahab Yunus wrote:

    @Peter thanks for the tip. -no_multiquery flag works but I guess it will be
    a performance hit. It is weird because it does not generate any error
    otherwise. In fact success message is explicitly logged even if nothing is
    saved.

    @Ruslan, yeah I was thinking of that too but then the thing is that if we
    indeed have to split the processing and first generate multiple HDFS files
    and then use SQOOP to load RDMS, then why not write few more short PIG
    scripts to load those HDFS files in RDMS?

    Regards,
    Shahab


    On Wed, May 8, 2013 at 12:27 PM, Ruslan Al-Fakikh <metaruslan@gmail.com
    wrote:
    Hi,

    It is possible to have multiple store statements, but I can't tell why you
    have nothing in the result.
    I recommend to split the task to the appropriate tools: store everything in
    HDFS and then run Sqoop to upload data to an RDBMS.

    Ruslan


    On Wed, May 8, 2013 at 6:11 PM, Shahab Yunus <shahab.yunus@gmail.com>
    wrote:
    Hello,

    In a Pig script I want to store the results in 2 different MySql tables
    (using DBStorage) and a file on HDFS. This means 3 different STORE
    statements. Right now when I do that, it does give success message in
    the
    logs but saves nothing. What am I missing? Is it even possible?

    I know I can use MultiStorage from PiggBank but I think it i sonly for
    2
    HDFS files. I want to store in 2 different MySQL tables or a MySQL
    table
    and an HDFS file.

    Any recommendations? Thanks.

    Regards,
    Shahab
  • Shahab Yunus at May 10, 2013 at 1:21 am
    I am running it as a script through the exec command with the filename.
    Using . -no_multiquery works though.

    Regards,
    Shahab

    On Thu, May 9, 2013 at 9:07 PM, Russell Jurney wrote:

    Are you running the script from the command line by filename, or
    pasting it into an interactive grunt session? If you're pasting, it
    only runs the first STORE statement. Just a guess.

    Russell Jurney http://datasyndrome.com
    On May 9, 2013, at 5:56 PM, Ruslan Al-Fakikh wrote:

    I was just thinking that using Sqoop is a more reliable and robust
    solution. I guess it may work in cases where other tools don't. And
    probably it can provide some more useful functionality.

    On Wed, May 8, 2013 at 9:47 PM, Shahab Yunus wrote:

    @Peter thanks for the tip. -no_multiquery flag works but I guess it
    will be
    a performance hit. It is weird because it does not generate any error
    otherwise. In fact success message is explicitly logged even if nothing
    is
    saved.

    @Ruslan, yeah I was thinking of that too but then the thing is that if
    we
    indeed have to split the processing and first generate multiple HDFS
    files
    and then use SQOOP to load RDMS, then why not write few more short PIG
    scripts to load those HDFS files in RDMS?

    Regards,
    Shahab


    On Wed, May 8, 2013 at 12:27 PM, Ruslan Al-Fakikh <metaruslan@gmail.com
    wrote:
    Hi,

    It is possible to have multiple store statements, but I can't tell why you
    have nothing in the result.
    I recommend to split the task to the appropriate tools: store
    everything
    in
    HDFS and then run Sqoop to upload data to an RDBMS.

    Ruslan


    On Wed, May 8, 2013 at 6:11 PM, Shahab Yunus <shahab.yunus@gmail.com>
    wrote:
    Hello,

    In a Pig script I want to store the results in 2 different MySql
    tables
    (using DBStorage) and a file on HDFS. This means 3 different STORE
    statements. Right now when I do that, it does give success message in
    the
    logs but saves nothing. What am I missing? Is it even possible?

    I know I can use MultiStorage from PiggBank but I think it i sonly for
    2
    HDFS files. I want to store in 2 different MySQL tables or a MySQL
    table
    and an HDFS file.

    Any recommendations? Thanks.

    Regards,
    Shahab

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriespig, hadoop
postedMay 8, '13 at 2:11p
activeMay 10, '13 at 1:21a
posts6
users3
websitepig.apache.org

People

Translate

site design / logo © 2021 Grokbase