Grokbase Groups Pig user June 2010
FAQ
Does there exist any reporting tools that can run on top of
pig or using pig? Or does everyone load TSV results in some type of excel.

I will need to create reports with labels and sequential pig queries
and any fancy display stuff I can send out with email.


elein
elein@varlena.com

Search Discussions

  • Russell Jurney at Jun 15, 2010 at 12:01 am
    When I need reports, I do:

    echo "My\tcolumn\tnames\n" > report.tsv
    hdfs -cat my/pig/output/* >> report.tsv

    If you need something more elaborate, you could use something like
    http://search.cpan.org/dist/Spreadsheet-WriteExcel/ or simply load your TSV
    into a database with a script after your pig job finishes, and use any of
    the database reporting tools.

    MySQL (with the Infobright engine if you have bigger data output) and
    something like Pentaho would work: http://www.pentaho.com/
    Tableau is really nice, and can load smaller TSV directly, but is Windows
    only and a bit pricey. http://www.tableausoftware.com/

    Russ
    On Mon, Jun 14, 2010 at 4:24 PM, elein wrote:


    Does there exist any reporting tools that can run on top of
    pig or using pig? Or does everyone load TSV results in some type of excel.

    I will need to create reports with labels and sequential pig queries
    and any fancy display stuff I can send out with email.


    elein
    elein@varlena.com



  • Dmitriy Ryaboy at Jun 15, 2010 at 1:55 am
    Hey Russ you should check out the SchemaAwarePigLoader or whatever it was I
    wound up calling it.
    Dumps a header file next to the data so you can just cat everything
    together. Should be in 0.7 piggybank.
    On Mon, Jun 14, 2010 at 5:01 PM, Russell Jurney wrote:

    When I need reports, I do:

    echo "My\tcolumn\tnames\n" > report.tsv
    hdfs -cat my/pig/output/* >> report.tsv

    If you need something more elaborate, you could use something like
    http://search.cpan.org/dist/Spreadsheet-WriteExcel/ or simply load your
    TSV
    into a database with a script after your pig job finishes, and use any of
    the database reporting tools.

    MySQL (with the Infobright engine if you have bigger data output) and
    something like Pentaho would work: http://www.pentaho.com/
    Tableau is really nice, and can load smaller TSV directly, but is Windows
    only and a bit pricey. http://www.tableausoftware.com/

    Russ
    On Mon, Jun 14, 2010 at 4:24 PM, elein wrote:


    Does there exist any reporting tools that can run on top of
    pig or using pig? Or does everyone load TSV results in some type of excel.
    I will need to create reports with labels and sequential pig queries
    and any fancy display stuff I can send out with email.


    elein
    elein@varlena.com



  • Russell Jurney at Jun 15, 2010 at 2:43 am
    Cool, thanks!
    On Mon, Jun 14, 2010 at 6:55 PM, Dmitriy Ryaboy wrote:

    Hey Russ you should check out the SchemaAwarePigLoader or whatever it was I
    wound up calling it.
    Dumps a header file next to the data so you can just cat everything
    together. Should be in 0.7 piggybank.

    On Mon, Jun 14, 2010 at 5:01 PM, Russell Jurney <russell.jurney@gmail.com
    wrote:
    When I need reports, I do:

    echo "My\tcolumn\tnames\n" > report.tsv
    hdfs -cat my/pig/output/* >> report.tsv

    If you need something more elaborate, you could use something like
    http://search.cpan.org/dist/Spreadsheet-WriteExcel/ or simply load your
    TSV
    into a database with a script after your pig job finishes, and use any of
    the database reporting tools.

    MySQL (with the Infobright engine if you have bigger data output) and
    something like Pentaho would work: http://www.pentaho.com/
    Tableau is really nice, and can load smaller TSV directly, but is Windows
    only and a bit pricey. http://www.tableausoftware.com/

    Russ
    On Mon, Jun 14, 2010 at 4:24 PM, elein wrote:


    Does there exist any reporting tools that can run on top of
    pig or using pig? Or does everyone load TSV results in some type of excel.
    I will need to create reports with labels and sequential pig queries
    and any fancy display stuff I can send out with email.


    elein
    elein@varlena.com



Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriespig, hadoop
postedJun 14, '10 at 11:25p
activeJun 15, '10 at 2:43a
posts4
users3
websitepig.apache.org

People

Translate

site design / logo © 2021 Grokbase