FAQ
Hi,

I have added parquet usage example with m/r, in particular, jcascalog onto:
https://github.com/mykidong/jcascalog-parquet-example

Parquet is column major data format like trevni(as i mentioned in my last
thread, trevni example with jcascalog also can be found in
https://github.com/mykidong/jcascalog-trevni-example).

In this example, parquet data handling with jcascalog and raw m/r job can
be found.

Thanks,

- Kidong.

--
You received this message because you are subscribed to the Google Groups "cascalog-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cascalog-user+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Search Discussions

  • Jeroen van Dijk at Jul 30, 2013 at 6:18 pm
    Hi Kidong,

    Thanks for sharing!

    Can you say something about your experience with Parquet? Does it perform
    better than your previous code? More compact? Any pitfalls? I'm also
    considering it as alternative to normal sequence files.

    Thanks,
    Jeroen

    On Tue, Jul 30, 2013 at 9:56 AM, Kidong Lee wrote:

    Hi,

    I have added parquet usage example with m/r, in particular, jcascalog onto:
    https://github.com/mykidong/jcascalog-parquet-example

    Parquet is column major data format like trevni(as i mentioned in my last
    thread, trevni example with jcascalog also can be found in
    https://github.com/mykidong/jcascalog-trevni-example).

    In this example, parquet data handling with jcascalog and raw m/r job can
    be found.

    Thanks,

    - Kidong.

    --
    You received this message because you are subscribed to the Google Groups
    "cascalog-user" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to cascalog-user+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.

    --
    You received this message because you are subscribed to the Google Groups "cascalog-user" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to cascalog-user+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Kidong Lee at Jul 31, 2013 at 2:09 am
    i am familiar with avro. parquet and trevni can be handled with avro
    schema easily as you can see in my parquet and trevni example.
    in my case, i just design avro schema for data model, and the nested data
    format for my data model can be easily selected, that is, row-based avro,
    or column-based trevni and parquet.

    but i have not yet done performance test in large cluster.

    - kidong.





    2013/7/31 Jeroen van Dijk <jeroentjevandijk@gmail.com>
    Hi Kidong,

    Thanks for sharing!

    Can you say something about your experience with Parquet? Does it perform
    better than your previous code? More compact? Any pitfalls? I'm also
    considering it as alternative to normal sequence files.

    Thanks,
    Jeroen

    On Tue, Jul 30, 2013 at 9:56 AM, Kidong Lee wrote:

    Hi,

    I have added parquet usage example with m/r, in particular, jcascalog
    onto:
    https://github.com/mykidong/jcascalog-parquet-example

    Parquet is column major data format like trevni(as i mentioned in my last
    thread, trevni example with jcascalog also can be found in
    https://github.com/mykidong/jcascalog-trevni-example).

    In this example, parquet data handling with jcascalog and raw m/r job can
    be found.

    Thanks,

    - Kidong.

    --
    You received this message because you are subscribed to the Google Groups
    "cascalog-user" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to cascalog-user+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.

    --
    You received this message because you are subscribed to the Google Groups
    "cascalog-user" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to cascalog-user+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.

    --
    You received this message because you are subscribed to the Google Groups "cascalog-user" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to cascalog-user+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Jeroen van Dijk at Jul 31, 2013 at 7:41 pm
    Thanks for the information. I hope to try out Parquet soon as well.

    Jeroen

    On Wed, Jul 31, 2013 at 4:09 AM, Kidong Lee wrote:

    i am familiar with avro. parquet and trevni can be handled with avro
    schema easily as you can see in my parquet and trevni example.
    in my case, i just design avro schema for data model, and the nested data
    format for my data model can be easily selected, that is, row-based avro,
    or column-based trevni and parquet.

    but i have not yet done performance test in large cluster.

    - kidong.





    2013/7/31 Jeroen van Dijk <jeroentjevandijk@gmail.com>
    Hi Kidong,

    Thanks for sharing!

    Can you say something about your experience with Parquet? Does it perform
    better than your previous code? More compact? Any pitfalls? I'm also
    considering it as alternative to normal sequence files.

    Thanks,
    Jeroen

    On Tue, Jul 30, 2013 at 9:56 AM, Kidong Lee wrote:

    Hi,

    I have added parquet usage example with m/r, in particular, jcascalog
    onto:
    https://github.com/mykidong/jcascalog-parquet-example

    Parquet is column major data format like trevni(as i mentioned in my
    last thread, trevni example with jcascalog also can be found in
    https://github.com/mykidong/jcascalog-trevni-example).

    In this example, parquet data handling with jcascalog and raw m/r job
    can be found.

    Thanks,

    - Kidong.

    --
    You received this message because you are subscribed to the Google
    Groups "cascalog-user" group.
    To unsubscribe from this group and stop receiving emails from it, send
    an email to cascalog-user+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.

    --
    You received this message because you are subscribed to the Google Groups
    "cascalog-user" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to cascalog-user+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.

    --
    You received this message because you are subscribed to the Google Groups
    "cascalog-user" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to cascalog-user+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.

    --
    You received this message because you are subscribed to the Google Groups "cascalog-user" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to cascalog-user+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcascalog-user @
categoriesclojure, hadoop
postedJul 30, '13 at 7:56a
activeJul 31, '13 at 7:41p
posts4
users2
websiteclojure.org
irc#clojure

2 users in discussion

Kidong Lee: 2 posts Jeroen van Dijk: 2 posts

People

Translate

site design / logo © 2021 Grokbase