|
Paul Lam |
at Mar 10, 2013 at 8:27 am
|
⇧ |
| |
Hi Kang,
hfs-delimited is non-compressed by default. For a general solution, say if
you have one hfs-seqfile that is compressed and another hfs-seqfile that is
not compressed or using a different compression method, you can use
cascalog-checkpoint and have each sourcing step using its own
(with-job-conf) to set compression properties.
Paul
On Thursday, 7 March 2013 22:47:47 UTC, Kang Tu wrote:Hi Dave,
Thanks for replying. What I am not sure is:
If the hfs-seqfile is the compressed format by default?
If it is not, how can I set "compressed" option for one tap and
"non-compressed" option for another tap? I know there might be some option
in with-job-conf but it looks like a global option and cannot be applied to
individual.
Thanks
Kang
On Thursday, March 7, 2013 2:01:15 PM UTC-8, David Kincaid wrote:I think you answered your own question. Create two taps using hfs-seqfile
for the compressed file and hfs-delimited for the tab delimited file. Then
create a query that uses the two taps and does your join.
Dave
On Thursday, March 7, 2013 3:49:31 PM UTC-6, Kang Tu wrote:
Hi,
I need to join two files. One is compressed sequence file (maybe I
should use hfs-seqfile tap) and the other one is not compressed, tab
delimited file (maybe I should use hfs-delimited).
I wonder if I can do it in cascalog?
Thanks in advance
Kang
--
You received this message because you are subscribed to the Google Groups "cascalog-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cascalog-user+unsubscribe@googlegroups.com.
For more options, visit
https://groups.google.com/groups/opt_out.