Unfortunately, there's no easy way to d it easily. The recommended way is
to put the file in HDFS first, and then use Hive or Impala to do the
conversion.
Parquet requires that the whole file stay in a block. Using "hadoop fs
-put" might violate this condition.
Thanks,
Alan
On Mon, Mar 10, 2014 at 7:14 PM, wrote:
Hi,
Customer can only provide csv format file by time period. I plan use
impala:
1: Convert csv file to parquet file directly out of hadoop cluster;
2: Then put file to hdfs specific directory.
3: Alter table add a new table's partions (by time period)
4: All is OK
Now How can i convert csv file to parquet file directly?
To unsubscribe from this group and stop receiving emails from it, send an
email to impala-user+unsubscribe@cloudera.org.
To unsubscribe from this group and stop receiving emails from it, send an email to impala-user+unsubscribe@cloudera.org.Hi,
Customer can only provide csv format file by time period. I plan use
impala:
1: Convert csv file to parquet file directly out of hadoop cluster;
2: Then put file to hdfs specific directory.
3: Alter table add a new table's partions (by time period)
4: All is OK
Now How can i convert csv file to parquet file directly?
To unsubscribe from this group and stop receiving emails from it, send an
email to impala-user+unsubscribe@cloudera.org.