I am new to Hive and Hadoop in general. I have a table in Oracle that has
millions of rows and I'd like to export it into HDFS so that I can run some
Hive queries. My first question is, is it recommended to export the entire
table as a single file (possibly 5GB), or more files with smaller sizes (10
files each 500mb)? also, does it matter if I put the files under different
sub-directories before I do the data load in Hive? or everything has to be
under the same folder?
p.s. I am sorry if this post is submitted twice.