|| at Aug 12, 2011 at 2:50 am
The Hive table is just a directory in HDFS, so you can recursively set the replication factor on it as you like. You can set it to the number of datanodes you have. If you have 100 nodes, then run this after you create your table:
hadoop fs -setrep -R -w 100 /path/to/hive/warehouse/small_table_to_be_distributed
On Aug 11, 2011, at 7:43 PM, Daniel,Wu wrote:
if we have a very small table to be joined. we can use map side join and need the small table to be located on the map task. Is it possible to replicate the small table to ALL nodes when create the small table to cute the time to distribute the small table?