hi JJ,
You can add the files to your jar file. That grants that the files are avaliable to each tak. If the files are job dependent you should use an ant script that adds the files to the jar each time you start a job.
regards,
Christian
---------------8<--------------------------------
Siemens AG
Corporate Technology
Corporate Research and Technologies
CT T DE IT3
Otto-Hahn-Ring 6
81739 München, Deutschland
Tel.: +49 (89) 636-42722
Fax: +49 (89) 636-41423
mailto:christian.kleegrewe@siemens.com
Siemens Aktiengesellschaft: Vorsitzender des Aufsichtsrats: Gerhard Cromme; Vorstand: Peter Löscher, Vorsitzender; Wolfgang Dehen, Brigitte Ederer, Joe Kaeser, Barbara Kux, Hermann Requardt, Siegfried Russwurm, Peter Y. Solmssen; Sitz der Gesellschaft: Berlin und München, Deutschland; Registergericht: Berlin Charlottenburg, HRB 12300, München, HRB 6684; WEEE-Reg.-Nr. DE 23691322
________________________________
Von: Mapred Learn
Gesendet: Freitag, 10. Juni 2011 02:36
An: mapreduce-user@hadoop.apache.org; cdh-user@cloudera.org
Betreff: How to send files to task trackers in a map-red job
Hi,
I have 2 files that I want to send to all tasktrackers during job execution.
I try something like:
hadoop jar abc.jar <main class> -conf <conf file> -cacheFile 'hdfs://<namenode>:port/user/jj/dummy/abc.dat#abc' -cacheFile 'hdfs://<namenode>:port/user/jj/dummy/abc.txt#abc1'
But looks like I don't get second file to the task trackers and all task trackers fail with exception - "file not found: /user/jj/dummy/abc.txt".
Could somebody please guide me to the right way how to get files to all the data nodes in a map-red job ?
Thanks,
JJ