|
Serega Sheypak |
at Apr 16, 2013 at 1:33 pm
|
⇧ |
| |
Your files are rather big the common size is about 292519093 bytes =>
292Mbytes.
Your replication factor is 1, you have 4 TT and DN, it can lead to poor
data locality. Your mapper runs and it has no local split (if your files
are splittable???). TT will ask for a file split/full file (if not
splittable) from remote DN.
***so most of my data is actually really on my master node, being node1
I don't understand what you meant to say. Your data should be evenly
distributed (HDFS splits file by blocks).
My guess is poor locality. I would like if some more experienced group
member will validate this suggestion.
2013/4/16 Nicolas Maillard <
nicolas.maillard@fifty-five.com>
Hi Serega
Thanks for the help
I have a replication factor of 1 currently so most of my data is actually
really on my master node, being node1.
I Have 3 tasktrackers, one per node.
I have 3 datanodes , one per node
So datanodes and tastrackers are colocated.
The main node, node1 the master is in addition also the jobtracker, the
namenode and the secondary name node. As this was my first cluster I have
read that this is not ideal. My Master node however performs well.
I have included a print of the hdfs ls of my hive input path.
As per your thought I have tried decomissioning node2 that was slowing
down to see if it was a disk issue. This had the effect of bringing the
slowdown on node3. I guess this probably points more to a too big a split
being sent to one to a mapper.
I am note sure why a mapper would get too big a split. How could I
investigate this issue?
could it be the hdfs files that are note evenly distributed?
Are there ways to make sure the a mapper does not get too big a split.
thanks again for your help
On Tue, Apr 16, 2013 at 9:29 AM, Serega Sheypak wrote:Ok, I've read these logs. I didn't find anything critical.
My theory is that one of your mapper gets too big split. And it causes
skew. That is why I've sked you to provide:
1. hadoop fs -ls -R /your/hive/input/path
2. provide replication factor
3. specify quantity of TaskTrackers, NameNodes? Are they co-located, I
mean are they installed together on the each node?
The other theory that you have disk problem.
2013/4/15 Nicolas Maillard <
nicolas.maillard@fifty-five.com>
Hi everyone
I have installed CDH4.2 with CM 4.5 on a 3 node cluster.
Playing around with HIve and Map reduce I have a weird behaviour I can't
seem to fix.
One of my nodes takes an unusual long time t0 finish most of it's map
tasks.
A normal running time for a map task is about 35 seconds but my node
number 2 takes more along the line of 3 minutes.
For another reason I needed to reboot my machines yesterday and the
first job after reboot went on fine and all map tasks around 35 seconds and
then tje jobs after restarted with this wierd behaviour.
This node degrading my cluster performance as whole mkaing my jobs wait
a couple of minutes for the end.
All three nodes are on ubuntu 12.04 with 24GB ram 8cores and a
gigaethernet connection.
I have changed some settings:
- 4GB Mapreduce child java maximum heap size
- jvm reuse -1
- jobtracker heap size 2GB
- tasktaracker heap size 2GB
I have monitored the same hive request to try to get a feel of what is
wrong.
My cpu and ram and all machines are ok used but not full
The map logs show no sign of errors or wait.
In the map logs I see long pauses between operations: I see that node 2
processes data from the hdfs on node1 so there must be network time,
however node 3 also retrives data from node 1 to process and never spends a
full minute on it.
2013-04-10 09:42:18,884 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 forwarding 100000 rows
2013-04-10 09:42:18,884 INFO ExecMapper: ExecMapper: processing 100000 rows: used memory = 31260136
2013-04-10 09:43:17,820 INFO org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader: Processing file hdfs://node1:8020/user/hive/warehouse/oscaro_stack/year=2012/month=10/000024_0
2013-04-10 09:44:15,441 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 4 finished. closing...
2013-04-10 09:44:15,441 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 4 forwarded 722344 rows
If anyone has insight on this issue or test I could run to understand how to solve this I would appreciate it.
Thanks in advance