FAQ
Hi, Suddenly, I've started to get errors while trying to view logs through
HUE admin:
http://node11.lol.ru:8888/jobbrowser/jobs/job_201303201339_0025/single_logs

I do get such stacktrace:

[01/Apr/2013 07:05:40 +0000] access INFO 10.66.49.134 hdfs - "GET
/debug/check_config_ajax HTTP/1.0"
[01/Apr/2013 07:05:43 +0000] access INFO 10.66.49.134 hdfs - "GET
/jobbrowser/jobs/job_201303201339_0025/tasks/task_201303201339_0025_r_000164
HTTP/1.0"
[01/Apr/2013 07:05:43 +0000] thrift_util DEBUG Thrift call: <class
'hadoop.api.jobtracker.Jobtracker.Client'>.getJob(args=(RequestContext(confOptions={'effective_user':
u'hdfs'}), ThriftJobID(asString=u'job_201303201339_0025',
jobTrackerID=u'201303201339', jobID=25)), kwargs={})
[01/Apr/2013 07:05:43 +0000] thrift_util DEBUG Thrift call <class
'hadoop.api.jobtracker.Jobtracker.Client'>.getJob returned in 53ms:
ThriftJobInProgress(profile=ThriftJobProfile(jobFile='hdfs://prod-node015.lol.ru:8020/user/hdfs/.staging/job_201303201339_0025/job.xml',
queueName='default', user='hdfs',
name='oozie:action:T=map-reduce:W=Url-rating-subworkflow:A=Url-rating-subworkflow-run:ID=0000021-130320135309911-oozie-oozi-W',
jobID=ThriftJobID(asString='job_201303201339_0025',
jobTrackerID='201303201339', jobID=25)),
status=ThriftJobStatus(cleanupProgress=1.0, reduceProgress=1.0, runState=2,
jobID=ThriftJobID(asString='job_201303201339_0025',
jobTrackerID='201303201339', jobID=25), priority=2, user='hdfs',
startTime=1364825038848, setupProgress=1.0, mapProgress=1.0,
schedulingInfo='NA'), tasks=ThriftTaskInProgressList(numTotalTasks=205,
tasks=[ThriftTaskInProgress(runningAttempts=[],
taskStatuses={'attempt_201303201339_0025_m_000035_0':
ThriftTaskStatus(finishTime=1364825088030, stateString='cleanup',
startTime=1364825085984, sortFinishTime=0,
taskTracker='tracker_prod-node014.lol.ru:localhost/127....
[01/Apr/2013 07:05:43 +0000] thrift_util DEBUG Thrift call: <class
'hadoop.api.jobtracker.Jobtracker.Client'>.getTask(args=(RequestContext(confOptions={'effective_user':
u'hdfs'}), ThriftTaskID(asString=None, taskType=1, taskID=164,
jobID=ThriftJobID(asString=None, jobTrackerID=u'201303201339', jobID=25))),
kwargs={})
[01/Apr/2013 07:05:43 +0000] thrift_util DEBUG Thrift call <class
'hadoop.api.jobtracker.Jobtracker.Client'>.getTask returned in 6ms:
ThriftTaskInProgress(runningAttempts=[],
taskStatuses={'attempt_201303201339_0025_r_000164_0':
ThriftTaskStatus(finishTime=1364825078834, stateString='reduce > reduce',
startTime=1364825069066, sortFinishTime=1364825077527,
taskTracker='tracker_prod-node029.lol.ru:localhost/127.0.0.1:43079',
state=1, shuffleFinishTime=1364825076674, mapFinishTime=0,
taskID=ThriftTaskAttemptID(asString='attempt_201303201339_0025_r_000164_0',
attemptID=0,
taskID=ThriftTaskID(asString='task_201303201339_0025_r_000164', taskType=1,
taskID=164, jobID=ThriftJobID(asString='job_201303201339_0025',
jobTrackerID='201303201339', jobID=25))), diagnosticInfo='', phase=4,
progress=1.0, outputSize=-1,
counters=ThriftGroupList(groups=[ThriftCounterGroup(displayName='File
System Counters', name='org.apache.hadoop.mapreduce.FileSystemCounter',
counters={'FILE: Number of bytes read': ThriftCounter(displayName='FILE:
Number of bytes read', name='FILE_BYTES_READ', value=20), 'HDFS: Number of
write operations': Thr...
[01/Apr/2013 07:05:44 +0000] access INFO 10.66.49.134 hdfs - "GET
/debug/check_config_ajax HTTP/1.0"
[01/Apr/2013 07:05:54 +0000] middleware DEBUG No desktop_app known for
request.
[01/Apr/2013 07:05:54 +0000] access INFO 10.66.49.134 hdfs - "GET
/jobbrowser/ HTTP/1.0"
[01/Apr/2013 07:05:54 +0000] thrift_util DEBUG Thrift call: <class
'hadoop.api.jobtracker.Jobtracker.Client'>.getAllJobs(args=(RequestContext(confOptions={'effective_user':
u'hdfs'}),), kwargs={})
[01/Apr/2013 07:05:54 +0000] thrift_util DEBUG Thrift call <class
'hadoop.api.jobtracker.Jobtracker.Client'>.getAllJobs returned in 6ms:
ThriftJobList(jobs=[ThriftJobInProgress(profile=ThriftJobProfile(jobFile='hdfs://prod-node015.lol.ru:8020/user/devops/.staging/job_201303201339_0002/job.xml',
queueName='default', user='devops',
name='oozie:action:T=map-reduce:W=Url-rating-subworkflow:A=Url-rating-subworkflow-run:ID=0000006-130320135309911-oozie-oozi-W',
jobID=ThriftJobID(asString='job_201303201339_0002',
jobTrackerID='201303201339', jobID=2)),
status=ThriftJobStatus(cleanupProgress=1.0, reduceProgress=1.0, runState=3,
jobID=ThriftJobID(asString='job_201303201339_0002',
jobTrackerID='201303201339', jobID=2), priority=2, user='devops',
startTime=1364819345925, setupProgress=1.0, mapProgress=1.0,
schedulingInfo='NA'), tasks=None, desiredMaps=18, desiredReduces=168,
finishedMaps=0, finishedReduces=0,
jobID=ThriftJobID(asString='job_201303201339_0002',
jobTrackerID='201303201339', jobID=2), priority=2,
launchTime=1364819346297, startTime=1364819345925,
finishTime=1364819398372), ThriftJobInProgress(profile=ThriftJo...
[01/Apr/2013 07:05:55 +0000] access INFO 10.66.49.134 hdfs - "GET
/debug/check_config_ajax HTTP/1.0"
[01/Apr/2013 07:05:55 +0000] access DEBUG 10.66.49.134 hdfs - "GET
/static/art/datatables/sort_desc.png HTTP/1.0"
[01/Apr/2013 07:06:06 +0000] access INFO 10.66.49.134 hdfs - "GET
/jobbrowser/jobs/job_201303201339_0022 HTTP/1.0"
[01/Apr/2013 07:06:06 +0000] thrift_util DEBUG Thrift call: <class
'hadoop.api.jobtracker.Jobtracker.Client'>.getJob(args=(RequestContext(confOptions={'effective_user':
u'hdfs'}), ThriftJobID(asString=u'job_201303201339_0022',
jobTrackerID=u'201303201339', jobID=22)), kwargs={})
[01/Apr/2013 07:06:06 +0000] thrift_util DEBUG Thrift call <class
'hadoop.api.jobtracker.Jobtracker.Client'>.getJob returned in 32ms:
ThriftJobInProgress(profile=ThriftJobProfile(jobFile='hdfs://prod-node015.lol.ru:8020/user/hdfs/.staging/job_201303201339_0022/job.xml',
queueName='default', user='hdfs',
name='oozie:action:T=map-reduce:W=Url-rating-subworkflow:A=Url-rating-subworkflow-run:ID=0000020-130320135309911-oozie-oozi-W',
jobID=ThriftJobID(asString='job_201303201339_0022',
jobTrackerID='201303201339', jobID=22)),
status=ThriftJobStatus(cleanupProgress=1.0, reduceProgress=1.0, runState=2,
jobID=ThriftJobID(asString='job_201303201339_0022',
jobTrackerID='201303201339', jobID=22), priority=2, user='hdfs',
startTime=1364824738956, setupProgress=1.0, mapProgress=1.0,
schedulingInfo='NA'), tasks=ThriftTaskInProgressList(numTotalTasks=188,
tasks=[ThriftTaskInProgress(runningAttempts=[],
taskStatuses={'attempt_201303201339_0022_m_000018_0':
ThriftTaskStatus(finishTime=1364824794989, stateString='cleanup',
startTime=1364824793165, sortFinishTime=0,
taskTracker='tracker_prod-node034.lol.ru:localhost/127....
[01/Apr/2013 07:06:06 +0000] thrift_util DEBUG Thrift call: <class
'hadoop.api.jobtracker.Jobtracker.Client'>.getJobConfXML(args=(RequestContext(confOptions={'effective_user':
u'hdfs'}), ThriftJobID(asString='job_201303201339_0022',
jobTrackerID='201303201339', jobID=22)), kwargs={})
[01/Apr/2013 07:06:06 +0000] thrift_util DEBUG Thrift call <class
'hadoop.api.jobtracker.Jobtracker.Client'>.getJobConfXML returned in 4ms:
'<?xml version="1.0" encoding="UTF-8"
standalone="no"?><configuration>\n<property><name>mapred.job.restart.recover</name><value>true</value><source>programatically</source><source>/data/disk0/mapred/jt/jobTracker/job_201303201339_0022.xml</source></property>\n<property><name>job.end.retry.interval</name><value>30000</value><source>programatically</source><source>/data/disk0/mapred/jt/jobTracker/job_201303201339_0022.xml</source></property>\n<property><name>mapred.job.tracker.retiredjobs.cache.size</name><value>1000</value><source>programatically</source><source>/data/disk0/mapred/jt/jobTracker/job_201303201339_0022.xml</source></property>\n<property><name>mapred.queue.default.acl-administer-jobs</name><value>*</value><source>programatically</source><source>/data/disk0/mapred/jt/jobTracker/job_201303201339_0022.xml</source></property>\n<property><name>dfs.image.transfer.bandwidthPerSec</name><value>0</value><source>programatically</source><source>/data/disk0/mapred/jt/jobTracker/job_201...
[01/Apr/2013 07:06:06 +0000] http_client DEBUG GET
http://prod-node015.lol.ru:50070/webhdfs/v1/staging/landing/source/protei/http/2013/03/27/01?op=GETFILESTATUS&user.name=hue&doas=hdfs
[01/Apr/2013 07:06:06 +0000] resource DEBUG GET Got response:
{"FileStatus":{"accessTime":0,"b...
[01/Apr/2013 07:06:06 +0000] http_client DEBUG GET
http://prod-node015.lol.ru:50070/webhdfs/v1/masterdata/source/protei/http/archive/2013/03/27/01?op=GETFILESTATUS&user.name=hue&doas=hdfs
[01/Apr/2013 07:06:06 +0000] resource DEBUG GET Got response:
{"FileStatus":{"accessTime":0,"b...
[01/Apr/2013 07:06:06 +0000] thrift_util DEBUG Thrift call: <class
'hadoop.api.jobtracker.Jobtracker.Client'>.getJobCounterRollups(args=(RequestContext(confOptions={'effective_user':
u'hdfs'}), ThriftJobID(asString='job_201303201339_0022',
jobTrackerID='201303201339', jobID=22)), kwargs={})
[01/Apr/2013 07:06:06 +0000] thrift_util DEBUG Thrift call <class
'hadoop.api.jobtracker.Jobtracker.Client'>.getJobCounterRollups returned in
18ms:
ThriftJobCounterRollups(reduceCounters=ThriftGroupList(groups=[ThriftCounterGroup(displayName='File
System Counters', name='org.apache.hadoop.mapreduce.FileSystemCounter',
counters={'FILE: Number of bytes read': ThriftCounter(displayName='FILE:
Number of bytes read', name='FILE_BYTES_READ', value=3360), 'HDFS: Number
of write operations': ThriftCounter(displayName='HDFS: Number of write
operations', name='HDFS_WRITE_OPS', value=168), 'FILE: Number of read
operations': ThriftCounter(displayName='FILE: Number of read operations',
name='FILE_READ_OPS', value=0), 'HDFS: Number of bytes read':
ThriftCounter(displayName='HDFS: Number of bytes read',
name='HDFS_BYTES_READ', value=0), 'HDFS: Number of read operations':
ThriftCounter(displayName='HDFS: Number of read operations',
name='HDFS_READ_OPS', value=21), 'FILE: Number of bytes written':
ThriftCounter(displayName='FILE: Number of bytes written',
name='FILE_BYTES_WRITTEN', value=29347957), 'HDFS: Number of large read
operations': ThriftCo...
[01/Apr/2013 07:06:07 +0000] access INFO 10.66.49.134 hdfs - "GET
/debug/check_config_ajax HTTP/1.0"
[01/Apr/2013 07:06:16 +0000] access INFO 10.66.49.134 hdfs - "GET
/jobbrowser/jobs/job_201303201339_0022/tasks/task_201303201339_0022_r_000167/attempts/attempt_201303201339_0022_r_000167_0/logs
HTTP/1.0"
[01/Apr/2013 07:06:16 +0000] thrift_util DEBUG Thrift call: <class
'hadoop.api.jobtracker.Jobtracker.Client'>.getJob(args=(RequestContext(confOptions={'effective_user':
u'hdfs'}), ThriftJobID(asString=u'job_201303201339_0022',
jobTrackerID=u'201303201339', jobID=22)), kwargs={})
[01/Apr/2013 07:06:16 +0000] thrift_util DEBUG Thrift call <class
'hadoop.api.jobtracker.Jobtracker.Client'>.getJob returned in 53ms:
ThriftJobInProgress(profile=ThriftJobProfile(jobFile='hdfs://prod-node015.lol.ru:8020/user/hdfs/.staging/job_201303201339_0022/job.xml',
queueName='default', user='hdfs',
name='oozie:action:T=map-reduce:W=Url-rating-subworkflow:A=Url-rating-subworkflow-run:ID=0000020-130320135309911-oozie-oozi-W',
jobID=ThriftJobID(asString='job_201303201339_0022',
jobTrackerID='201303201339', jobID=22)),
status=ThriftJobStatus(cleanupProgress=1.0, reduceProgress=1.0, runState=2,
jobID=ThriftJobID(asString='job_201303201339_0022',
jobTrackerID='201303201339', jobID=22), priority=2, user='hdfs',
startTime=1364824738956, setupProgress=1.0, mapProgress=1.0,
schedulingInfo='NA'), tasks=ThriftTaskInProgressList(numTotalTasks=188,
tasks=[ThriftTaskInProgress(runningAttempts=[],
taskStatuses={'attempt_201303201339_0022_m_000018_0':
ThriftTaskStatus(finishTime=1364824794989, stateString='cleanup',
startTime=1364824793165, sortFinishTime=0,
taskTracker='tracker_prod-node034.lol.ru:localhost/127....
[01/Apr/2013 07:06:16 +0000] thrift_util DEBUG Thrift call: <class
'hadoop.api.jobtracker.Jobtracker.Client'>.getTask(args=(RequestContext(confOptions={'effective_user':
u'hdfs'}), ThriftTaskID(asString=None, taskType=1, taskID=167,
jobID=ThriftJobID(asString=None, jobTrackerID=u'201303201339', jobID=22))),
kwargs={})
[01/Apr/2013 07:06:16 +0000] thrift_util DEBUG Thrift call <class
'hadoop.api.jobtracker.Jobtracker.Client'>.getTask returned in 6ms:
ThriftTaskInProgress(runningAttempts=[],
taskStatuses={'attempt_201303201339_0022_r_000167_0':
ThriftTaskStatus(finishTime=1364824779167, stateString='reduce > reduce',
startTime=1364824766546, sortFinishTime=1364824777715,
taskTracker='tracker_prod-node014.lol.ru:localhost/127.0.0.1:47833',
state=1, shuffleFinishTime=1364824777120, mapFinishTime=0,
taskID=ThriftTaskAttemptID(asString='attempt_201303201339_0022_r_000167_0',
attemptID=0,
taskID=ThriftTaskID(asString='task_201303201339_0022_r_000167', taskType=1,
taskID=167, jobID=ThriftJobID(asString='job_201303201339_0022',
jobTrackerID='201303201339', jobID=22))), diagnosticInfo='', phase=4,
progress=1.0, outputSize=-1,
counters=ThriftGroupList(groups=[ThriftCounterGroup(displayName='File
System Counters', name='org.apache.hadoop.mapreduce.FileSystemCounter',
counters={'FILE: Number of bytes read': ThriftCounter(displayName='FILE:
Number of bytes read', name='FILE_BYTES_READ', value=20), 'HDFS: Number of
write operations': Thr...
[01/Apr/2013 07:06:16 +0000] thrift_util DEBUG Thrift call: <class
'hadoop.api.jobtracker.Jobtracker.Client'>.getTracker(args=(RequestContext(confOptions={'effective_user':
u'hdfs'}), 'tracker_prod-node014.lol.ru:localhost/127.0.0.1:47833'),
kwargs={})
[01/Apr/2013 07:06:16 +0000] thrift_util DEBUG Thrift call <class
'hadoop.api.jobtracker.Jobtracker.Client'>.getTracker returned in 1ms:
ThriftTaskTrackerStatus(taskReports=None, availableSpace=2041132032000,
totalVirtualMemory=139483734016, failureCount=32, httpPort=50060,
host='prod-node014.lol.ru', totalPhysicalMemory=135290486784,
reduceCount=0, lastSeen=1364825174380,
trackerName='tracker_prod-node014.lol.ru:localhost/127.0.0.1:47833',
mapCount=0, maxReduceTasks=16, maxMapTasks=32)
[01/Apr/2013 07:06:16 +0000] models INFO Retrieving
http://prod-node014.lol.ru:50060/tasklog?attemptid=attempt_201303201339_0022_r_000167_0
[01/Apr/2013 07:06:16 +0000] middleware INFO Processing exception:
Unexpected end tag : td, line 7, column 12: Traceback (most recent call
last):
File
"/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/share/hue/build/env/lib/python2.6/site-packages/Django-1.2.3-py2.6.egg/django/core/handlers/base.py",
line 100, in get_response
response = callback(request, *callback_args, **callback_kwargs)
File
"/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/share/hue/apps/jobbrowser/src/jobbrowser/views.py",
line 62, in decorate
return view_func(request, *args, **kwargs)
File
"/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/share/hue/apps/jobbrowser/src/jobbrowser/views.py",
line 290, in single_task_attempt_logs
logs += [ section.strip() for section in attempt.get_task_log() ]
File
"/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/share/hue/apps/jobbrowser/src/jobbrowser/models.py",
line 451, in get_task_log
et = lxml.html.parse(data)
File
"/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/share/hue/build/env/lib/python2.6/site-packages/lxml-2.2.2-py2.6-linux-x86_64.egg/lxml/html/__init__.py",
line 661, in parse
return etree.parse(filename_or_url, parser, base_url=base_url, **kw)
File "lxml.etree.pyx", line 2698, in lxml.etree.parse
(src/lxml/lxml.etree.c:49590)
File "parser.pxi", line 1513, in lxml.etree._parseDocument
(src/lxml/lxml.etree.c:71423)
File "parser.pxi", line 1543, in lxml.etree._parseFilelikeDocument
(src/lxml/lxml.etree.c:71733)
File "parser.pxi", line 1426, in lxml.etree._parseDocFromFilelike
(src/lxml/lxml.etree.c:70648)
File "parser.pxi", line 997, in
lxml.etree._BaseParser._parseDocFromFilelike (src/lxml/lxml.etree.c:67944)
File "parser.pxi", line 539, in
lxml.etree._ParserContext._handleParseResultDoc
(src/lxml/lxml.etree.c:63820)
File "parser.pxi", line 625, in lxml.etree._handleParseResult
(src/lxml/lxml.etree.c:64741)
File "parser.pxi", line 565, in lxml.etree._raiseParseError
(src/lxml/lxml.etree.c:64084)
XMLSyntaxError: Unexpected end tag : td, line 7, column 12

[01/Apr/2013 07:06:16 +0000] access INFO 10.66.49.134 hdfs - "GET
/debug/check_config_ajax HTTP/1.0"
[01/Apr/2013 07:06:20 +0000] access WARNING 10.66.49.134 hdfs - "GET
/logs HTTP/1.0"
[01/Apr/2013 07:06:31 +0000] access INFO 10.66.49.134 hdfs - "GET
/debug/check_config_ajax HTTP/1.0"
[01/Apr/2013 07:14:18 +0000] access INFO 10.66.49.134 hdfs - "GET
/jobbrowser/jobs/job_201303201339_0025 HTTP/1.0"
[01/Apr/2013 07:14:18 +0000] thrift_util DEBUG Thrift call: <class
'hadoop.api.jobtracker.Jobtracker.Client'>.getJob(args=(RequestContext(confOptions={'effective_user':
u'hdfs'}), ThriftJobID(asString=u'job_201303201339_0025',
jobTrackerID=u'201303201339', jobID=25)), kwargs={})
[01/Apr/2013 07:14:18 +0000] thrift_util INFO Thrift exception;
retrying: None
[01/Apr/2013 07:14:18 +0000] thrift_util DEBUG Thrift call: <class
'hadoop.api.jobtracker.Jobtracker.Client'>.getJob(args=(RequestContext(confOptions={'effective_user':
u'hdfs'}), ThriftJobID(asString=u'job_201303201339_0025',
jobTrackerID=u'201303201339', jobID=25)), kwargs={})
[01/Apr/2013 07:14:19 +0000] thrift_util DEBUG Thrift call <class
'hadoop.api.jobtracker.Jobtracker.Client'>.getJob returned in 57ms:
ThriftJobInProgress(profile=ThriftJobProfile(jobFile='hdfs://prod-node015.lol.ru:8020/user/hdfs/.staging/job_201303201339_0025/job.xml',
queueName='default', user='hdfs',
name='oozie:action:T=map-reduce:W=Url-rating-subworkflow:A=Url-rating-subworkflow-run:ID=0000021-130320135309911-oozie-oozi-W',
jobID=ThriftJobID(asString='job_201303201339_0025',
jobTrackerID='201303201339', jobID=25)),
status=ThriftJobStatus(cleanupProgress=1.0, reduceProgress=1.0, runState=2,
jobID=ThriftJobID(asString='job_201303201339_0025',
jobTrackerID='201303201339', jobID=25), priority=2, user='hdfs',
startTime=1364825038848, setupProgress=1.0, mapProgress=1.0,
schedulingInfo='NA'), tasks=ThriftTaskInProgressList(numTotalTasks=205,
tasks=[ThriftTaskInProgress(runningAttempts=[],
taskStatuses={'attempt_201303201339_0025_m_000035_0':
ThriftTaskStatus(finishTime=1364825088030, stateString='cleanup',
startTime=1364825085984, sortFinishTime=0,
taskTracker='tracker_prod-node014.lol.ru:localhost/127....
[01/Apr/2013 07:14:19 +0000] thrift_util DEBUG Thrift call: <class
'hadoop.api.jobtracker.Jobtracker.Client'>.getJobConfXML(args=(RequestContext(confOptions={'effective_user':
u'hdfs'}), ThriftJobID(asString='job_201303201339_0025',
jobTrackerID='201303201339', jobID=25)), kwargs={})
[01/Apr/2013 07:14:19 +0000] thrift_util DEBUG Thrift call <class
'hadoop.api.jobtracker.Jobtracker.Client'>.getJobConfXML returned in 4ms:
'<?xml version="1.0" encoding="UTF-8"
standalone="no"?><configuration>\n<property><name>mapred.job.restart.recover</name><value>true</value><source>programatically</source><source>/data/disk0/mapred/jt/jobTracker/job_201303201339_0025.xml</source></property>\n<property><name>job.end.retry.interval</name><value>30000</value><source>programatically</source><source>/data/disk0/mapred/jt/jobTracker/job_201303201339_0025.xml</source></property>\n<property><name>mapred.job.tracker.retiredjobs.cache.size</name><value>1000</value><source>programatically</source><source>/data/disk0/mapred/jt/jobTracker/job_201303201339_0025.xml</source></property>\n<property><name>mapred.queue.default.acl-administer-jobs</name><value>*</value><source>programatically</source><source>/data/disk0/mapred/jt/jobTracker/job_201303201339_0025.xml</source></property>\n<property><name>dfs.image.transfer.bandwidthPerSec</name><value>0</value><source>programatically</source><source>/data/disk0/mapred/jt/jobTracker/job_201...
[01/Apr/2013 07:14:19 +0000] http_client DEBUG GET
http://prod-node015.lol.ru:50070/webhdfs/v1/staging/landing/source/protei/http/2013/03/27/02?op=GETFILESTATUS&user.name=hue&doas=hdfs
[01/Apr/2013 07:14:19 +0000] resource DEBUG GET Got response:
{"FileStatus":{"accessTime":0,"b...
[01/Apr/2013 07:14:19 +0000] http_client DEBUG GET
http://prod-node015.lol.ru:50070/webhdfs/v1/masterdata/source/protei/http/archive/2013/03/27/02?op=GETFILESTATUS&user.name=hue&doas=hdfs
[01/Apr/2013 07:14:19 +0000] resource DEBUG GET Got response:
{"FileStatus":{"accessTime":0,"b...
[01/Apr/2013 07:14:19 +0000] thrift_util DEBUG Thrift call: <class
'hadoop.api.jobtracker.Jobtracker.Client'>.getJobCounterRollups(args=(RequestContext(confOptions={'effective_user':
u'hdfs'}), ThriftJobID(asString='job_201303201339_0025',
jobTrackerID='201303201339', jobID=25)), kwargs={})
[01/Apr/2013 07:14:19 +0000] thrift_util DEBUG Thrift call <class
'hadoop.api.jobtracker.Jobtracker.Client'>.getJobCounterRollups returned in
19ms:
ThriftJobCounterRollups(reduceCounters=ThriftGroupList(groups=[ThriftCounterGroup(displayName='File
System Counters', name='org.apache.hadoop.mapreduce.FileSystemCounter',
counters={'FILE: Number of bytes read': ThriftCounter(displayName='FILE:
Number of bytes read', name='FILE_BYTES_READ', value=3360), 'HDFS: Number
of write operations': ThriftCounter(displayName='HDFS: Number of write
operations', name='HDFS_WRITE_OPS', value=168), 'FILE: Number of read
operations': ThriftCounter(displayName='FILE: Number of read operations',
name='FILE_READ_OPS', value=0), 'HDFS: Number of bytes read':
ThriftCounter(displayName='HDFS: Number of bytes read',
name='HDFS_BYTES_READ', value=0), 'HDFS: Number of read operations':
ThriftCounter(displayName='HDFS: Number of read operations',
name='HDFS_READ_OPS', value=103), 'FILE: Number of bytes written':
ThriftCounter(displayName='FILE: Number of bytes written',
name='FILE_BYTES_WRITTEN', value=29347909), 'HDFS: Number of large read
operations': ThriftC...
[01/Apr/2013 07:14:19 +0000] access INFO 10.66.49.134 hdfs - "GET
/debug/check_config_ajax HTTP/1.0"
[01/Apr/2013 07:15:25 +0000] access WARNING 10.66.49.134 hdfs - "GET
/logs HTTP/1.0"
[01/Apr/2013 07:15:33 +0000] access WARNING 10.66.49.134 hdfs - "GET
/download_logs HTTP/1.0"


What does it mean?
Few hours ago I could see MapReduce logs through Hue interface.

Search Discussions

  • Serega Sheypak at Apr 4, 2013 at 2:05 pm
    Hi, hue version is: Hue 2.2.0

    [devops@pnode034 ~]$ hadoop version
    Hadoop 2.0.0-cdh4.2.0
    Subversion
    file:///data/1/jenkins/workspace/generic-package-rhel64-6-0/topdir/BUILD/hadoop-2.0.0-cdh4.2.0/src/hadoop-common-project/hadoop-common
    -r 8bce4bd28a464e0a92950c50ba01a9deb1d85686
    Compiled by jenkins on Fri Feb 15 11:13:32 PST 2013
    From source with checksum 3eefc211a14ac7b6e764d6ded2eeeb26

    I didn't understand:
    " content of the above page" which one page?


    2013/4/1 Romain Rigaux <romain@cloudera.com>
    The page
    http://prod-node014.lol.ru:50060/tasklog?attemptid=attempt_201303201339_0022_r_000167_0probably produces some HTML with missing parts so Hue can't understand it.
    It should not happen with some other jobs. This handling was fixed in next
    month release.

    Could you share your Hadoop and Hue version and the content of the above
    page so that we can reproduce/test it?

    Romain

    On Mon, Apr 1, 2013 at 7:17 AM, Serega Sheypak wrote:

    Hi, Suddenly, I've started to get errors while trying to view logs
    through HUE admin:

    http://node11.lol.ru:8888/jobbrowser/jobs/job_201303201339_0025/single_logs

    I do get such stacktrace:

    [01/Apr/2013 07:05:40 +0000] access INFO 10.66.49.134 hdfs -
    "GET /debug/check_config_ajax HTTP/1.0"
    [01/Apr/2013 07:05:43 +0000] access INFO 10.66.49.134 hdfs -
    "GET
    /jobbrowser/jobs/job_201303201339_0025/tasks/task_201303201339_0025_r_000164
    HTTP/1.0"
    [01/Apr/2013 07:05:43 +0000] thrift_util DEBUG Thrift call: <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getJob(args=(RequestContext(confOptions={'effective_user':
    u'hdfs'}), ThriftJobID(asString=u'job_201303201339_0025',
    jobTrackerID=u'201303201339', jobID=25)), kwargs={})
    [01/Apr/2013 07:05:43 +0000] thrift_util DEBUG Thrift call <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getJob returned in 53ms:
    ThriftJobInProgress(profile=ThriftJobProfile(jobFile='hdfs://
    prod-node015.lol.ru:8020/user/hdfs/.staging/job_201303201339_0025/job.xml',
    queueName='default', user='hdfs',
    name='oozie:action:T=map-reduce:W=Url-rating-subworkflow:A=Url-rating-subworkflow-run:ID=0000021-130320135309911-oozie-oozi-W',
    jobID=ThriftJobID(asString='job_201303201339_0025',
    jobTrackerID='201303201339', jobID=25)),
    status=ThriftJobStatus(cleanupProgress=1.0, reduceProgress=1.0, runState=2,
    jobID=ThriftJobID(asString='job_201303201339_0025',
    jobTrackerID='201303201339', jobID=25), priority=2, user='hdfs',
    startTime=1364825038848, setupProgress=1.0, mapProgress=1.0,
    schedulingInfo='NA'), tasks=ThriftTaskInProgressList(numTotalTasks=205,
    tasks=[ThriftTaskInProgress(runningAttempts=[],
    taskStatuses={'attempt_201303201339_0025_m_000035_0':
    ThriftTaskStatus(finishTime=1364825088030, stateString='cleanup',
    startTime=1364825085984, sortFinishTime=0,
    taskTracker='tracker_prod-node014.lol.ru:localhost/127....
    [01/Apr/2013 07:05:43 +0000] thrift_util DEBUG Thrift call: <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getTask(args=(RequestContext(confOptions={'effective_user':
    u'hdfs'}), ThriftTaskID(asString=None, taskType=1, taskID=164,
    jobID=ThriftJobID(asString=None, jobTrackerID=u'201303201339', jobID=25))),
    kwargs={})
    [01/Apr/2013 07:05:43 +0000] thrift_util DEBUG Thrift call <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getTask returned in 6ms:
    ThriftTaskInProgress(runningAttempts=[],
    taskStatuses={'attempt_201303201339_0025_r_000164_0':
    ThriftTaskStatus(finishTime=1364825078834, stateString='reduce > reduce',
    startTime=1364825069066, sortFinishTime=1364825077527,
    taskTracker='tracker_prod-node029.lol.ru:localhost/127.0.0.1:43079',
    state=1, shuffleFinishTime=1364825076674, mapFinishTime=0,
    taskID=ThriftTaskAttemptID(asString='attempt_201303201339_0025_r_000164_0',
    attemptID=0,
    taskID=ThriftTaskID(asString='task_201303201339_0025_r_000164', taskType=1,
    taskID=164, jobID=ThriftJobID(asString='job_201303201339_0025',
    jobTrackerID='201303201339', jobID=25))), diagnosticInfo='', phase=4,
    progress=1.0, outputSize=-1,
    counters=ThriftGroupList(groups=[ThriftCounterGroup(displayName='File
    System Counters', name='org.apache.hadoop.mapreduce.FileSystemCounter',
    counters={'FILE: Number of bytes read': ThriftCounter(displayName='FILE:
    Number of bytes read', name='FILE_BYTES_READ', value=20), 'HDFS: Number of
    write operations': Thr...
    [01/Apr/2013 07:05:44 +0000] access INFO 10.66.49.134 hdfs -
    "GET /debug/check_config_ajax HTTP/1.0"
    [01/Apr/2013 07:05:54 +0000] middleware DEBUG No desktop_app known
    for request.
    [01/Apr/2013 07:05:54 +0000] access INFO 10.66.49.134 hdfs -
    "GET /jobbrowser/ HTTP/1.0"
    [01/Apr/2013 07:05:54 +0000] thrift_util DEBUG Thrift call: <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getAllJobs(args=(RequestContext(confOptions={'effective_user':
    u'hdfs'}),), kwargs={})
    [01/Apr/2013 07:05:54 +0000] thrift_util DEBUG Thrift call <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getAllJobs returned in 6ms:
    ThriftJobList(jobs=[ThriftJobInProgress(profile=ThriftJobProfile(jobFile='hdfs://
    prod-node015.lol.ru:8020/user/devops/.staging/job_201303201339_0002/job.xml',
    queueName='default', user='devops',
    name='oozie:action:T=map-reduce:W=Url-rating-subworkflow:A=Url-rating-subworkflow-run:ID=0000006-130320135309911-oozie-oozi-W',
    jobID=ThriftJobID(asString='job_201303201339_0002',
    jobTrackerID='201303201339', jobID=2)),
    status=ThriftJobStatus(cleanupProgress=1.0, reduceProgress=1.0, runState=3,
    jobID=ThriftJobID(asString='job_201303201339_0002',
    jobTrackerID='201303201339', jobID=2), priority=2, user='devops',
    startTime=1364819345925, setupProgress=1.0, mapProgress=1.0,
    schedulingInfo='NA'), tasks=None, desiredMaps=18, desiredReduces=168,
    finishedMaps=0, finishedReduces=0,
    jobID=ThriftJobID(asString='job_201303201339_0002',
    jobTrackerID='201303201339', jobID=2), priority=2,
    launchTime=1364819346297, startTime=1364819345925,
    finishTime=1364819398372), ThriftJobInProgress(profile=ThriftJo...
    [01/Apr/2013 07:05:55 +0000] access INFO 10.66.49.134 hdfs -
    "GET /debug/check_config_ajax HTTP/1.0"
    [01/Apr/2013 07:05:55 +0000] access DEBUG 10.66.49.134 hdfs -
    "GET /static/art/datatables/sort_desc.png HTTP/1.0"
    [01/Apr/2013 07:06:06 +0000] access INFO 10.66.49.134 hdfs -
    "GET /jobbrowser/jobs/job_201303201339_0022 HTTP/1.0"
    [01/Apr/2013 07:06:06 +0000] thrift_util DEBUG Thrift call: <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getJob(args=(RequestContext(confOptions={'effective_user':
    u'hdfs'}), ThriftJobID(asString=u'job_201303201339_0022',
    jobTrackerID=u'201303201339', jobID=22)), kwargs={})
    [01/Apr/2013 07:06:06 +0000] thrift_util DEBUG Thrift call <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getJob returned in 32ms:
    ThriftJobInProgress(profile=ThriftJobProfile(jobFile='hdfs://
    prod-node015.lol.ru:8020/user/hdfs/.staging/job_201303201339_0022/job.xml',
    queueName='default', user='hdfs',
    name='oozie:action:T=map-reduce:W=Url-rating-subworkflow:A=Url-rating-subworkflow-run:ID=0000020-130320135309911-oozie-oozi-W',
    jobID=ThriftJobID(asString='job_201303201339_0022',
    jobTrackerID='201303201339', jobID=22)),
    status=ThriftJobStatus(cleanupProgress=1.0, reduceProgress=1.0, runState=2,
    jobID=ThriftJobID(asString='job_201303201339_0022',
    jobTrackerID='201303201339', jobID=22), priority=2, user='hdfs',
    startTime=1364824738956, setupProgress=1.0, mapProgress=1.0,
    schedulingInfo='NA'), tasks=ThriftTaskInProgressList(numTotalTasks=188,
    tasks=[ThriftTaskInProgress(runningAttempts=[],
    taskStatuses={'attempt_201303201339_0022_m_000018_0':
    ThriftTaskStatus(finishTime=1364824794989, stateString='cleanup',
    startTime=1364824793165, sortFinishTime=0,
    taskTracker='tracker_prod-node034.lol.ru:localhost/127....
    [01/Apr/2013 07:06:06 +0000] thrift_util DEBUG Thrift call: <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getJobConfXML(args=(RequestContext(confOptions={'effective_user':
    u'hdfs'}), ThriftJobID(asString='job_201303201339_0022',
    jobTrackerID='201303201339', jobID=22)), kwargs={})
    [01/Apr/2013 07:06:06 +0000] thrift_util DEBUG Thrift call <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getJobConfXML returned in 4ms:
    '<?xml version="1.0" encoding="UTF-8"
    standalone="no"?><configuration>\n<property><name>mapred.job.restart.recover</name><value>true</value><source>programatically</source><source>/data/disk0/mapred/jt/jobTracker/job_201303201339_0022.xml</source></property>\n<property><name>job.end.retry.interval</name><value>30000</value><source>programatically</source><source>/data/disk0/mapred/jt/jobTracker/job_201303201339_0022.xml</source></property>\n<property><name>mapred.job.tracker.retiredjobs.cache.size</name><value>1000</value><source>programatically</source><source>/data/disk0/mapred/jt/jobTracker/job_201303201339_0022.xml</source></property>\n<property><name>mapred.queue.default.acl-administer-jobs</name><value>*</value><source>programatically</source><source>/data/disk0/mapred/jt/jobTracker/job_201303201339_0022.xml</source></property>\n<property><name>dfs.image.transfer.bandwidthPerSec</name><value>0</value><source>programatically</source><source>/data/disk0/mapred/jt/jobTracker/job_201...
    [01/Apr/2013 07:06:06 +0000] http_client DEBUG GET
    http://prod-node015.lol.ru:50070/webhdfs/v1/staging/landing/source/protei/http/2013/03/27/01?op=GETFILESTATUS&user.name=hue&doas=hdfs
    [01/Apr/2013 07:06:06 +0000] resource DEBUG GET Got response:
    {"FileStatus":{"accessTime":0,"b...
    [01/Apr/2013 07:06:06 +0000] http_client DEBUG GET
    http://prod-node015.lol.ru:50070/webhdfs/v1/masterdata/source/protei/http/archive/2013/03/27/01?op=GETFILESTATUS&user.name=hue&doas=hdfs
    [01/Apr/2013 07:06:06 +0000] resource DEBUG GET Got response:
    {"FileStatus":{"accessTime":0,"b...
    [01/Apr/2013 07:06:06 +0000] thrift_util DEBUG Thrift call: <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getJobCounterRollups(args=(RequestContext(confOptions={'effective_user':
    u'hdfs'}), ThriftJobID(asString='job_201303201339_0022',
    jobTrackerID='201303201339', jobID=22)), kwargs={})
    [01/Apr/2013 07:06:06 +0000] thrift_util DEBUG Thrift call <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getJobCounterRollups returned in
    18ms:
    ThriftJobCounterRollups(reduceCounters=ThriftGroupList(groups=[ThriftCounterGroup(displayName='File
    System Counters', name='org.apache.hadoop.mapreduce.FileSystemCounter',
    counters={'FILE: Number of bytes read': ThriftCounter(displayName='FILE:
    Number of bytes read', name='FILE_BYTES_READ', value=3360), 'HDFS: Number
    of write operations': ThriftCounter(displayName='HDFS: Number of write
    operations', name='HDFS_WRITE_OPS', value=168), 'FILE: Number of read
    operations': ThriftCounter(displayName='FILE: Number of read operations',
    name='FILE_READ_OPS', value=0), 'HDFS: Number of bytes read':
    ThriftCounter(displayName='HDFS: Number of bytes read',
    name='HDFS_BYTES_READ', value=0), 'HDFS: Number of read operations':
    ThriftCounter(displayName='HDFS: Number of read operations',
    name='HDFS_READ_OPS', value=21), 'FILE: Number of bytes written':
    ThriftCounter(displayName='FILE: Number of bytes written',
    name='FILE_BYTES_WRITTEN', value=29347957), 'HDFS: Number of large read
    operations': ThriftCo...
    [01/Apr/2013 07:06:07 +0000] access INFO 10.66.49.134 hdfs -
    "GET /debug/check_config_ajax HTTP/1.0"
    [01/Apr/2013 07:06:16 +0000] access INFO 10.66.49.134 hdfs -
    "GET
    /jobbrowser/jobs/job_201303201339_0022/tasks/task_201303201339_0022_r_000167/attempts/attempt_201303201339_0022_r_000167_0/logs
    HTTP/1.0"
    [01/Apr/2013 07:06:16 +0000] thrift_util DEBUG Thrift call: <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getJob(args=(RequestContext(confOptions={'effective_user':
    u'hdfs'}), ThriftJobID(asString=u'job_201303201339_0022',
    jobTrackerID=u'201303201339', jobID=22)), kwargs={})
    [01/Apr/2013 07:06:16 +0000] thrift_util DEBUG Thrift call <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getJob returned in 53ms:
    ThriftJobInProgress(profile=ThriftJobProfile(jobFile='hdfs://
    prod-node015.lol.ru:8020/user/hdfs/.staging/job_201303201339_0022/job.xml',
    queueName='default', user='hdfs',
    name='oozie:action:T=map-reduce:W=Url-rating-subworkflow:A=Url-rating-subworkflow-run:ID=0000020-130320135309911-oozie-oozi-W',
    jobID=ThriftJobID(asString='job_201303201339_0022',
    jobTrackerID='201303201339', jobID=22)),
    status=ThriftJobStatus(cleanupProgress=1.0, reduceProgress=1.0, runState=2,
    jobID=ThriftJobID(asString='job_201303201339_0022',
    jobTrackerID='201303201339', jobID=22), priority=2, user='hdfs',
    startTime=1364824738956, setupProgress=1.0, mapProgress=1.0,
    schedulingInfo='NA'), tasks=ThriftTaskInProgressList(numTotalTasks=188,
    tasks=[ThriftTaskInProgress(runningAttempts=[],
    taskStatuses={'attempt_201303201339_0022_m_000018_0':
    ThriftTaskStatus(finishTime=1364824794989, stateString='cleanup',
    startTime=1364824793165, sortFinishTime=0,
    taskTracker='tracker_prod-node034.lol.ru:localhost/127....
    [01/Apr/2013 07:06:16 +0000] thrift_util DEBUG Thrift call: <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getTask(args=(RequestContext(confOptions={'effective_user':
    u'hdfs'}), ThriftTaskID(asString=None, taskType=1, taskID=167,
    jobID=ThriftJobID(asString=None, jobTrackerID=u'201303201339', jobID=22))),
    kwargs={})
    [01/Apr/2013 07:06:16 +0000] thrift_util DEBUG Thrift call <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getTask returned in 6ms:
    ThriftTaskInProgress(runningAttempts=[],
    taskStatuses={'attempt_201303201339_0022_r_000167_0':
    ThriftTaskStatus(finishTime=1364824779167, stateString='reduce > reduce',
    startTime=1364824766546, sortFinishTime=1364824777715,
    taskTracker='tracker_prod-node014.lol.ru:localhost/127.0.0.1:47833',
    state=1, shuffleFinishTime=1364824777120, mapFinishTime=0,
    taskID=ThriftTaskAttemptID(asString='attempt_201303201339_0022_r_000167_0',
    attemptID=0,
    taskID=ThriftTaskID(asString='task_201303201339_0022_r_000167', taskType=1,
    taskID=167, jobID=ThriftJobID(asString='job_201303201339_0022',
    jobTrackerID='201303201339', jobID=22))), diagnosticInfo='', phase=4,
    progress=1.0, outputSize=-1,
    counters=ThriftGroupList(groups=[ThriftCounterGroup(displayName='File
    System Counters', name='org.apache.hadoop.mapreduce.FileSystemCounter',
    counters={'FILE: Number of bytes read': ThriftCounter(displayName='FILE:
    Number of bytes read', name='FILE_BYTES_READ', value=20), 'HDFS: Number of
    write operations': Thr...
    [01/Apr/2013 07:06:16 +0000] thrift_util DEBUG Thrift call: <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getTracker(args=(RequestContext(confOptions={'effective_user':
    u'hdfs'}), 'tracker_prod-node014.lol.ru:localhost/127.0.0.1:47833'),
    kwargs={})
    [01/Apr/2013 07:06:16 +0000] thrift_util DEBUG Thrift call <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getTracker returned in 1ms:
    ThriftTaskTrackerStatus(taskReports=None, availableSpace=2041132032000,
    totalVirtualMemory=139483734016, failureCount=32, httpPort=50060, host='
    prod-node014.lol.ru', totalPhysicalMemory=135290486784, reduceCount=0,
    lastSeen=1364825174380, trackerName='tracker_prod-node014.lol.ru:
    localhost/127.0.0.1:47833', mapCount=0, maxReduceTasks=16,
    maxMapTasks=32)
    [01/Apr/2013 07:06:16 +0000] models INFO Retrieving
    http://prod-node014.lol.ru:50060/tasklog?attemptid=attempt_201303201339_0022_r_000167_0
    [01/Apr/2013 07:06:16 +0000] middleware INFO Processing exception:
    Unexpected end tag : td, line 7, column 12: Traceback (most recent call
    last):
    File
    "/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/share/hue/build/env/lib/python2.6/site-packages/Django-1.2.3-py2.6.egg/django/core/handlers/base.py",
    line 100, in get_response
    response = callback(request, *callback_args, **callback_kwargs)
    File
    "/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/share/hue/apps/jobbrowser/src/jobbrowser/views.py",
    line 62, in decorate
    return view_func(request, *args, **kwargs)
    File
    "/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/share/hue/apps/jobbrowser/src/jobbrowser/views.py",
    line 290, in single_task_attempt_logs
    logs += [ section.strip() for section in attempt.get_task_log() ]
    File
    "/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/share/hue/apps/jobbrowser/src/jobbrowser/models.py",
    line 451, in get_task_log
    et = lxml.html.parse(data)
    File
    "/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/share/hue/build/env/lib/python2.6/site-packages/lxml-2.2.2-py2.6-linux-x86_64.egg/lxml/html/__init__.py",
    line 661, in parse
    return etree.parse(filename_or_url, parser, base_url=base_url, **kw)
    File "lxml.etree.pyx", line 2698, in lxml.etree.parse
    (src/lxml/lxml.etree.c:49590)
    File "parser.pxi", line 1513, in lxml.etree._parseDocument
    (src/lxml/lxml.etree.c:71423)
    File "parser.pxi", line 1543, in lxml.etree._parseFilelikeDocument
    (src/lxml/lxml.etree.c:71733)
    File "parser.pxi", line 1426, in lxml.etree._parseDocFromFilelike
    (src/lxml/lxml.etree.c:70648)
    File "parser.pxi", line 997, in
    lxml.etree._BaseParser._parseDocFromFilelike (src/lxml/lxml.etree.c:67944)
    File "parser.pxi", line 539, in
    lxml.etree._ParserContext._handleParseResultDoc
    (src/lxml/lxml.etree.c:63820)
    File "parser.pxi", line 625, in lxml.etree._handleParseResult
    (src/lxml/lxml.etree.c:64741)
    File "parser.pxi", line 565, in lxml.etree._raiseParseError
    (src/lxml/lxml.etree.c:64084)
    XMLSyntaxError: Unexpected end tag : td, line 7, column 12

    [01/Apr/2013 07:06:16 +0000] access INFO 10.66.49.134 hdfs -
    "GET /debug/check_config_ajax HTTP/1.0"
    [01/Apr/2013 07:06:20 +0000] access WARNING 10.66.49.134 hdfs -
    "GET /logs HTTP/1.0"
    [01/Apr/2013 07:06:31 +0000] access INFO 10.66.49.134 hdfs -
    "GET /debug/check_config_ajax HTTP/1.0"
    [01/Apr/2013 07:14:18 +0000] access INFO 10.66.49.134 hdfs -
    "GET /jobbrowser/jobs/job_201303201339_0025 HTTP/1.0"
    [01/Apr/2013 07:14:18 +0000] thrift_util DEBUG Thrift call: <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getJob(args=(RequestContext(confOptions={'effective_user':
    u'hdfs'}), ThriftJobID(asString=u'job_201303201339_0025',
    jobTrackerID=u'201303201339', jobID=25)), kwargs={})
    [01/Apr/2013 07:14:18 +0000] thrift_util INFO Thrift exception;
    retrying: None
    [01/Apr/2013 07:14:18 +0000] thrift_util DEBUG Thrift call: <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getJob(args=(RequestContext(confOptions={'effective_user':
    u'hdfs'}), ThriftJobID(asString=u'job_201303201339_0025',
    jobTrackerID=u'201303201339', jobID=25)), kwargs={})
    [01/Apr/2013 07:14:19 +0000] thrift_util DEBUG Thrift call <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getJob returned in 57ms:
    ThriftJobInProgress(profile=ThriftJobProfile(jobFile='hdfs://
    prod-node015.lol.ru:8020/user/hdfs/.staging/job_201303201339_0025/job.xml',
    queueName='default', user='hdfs',
    name='oozie:action:T=map-reduce:W=Url-rating-subworkflow:A=Url-rating-subworkflow-run:ID=0000021-130320135309911-oozie-oozi-W',
    jobID=ThriftJobID(asString='job_201303201339_0025',
    jobTrackerID='201303201339', jobID=25)),
    status=ThriftJobStatus(cleanupProgress=1.0, reduceProgress=1.0, runState=2,
    jobID=ThriftJobID(asString='job_201303201339_0025',
    jobTrackerID='201303201339', jobID=25), priority=2, user='hdfs',
    startTime=1364825038848, setupProgress=1.0, mapProgress=1.0,
    schedulingInfo='NA'), tasks=ThriftTaskInProgressList(numTotalTasks=205,
    tasks=[ThriftTaskInProgress(runningAttempts=[],
    taskStatuses={'attempt_201303201339_0025_m_000035_0':
    ThriftTaskStatus(finishTime=1364825088030, stateString='cleanup',
    startTime=1364825085984, sortFinishTime=0,
    taskTracker='tracker_prod-node014.lol.ru:localhost/127....
    [01/Apr/2013 07:14:19 +0000] thrift_util DEBUG Thrift call: <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getJobConfXML(args=(RequestContext(confOptions={'effective_user':
    u'hdfs'}), ThriftJobID(asString='job_201303201339_0025',
    jobTrackerID='201303201339', jobID=25)), kwargs={})
    [01/Apr/2013 07:14:19 +0000] thrift_util DEBUG Thrift call <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getJobConfXML returned in 4ms:
    '<?xml version="1.0" encoding="UTF-8"
    standalone="no"?><configuration>\n<property><name>mapred.job.restart.recover</name><value>true</value><source>programatically</source><source>/data/disk0/mapred/jt/jobTracker/job_201303201339_0025.xml</source></property>\n<property><name>job.end.retry.interval</name><value>30000</value><source>programatically</source><source>/data/disk0/mapred/jt/jobTracker/job_201303201339_0025.xml</source></property>\n<property><name>mapred.job.tracker.retiredjobs.cache.size</name><value>1000</value><source>programatically</source><source>/data/disk0/mapred/jt/jobTracker/job_201303201339_0025.xml</source></property>\n<property><name>mapred.queue.default.acl-administer-jobs</name><value>*</value><source>programatically</source><source>/data/disk0/mapred/jt/jobTracker/job_201303201339_0025.xml</source></property>\n<property><name>dfs.image.transfer.bandwidthPerSec</name><value>0</value><source>programatically</source><source>/data/disk0/mapred/jt/jobTracker/job_201...
    [01/Apr/2013 07:14:19 +0000] http_client DEBUG GET
    http://prod-node015.lol.ru:50070/webhdfs/v1/staging/landing/source/protei/http/2013/03/27/02?op=GETFILESTATUS&user.name=hue&doas=hdfs
    [01/Apr/2013 07:14:19 +0000] resource DEBUG GET Got response:
    {"FileStatus":{"accessTime":0,"b...
    [01/Apr/2013 07:14:19 +0000] http_client DEBUG GET
    http://prod-node015.lol.ru:50070/webhdfs/v1/masterdata/source/protei/http/archive/2013/03/27/02?op=GETFILESTATUS&user.name=hue&doas=hdfs
    [01/Apr/2013 07:14:19 +0000] resource DEBUG GET Got response:
    {"FileStatus":{"accessTime":0,"b...
    [01/Apr/2013 07:14:19 +0000] thrift_util DEBUG Thrift call: <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getJobCounterRollups(args=(RequestContext(confOptions={'effective_user':
    u'hdfs'}), ThriftJobID(asString='job_201303201339_0025',
    jobTrackerID='201303201339', jobID=25)), kwargs={})
    [01/Apr/2013 07:14:19 +0000] thrift_util DEBUG Thrift call <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getJobCounterRollups returned in
    19ms:
    ThriftJobCounterRollups(reduceCounters=ThriftGroupList(groups=[ThriftCounterGroup(displayName='File
    System Counters', name='org.apache.hadoop.mapreduce.FileSystemCounter',
    counters={'FILE: Number of bytes read': ThriftCounter(displayName='FILE:
    Number of bytes read', name='FILE_BYTES_READ', value=3360), 'HDFS: Number
    of write operations': ThriftCounter(displayName='HDFS: Number of write
    operations', name='HDFS_WRITE_OPS', value=168), 'FILE: Number of read
    operations': ThriftCounter(displayName='FILE: Number of read operations',
    name='FILE_READ_OPS', value=0), 'HDFS: Number of bytes read':
    ThriftCounter(displayName='HDFS: Number of bytes read',
    name='HDFS_BYTES_READ', value=0), 'HDFS: Number of read operations':
    ThriftCounter(displayName='HDFS: Number of read operations',
    name='HDFS_READ_OPS', value=103), 'FILE: Number of bytes written':
    ThriftCounter(displayName='FILE: Number of bytes written',
    name='FILE_BYTES_WRITTEN', value=29347909), 'HDFS: Number of large read
    operations': ThriftC...
    [01/Apr/2013 07:14:19 +0000] access INFO 10.66.49.134 hdfs -
    "GET /debug/check_config_ajax HTTP/1.0"
    [01/Apr/2013 07:15:25 +0000] access WARNING 10.66.49.134 hdfs -
    "GET /logs HTTP/1.0"
    [01/Apr/2013 07:15:33 +0000] access WARNING 10.66.49.134 hdfs -
    "GET /download_logs HTTP/1.0"


    What does it mean?
    Few hours ago I could see MapReduce logs through Hue interface.
  • Serega Sheypak at Apr 5, 2013 at 2:16 pm
    Hi. Is there any workaround? We are behind firewall which hides production
    cluster. It's impossible to open hole for each tasknode UI to see the
    logs....


    2013/4/4 Serega Sheypak <serega.sheypak@gmail.com>
    I can see these logs through jobtracker interface without any problem. How
    can we fix it and see logs through HUE web UI? It's very comfortable to do
    it there.


    2013/4/4 Serega Sheypak <serega.sheypak@gmail.com>
    Ouups, the same thing for Hive action logs.
    I do see hive action step in Hue OOize dashboard. I see error text:

    addHivePartitionhive
    org/apache/hadoop/hive/cli/CliDriver

    but when I try to see logs, I get the same error:


    http://prod-beeswax.lol.ru:8888/jobbrowser/jobs/job_201303201339_0047/single_logs

    04/Apr/2013 08:44:28 +0000] thrift_util DEBUG Thrift call: <class 'hadoop.api.jobtracker.Jobtracker.Client'>.getJob(args=(RequestContext(confOptions={'effective_user': u'hdfs'}), ThriftJobID(asString=u'job_201303201339_0047', jobTrackerID=u'201303201339', jobID=47)), kwargs={})

    [04/Apr/2013 08:44:28 +0000] access INFO 10.100.231.188 hdfs - "GET /jobbrowser/jobs/job_201303201339_0047/single_logs HTTP/1.0"

    [04/Apr/2013 08:42:49 +0000] access WARNING 10.100.231.188 hdfs - "GET /logs HTTP/1.0"

    [04/Apr/2013 08:42:45 +0000] access INFO 10.100.231.188 hdfs - "GET /debug/check_config_ajax HTTP/1.0"

    [04/Apr/2013 08:42:44 +0000] middleware INFO Processing exception: Unexpected end tag : td, line 748, column 12: Traceback (most recent call last):
    File "/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/share/hue/build/env/lib/python2.6/site-packages/Django-1.2.3-py2.6.egg/django/core/handlers/base.py", line 100, in get_response
    response = callback(request, *callback_args, **callback_kwargs)
    File "/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/share/hue/apps/jobbrowser/src/jobbrowser/views.py", line 62, in decorate
    return view_func(request, *args, **kwargs)
    File "/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/share/hue/apps/jobbrowser/src/jobbrowser/views.py", line 193, in job_single_logs
    return single_task_attempt_logs(request, **{'job': job.jobId, 'taskid': task.taskId, 'attemptid': task.taskAttemptIds[-1]})
    File "/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/share/hue/apps/jobbrowser/src/jobbrowser/views.py", line 62, in decorate
    return view_func(request, *args, **kwargs)
    File "/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/share/hue/apps/jobbrowser/src/jobbrowser/views.py", line 290, in single_task_attempt_logs
    logs += [ section.strip() for section in attempt.get_task_log() ]
    File "/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/share/hue/apps/jobbrowser/src/jobbrowser/models.py", line 451, in get_task_log
    et = lxml.html.parse(data)
    File "/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/share/hue/build/env/lib/python2.6/site-packages/lxml-2.2.2-py2.6-linux-x86_64.egg/lxml/html/__init__.py", line 661, in parse
    return etree.parse(filename_or_url, parser, base_url=base_url, **kw)
    File "lxml.etree.pyx", line 2698, in lxml.etree.parse (src/lxml/lxml.etree.c:49590)
    File "parser.pxi", line 1513, in lxml.etree._parseDocument (src/lxml/lxml.etree.c:71423)
    File "parser.pxi", line 1543, in lxml.etree._parseFilelikeDocument (src/lxml/lxml.etree.c:71733)
    File "parser.pxi", line 1426, in lxml.etree._parseDocFromFilelike (src/lxml/lxml.etree.c:70648)
    File "parser.pxi", line 997, in lxml.etree._BaseParser._parseDocFromFilelike (src/lxml/lxml.etree.c:67944)
    File "parser.pxi", line 539, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:63820)
    File "parser.pxi", line 625, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:64741)
    File "parser.pxi", line 565, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:64084)
    XMLSyntaxError: Unexpected end tag : td, line 748, column 12

    [04/Apr/2013 08:42:44 +0000] models INFO Retrieving http://prod-node032.lol.ru:50060/tasklog?attemptid=attempt_201303201339_0047_m_000000_0

    [04/Apr/2013 08:42:44 +0000] thrift_util DEBUG Thrift call <class 'hadoop.api.jobtracker.Jobtracker.Client'>.getTracker returned in 1ms: ThriftTaskTrackerStatus(taskReports=None, availableSpace=2026570104832, totalVirtualMemory=139483734016, failureCount=31, httpPort=50060, host='prod-node032.lol.ru', totalPhysicalMemory=135290486784, reduceCount=0, lastSeen=1365090162808, trackerName='tracker_prod-node032.lol.ru:localhost/127.0.0.1:35365', mapCount=0, maxReduceTasks=16, maxMapTasks=32)





    2013/4/4 Serega Sheypak <serega.sheypak@gmail.com>
    Ok, I got it. I've visited

    /var/log/hadoop-0.20-mapreduce/userlogs/job_201303201339_0031/attempt_201303201339_0031_m_000000_0
    catalog on tasknode and inspected syslog file

    There are some unprintable characters from my mapper (logging does
    produce it)

    User-Agent: Mozilla/5.0 (iPad; CPU OS 6_0_1 like Mac OS X)
    AppleWebKit/536.26 (KHTML, q��c
    ������jw�0�d ���RŬ�P �\�ħ�ˢ���{� mo��/p�� �ishi%20GaQ�]

    The second problem is that this file is HUGE. 99% of data is corrupted.
    I've removed verbose logging and now try to rerun it.



    2013/4/4 Brian Burton <brian@cloudera.com>
    Hi Serega,

    The page Romain linked:


    http://prod-node014.lol.ru:50060/tasklog?attemptid=attempt_201303201339_0022_r_000167_0

    That should only be reachable internally for you, so we need you to
    provide the contents of that page.

    *Brian Burton*
    *Customer Operations Engineer*
    <http://www.cloudera.com>


    On Thu, Apr 4, 2013 at 10:05 AM, Serega Sheypak <
    serega.sheypak@gmail.com> wrote:
    Hi, hue version is: Hue 2.2.0

    [devops@pnode034 ~]$ hadoop version
    Hadoop 2.0.0-cdh4.2.0
    Subversion
    file:///data/1/jenkins/workspace/generic-package-rhel64-6-0/topdir/BUILD/hadoop-2.0.0-cdh4.2.0/src/hadoop-common-project/hadoop-common
    -r 8bce4bd28a464e0a92950c50ba01a9deb1d85686
    Compiled by jenkins on Fri Feb 15 11:13:32 PST 2013
    From source with checksum 3eefc211a14ac7b6e764d6ded2eeeb26

    I didn't understand:
    " content of the above page" which one page?


    2013/4/1 Romain Rigaux <romain@cloudera.com>
    The page
    http://prod-node014.lol.ru:50060/tasklog?attemptid=attempt_201303201339_0022_r_000167_0probably produces some HTML with missing parts so Hue can't understand it.
    It should not happen with some other jobs. This handling was fixed in next
    month release.

    Could you share your Hadoop and Hue version and the content of the
    above page so that we can reproduce/test it?

    Romain


    On Mon, Apr 1, 2013 at 7:17 AM, Serega Sheypak <
    serega.sheypak@gmail.com> wrote:
    Hi, Suddenly, I've started to get errors while trying to view logs
    through HUE admin:

    http://node11.lol.ru:8888/jobbrowser/jobs/job_201303201339_0025/single_logs

    I do get such stacktrace:

    [01/Apr/2013 07:05:40 +0000] access INFO 10.66.49.134 hdfs
    - "GET /debug/check_config_ajax HTTP/1.0"
    [01/Apr/2013 07:05:43 +0000] access INFO 10.66.49.134 hdfs
    - "GET
    /jobbrowser/jobs/job_201303201339_0025/tasks/task_201303201339_0025_r_000164
    HTTP/1.0"
    [01/Apr/2013 07:05:43 +0000] thrift_util DEBUG Thrift call:
    <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getJob(args=(RequestContext(confOptions={'effective_user':
    u'hdfs'}), ThriftJobID(asString=u'job_201303201339_0025',
    jobTrackerID=u'201303201339', jobID=25)), kwargs={})
    [01/Apr/2013 07:05:43 +0000] thrift_util DEBUG Thrift call
    <class 'hadoop.api.jobtracker.Jobtracker.Client'>.getJob returned in 53ms:
    ThriftJobInProgress(profile=ThriftJobProfile(jobFile='hdfs://
    prod-node015.lol.ru:8020/user/hdfs/.staging/job_201303201339_0025/job.xml',
    queueName='default', user='hdfs',
    name='oozie:action:T=map-reduce:W=Url-rating-subworkflow:A=Url-rating-subworkflow-run:ID=0000021-130320135309911-oozie-oozi-W',
    jobID=ThriftJobID(asString='job_201303201339_0025',
    jobTrackerID='201303201339', jobID=25)),
    status=ThriftJobStatus(cleanupProgress=1.0, reduceProgress=1.0, runState=2,
    jobID=ThriftJobID(asString='job_201303201339_0025',
    jobTrackerID='201303201339', jobID=25), priority=2, user='hdfs',
    startTime=1364825038848, setupProgress=1.0, mapProgress=1.0,
    schedulingInfo='NA'), tasks=ThriftTaskInProgressList(numTotalTasks=205,
    tasks=[ThriftTaskInProgress(runningAttempts=[],
    taskStatuses={'attempt_201303201339_0025_m_000035_0':
    ThriftTaskStatus(finishTime=1364825088030, stateString='cleanup',
    startTime=1364825085984, sortFinishTime=0,
    taskTracker='tracker_prod-node014.lol.ru:localhost/127....
    [01/Apr/2013 07:05:43 +0000] thrift_util DEBUG Thrift call:
    <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getTask(args=(RequestContext(confOptions={'effective_user':
    u'hdfs'}), ThriftTaskID(asString=None, taskType=1, taskID=164,
    jobID=ThriftJobID(asString=None, jobTrackerID=u'201303201339', jobID=25))),
    kwargs={})
    [01/Apr/2013 07:05:43 +0000] thrift_util DEBUG Thrift call
    <class 'hadoop.api.jobtracker.Jobtracker.Client'>.getTask returned in 6ms:
    ThriftTaskInProgress(runningAttempts=[],
    taskStatuses={'attempt_201303201339_0025_r_000164_0':
    ThriftTaskStatus(finishTime=1364825078834, stateString='reduce > reduce',
    startTime=1364825069066, sortFinishTime=1364825077527,
    taskTracker='tracker_prod-node029.lol.ru:localhost/127.0.0.1:43079',
    state=1, shuffleFinishTime=1364825076674, mapFinishTime=0,
    taskID=ThriftTaskAttemptID(asString='attempt_201303201339_0025_r_000164_0',
    attemptID=0,
    taskID=ThriftTaskID(asString='task_201303201339_0025_r_000164', taskType=1,
    taskID=164, jobID=ThriftJobID(asString='job_201303201339_0025',
    jobTrackerID='201303201339', jobID=25))), diagnosticInfo='', phase=4,
    progress=1.0, outputSize=-1,
    counters=ThriftGroupList(groups=[ThriftCounterGroup(displayName='File
    System Counters', name='org.apache.hadoop.mapreduce.FileSystemCounter',
    counters={'FILE: Number of bytes read': ThriftCounter(displayName='FILE:
    Number of bytes read', name='FILE_BYTES_READ', value=20), 'HDFS: Number of
    write operations': Thr...
    [01/Apr/2013 07:05:44 +0000] access INFO 10.66.49.134 hdfs
    - "GET /debug/check_config_ajax HTTP/1.0"
    [01/Apr/2013 07:05:54 +0000] middleware DEBUG No desktop_app
    known for request.
    [01/Apr/2013 07:05:54 +0000] access INFO 10.66.49.134 hdfs
    - "GET /jobbrowser/ HTTP/1.0"
    [01/Apr/2013 07:05:54 +0000] thrift_util DEBUG Thrift call:
    <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getAllJobs(args=(RequestContext(confOptions={'effective_user':
    u'hdfs'}),), kwargs={})
    [01/Apr/2013 07:05:54 +0000] thrift_util DEBUG Thrift call
    <class 'hadoop.api.jobtracker.Jobtracker.Client'>.getAllJobs returned in
    6ms:
    ThriftJobList(jobs=[ThriftJobInProgress(profile=ThriftJobProfile(jobFile='hdfs://
    prod-node015.lol.ru:8020/user/devops/.staging/job_201303201339_0002/job.xml',
    queueName='default', user='devops',
    name='oozie:action:T=map-reduce:W=Url-rating-subworkflow:A=Url-rating-subworkflow-run:ID=0000006-130320135309911-oozie-oozi-W',
    jobID=ThriftJobID(asString='job_201303201339_0002',
    jobTrackerID='201303201339', jobID=2)),
    status=ThriftJobStatus(cleanupProgress=1.0, reduceProgress=1.0, runState=3,
    jobID=ThriftJobID(asString='job_201303201339_0002',
    jobTrackerID='201303201339', jobID=2), priority=2, user='devops',
    startTime=1364819345925, setupProgress=1.0, mapProgress=1.0,
    schedulingInfo='NA'), tasks=None, desiredMaps=18, desiredReduces=168,
    finishedMaps=0, finishedReduces=0,
    jobID=ThriftJobID(asString='job_201303201339_0002',
    jobTrackerID='201303201339', jobID=2), priority=2,
    launchTime=1364819346297, startTime=1364819345925,
    finishTime=1364819398372), ThriftJobInProgress(profile=ThriftJo...
    [01/Apr/2013 07:05:55 +0000] access INFO 10.66.49.134 hdfs
    - "GET /debug/check_config_ajax HTTP/1.0"
    [01/Apr/2013 07:05:55 +0000] access DEBUG 10.66.49.134 hdfs
    - "GET /static/art/datatables/sort_desc.png HTTP/1.0"
    [01/Apr/2013 07:06:06 +0000] access INFO 10.66.49.134 hdfs
    - "GET /jobbrowser/jobs/job_201303201339_0022 HTTP/1.0"
    [01/Apr/2013 07:06:06 +0000] thrift_util DEBUG Thrift call:
    <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getJob(args=(RequestContext(confOptions={'effective_user':
    u'hdfs'}), ThriftJobID(asString=u'job_201303201339_0022',
    jobTrackerID=u'201303201339', jobID=22)), kwargs={})
    [01/Apr/2013 07:06:06 +0000] thrift_util DEBUG Thrift call
    <class 'hadoop.api.jobtracker.Jobtracker.Client'>.getJob returned in 32ms:
    ThriftJobInProgress(profile=ThriftJobProfile(jobFile='hdfs://
    prod-node015.lol.ru:8020/user/hdfs/.staging/job_201303201339_0022/job.xml',
    queueName='default', user='hdfs',
    name='oozie:action:T=map-reduce:W=Url-rating-subworkflow:A=Url-rating-subworkflow-run:ID=0000020-130320135309911-oozie-oozi-W',
    jobID=ThriftJobID(asString='job_201303201339_0022',
    jobTrackerID='201303201339', jobID=22)),
    status=ThriftJobStatus(cleanupProgress=1.0, reduceProgress=1.0, runState=2,
    jobID=ThriftJobID(asString='job_201303201339_0022',
    jobTrackerID='201303201339', jobID=22), priority=2, user='hdfs',
    startTime=1364824738956, setupProgress=1.0, mapProgress=1.0,
    schedulingInfo='NA'), tasks=ThriftTaskInProgressList(numTotalTasks=188,
    tasks=[ThriftTaskInProgress(runningAttempts=[],
    taskStatuses={'attempt_201303201339_0022_m_000018_0':
    ThriftTaskStatus(finishTime=1364824794989, stateString='cleanup',
    startTime=1364824793165, sortFinishTime=0,
    taskTracker='tracker_prod-node034.lol.ru:localhost/127....
    [01/Apr/2013 07:06:06 +0000] thrift_util DEBUG Thrift call:
    <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getJobConfXML(args=(RequestContext(confOptions={'effective_user':
    u'hdfs'}), ThriftJobID(asString='job_201303201339_0022',
    jobTrackerID='201303201339', jobID=22)), kwargs={})
    [01/Apr/2013 07:06:06 +0000] thrift_util DEBUG Thrift call
    <class 'hadoop.api.jobtracker.Jobtracker.Client'>.getJobConfXML returned in
    4ms: '<?xml version="1.0" encoding="UTF-8"
    standalone="no"?><configuration>\n<property><name>mapred.job.restart.recover</name><value>true</value><source>programatically</source><source>/data/disk0/mapred/jt/jobTracker/job_201303201339_0022.xml</source></property>\n<property><name>job.end.retry.interval</name><value>30000</value><source>programatically</source><source>/data/disk0/mapred/jt/jobTracker/job_201303201339_0022.xml</source></property>\n<property><name>mapred.job.tracker.retiredjobs.cache.size</name><value>1000</value><source>programatically</source><source>/data/disk0/mapred/jt/jobTracker/job_201303201339_0022.xml</source></property>\n<property><name>mapred.queue.default.acl-administer-jobs</name><value>*</value><source>programatically</source><source>/data/disk0/mapred/jt/jobTracker/job_201303201339_0022.xml</source></property>\n<property><name>dfs.image.transfer.bandwidthPerSec</name><value>0</value><source>programatically</source><source>/data/disk0/mapred/jt/jobTracker/job_201...
    [01/Apr/2013 07:06:06 +0000] http_client DEBUG GET
    http://prod-node015.lol.ru:50070/webhdfs/v1/staging/landing/source/protei/http/2013/03/27/01?op=GETFILESTATUS&user.name=hue&doas=hdfs
    [01/Apr/2013 07:06:06 +0000] resource DEBUG GET Got response:
    {"FileStatus":{"accessTime":0,"b...
    [01/Apr/2013 07:06:06 +0000] http_client DEBUG GET
    http://prod-node015.lol.ru:50070/webhdfs/v1/masterdata/source/protei/http/archive/2013/03/27/01?op=GETFILESTATUS&user.name=hue&doas=hdfs
    [01/Apr/2013 07:06:06 +0000] resource DEBUG GET Got response:
    {"FileStatus":{"accessTime":0,"b...
    [01/Apr/2013 07:06:06 +0000] thrift_util DEBUG Thrift call:
    <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getJobCounterRollups(args=(RequestContext(confOptions={'effective_user':
    u'hdfs'}), ThriftJobID(asString='job_201303201339_0022',
    jobTrackerID='201303201339', jobID=22)), kwargs={})
    [01/Apr/2013 07:06:06 +0000] thrift_util DEBUG Thrift call
    <class 'hadoop.api.jobtracker.Jobtracker.Client'>.getJobCounterRollups
    returned in 18ms:
    ThriftJobCounterRollups(reduceCounters=ThriftGroupList(groups=[ThriftCounterGroup(displayName='File
    System Counters', name='org.apache.hadoop.mapreduce.FileSystemCounter',
    counters={'FILE: Number of bytes read': ThriftCounter(displayName='FILE:
    Number of bytes read', name='FILE_BYTES_READ', value=3360), 'HDFS: Number
    of write operations': ThriftCounter(displayName='HDFS: Number of write
    operations', name='HDFS_WRITE_OPS', value=168), 'FILE: Number of read
    operations': ThriftCounter(displayName='FILE: Number of read operations',
    name='FILE_READ_OPS', value=0), 'HDFS: Number of bytes read':
    ThriftCounter(displayName='HDFS: Number of bytes read',
    name='HDFS_BYTES_READ', value=0), 'HDFS: Number of read operations':
    ThriftCounter(displayName='HDFS: Number of read operations',
    name='HDFS_READ_OPS', value=21), 'FILE: Number of bytes written':
    ThriftCounter(displayName='FILE: Number of bytes written',
    name='FILE_BYTES_WRITTEN', value=29347957), 'HDFS: Number of large read
    operations': ThriftCo...
    [01/Apr/2013 07:06:07 +0000] access INFO 10.66.49.134 hdfs
    - "GET /debug/check_config_ajax HTTP/1.0"
    [01/Apr/2013 07:06:16 +0000] access INFO 10.66.49.134 hdfs
    - "GET
    /jobbrowser/jobs/job_201303201339_0022/tasks/task_201303201339_0022_r_000167/attempts/attempt_201303201339_0022_r_000167_0/logs
    HTTP/1.0"
    [01/Apr/2013 07:06:16 +0000] thrift_util DEBUG Thrift call:
    <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getJob(args=(RequestContext(confOptions={'effective_user':
    u'hdfs'}), ThriftJobID(asString=u'job_201303201339_0022',
    jobTrackerID=u'201303201339', jobID=22)), kwargs={})
    [01/Apr/2013 07:06:16 +0000] thrift_util DEBUG Thrift call
    <class 'hadoop.api.jobtracker.Jobtracker.Client'>.getJob returned in 53ms:
    ThriftJobInProgress(profile=ThriftJobProfile(jobFile='hdfs://
    prod-node015.lol.ru:8020/user/hdfs/.staging/job_201303201339_0022/job.xml',
    queueName='default', user='hdfs',
    name='oozie:action:T=map-reduce:W=Url-rating-subworkflow:A=Url-rating-subworkflow-run:ID=0000020-130320135309911-oozie-oozi-W',
    jobID=ThriftJobID(asString='job_201303201339_0022',
    jobTrackerID='201303201339', jobID=22)),
    status=ThriftJobStatus(cleanupProgress=1.0, reduceProgress=1.0, runState=2,
    jobID=ThriftJobID(asString='job_201303201339_0022',
    jobTrackerID='201303201339', jobID=22), priority=2, user='hdfs',
    startTime=1364824738956, setupProgress=1.0, mapProgress=1.0,
    schedulingInfo='NA'), tasks=ThriftTaskInProgressList(numTotalTasks=188,
    tasks=[ThriftTaskInProgress(runningAttempts=[],
    taskStatuses={'attempt_201303201339_0022_m_000018_0':
    ThriftTaskStatus(finishTime=1364824794989, stateString='cleanup',
    startTime=1364824793165, sortFinishTime=0,
    taskTracker='tracker_prod-node034.lol.ru:localhost/127....
    [01/Apr/2013 07:06:16 +0000] thrift_util DEBUG Thrift call:
    <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getTask(args=(RequestContext(confOptions={'effective_user':
    u'hdfs'}), ThriftTaskID(asString=None, taskType=1, taskID=167,
    jobID=ThriftJobID(asString=None, jobTrackerID=u'201303201339', jobID=22))),
    kwargs={})
    [01/Apr/2013 07:06:16 +0000] thrift_util DEBUG Thrift call
    <class 'hadoop.api.jobtracker.Jobtracker.Client'>.getTask returned in 6ms:
    ThriftTaskInProgress(runningAttempts=[],
    taskStatuses={'attempt_201303201339_0022_r_000167_0':
    ThriftTaskStatus(finishTime=1364824779167, stateString='reduce > reduce',
    startTime=1364824766546, sortFinishTime=1364824777715,
    taskTracker='tracker_prod-node014.lol.ru:localhost/127.0.0.1:47833',
    state=1, shuffleFinishTime=1364824777120, mapFinishTime=0,
    taskID=ThriftTaskAttemptID(asString='attempt_201303201339_0022_r_000167_0',
    attemptID=0,
    taskID=ThriftTaskID(asString='task_201303201339_0022_r_000167', taskType=1,
    taskID=167, jobID=ThriftJobID(asString='job_201303201339_0022',
    jobTrackerID='201303201339', jobID=22))), diagnosticInfo='', phase=4,
    progress=1.0, outputSize=-1,
    counters=ThriftGroupList(groups=[ThriftCounterGroup(displayName='File
    System Counters', name='org.apache.hadoop.mapreduce.FileSystemCounter',
    counters={'FILE: Number of bytes read': ThriftCounter(displayName='FILE:
    Number of bytes read', name='FILE_BYTES_READ', value=20), 'HDFS: Number of
    write operations': Thr...
    [01/Apr/2013 07:06:16 +0000] thrift_util DEBUG Thrift call:
    <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getTracker(args=(RequestContext(confOptions={'effective_user':
    u'hdfs'}), 'tracker_prod-node014.lol.ru:localhost/127.0.0.1:47833'),
    kwargs={})
    [01/Apr/2013 07:06:16 +0000] thrift_util DEBUG Thrift call
    <class 'hadoop.api.jobtracker.Jobtracker.Client'>.getTracker returned in
    1ms: ThriftTaskTrackerStatus(taskReports=None,
    availableSpace=2041132032000, totalVirtualMemory=139483734016,
    failureCount=32, httpPort=50060, host='prod-node014.lol.ru',
    totalPhysicalMemory=135290486784, reduceCount=0, lastSeen=1364825174380,
    trackerName='tracker_prod-node014.lol.ru:localhost/127.0.0.1:47833',
    mapCount=0, maxReduceTasks=16, maxMapTasks=32)
    [01/Apr/2013 07:06:16 +0000] models INFO Retrieving
    http://prod-node014.lol.ru:50060/tasklog?attemptid=attempt_201303201339_0022_r_000167_0
    [01/Apr/2013 07:06:16 +0000] middleware INFO Processing
    exception: Unexpected end tag : td, line 7, column 12: Traceback (most
    recent call last):
    File
    "/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/share/hue/build/env/lib/python2.6/site-packages/Django-1.2.3-py2.6.egg/django/core/handlers/base.py",
    line 100, in get_response
    response = callback(request, *callback_args, **callback_kwargs)
    File
    "/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/share/hue/apps/jobbrowser/src/jobbrowser/views.py",
    line 62, in decorate
    return view_func(request, *args, **kwargs)
    File
    "/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/share/hue/apps/jobbrowser/src/jobbrowser/views.py",
    line 290, in single_task_attempt_logs
    logs += [ section.strip() for section in attempt.get_task_log() ]
    File
    "/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/share/hue/apps/jobbrowser/src/jobbrowser/models.py",
    line 451, in get_task_log
    et = lxml.html.parse(data)
    File
    "/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/share/hue/build/env/lib/python2.6/site-packages/lxml-2.2.2-py2.6-linux-x86_64.egg/lxml/html/__init__.py",
    line 661, in parse
    return etree.parse(filename_or_url, parser, base_url=base_url,
    **kw)
    File "lxml.etree.pyx", line 2698, in lxml.etree.parse
    (src/lxml/lxml.etree.c:49590)
    File "parser.pxi", line 1513, in lxml.etree._parseDocument
    (src/lxml/lxml.etree.c:71423)
    File "parser.pxi", line 1543, in lxml.etree._parseFilelikeDocument
    (src/lxml/lxml.etree.c:71733)
    File "parser.pxi", line 1426, in lxml.etree._parseDocFromFilelike
    (src/lxml/lxml.etree.c:70648)
    File "parser.pxi", line 997, in
    lxml.etree._BaseParser._parseDocFromFilelike (src/lxml/lxml.etree.c:67944)
    File "parser.pxi", line 539, in
    lxml.etree._ParserContext._handleParseResultDoc
    (src/lxml/lxml.etree.c:63820)
    File "parser.pxi", line 625, in lxml.etree._handleParseResult
    (src/lxml/lxml.etree.c:64741)
    File "parser.pxi", line 565, in lxml.etree._raiseParseError
    (src/lxml/lxml.etree.c:64084)
    XMLSyntaxError: Unexpected end tag : td, line 7, column 12

    [01/Apr/2013 07:06:16 +0000] access INFO 10.66.49.134 hdfs
    - "GET /debug/check_config_ajax HTTP/1.0"
    [01/Apr/2013 07:06:20 +0000] access WARNING 10.66.49.134 hdfs
    - "GET /logs HTTP/1.0"
    [01/Apr/2013 07:06:31 +0000] access INFO 10.66.49.134 hdfs
    - "GET /debug/check_config_ajax HTTP/1.0"
    [01/Apr/2013 07:14:18 +0000] access INFO 10.66.49.134 hdfs
    - "GET /jobbrowser/jobs/job_201303201339_0025 HTTP/1.0"
    [01/Apr/2013 07:14:18 +0000] thrift_util DEBUG Thrift call:
    <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getJob(args=(RequestContext(confOptions={'effective_user':
    u'hdfs'}), ThriftJobID(asString=u'job_201303201339_0025',
    jobTrackerID=u'201303201339', jobID=25)), kwargs={})
    [01/Apr/2013 07:14:18 +0000] thrift_util INFO Thrift exception;
    retrying: None
    [01/Apr/2013 07:14:18 +0000] thrift_util DEBUG Thrift call:
    <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getJob(args=(RequestContext(confOptions={'effective_user':
    u'hdfs'}), ThriftJobID(asString=u'job_201303201339_0025',
    jobTrackerID=u'201303201339', jobID=25)), kwargs={})
    [01/Apr/2013 07:14:19 +0000] thrift_util DEBUG Thrift call
    <class 'hadoop.api.jobtracker.Jobtracker.Client'>.getJob returned in 57ms:
    ThriftJobInProgress(profile=ThriftJobProfile(jobFile='hdfs://
    prod-node015.lol.ru:8020/user/hdfs/.staging/job_201303201339_0025/job.xml',
    queueName='default', user='hdfs',
    name='oozie:action:T=map-reduce:W=Url-rating-subworkflow:A=Url-rating-subworkflow-run:ID=0000021-130320135309911-oozie-oozi-W',
    jobID=ThriftJobID(asString='job_201303201339_0025',
    jobTrackerID='201303201339', jobID=25)),
    status=ThriftJobStatus(cleanupProgress=1.0, reduceProgress=1.0, runState=2,
    jobID=ThriftJobID(asString='job_201303201339_0025',
    jobTrackerID='201303201339', jobID=25), priority=2, user='hdfs',
    startTime=1364825038848, setupProgress=1.0, mapProgress=1.0,
    schedulingInfo='NA'), tasks=ThriftTaskInProgressList(numTotalTasks=205,
    tasks=[ThriftTaskInProgress(runningAttempts=[],
    taskStatuses={'attempt_201303201339_0025_m_000035_0':
    ThriftTaskStatus(finishTime=1364825088030, stateString='cleanup',
    startTime=1364825085984, sortFinishTime=0,
    taskTracker='tracker_prod-node014.lol.ru:localhost/127....
    [01/Apr/2013 07:14:19 +0000] thrift_util DEBUG Thrift call:
    <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getJobConfXML(args=(RequestContext(confOptions={'effective_user':
    u'hdfs'}), ThriftJobID(asString='job_201303201339_0025',
    jobTrackerID='201303201339', jobID=25)), kwargs={})
    [01/Apr/2013 07:14:19 +0000] thrift_util DEBUG Thrift call
    <class 'hadoop.api.jobtracker.Jobtracker.Client'>.getJobConfXML returned in
    4ms: '<?xml version="1.0" encoding="UTF-8"
    standalone="no"?><configuration>\n<property><name>mapred.job.restart.recover</name><value>true</value><source>programatically</source><source>/data/disk0/mapred/jt/jobTracker/job_201303201339_0025.xml</source></property>\n<property><name>job.end.retry.interval</name><value>30000</value><source>programatically</source><source>/data/disk0/mapred/jt/jobTracker/job_201303201339_0025.xml</source></property>\n<property><name>mapred.job.tracker.retiredjobs.cache.size</name><value>1000</value><source>programatically</source><source>/data/disk0/mapred/jt/jobTracker/job_201303201339_0025.xml</source></property>\n<property><name>mapred.queue.default.acl-administer-jobs</name><value>*</value><source>programatically</source><source>/data/disk0/mapred/jt/jobTracker/job_201303201339_0025.xml</source></property>\n<property><name>dfs.image.transfer.bandwidthPerSec</name><value>0</value><source>programatically</source><source>/data/disk0/mapred/jt/jobTracker/job_201...
    [01/Apr/2013 07:14:19 +0000] http_client DEBUG GET
    http://prod-node015.lol.ru:50070/webhdfs/v1/staging/landing/source/protei/http/2013/03/27/02?op=GETFILESTATUS&user.name=hue&doas=hdfs
    [01/Apr/2013 07:14:19 +0000] resource DEBUG GET Got response:
    {"FileStatus":{"accessTime":0,"b...
    [01/Apr/2013 07:14:19 +0000] http_client DEBUG GET
    http://prod-node015.lol.ru:50070/webhdfs/v1/masterdata/source/protei/http/archive/2013/03/27/02?op=GETFILESTATUS&user.name=hue&doas=hdfs
    [01/Apr/2013 07:14:19 +0000] resource DEBUG GET Got response:
    {"FileStatus":{"accessTime":0,"b...
    [01/Apr/2013 07:14:19 +0000] thrift_util DEBUG Thrift call:
    <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getJobCounterRollups(args=(RequestContext(confOptions={'effective_user':
    u'hdfs'}), ThriftJobID(asString='job_201303201339_0025',
    jobTrackerID='201303201339', jobID=25)), kwargs={})
    [01/Apr/2013 07:14:19 +0000] thrift_util DEBUG Thrift call
    <class 'hadoop.api.jobtracker.Jobtracker.Client'>.getJobCounterRollups
    returned in 19ms:
    ThriftJobCounterRollups(reduceCounters=ThriftGroupList(groups=[ThriftCounterGroup(displayName='File
    System Counters', name='org.apache.hadoop.mapreduce.FileSystemCounter',
    counters={'FILE: Number of bytes read': ThriftCounter(displayName='FILE:
    Number of bytes read', name='FILE_BYTES_READ', value=3360), 'HDFS: Number
    of write operations': ThriftCounter(displayName='HDFS: Number of write
    operations', name='HDFS_WRITE_OPS', value=168), 'FILE: Number of read
    operations': ThriftCounter(displayName='FILE: Number of read operations',
    name='FILE_READ_OPS', value=0), 'HDFS: Number of bytes read':
    ThriftCounter(displayName='HDFS: Number of bytes read',
    name='HDFS_BYTES_READ', value=0), 'HDFS: Number of read operations':
    ThriftCounter(displayName='HDFS: Number of read operations',
    name='HDFS_READ_OPS', value=103), 'FILE: Number of bytes written':
    ThriftCounter(displayName='FILE: Number of bytes written',
    name='FILE_BYTES_WRITTEN', value=29347909), 'HDFS: Number of large read
    operations': ThriftC...
    [01/Apr/2013 07:14:19 +0000] access INFO 10.66.49.134 hdfs
    - "GET /debug/check_config_ajax HTTP/1.0"
    [01/Apr/2013 07:15:25 +0000] access WARNING 10.66.49.134 hdfs
    - "GET /logs HTTP/1.0"
    [01/Apr/2013 07:15:33 +0000] access WARNING 10.66.49.134 hdfs
    - "GET /download_logs HTTP/1.0"


    What does it mean?
    Few hours ago I could see MapReduce logs through Hue interface.
  • Romain Rigaux at Apr 8, 2013 at 7:26 am
    Hue is doing the equivalent of a 'wget' on this URL (there is no API in
    MR1), so yes 'hue' needs access to the page.

    Romain
    On Sun, Apr 7, 2013 at 11:13 PM, Serega Sheypak wrote:

    OK, so hue user should be granted to access tasknode log directories?
    08.04.2013 10:09 пользователь "Romain Rigaux" <romain@cloudera.com>
    написал:
    How did you retrieve the log file? (from which machine and user)


    The content looks fine and the error seems just that Hue is not
    authorized to access:


    http://prod-node013.lol.ru:50060/tasklog?attemptid=attempt_201303201339_0103_m_000000_0

    Do you have the file creating this error (which was the original one of
    the thread):


    et = lxml.html.parse(data)
    File "/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/share/hue/build/env/lib/python2.6/site-packages/lxml-2.2.2-py2.6-linux-x86_64.egg/lxml/html/__init__.py", line 661, in parse
    return etree.parse(filename_or_url, parser, base_url=base_url, **kw)
    File "lxml.etree.pyx", line 2698, in lxml.etree.parse (src/lxml/lxml.etree.c:49590)
    File "parser.pxi", line 1513, in lxml.etree._parseDocument (src/lxml/lxml.etree.c:71423)
    File "parser.pxi", line 1543, in lxml.etree._parseFilelikeDocument (src/lxml/lxml.etree.c:71733)
    File "parser.pxi", line 1426, in lxml.etree._parseDocFromFilelike (src/lxml/lxml.etree.c:70648)
    File "parser.pxi", line 997, in lxml.etree._BaseParser._parseDocFromFilelike (src/lxml/lxml.etree.c:67944)
    File "parser.pxi", line 539, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:63820)
    File "parser.pxi", line 625, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:64741)
    File "parser.pxi", line 565, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:64084)
    XMLSyntaxError: Unexpected end tag : td, line 748, column 12



    Romain
    On Sat, Apr 6, 2013 at 5:47 AM, Serega Sheypak wrote:

    Here it is:


    [06/Apr/2013 05:37:26 +0000] access WARNING 10.66.49.134 hdfs - "GET /logs HTTP/1.0"


    [06/Apr/2013 05:37:25 +0000] access INFO 10.66.49.134 hdfs - "GET /debug/check_config_ajax HTTP/1.0"


    [06/Apr/2013 05:37:25 +0000] middleware INFO Processing exception: <urlopen error Cannot retrieve logs from TaskTracker tracker_prod-node013.lol.ru:localhost/127.0.0.1:39567.>: Traceback (most recent call last):
    File "/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/share/hue/build/env/lib/python2.6/site-packages/Django-1.2.3-py2.6.egg/django/core/handlers/base.py", line 100, in get_response
    response = callback(request, *callback_args, **callback_kwargs)
    File "/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/share/hue/apps/jobbrowser/src/jobbrowser/views.py", line 62, in decorate
    return view_func(request, *args, **kwargs)
    File "/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/share/hue/apps/jobbrowser/src/jobbrowser/views.py", line 193, in job_single_logs
    return single_task_attempt_logs(request, **{'job': job.jobId, 'taskid': task.taskId, 'attemptid': task.taskAttemptIds[-1]})
    File "/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/share/hue/apps/jobbrowser/src/jobbrowser/views.py", line 62, in decorate
    return view_func(request, *args, **kwargs)
    File "/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/share/hue/apps/jobbrowser/src/jobbrowser/views.py", line 290, in single_task_attempt_logs
    logs += [ section.strip() for section in attempt.get_task_log() ]
    File "/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/share/hue/apps/jobbrowser/src/jobbrowser/models.py", line 449, in get_task_log
    raise urllib2.URLError(_("Cannot retrieve logs from TaskTracker %(id)s.") % {'id': self.taskTrackerId})
    URLError: <urlopen error Cannot retrieve logs from TaskTracker tracker_prod-node013.lol.ru:localhost/127.0.0.1:39567.>

    [06/Apr/2013 05:37:25 +0000] models INFO *Retrieving http://prod-node013.lol.ru:50060/tasklog?attemptid=attempt_201303201339_0103_m_000000_0*


    [06/Apr/2013 05:37:25 +0000] thrift_util DEBUG Thrift call <class 'hadoop.api.jobtracker.Jobtracker.Client'>.getTracker returned in 2ms: ThriftTaskTrackerStatus(taskReports=[ThriftTaskStatus(finishTime=0, stateString='initializing', startTime=1365251839119, sortFinishTime=0, taskTracker='tracker_prod-node013.lol.ru:localhost/127.0.0.1:39567', state=0, shuffleFinishTime=0, mapFinishTime=0, taskID=ThriftTaskAttemptID(asString='attempt_201303201339_0103_m_000000_0', attemptID=0, taskID=ThriftTaskID(asString='task_201303201339_0103_m_000000', taskType=0, taskID=0, jobID=ThriftJobID(asString='job_201303201339_0103', jobTrackerID='201303201339', jobID=103))), diagnosticInfo='', phase=1, progress=0.0, outputSize=-1, counters=ThriftGroupList(groups=[ThriftCounterGroup(displayName='Map-Reduce Framework', name='org.apache.hadoop.mapreduce.TaskCounter', counters={'Spilled Records': ThriftCounter(displayName='Spilled Records', name='SPILLED_RECORDS', value=0)})]))], availableSpace=2023627849728, totalVirtualMemory=139483734016, failureCount=32, httpPort=50060, host='prod-node013.lol.ru', totalPhysicalMemory=13529048...


    [06/Apr/2013 05:37:25 +0000] thrift_util DEBUG Thrift call: <class 'hadoop.api.jobtracker.Jobtracker.Client'>.getTracker(args=(RequestContext(confOptions={'effective_user': u'hdfs'}), 'tracker_prod-node013.lol.ru:localhost/127.0.0.1:39567'), kwargs={})


    Is it the thing you need?



    2013/4/6 Romain Rigaux <romain@cloudera.com>
    Could you send us a zip of the log page (from the JobTracker). That way
    we could investigate and provide a fix.

    Romain


    On Fri, Apr 5, 2013 at 7:16 AM, Serega Sheypak <
    serega.sheypak@gmail.com> wrote:
    Hi. Is there any workaround? We are behind firewall which hides
    production cluster. It's impossible to open hole for each tasknode UI to
    see the logs....


    2013/4/4 Serega Sheypak <serega.sheypak@gmail.com>
    I can see these logs through jobtracker interface without any
    problem. How can we fix it and see logs through HUE web UI? It's very
    comfortable to do it there.


    2013/4/4 Serega Sheypak <serega.sheypak@gmail.com>
    Ouups, the same thing for Hive action logs.
    I do see hive action step in Hue OOize dashboard. I see error text:

    addHivePartitionhive
    org/apache/hadoop/hive/cli/CliDriver

    but when I try to see logs, I get the same error:


    http://prod-beeswax.lol.ru:8888/jobbrowser/jobs/job_201303201339_0047/single_logs


    04/Apr/2013 08:44:28 +0000] thrift_util DEBUG Thrift call: <class 'hadoop.api.jobtracker.Jobtracker.Client'>.getJob(args=(RequestContext(confOptions={'effective_user': u'hdfs'}), ThriftJobID(asString=u'job_201303201339_0047', jobTrackerID=u'201303201339', jobID=47)), kwargs={})


    [04/Apr/2013 08:44:28 +0000] access INFO 10.100.231.188 hdfs - "GET /jobbrowser/jobs/job_201303201339_0047/single_logs HTTP/1.0"


    [04/Apr/2013 08:42:49 +0000] access WARNING 10.100.231.188 hdfs - "GET /logs HTTP/1.0"


    [04/Apr/2013 08:42:45 +0000] access INFO 10.100.231.188 hdfs - "GET /debug/check_config_ajax HTTP/1.0"


    [04/Apr/2013 08:42:44 +0000] middleware INFO Processing exception: Unexpected end tag : td, line 748, column 12: Traceback (most recent call last):
    File "/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/share/hue/build/env/lib/python2.6/site-packages/Django-1.2.3-py2.6.egg/django/core/handlers/base.py", line 100, in get_response
    response = callback(request, *callback_args, **callback_kwargs)
    File "/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/share/hue/apps/jobbrowser/src/jobbrowser/views.py", line 62, in decorate
    return view_func(request, *args, **kwargs)
    File "/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/share/hue/apps/jobbrowser/src/jobbrowser/views.py", line 193, in job_single_logs
    return single_task_attempt_logs(request, **{'job': job.jobId, 'taskid': task.taskId, 'attemptid': task.taskAttemptIds[-1]})
    File "/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/share/hue/apps/jobbrowser/src/jobbrowser/views.py", line 62, in decorate
    return view_func(request, *args, **kwargs)
    File "/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/share/hue/apps/jobbrowser/src/jobbrowser/views.py", line 290, in single_task_attempt_logs
    logs += [ section.strip() for section in attempt.get_task_log() ]
    File "/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/share/hue/apps/jobbrowser/src/jobbrowser/models.py", line 451, in get_task_log
    et = lxml.html.parse(data)
    File "/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/share/hue/build/env/lib/python2.6/site-packages/lxml-2.2.2-py2.6-linux-x86_64.egg/lxml/html/__init__.py", line 661, in parse
    return etree.parse(filename_or_url, parser, base_url=base_url, **kw)
    File "lxml.etree.pyx", line 2698, in lxml.etree.parse (src/lxml/lxml.etree.c:49590)
    File "parser.pxi", line 1513, in lxml.etree._parseDocument (src/lxml/lxml.etree.c:71423)
    File "parser.pxi", line 1543, in lxml.etree._parseFilelikeDocument (src/lxml/lxml.etree.c:71733)
    File "parser.pxi", line 1426, in lxml.etree._parseDocFromFilelike (src/lxml/lxml.etree.c:70648)
    File "parser.pxi", line 997, in lxml.etree._BaseParser._parseDocFromFilelike (src/lxml/lxml.etree.c:67944)
    File "parser.pxi", line 539, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:63820)
    File "parser.pxi", line 625, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:64741)
    File "parser.pxi", line 565, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:64084)
    XMLSyntaxError: Unexpected end tag : td, line 748, column 12


    [04/Apr/2013 08:42:44 +0000] models INFO Retrieving http://prod-node032.lol.ru:50060/tasklog?attemptid=attempt_201303201339_0047_m_000000_0


    [04/Apr/2013 08:42:44 +0000] thrift_util DEBUG Thrift call <class 'hadoop.api.jobtracker.Jobtracker.Client'>.getTracker returned in 1ms: ThriftTaskTrackerStatus(taskReports=None, availableSpace=2026570104832, totalVirtualMemory=139483734016, failureCount=31, httpPort=50060, host='prod-node032.lol.ru', totalPhysicalMemory=135290486784, reduceCount=0, lastSeen=1365090162808, trackerName='tracker_prod-node032.lol.ru:localhost/127.0.0.1:35365', mapCount=0, maxReduceTasks=16, maxMapTasks=32)





    2013/4/4 Serega Sheypak <serega.sheypak@gmail.com>
    Ok, I got it. I've visited

    /var/log/hadoop-0.20-mapreduce/userlogs/job_201303201339_0031/attempt_201303201339_0031_m_000000_0
    catalog on tasknode and inspected syslog file

    There are some unprintable characters from my mapper (logging does
    produce it)

    User-Agent: Mozilla/5.0 (iPad; CPU OS 6_0_1 like Mac OS X)
    AppleWebKit/536.26 (KHTML, q��c
    ������jw�0�d ���RŬ�P �\�ħ�ˢ���{� mo��/p�� �ishi%20GaQ�]

    The second problem is that this file is HUGE. 99% of data is
    corrupted. I've removed verbose logging and now try to rerun it.



    2013/4/4 Brian Burton <brian@cloudera.com>
    Hi Serega,

    The page Romain linked:


    http://prod-node014.lol.ru:50060/tasklog?attemptid=attempt_201303201339_0022_r_000167_0

    That should only be reachable internally for you, so we need you
    to provide the contents of that page.

    *Brian Burton*
    *Customer Operations Engineer*
    <http://www.cloudera.com>


    On Thu, Apr 4, 2013 at 10:05 AM, Serega Sheypak <
    serega.sheypak@gmail.com> wrote:
    Hi, hue version is: Hue 2.2.0

    [devops@pnode034 ~]$ hadoop version
    Hadoop 2.0.0-cdh4.2.0
    Subversion
    file:///data/1/jenkins/workspace/generic-package-rhel64-6-0/topdir/BUILD/hadoop-2.0.0-cdh4.2.0/src/hadoop-common-project/hadoop-common
    -r 8bce4bd28a464e0a92950c50ba01a9deb1d85686
    Compiled by jenkins on Fri Feb 15 11:13:32 PST 2013
    From source with checksum 3eefc211a14ac7b6e764d6ded2eeeb26

    I didn't understand:
    " content of the above page" which one page?


    2013/4/1 Romain Rigaux <romain@cloudera.com>
    The page
    http://prod-node014.lol.ru:50060/tasklog?attemptid=attempt_201303201339_0022_r_000167_0probably produces some HTML with missing parts so Hue can't understand it.
    It should not happen with some other jobs. This handling was fixed in next
    month release.

    Could you share your Hadoop and Hue version and the content of
    the above page so that we can reproduce/test it?

    Romain


    On Mon, Apr 1, 2013 at 7:17 AM, Serega Sheypak <
    serega.sheypak@gmail.com> wrote:
    Hi, Suddenly, I've started to get errors while trying to view
    logs through HUE admin:

    http://node11.lol.ru:8888/jobbrowser/jobs/job_201303201339_0025/single_logs

    I do get such stacktrace:

    [01/Apr/2013 07:05:40 +0000] access INFO 10.66.49.134
    hdfs - "GET /debug/check_config_ajax HTTP/1.0"
    [01/Apr/2013 07:05:43 +0000] access INFO 10.66.49.134
    hdfs - "GET
    /jobbrowser/jobs/job_201303201339_0025/tasks/task_201303201339_0025_r_000164
    HTTP/1.0"
    [01/Apr/2013 07:05:43 +0000] thrift_util DEBUG Thrift call:
    <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getJob(args=(RequestContext(confOptions={'effective_user':
    u'hdfs'}), ThriftJobID(asString=u'job_201303201339_0025',
    jobTrackerID=u'201303201339', jobID=25)), kwargs={})
    [01/Apr/2013 07:05:43 +0000] thrift_util DEBUG Thrift call
    <class 'hadoop.api.jobtracker.Jobtracker.Client'>.getJob returned in 53ms:
    ThriftJobInProgress(profile=ThriftJobProfile(jobFile='hdfs://
    prod-node015.lol.ru:8020/user/hdfs/.staging/job_201303201339_0025/job.xml',
    queueName='default', user='hdfs',
    name='oozie:action:T=map-reduce:W=Url-rating-subworkflow:A=Url-rating-subworkflow-run:ID=0000021-130320135309911-oozie-oozi-W',
    jobID=ThriftJobID(asString='job_201303201339_0025',
    jobTrackerID='201303201339', jobID=25)),
    status=ThriftJobStatus(cleanupProgress=1.0, reduceProgress=1.0, runState=2,
    jobID=ThriftJobID(asString='job_201303201339_0025',
    jobTrackerID='201303201339', jobID=25), priority=2, user='hdfs',
    startTime=1364825038848, setupProgress=1.0, mapProgress=1.0,
    schedulingInfo='NA'), tasks=ThriftTaskInProgressList(numTotalTasks=205,
    tasks=[ThriftTaskInProgress(runningAttempts=[],
    taskStatuses={'attempt_201303201339_0025_m_000035_0':
    ThriftTaskStatus(finishTime=1364825088030, stateString='cleanup',
    startTime=1364825085984, sortFinishTime=0,
    taskTracker='tracker_prod-node014.lol.ru:localhost/127....
    [01/Apr/2013 07:05:43 +0000] thrift_util DEBUG Thrift call:
    <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getTask(args=(RequestContext(confOptions={'effective_user':
    u'hdfs'}), ThriftTaskID(asString=None, taskType=1, taskID=164,
    jobID=ThriftJobID(asString=None, jobTrackerID=u'201303201339', jobID=25))),
    kwargs={})
    [01/Apr/2013 07:05:43 +0000] thrift_util DEBUG Thrift call
    <class 'hadoop.api.jobtracker.Jobtracker.Client'>.getTask returned in 6ms:
    ThriftTaskInProgress(runningAttempts=[],
    taskStatuses={'attempt_201303201339_0025_r_000164_0':
    ThriftTaskStatus(finishTime=1364825078834, stateString='reduce > reduce',
    startTime=1364825069066, sortFinishTime=1364825077527,
    taskTracker='tracker_prod-node029.lol.ru:localhost/
    127.0.0.1:43079', state=1, shuffleFinishTime=1364825076674,
    mapFinishTime=0,
    taskID=ThriftTaskAttemptID(asString='attempt_201303201339_0025_r_000164_0',
    attemptID=0,
    taskID=ThriftTaskID(asString='task_201303201339_0025_r_000164', taskType=1,
    taskID=164, jobID=ThriftJobID(asString='job_201303201339_0025',
    jobTrackerID='201303201339', jobID=25))), diagnosticInfo='', phase=4,
    progress=1.0, outputSize=-1,
    counters=ThriftGroupList(groups=[ThriftCounterGroup(displayName='File
    System Counters', name='org.apache.hadoop.mapreduce.FileSystemCounter',
    counters={'FILE: Number of bytes read': ThriftCounter(displayName='FILE:
    Number of bytes read', name='FILE_BYTES_READ', value=20), 'HDFS: Number of
    write operations': Thr...
    [01/Apr/2013 07:05:44 +0000] access INFO 10.66.49.134
    hdfs - "GET /debug/check_config_ajax HTTP/1.0"
    [01/Apr/2013 07:05:54 +0000] middleware DEBUG No
    desktop_app known for request.
    [01/Apr/2013 07:05:54 +0000] access INFO 10.66.49.134
    hdfs - "GET /jobbrowser/ HTTP/1.0"
    [01/Apr/2013 07:05:54 +0000] thrift_util DEBUG Thrift call:
    <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getAllJobs(args=(RequestContext(confOptions={'effective_user':
    u'hdfs'}),), kwargs={})
    [01/Apr/2013 07:05:54 +0000] thrift_util DEBUG Thrift call
    <class 'hadoop.api.jobtracker.Jobtracker.Client'>.getAllJobs returned in
    6ms:
    ThriftJobList(jobs=[ThriftJobInProgress(profile=ThriftJobProfile(jobFile='hdfs://
    prod-node015.lol.ru:8020/user/devops/.staging/job_201303201339_0002/job.xml',
    queueName='default', user='devops',
    name='oozie:action:T=map-reduce:W=Url-rating-subworkflow:A=Url-rating-subworkflow-run:ID=0000006-130320135309911-oozie-oozi-W',
    jobID=ThriftJobID(asString='job_201303201339_0002',
    jobTrackerID='201303201339', jobID=2)),
    status=ThriftJobStatus(cleanupProgress=1.0, reduceProgress=1.0, runState=3,
    jobID=ThriftJobID(asString='job_201303201339_0002',
    jobTrackerID='201303201339', jobID=2), priority=2, user='devops',
    startTime=1364819345925, setupProgress=1.0, mapProgress=1.0,
    schedulingInfo='NA'), tasks=None, desiredMaps=18, desiredReduces=168,
    finishedMaps=0, finishedReduces=0,
    jobID=ThriftJobID(asString='job_201303201339_0002',
    jobTrackerID='201303201339', jobID=2), priority=2,
    launchTime=1364819346297, startTime=1364819345925,
    finishTime=1364819398372), ThriftJobInProgress(profile=ThriftJo...
    [01/Apr/2013 07:05:55 +0000] access INFO 10.66.49.134
    hdfs - "GET /debug/check_config_ajax HTTP/1.0"
    [01/Apr/2013 07:05:55 +0000] access DEBUG 10.66.49.134
    hdfs - "GET /static/art/datatables/sort_desc.png HTTP/1.0"
    [01/Apr/2013 07:06:06 +0000] access INFO 10.66.49.134
    hdfs - "GET /jobbrowser/jobs/job_201303201339_0022 HTTP/1.0"
    [01/Apr/2013 07:06:06 +0000] thrift_util DEBUG Thrift call:
    <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getJob(args=(RequestContext(confOptions={'effective_user':
    u'hdfs'}), ThriftJobID(asString=u'job_201303201339_0022',
    jobTrackerID=u'201303201339', jobID=22)), kwargs={})
    [01/Apr/2013 07:06:06 +0000] thrift_util DEBUG Thrift call
    <class 'hadoop.api.jobtracker.Jobtracker.Client'>.getJob returned in 32ms:
    ThriftJobInProgress(profile=ThriftJobProfile(jobFile='hdfs://
    prod-node015.lol.ru:8020/user/hdfs/.staging/job_201303201339_0022/job.xml',
    queueName='default', user='hdfs',
    name='oozie:action:T=map-reduce:W=Url-rating-subworkflow:A=Url-rating-subworkflow-run:ID=0000020-130320135309911-oozie-oozi-W',
    jobID=ThriftJobID(asString='job_201303201339_0022',
    jobTrackerID='201303201339', jobID=22)),
    status=ThriftJobStatus(cleanupProgress=1.0, reduceProgress=1.0, runState=2,
    jobID=ThriftJobID(asString='job_201303201339_0022',
    jobTrackerID='201303201339', jobID=22), priority=2, user='hdfs',
    startTime=1364824738956, setupProgress=1.0, mapProgress=1.0,
    schedulingInfo='NA'), tasks=ThriftTaskInProgressList(numTotalTasks=188,
    tasks=[ThriftTaskInProgress(runningAttempts=[],
    taskStatuses={'attempt_201303201339_0022_m_000018_0':
    ThriftTaskStatus(finishTime=1364824794989, stateString='cleanup',
    startTime=1364824793165, sortFinishTime=0,
    taskTracker='tracker_prod-node034.lol.ru:localhost/127....
    [01/Apr/2013 07:06:06 +0000] thrift_util DEBUG Thrift call:
    <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getJobConfXML(args=(RequestContext(confOptions={'effective_user':
    u'hdfs'}), ThriftJobID(asString='job_201303201339_0022',
    jobTrackerID='201303201339', jobID=22)), kwargs={})
    [01/Apr/2013 07:06:06 +0000] thrift_util DEBUG Thrift call
    <class 'hadoop.api.jobtracker.Jobtracker.Client'>.getJobConfXML returned in
    4ms: '<?xml version="1.0" encoding="UTF-8"
    standalone="no"?><configuration>\n<property><name>mapred.job.restart.recover</name><value>true</value><source>programatically</source><source>/data/disk0/mapred/jt/jobTracker/job_201303201339_0022.xml</source></property>\n<property><name>job.end.retry.interval</name><value>30000</value><source>programatically</source><source>/data/disk0/mapred/jt/jobTracker/job_201303201339_0022.xml</source></property>\n<property><name>mapred.job.tracker.retiredjobs.cache.size</name><value>1000</value><source>programatically</source><source>/data/disk0/mapred/jt/jobTracker/job_201303201339_0022.xml</source></property>\n<property><name>mapred.queue.default.acl-administer-jobs</name><value>*</value><source>programatically</source><source>/data/disk0/mapred/jt/jobTracker/job_201303201339_0022.xml</source></property>\n<property><name>dfs.image.transfer.bandwidthPerSec</name><value>0</value><source>programatically</source><source>/data/disk0/mapred/jt/jobTracker/job_201...
    [01/Apr/2013 07:06:06 +0000] http_client DEBUG GET
    http://prod-node015.lol.ru:50070/webhdfs/v1/staging/landing/source/protei/http/2013/03/27/01?op=GETFILESTATUS&user.name=hue&doas=hdfs
    [01/Apr/2013 07:06:06 +0000] resource DEBUG GET Got
    response: {"FileStatus":{"accessTime":0,"b...
    [01/Apr/2013 07:06:06 +0000] http_client DEBUG GET
    http://prod-node015.lol.ru:50070/webhdfs/v1/masterdata/source/protei/http/archive/2013/03/27/01?op=GETFILESTATUS&user.name=hue&doas=hdfs
    [01/Apr/2013 07:06:06 +0000] resource DEBUG GET Got
    response: {"FileStatus":{"accessTime":0,"b...
    [01/Apr/2013 07:06:06 +0000] thrift_util DEBUG Thrift call:
    <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getJobCounterRollups(args=(RequestContext(confOptions={'effective_user':
    u'hdfs'}), ThriftJobID(asString='job_201303201339_0022',
    jobTrackerID='201303201339', jobID=22)), kwargs={})
    [01/Apr/2013 07:06:06 +0000] thrift_util DEBUG Thrift call
    <class 'hadoop.api.jobtracker.Jobtracker.Client'>.getJobCounterRollups
    returned in 18ms:
    ThriftJobCounterRollups(reduceCounters=ThriftGroupList(groups=[ThriftCounterGroup(displayName='File
    System Counters', name='org.apache.hadoop.mapreduce.FileSystemCounter',
    counters={'FILE: Number of bytes read': ThriftCounter(displayName='FILE:
    Number of bytes read', name='FILE_BYTES_READ', value=3360), 'HDFS: Number
    of write operations': ThriftCounter(displayName='HDFS: Number of write
    operations', name='HDFS_WRITE_OPS', value=168), 'FILE: Number of read
    operations': ThriftCounter(displayName='FILE: Number of read operations',
    name='FILE_READ_OPS', value=0), 'HDFS: Number of bytes read':
    ThriftCounter(displayName='HDFS: Number of bytes read',
    name='HDFS_BYTES_READ', value=0), 'HDFS: Number of read operations':
    ThriftCounter(displayName='HDFS: Number of read operations',
    name='HDFS_READ_OPS', value=21), 'FILE: Number of bytes written':
    ThriftCounter(displayName='FILE: Number of bytes written',
    name='FILE_BYTES_WRITTEN', value=29347957), 'HDFS: Number of large read
    operations': ThriftCo...
    [01/Apr/2013 07:06:07 +0000] access INFO 10.66.49.134
    hdfs - "GET /debug/check_config_ajax HTTP/1.0"
    [01/Apr/2013 07:06:16 +0000] access INFO 10.66.49.134
    hdfs - "GET
    /jobbrowser/jobs/job_201303201339_0022/tasks/task_201303201339_0022_r_000167/attempts/attempt_201303201339_0022_r_000167_0/logs
    HTTP/1.0"
    [01/Apr/2013 07:06:16 +0000] thrift_util DEBUG Thrift call:
    <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getJob(args=(RequestContext(confOptions={'effective_user':
    u'hdfs'}), ThriftJobID(asString=u'job_201303201339_0022',
    jobTrackerID=u'201303201339', jobID=22)), kwargs={})
    [01/Apr/2013 07:06:16 +0000] thrift_util DEBUG Thrift call
    <class 'hadoop.api.jobtracker.Jobtracker.Client'>.getJob returned in 53ms:
    ThriftJobInProgress(profile=ThriftJobProfile(jobFile='hdfs://
    prod-node015.lol.ru:8020/user/hdfs/.staging/job_201303201339_0022/job.xml',
    queueName='default', user='hdfs',
    name='oozie:action:T=map-reduce:W=Url-rating-subworkflow:A=Url-rating-subworkflow-run:ID=0000020-130320135309911-oozie-oozi-W',
    jobID=ThriftJobID(asString='job_201303201339_0022',
    jobTrackerID='201303201339', jobID=22)),
    status=ThriftJobStatus(cleanupProgress=1.0, reduceProgress=1.0, runState=2,
    jobID=ThriftJobID(asString='job_201303201339_0022',
    jobTrackerID='201303201339', jobID=22), priority=2, user='hdfs',
    startTime=1364824738956, setupProgress=1.0, mapProgress=1.0,
    schedulingInfo='NA'), tasks=ThriftTaskInProgressList(numTotalTasks=188,
    tasks=[ThriftTaskInProgress(runningAttempts=[],
    taskStatuses={'attempt_201303201339_0022_m_000018_0':
    ThriftTaskStatus(finishTime=1364824794989, stateString='cleanup',
    startTime=1364824793165, sortFinishTime=0,
    taskTracker='tracker_prod-node034.lol.ru:localhost/127....
    [01/Apr/2013 07:06:16 +0000] thrift_util DEBUG Thrift call:
    <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getTask(args=(RequestContext(confOptions={'effective_user':
    u'hdfs'}), ThriftTaskID(asString=None, taskType=1, taskID=167,
    jobID=ThriftJobID(asString=None, jobTrackerID=u'201303201339', jobID=22))),
    kwargs={})
    [01/Apr/2013 07:06:16 +0000] thrift_util DEBUG Thrift call
    <class 'hadoop.api.jobtracker.Jobtracker.Client'>.getTask returned in 6ms:
    ThriftTaskInProgress(runningAttempts=[],
    taskStatuses={'attempt_201303201339_0022_r_000167_0':
    ThriftTaskStatus(finishTime=1364824779167, stateString='reduce > reduce',
    startTime=1364824766546, sortFinishTime=1364824777715,
    taskTracker='tracker_prod-node014.lol.ru:localhost/
    127.0.0.1:47833', state=1, shuffleFinishTime=1364824777120,
    mapFinishTime=0,
    taskID=ThriftTaskAttemptID(asString='attempt_201303201339_0022_r_000167_0',
    attemptID=0,
    taskID=ThriftTaskID(asString='task_201303201339_0022_r_000167', taskType=1,
    taskID=167, jobID=ThriftJobID(asString='job_201303201339_0022',
    jobTrackerID='201303201339', jobID=22))), diagnosticInfo='', phase=4,
    progress=1.0, outputSize=-1,
    counters=ThriftGroupList(groups=[ThriftCounterGroup(displayName='File
    System Counters', name='org.apache.hadoop.mapreduce.FileSystemCounter',
    counters={'FILE: Number of bytes read': ThriftCounter(displayName='FILE:
    Number of bytes read', name='FILE_BYTES_READ', value=20), 'HDFS: Number of
    write operations': Thr...
    [01/Apr/2013 07:06:16 +0000] thrift_util DEBUG Thrift call:
    <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getTracker(args=(RequestContext(confOptions={'effective_user':
    u'hdfs'}), 'tracker_prod-node014.lol.ru:
    localhost/127.0.0.1:47833'), kwargs={})
    [01/Apr/2013 07:06:16 +0000] thrift_util DEBUG Thrift call
    <class 'hadoop.api.jobtracker.Jobtracker.Client'>.getTracker returned in
    1ms: ThriftTaskTrackerStatus(taskReports=None,
    availableSpace=2041132032000, totalVirtualMemory=139483734016,
    failureCount=32, httpPort=50060, host='prod-node014.lol.ru',
    totalPhysicalMemory=135290486784, reduceCount=0, lastSeen=1364825174380,
    trackerName='tracker_prod-node014.lol.ru:localhost/
    127.0.0.1:47833', mapCount=0, maxReduceTasks=16,
    maxMapTasks=32)
    [01/Apr/2013 07:06:16 +0000] models INFO Retrieving
    http://prod-node014.lol.ru:50060/tasklog?attemptid=attempt_201303201339_0022_r_000167_0
    [01/Apr/2013 07:06:16 +0000] middleware INFO Processing
    exception: Unexpected end tag : td, line 7, column 12: Traceback (most
    recent call last):
    File
    "/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/share/hue/build/env/lib/python2.6/site-packages/Django-1.2.3-py2.6.egg/django/core/handlers/base.py",
    line 100, in get_response
    response = callback(request, *callback_args,
    **callback_kwargs)
    File
    "/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/share/hue/apps/jobbrowser/src/jobbrowser/views.py",
    line 62, in decorate
    return view_func(request, *args, **kwargs)
    File
    "/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/share/hue/apps/jobbrowser/src/jobbrowser/views.py",
    line 290, in single_task_attempt_logs
    logs += [ section.strip() for section in
    attempt.get_task_log() ]
    File
    "/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/share/hue/apps/jobbrowser/src/jobbrowser/models.py",
    line 451, in get_task_log
    et = lxml.html.parse(data)
    File
    "/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/share/hue/build/env/lib/python2.6/site-packages/lxml-2.2.2-py2.6-linux-x86_64.egg/lxml/html/__init__.py",
    line 661, in parse
    return etree.parse(filename_or_url, parser,
    base_url=base_url, **kw)
    File "lxml.etree.pyx", line 2698, in lxml.etree.parse
    (src/lxml/lxml.etree.c:49590)
    File "parser.pxi", line 1513, in lxml.etree._parseDocument
    (src/lxml/lxml.etree.c:71423)
    File "parser.pxi", line 1543, in
    lxml.etree._parseFilelikeDocument (src/lxml/lxml.etree.c:71733)
    File "parser.pxi", line 1426, in
    lxml.etree._parseDocFromFilelike (src/lxml/lxml.etree.c:70648)
    File "parser.pxi", line 997, in
    lxml.etree._BaseParser._parseDocFromFilelike (src/lxml/lxml.etree.c:67944)
    File "parser.pxi", line 539, in
    lxml.etree._ParserContext._handleParseResultDoc
    (src/lxml/lxml.etree.c:63820)
    File "parser.pxi", line 625, in lxml.etree._handleParseResult
    (src/lxml/lxml.etree.c:64741)
    File "parser.pxi", line 565, in lxml.etree._raiseParseError
    (src/lxml/lxml.etree.c:64084)
    XMLSyntaxError: Unexpected end tag : td, line 7, column 12

    [01/Apr/2013 07:06:16 +0000] access INFO 10.66.49.134
    hdfs - "GET /debug/check_config_ajax HTTP/1.0"
    [01/Apr/2013 07:06:20 +0000] access WARNING 10.66.49.134
    hdfs - "GET /logs HTTP/1.0"
    [01/Apr/2013 07:06:31 +0000] access INFO 10.66.49.134
    hdfs - "GET /debug/check_config_ajax HTTP/1.0"
    [01/Apr/2013 07:14:18 +0000] access INFO 10.66.49.134
    hdfs - "GET /jobbrowser/jobs/job_201303201339_0025 HTTP/1.0"
    [01/Apr/2013 07:14:18 +0000] thrift_util DEBUG Thrift call:
    <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getJob(args=(RequestContext(confOptions={'effective_user':
    u'hdfs'}), ThriftJobID(asString=u'job_201303201339_0025',
    jobTrackerID=u'201303201339', jobID=25)), kwargs={})
    [01/Apr/2013 07:14:18 +0000] thrift_util INFO Thrift
    exception; retrying: None
    [01/Apr/2013 07:14:18 +0000] thrift_util DEBUG Thrift call:
    <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getJob(args=(RequestContext(confOptions={'effective_user':
    u'hdfs'}), ThriftJobID(asString=u'job_201303201339_0025',
    jobTrackerID=u'201303201339', jobID=25)), kwargs={})
    [01/Apr/2013 07:14:19 +0000] thrift_util DEBUG Thrift call
    <class 'hadoop.api.jobtracker.Jobtracker.Client'>.getJob returned in 57ms:
    ThriftJobInProgress(profile=ThriftJobProfile(jobFile='hdfs://
    prod-node015.lol.ru:8020/user/hdfs/.staging/job_201303201339_0025/job.xml',
    queueName='default', user='hdfs',
    name='oozie:action:T=map-reduce:W=Url-rating-subworkflow:A=Url-rating-subworkflow-run:ID=0000021-130320135309911-oozie-oozi-W',
    jobID=ThriftJobID(asString='job_201303201339_0025',
    jobTrackerID='201303201339', jobID=25)),
    status=ThriftJobStatus(cleanupProgress=1.0, reduceProgress=1.0, runState=2,
    jobID=ThriftJobID(asString='job_201303201339_0025',
    jobTrackerID='201303201339', jobID=25), priority=2, user='hdfs',
    startTime=1364825038848, setupProgress=1.0, mapProgress=1.0,
    schedulingInfo='NA'), tasks=ThriftTaskInProgressList(numTotalTasks=205,
    tasks=[ThriftTaskInProgress(runningAttempts=[],
    taskStatuses={'attempt_201303201339_0025_m_000035_0':
    ThriftTaskStatus(finishTime=1364825088030, stateString='cleanup',
    startTime=1364825085984, sortFinishTime=0,
    taskTracker='tracker_prod-node014.lol.ru:localhost/127....
    [01/Apr/2013 07:14:19 +0000] thrift_util DEBUG Thrift call:
    <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getJobConfXML(args=(RequestContext(confOptions={'effective_user':
    u'hdfs'}), ThriftJobID(asString='job_201303201339_0025',
    jobTrackerID='201303201339', jobID=25)), kwargs={})
    [01/Apr/2013 07:14:19 +0000] thrift_util DEBUG Thrift call
    <class 'hadoop.api.jobtracker.Jobtracker.Client'>.getJobConfXML returned in
    4ms: '<?xml version="1.0" encoding="UTF-8"
    standalone="no"?><configuration>\n<property><name>mapred.job.restart.recover</name><value>true</value><source>programatically</source><source>/data/disk0/mapred/jt/jobTracker/job_201303201339_0025.xml</source></property>\n<property><name>job.end.retry.interval</name><value>30000</value><source>programatically</source><source>/data/disk0/mapred/jt/jobTracker/job_201303201339_0025.xml</source></property>\n<property><name>mapred.job.tracker.retiredjobs.cache.size</name><value>1000</value><source>programatically</source><source>/data/disk0/mapred/jt/jobTracker/job_201303201339_0025.xml</source></property>\n<property><name>mapred.queue.default.acl-administer-jobs</name><value>*</value><source>programatically</source><source>/data/disk0/mapred/jt/jobTracker/job_201303201339_0025.xml</source></property>\n<property><name>dfs.image.transfer.bandwidthPerSec</name><value>0</value><source>programatically</source><source>/data/disk0/mapred/jt/jobTracker/job_201...
    [01/Apr/2013 07:14:19 +0000] http_client DEBUG GET
    http://prod-node015.lol.ru:50070/webhdfs/v1/staging/landing/source/protei/http/2013/03/27/02?op=GETFILESTATUS&user.name=hue&doas=hdfs
    [01/Apr/2013 07:14:19 +0000] resource DEBUG GET Got
    response: {"FileStatus":{"accessTime":0,"b...
    [01/Apr/2013 07:14:19 +0000] http_client DEBUG GET
    http://prod-node015.lol.ru:50070/webhdfs/v1/masterdata/source/protei/http/archive/2013/03/27/02?op=GETFILESTATUS&user.name=hue&doas=hdfs
    [01/Apr/2013 07:14:19 +0000] resource DEBUG GET Got
    response: {"FileStatus":{"accessTime":0,"b...
    [01/Apr/2013 07:14:19 +0000] thrift_util DEBUG Thrift call:
    <class
    'hadoop.api.jobtracker.Jobtracker.Client'>.getJobCounterRollups(args=(RequestContext(confOptions={'effective_user':
    u'hdfs'}), ThriftJobID(asString='job_201303201339_0025',
    jobTrackerID='201303201339', jobID=25)), kwargs={})
    [01/Apr/2013 07:14:19 +0000] thrift_util DEBUG Thrift call
    <class 'hadoop.api.jobtracker.Jobtracker.Client'>.getJobCounterRollups
    returned in 19ms:
    ThriftJobCounterRollups(reduceCounters=ThriftGroupList(groups=[ThriftCounterGroup(displayName='File
    System Counters', name='org.apache.hadoop.mapreduce.FileSystemCounter',
    counters={'FILE: Number of bytes read': ThriftCounter(displayName='FILE:
    Number of bytes read', name='FILE_BYTES_READ', value=3360), 'HDFS: Number
    of write operations': ThriftCounter(displayName='HDFS: Number of write
    operations', name='HDFS_WRITE_OPS', value=168), 'FILE: Number of read
    operations': ThriftCounter(displayName='FILE: Number of read operations',
    name='FILE_READ_OPS', value=0), 'HDFS: Number of bytes read':
    ThriftCounter(displayName='HDFS: Number of bytes read',
    name='HDFS_BYTES_READ', value=0), 'HDFS: Number of read operations':
    ThriftCounter(displayName='HDFS: Number of read operations',
    name='HDFS_READ_OPS', value=103), 'FILE: Number of bytes written':
    ThriftCounter(displayName='FILE: Number of bytes written',
    name='FILE_BYTES_WRITTEN', value=29347909), 'HDFS: Number of large read
    operations': ThriftC...
    [01/Apr/2013 07:14:19 +0000] access INFO 10.66.49.134
    hdfs - "GET /debug/check_config_ajax HTTP/1.0"
    [01/Apr/2013 07:15:25 +0000] access WARNING 10.66.49.134
    hdfs - "GET /logs HTTP/1.0"
    [01/Apr/2013 07:15:33 +0000] access WARNING 10.66.49.134
    hdfs - "GET /download_logs HTTP/1.0"


    What does it mean?
    Few hours ago I could see MapReduce logs through Hue interface.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupscm-users @
categorieshadoop
postedApr 1, '13 at 2:17p
activeApr 8, '13 at 7:26a
posts4
users2
websitecloudera.com
irc#hadoop

2 users in discussion

Serega Sheypak: 3 posts Romain Rigaux: 1 post

People

Translate

site design / logo © 2022 Grokbase