FAQ
Hi Matti,

The tarball version of CDH does not support direct read because libhadoop.so is
not available if you have installed from a tarball. You must install from
an .rpm, .deb, or parcel in order to use short-circuit local reads.

See
https://ccp.cloudera.com/display/CDH4DOC/Tips+and+Guidelines#TipsandGuidelines-ImprovePerformanceforLocalReads

Can you re-install CDH using package?

Thanks,
Alan


On Tue, Mar 19, 2013 at 12:37 AM, Matti Niemenmaa wrote:

Hi,

I have recently begun using Impala 0.6 with CDH 4.2.0 and have managed
to get a tarball installation of it working, but with the exception of
block location metadata. I've followed the latest instructions at

https://ccp.cloudera.com/display/IMPALA10BETADOC/Configuring+Impala+for+Performance
to configure Hadoop appropriately, and as far as I can tell
short-circuit reads and native checksumming are both functioning properly.

I've replicated the problems with a simple two-machine test cluster with
the namenode, jobtracker, and statestored on one of the two computers
and a datanode, tasktracker, and impalad on the other.

Upon giving a simple "select * from rc limit 10", Impala's relevant
output is:

13/03/18 18:09:11 INFO planner.HdfsScanNode: collecting partitions for
table rc
13/03/18 18:09:11 INFO service.Frontend: get scan range locations
13/03/18 18:09:12 INFO catalog.HdfsTable: loaded partiton
PartitionBlockMetadata{#blocks=405, #filenames=203, totalStringLen=9966}
13/03/18 18:09:12 INFO hdfs.BlockStorageLocationUtil: Failed to connect
to datanode 10.10.253.222:49697
13/03/18 18:09:12 INFO catalog.HdfsTable: loaded disk ids for
PartitionBlockMetadata{#blocks=405, #filenames=203, totalStringLen=9966}
13/03/18 18:09:12 INFO catalog.HdfsTable: block metadata cache:
CacheStats{hitCount=0, missCount=1, loadSuccessCount=1,
loadExceptionCount=0, totalLoadTime=878930596, evictionCount=0}

Impala's logs additionally have this warning:

W0318 18:09:12.878093 1555 hdfs-scan-node.cc:184] Unknown disk id.
This will negatively affect performance. Check your hdfs settings to
enable block location metadata.

And Hadoop's logs from the same time state this:

2013-03-18 18:09:12,711 WARN org.apache.hadoop.ipc.Server: IPC Server
Responder, call

org.apache.hadoop.hdfs.protocol.ClientDatanodeProtocol.getHdfsBlockLocations
from 10.10.253.222:52125: outp
2013-03-18 18:09:12,713 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 0 on 49697 caught an exception
java.nio.channels.ClosedChannelException
at
sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:144)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:342)
at org.apache.hadoop.ipc.Server.channelWrite(Server.java:2134)
at org.apache.hadoop.ipc.Server.access$2000(Server.java:108)
at
org.apache.hadoop.ipc.Server$Responder.processResponse(Server.java:931)
at org.apache.hadoop.ipc.Server$Responder.doRespond(Server.java:997)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1741)

It seems clear that the ClosedChannelException is the cause of Impala's
troubles but I can't figure out what could be causing the IPC issues.
Any help would be appreciated.

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupimpala-user @
categorieshadoop
postedMar 19, '13 at 6:35p
activeMar 19, '13 at 6:35p
posts1
users1
websitecloudera.com
irc#hadoop

1 user in discussion

Alan Choi: 1 post

People

Translate

site design / logo © 2022 Grokbase