The simplest of hive queries seem to be consuming 100% cpu. This is
with a small 4-node cluster. The machines are pretty beefy (16 cores
per machine, tons of RAM, 16 M+R maximum tasks configured, 1GB RAM for
mapred.child.java.opts, etc). A simple query like "select count(1)
from events" where the events table has daily partitions of log files
in gzipped file format). While this is probably too generic a question
and there is a bunch of investigation we need to, are there any
specific areas for me to look at? Has anyone see anything like this
before? Also, are there any tools or easy options to profile hive
query execution?

Thanks in advance,

Search Discussions

Discussion Posts

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 1 of 3 | next ›
Discussion Overview
groupuser @
categorieshive, hadoop
postedFeb 3, '11 at 8:49p
activeFeb 3, '11 at 11:50p

2 users in discussion

Vijay: 2 posts Viral Bajaria: 1 post



site design / logo © 2022 Grokbase