Grokbase Groups Hive user July 2011
FAQ
hey guys

i'm seeing "Error: Java heapspace" msg quite often when running hive
queries. i know one way to fix it is to up the jvm memory using this
property: mapred.child.java.opts (currently our cluster uses this value:
-Xmx512M -XX:+UseCompressedOops)
i'm hesitant to increase this value because we're running other services
other than hadoop on the same boxes and i wouldn't want to risk running out
of memory for those services. and also i wonder if it's because my query is
inefficient.

so my query looks like this:
INSERT OVERWRITE TABLE tmp_metrics_filtered
select metrics.* from metrics left outer join internal_users on
(metrics.uid=internal_users.uid)
where internal_users.uid is null and date_str='2011-07-13'

it's filtering out internal user requests from our metrics logs and then
storing the result into a temporary table. there's only 40 entries in
internal_users table. we have about 10GB of metrics logs for that day.

i've also attached a log file for more details.
thx!

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categorieshive, hadoop
postedJul 15, '11 at 2:57a
activeJul 15, '11 at 2:57a
posts1
users1
websitehive.apache.org

1 user in discussion

Shouguo Li: 1 post

People

Translate

site design / logo © 2021 Grokbase