Grokbase Groups Hive user July 2015
Thanks Gopal. I filed an issue to cover JDBC+setMaxRows:

For your first offer of testing a patch, unfortunately we tend to run our
production software on customers' Hadoop clusters, so we can't easily patch
their Hive instances. But I'll still take you up on that if I find some
time to try it.

On Tue, Jul 21, 2015 at 11:14 PM, Gopal Vijayaraghavan wrote:

Just want to make sure I understand the behavior once that bug is
fixed...a 'select *' with no limit will run without a M/R job and instead
stream. Is that correct?
Yes, that¹s the intended behaviour. I can help you get a fix in, if you
have some time to test out my WIP patches.
That may incidently solve another bug I'm seeing: when you use JDBC
templates to set the limit (setMaxRows in Spring in my setup), it does
not avoid the M/R job (and no limit clause appears in the hive-server2
log). Instead, the M/R job gets launched...I'm
not sure if the jdbc framework subsequently would apply a limit, once
the job finishes. I haven't spotted this issue in JIRA, I'd be happy to
file it if that's useful to you.
File a JIRA, would be very useful for me.

There¹s a lot of low-hanging fruit in the JDBC + Prepared Statement
codepath, so going over the issues & filing your findings would help me
pick up and knock them off one by one when I¹m back.

Prasanth¹s github has some automated benchmarking tools for JDBC, which I
use heavily -

There are some known issues which have a 2-3x perf degradation for the
simple query patterns you¹re running, like -


Search Discussions

Discussion Posts


Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 5 of 6 | next ›
Discussion Overview
groupuser @
categorieshive, hadoop
postedJul 22, '15 at 1:37a
activeJul 22, '15 at 5:31p



site design / logo © 2021 Grokbase