Grokbase Groups Hive user May 2011
This is great info. Thanks a lot for sharing :)

From: Paul Ingles <[email protected]>
To: [email protected]
Sent: Wed, May 4, 2011 4:48:20 AM
Subject: Re: HIVE Server multiple instances

For future reference I've posted a little more about our setup

On Tue, May 3, 2011 at 8:01 PM, Paul Ingles wrote:

Nothing specifically about our Hive setup although some of us at Forward have
blogged bits and pieces about Hive + Hadoop and have a few Hadoop/Hive related
libs on our GitHub account:
I've blogged a few bits ( as has one of my colleagues

Another colleague also presented a little about our setup during a Hadoop meetup
last summer ( The
numbers Andy mentioned will be a little out of date but it does include some
screenshots of a few of the surrounding apps we built that connect to Hive and
Hadoop (including a web based Hive query tool + work queue).

I had a quick search through the mailing lists when we had connection problems
but I think most of it was discussed/resolved during a chat I had with Shevek
from Karmasphere at a London pub following a Hadoop meetup :)

If you're interested, I've posted a gist ( that
contains our HAProxy config; clients connect to 10000 and are balanced between
:10001 and :10005 on 2 servers (so actually 10 backend servers).

Be happy to talk more about our experience- feel free to ping me an email off
list if you'd like.

On 3 May 2011, at 19:18, Matthew Rathbone wrote:

Hey Paul,

I'd be very interested in reading about your hadoop/hive setup, do you have a
blog post or anything describing this setup, or some of the issues you've have
with hive?

Matthew Rathbone
Foursquare | Software Engineer | Server Engineering Team
[email protected] | @rathboma | 4sq

On Tuesday, May 3, 2011 at 2:15 PM, Paul Ingles wrote:
HiveServer does seem to support multiple connections but I think it still has
thread-safety problems (
We've ( certainly had instability problems with the thrift
server in the past and now run 5 or so instances behind the HAProxy
load-balancer ( Since we did that it's been
significantly better.

I think the JDBC server still operates using thrift to connect to the
HiveServer so I would expect it to have similar problems (but I may have got
that wrong :)

On 3 May 2011, at 18:59, Matthew Rathbone wrote:

Even if it is single threaded it certainly seems to support multiple

We run 5 workers all connected at the same time executing a different query
each ( with a different connection per worker).

Hope that helps

On Tuesday, May 3, 2011 at 1:40 PM, V.Senthil Kumar wrote:
Thanks Matthew. The wiki page
its single threaded. I have a queue of queries which gets added dynamically all
the time. By the time I run 1 query using 1 JDBC connection, the queue
added more queries and builds up a backlog. So, I was that's why I was
whether I can run two or more instances to avoid having a big backlog in queue.

----- Original Message ----
From: Matthew Rathbone <[email protected]>
To: [email protected]
Sent: Tue, May 3, 2011 7:46:49 AM
Subject: Re: HIVE Server multiple instances

Why would you want to run two? I think it is multithreaded, so you can query it
from two different connections

Matthew Rathbone
Foursquare | Software Engineer | Server Engineering Team
[email protected] | @rathboma | 4sq

On Monday, May 2, 2011 at 6:41 PM, V.Senthil Kumar wrote:
I have one instance of HIVE JDBC server running on port 10000. Can I run

instance on different port ? Would it cause a concurrency issue on the
underlying data warehouse files ? Please clarify.

V.Senthil Kumar

Search Discussions

Discussion Posts


Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 12 of 12 | next ›
Discussion Overview
groupuser @
categorieshive, hadoop
postedMay 2, '11 at 10:42p
activeMay 4, '11 at 6:06p



site design / logo © 2023 Grokbase