Grokbase Groups Hive user August 2011
FAQ
0 down vote favorite


I am able to create tables in HIVE. I have a problem with integrating
HIVE and HBASE.

I am following this doc.
https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration

My versions are: Hadoop 0.20.2 Hive 0.7.1 Hbase 0.20.6

hive> CREATE TABLE hbase_table_1(key int, value string)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val")
TBLPROPERTIES ("hbase.table.name" = "xyz");
console:

java.lang.NoSuchMethodError:
org.apache.hadoop.hbase.client.HBaseAdmin.(Lorg/apache/hadoop/conf/Configuration;)V
at org.apache.hadoop.hive.hbase.HBaseStorageHandler.getHBaseAdmin(HBaseStorageHandler.java:74)
at org.apache.hadoop.hive.hbase.HBaseStorageHandler.preCreateTable(HBaseStorageHandler.java:158)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:344)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:470)
at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:3146)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:213) at
org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:130) at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1063) at
org.apache.hadoop.hive.ql.Driver.execute(Driver.java:900) at
org.apache.hadoop.hive.ql.Driver.run(Driver.java:748) at
org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:164) at
org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:241)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:456) at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at
sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at
java.lang.reflect.Method.invoke(Unknown Source) at
org.apache.hadoop.util.RunJar.main(RunJar.java:156) FAILED: Execution
Error, return code -101 from org.apache.hadoop.hive.ql.exec.DDLTask

Any idea on how to proceed further or thoughts about the cause of the issue?

On Thu, Aug 25, 2011 at 6:00 PM, Ashutosh Chauhan wrote:
Christian,
Looks like its not possible to do the setup that you are looking for.
Problem arises since HiveServer extends HMSHandler directly instead of
accessing Metastore through HiveMetaStoreClient and because of this
metastore thrift interface is missed entirely. Hiveserver will contact mysql
directly and won't go through external metastore service as you have in your
diagram.  If you consider this as a blocker, please open up a jira for more
discussion.
Hope it helps,
Ashutosh
On Wed, Aug 24, 2011 at 23:21, Christian Kurz wrote:

Thanks, Edward and Ashutosh

Ashutosh,
yes, I do not understand why the service "hiveserver" still uses a Derby
instance even through it should be talking to the service "metastore". Btw,
if I run the hiveserver without having started the metastore service, the
hiveserver complains when I try to let it execute a HiveQL command through
JDBC:

...
org.apache.hadoop.hive.ql.metadata.HiveException:
MetaException(message:Could not connect to meta store using any of the URIs
provided)
at
org.apache.hadoop.hive.ql.metadata.Hive.getTablesByPattern(Hive.java:919)
...
(full stacktrace at the end of this post)

which is exactly what I expect and which makes me somewhat confident that
I have configured things correctly.

The entire issue came up, because the hiveserver service did not work,
when started from the same directory, from which the metastore service had
been started. It turned out that this was because both services were trying
to setup a Derby instance in the current dir and therefore ran into a file
locking situation. I have worked around this by starting the two services
from different directories, but I am worried that I'd be missing an
important point in my setup.

When I run "pfiles <pid of hiveserver>" it lists these files for the
hiveserver service (which should not need a Derby instance, as far as I
understood):
...tons of jars...
/home/hadoop/hive_admin/derby.log
/home/hadoop/hive_admin/metastore_db/log/log1.dat
/home/hadoop/hive_admin/metastore_db/dbex.lck
/home/hadoop/hive_admin/metastore_db/seg0/c191.dat
/home/hadoop/hive_admin/metastore_db/seg0/c1a1.dat
...
/home/hadoop/hive_admin/metastore_db/seg0/c431.dat
/home/hadoop/hive_admin/metastore_db/seg0/c451.dat

Any pointers appreciated. If anybody things this is a bug, I can file one.

Thanks,
Christian


full stacktrace:

Hive history
file=/tmp/hadoop/hive_job_log_hadoop_201108242305_155100916.txt
FAILED: Error in semantic analysis: Table not found weblog
org.apache.hadoop.hive.ql.metadata.HiveException:
MetaException(message:Could not connect to meta store using any of the URIs
provided)
at
org.apache.hadoop.hive.ql.metadata.Hive.getTablesByPattern(Hive.java:919)
at
org.apache.hadoop.hive.ql.metadata.Hive.getTablesByPattern(Hive.java:904)
at
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(SemanticAnalyzer.java:7074)
at
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:6573)
at
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:238)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:340)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:736)
at
org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:116)
at
org.apache.hadoop.hive.service.ThriftHive$Processor$execute.process(ThriftHive.java:699)
at
org.apache.hadoop.hive.service.ThriftHive$Processor.process(ThriftHive.java:677)
at
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:253)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
Caused by: MetaException(message:Could not connect to meta store using any
of the URIs provided)
at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:183)
at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:151)
at
org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:1855)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:1865)
at
org.apache.hadoop.hive.ql.metadata.Hive.getTablesByPattern(Hive.java:917)
... 13 more
FAILED: Error in metadata: MetaException(message:Could not connect to meta
store using any of the URIs provided)
FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.DDLTask



On 25.08.2011 01:29, Ashutosh Chauhan wrote:

Edward,
Apart from recommended best practices what Christian is asking for is why
HiveServer is still trying to interact with local db instance even after
setting the config variables. AFAIK it should not. Christian, you found that
out by looking at files opened by HiveServer jvm. Can you provide more info
there like how did you find that out and which these files are?
Ashutosh

On Wed, Aug 24, 2011 at 14:20, Edward Capriolo <edlinuxguru@gmail.com>
wrote:
On Wed, Aug 24, 2011 at 3:02 PM, Christian Kurz wrote:

Thanks for the quick reply, Edward

I am not sure I got you: My HiveService has been started
with hive.metastore.local=false. So shouldn't it use thrift instead of its
own local Derby instance?
Thanks,
Christian
Am 24.08.2011 um 19:33 schrieb Edward Capriolo <edlinuxguru@gmail.com>:


On Wed, Aug 24, 2011 at 10:53 AM, Christian Kurz wrote:

Greetings,

could somebody confirm/correct my understanding of a fully distributed
Hive setup, please?

My setup is as follows

Java application using Hive JDBC driver connects to
hive --service hiveserver, which connects to
hive --service metastore, which uses an embedded Derby database for
metadata storage

Please find more details in the image attached.

The thing I find confusing is that JVM2 (Hive Server) starts up a Derby
database instance. I can see that from the files the JVM has opened.

Does anybody know, why the Hive Server needs a Derby instance even
though hive-site.xml says: hive.metastore.local=false ?

Any hints are much appreciated.

Thanks,
Christian

btw,
I have not been able to access the picture on the wiki. ("Not
permitted"; even though I have registered on the wiki)
hive.metastore.local is really misnamed.

local=true means communicate using datanucleus/JPOX and talking directly
to the metastore.
local=false means use thrift which is essentially a level of
indirection.
Talking about HiveService can confuse things because HiveService is a
different thrift interface.
You could be setup like this:
HiveServiceClient->HiveService->metastore.local=true->derby
or

HiveServiceClient->HiveService->metastore.local=false>thrift->hive_metastore
most people are setup like this:
HiveServiceClient->HiveService->metastore.local=true->mysql
cli->metastore.local=true->mysql

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 1 of 1 | next ›
Discussion Overview
groupuser @
categorieshive, hadoop
postedAug 25, '11 at 11:26p
activeAug 25, '11 at 11:26p
posts1
users1
websitehive.apache.org

1 user in discussion

Karthik kottapalli: 1 post

People

Translate

site design / logo © 2022 Grokbase