I am new to Clouera. I have installed Cloudera Manger 4.5, Now I am trying
to install Hive,impala,Beewax But facing the issue.
Hive I am able to run my HQL from CLI.
But impala is showing bad condition.
Through Beewax console I am not able to browse my HDFS.
These are the steps I follw for installation:
Installing Impala with Cloudera Manager Free Edition<https://ccp.cloudera.com/display/FREE45BETADOC/Installing+Impala+with+Cloudera+Manager+Free+Edition> Step
1: Install CDH and Hive
To use Cloudera Impala, you must install CDH, Hive, and Impala. Install
CDH, Hive and Impala on the nodes that will run Impala as described in Automated
Installation of Cloudera Manager and CDH<https://ccp.cloudera.com/display/FREE45BETADOC/Automated+Installation+of+Cloudera+Manager+and+CDH>.
You must perform your installation using packages; installation of Impala
is not currently available using parcels.
Impala only supports RHEL/CentOS 6.2.
Step 2: Install a Database for the Hive Metastore
In order to run Impala, you must have the Hive metastore configured to use
either a MySQL or a PostgreSQL database. The default Hive metastore Derby
database is not supported with Impala.
The following instructions describe how to install a MySQL database to use
for the Hive metastore. Install this database on a single machine in your
*To install and configure a MySQL database:*
1. Install the MySQL server.
$ sudo yum install mysql-server
After issuing the command to install MySQL, you may need to respond to
prompts to confirm that you do want to complete the installation.
2. After installation completes, start the mysql daemon.
$ sudo service mysqld start
In the following step, your current root password is blank. Press the
Enter key when you're prompted for the root password.
3. Set the MySQL root password:
$ sudo /usr/bin/mysql_secure_installation
Enter current password for root (enter for none):
OK, successfully used password, moving on...
Set root password? [Y/n] y
Re-enter new password:
Remove anonymous users? [Y/n] Y
Disallow root login remotely? [Y/n] N
Remove test database and access to it [Y/n] Y
Reload privilege tables now? [Y/n] Y
4. Configure MySQL server to start at boot:
$ sudo /sbin/chkconfig mysqld on
$ sudo /sbin/chkconfig --list mysqld
mysqld 0:off 1:off 2:on 3:on 4:on 5:on 6:off
Step 3: Configure a Remote Database as the Hive Metastore
The recommended production environment for Hive is to use a database on one
or more remote servers as the metastore, and MySQL is the most popular
database to use. To set this up:
Impala does not support Derby.
Configuring the Remote MySQL Database
Before you can run the Hive metastore with a remote MySQL database, you
must configure a connector to the remote MySQL database, set up the initial
database schema, and configure the MySQL user account for the Hive user.
*Install the MySQL JDBC Connector<http://www.mysql.com/downloads/connector/j>in the Hive lib directory:
$ curl -L
tar xz$ sudo cp mysql-connector-java-5.1.22/mysql-connector-java-5.1.22-bin.jar
The MySQL administrator should create the initial database schema using the
hive-schema-0.9.0.mysql.sql file located in the
$ mysql -u root -p
mysql> CREATE DATABASE hivemetastoredb;
mysql> USE hivemetastoredb;
mysql> CREATE USER 'hive'@'%' IDENTIFIED BY 'hive';
mysql> GRANT ALL PRIVILEGES ON hivemetastoredb.* TO 'hive'@'%' WITH GRANT
mysql> FLUSH PRIVILEGES;
Take note of the settings you applied to the remote MySQL database. You
will need these values, such as the database name and database user name as
you configure Impala.
Step 4: Add the Impala Service
As you configure Impala, you will need to modify HDFS and Hive settings.
Configurations that are recommended for many environments are as follows:
Property Value DataNode Local Path Access Users
dfs.block.local-path-access.user impala DataNode Data Directory
dfs.datanode.data.dir.perm 755 Enable HDFS Block Metadata API
dfs.datanode.hdfs-blocks-metadata.enabled true Enable HDFS Short Circuit
Property Value Hive Metastore Database Type mysql Hive Metastore
Database Name hivemetastoredb Hive Metastore Database Host <db
hostname> Hive Metastore Database Port 3306 Hive Metastore Database
javax.jdo.option.ConnectionUserName hive Hive Metastore Database
*To add and configure the Impala service*
1. Connect to the Cloudera Manager Admin Console using a browser.
The server URL takes the following form:
<Server host> is the fully-qualified domain name or IP address of the
host machine where the Cloudera Manager Server is installed.
<port> is the port configured for the Cloudera Manager Server. The
default port is 7180.
For example, use a URL such as the following:
2. Click the *Services* tab, then choose *All Services*.
3. Click the *Add a Service* button.
4. Select *Impala* and click *Continue*.
5. Select the dependencies for your service and click *Continue*.
Usually this will be an HDFS, HBase, and Hive service, indicating that the
Impala service depends on these services.
6. Confirm the host assignment for the Impala services and click *
7. Review the configuration changes to be made during the Impala
installation and click *Accept*.
8. The results of the installation process are displayed. Click *Continue
* to return to the Cloudera Manager Admin Console.
9. Use the the HDFS service page to configure the HDFS service for
Impala. Click the instance of HDFS that supports Impala.
10. Pull down the *Configuration* tab at the top of the window and
1. Search for dfs.block.local-path-access.user, which is located under
the *Service-Wide > Performance* category in the left hand column.
Add impala to this list, where impala is the System user configured
in the Impala service.
2. Click *Save Changes*.
3. Search for dfs.datanode.data.dir.perm, which is under the *Security
* category under each DataNode role configuration group. The search
function should display all occurrences of this property if you have
multiple DataNode role configuration groups. Set the *DataNode Data
Directory Permissions* property to 755. This is recommended for
4. Click *Save Changes*.
5. Click the *Configuration* tab and click *Service Wide*. Enable the
following HDFS properties:
- Search for dfs.datanode.hdfs-blocks-metadata.enabled and verify
that it is enabled.
- Search for dfs.client.read.shortcircuit and verify that it is
Both of these are normally enabled by default.
6. Click *Save Changes*.
11. Restart the HDFS service.
1. Click *Services* and click *All Services*.
2. For the HDFS service you modified, click *Actions* and click *
12. Set the Hive metastore
1. Go to the Hive service page (by selecting the Hive service from the *
Services* menu or from the *All Services* page).
2. Pull down the *Configuration* tab at the top of the window and
3. Under *Service-Wide* select *Hive Metastore* in the left hand
4. Update the properties as appropriate to your metastore
configuration, then *Save Changes*.
13. (Optional) Once you have configured and saved your Hive metastore
settings, it is recommended that you validate the Hive metastore by
executing a Hive query. You can validate the metastore using Hue's Beewax
14. Click *Services* and click the new Impala service. Click *Actions*and click
Configuring Hue Beeswax to Connect with Impala (Optional)
You may want to connect with Impala and execute queries from the Hue
Beeswax interface. By default, Hue Beeswax is not configured to use RPC,
which is required for querying Impala. To enable RPC connections, you must
add configuration settings in the *Hue Service Configuration Safety Valve*to add the settings to the
hue.ini file. Impala requires Hue 2.1 or later in CDH 4.1 or later.
*To configure Hue to connect with Impala:*
1. In the Cloudera Manager Admin Console, go to the *Hue service >
2. In the *Service-Wide > Advanced* section, set the following values
for the *Hue Service Configuration Safety Valve*:
beeswax_server_host=<Impala daemon hostname>
beeswax_server_port=<Impala daemon port>
where: <Impala daemon hostname> refers to any host where any Impala
daemon is running. Hue must connect with only one Impala daemon, so any of
them will work. <Impala daemon port> refers to the port on that host to
use to connect to the Impala daemon. The default is 21000.
If Hue and the Impala daemon are installed on the same host, and you are
using CDH4.1, then you must add the beeswax_meta_server_only=9004configuration value to the
*Hue Service Configuration Safety Valve* as shown below to avoid a port
conflict in Hue:
beeswax_server_host=<Impala daemon hostname>
beeswax_server_port=<Impala daemon port>
3. Click *Save*.
4. Click *Services* and click the Hue service. Click *Actions* and click
For information about using Beeswax for queries, see Beeswax<https://ccp.cloudera.com/display/CDH4DOC/Beeswax>.