I've recently converted a metastore from derby to MySQL. It wasn't pain
free, but we made it through without major issues. Here is what I recommend
you try; though I make no promises it'll work for anyone else:
1. Backup your metastore files
2. Install razorsql (http://www.razorsql.com/)
3. Make a copy of your metastore files to work with in razorsql (not the
backup, and not the actual running copy either)
4. From razorsql do the following:
Connections->Add Connection Profile
Profile Name: Hive Metastore
Database Directory: Point to the directory with the metastore content copy
DB Tools-> Export Data
multiple tables, schema: APP, next
SQL Statements, Generate SQL INSERT statements, Generate DDL statements, next
Export to single file, \ for escape single quotes, <SEMI-COLON> sql
statement separator, next
Select a filename (metastore.sql), save
5. Run the resulting metastore.sql file through the attached
metastoreprocess perl script (e.g. cat metastore.sql | ./metastoreprocess >
6. Eyeball the metastore-processed.sql file for anything glaringly wrong
7. Load the file into mysql (e.g. cat metastore-processed.sql | mysql)
8. Reconfigure your configs to point at the new metastore location
Worth noting is that I actually did steps 1-7 to test the import. Then I
stopped everything that used Hive and repeated the entire process again,
followed by replacing the configs to point at the new metastore and starting
the hiveserver. We did notice one issue in that the regular expressions we
had in some of the tables got mangled due to escaping during the transfers
and conversions. We manually fixed those later and everything started
working again without issue.
I hope this helps. If you find errors or omissions and tweak the process for
improvement -- please reply back so others can bask in the knowledge.
We've been running hive in the default derby single user mode for a while.
Now we've got more users interested in Hive and so would like to change the
metastore to run off of mysql. I was able to find a pretty easy tutorial
for doing this at this blog:http://blog.milford.io/2010/06/installing-apache-hive-with-a-mysql-metastore-in-centos/
Before we pull the trigger...we have a couple concerns that are not clear
from the documentation
1) Since we already have a lot of tables in hive, if we switch to server
mode using mysql, will hive automatically rebuild the metadata?
2) If not, how do we rebuild the metadata from derby in mysql? Is there a
script we can use?
3) Any other gotchas we should be aware about?
I believe we are running 0.5.