Hi Hive users,
I just released a tool to import data from SQL databases into files
stored on HDFS ("Sqoop"; see
http://issues.apache.org/jira/browse/hadoop-5815). An extension that
I'm now working on is to use the Hive JDBC client to connect to a Hive
metastore attached to the same HDFS instance and carry the table
forward from HDFS into Hive by executing auto-generated "CREATE TABLE"
and "LOAD DATA" statements.
Most of the JDBC SQL types translate straightforwardly into Hive
types. But the SQL DECIMAL type a.k.a. NUMERIC is a fixed-precision
number value type. This is certainly not an INTEGER/BIGINT in Hive
parlance, but I'm hesitant to use DOUBLE as this is imprecise. I'd
like some suggestions from the Hive community on where I should go
with this:
Option A) Use DOUBLE and lose the precise nature of the type
Option B) Convert to STRING and lose the numeric nature of the type
(e.g., for sorting purposes)
Option C) Is someone with more hive-fu than I willing to implement
fixed-place arithmetic types? :)
Option D) Something I'm overlooking?
Cheers,
- Aaron Kimball