FAQ
Hello,

I'm getting only NULLs when trying to read data from an external table for which the data was staged. I'm using Cloudera's hive-0.4.0+14.tar.gz with hadoop-0.20.1+152.tar.gz on a Centos machine.

My steps were:

# CREATE TABLE

CREATE EXTERNAL TABLE mailog_stg(month STRING, day STRING, time STRING, host STRING, logline STRING)
ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'
WITH SERDEPROPERTIES (
"input.regex" = "^([^ ]+) ([^ ]+) ([^ ]+) ([^ ]+) (.+)$",
"output.format.string" = "%1$s %2$s %3$s %4$s %5$s"
)
STORED AS TEXTFILE
LOCATION '/user/data/staging/mailog';

# COPY DATA FILE

hadoop dfs -put /ken/sf45-cc2-maillogs/mail.20091102 /user/data/staging/mailog

# DO A SELECT

hive> select * from mailog_stg;
OK
NULL NULL NULL NULL NULL
NULL NULL NULL NULL NULL
NULL NULL NULL NULL NULL
NULL NULL NULL NULL NULL

I verified my data is out there:

[root@jims-desktop ~]# hadoop dfs -cat /user/data/staging/mailog/m*|wc
4 77 1016

And in fact when I used LOAD DATA INPATH with a table defined the same way but without making it external, it worked fine.

Any suggestions how to debug this?

Thanks
Ken

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categorieshive, hadoop
postedDec 1, '09 at 1:07a
activeDec 1, '09 at 1:07a
posts1
users1
websitehive.apache.org

1 user in discussion

Ken Barclay: 1 post

People

Translate

site design / logo © 2021 Grokbase