New to Cloudera and working on POC. I am using Python parsing to parse my
scanner logs. Once I parse my log, I am bringing that file to HDFS and
creating tables and writing quires. The question what is the best way to
parse file? Before I bring it into HDFS or after HDFS?
Can I combine my Python, hive create table and hive query all together?
Please help.