FAQ

ishwar ramani wrote:
Hi,

I have a setup where logs are periodically bundled up and dumped into
hadoop dfs as large sequence file.

It works fine for all my map reduce jobs.

Now i need to handle adhoc queries for pulling out logs based on user
and time range.

I really dont need a full indexer (like lucene) for this purpose.

My first thought is to run a periodic mapreduce to generate a large
text file sorted by user id.

The text file will have (sequence file name, offset) to retrieve the logs ....


I am guessing many of you ran into similar requirements... Any
suggestions on doing this better?

ishwar
Have you looked into Hive? Its perfect for ad hoc queries..

M

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 3 of 10 | next ›
Discussion Overview
groupcommon-user @
categorieshadoop
postedOct 1, '09 at 5:49p
activeOct 5, '09 at 9:32p
posts10
users6
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2021 Grokbase