FAQ
Hi, all,

I am working on a distributed searching system. Now I have one server only.
It has to crawl pages from the Web, generate indexes locally and respond
users' queries. I think this is too busy for it to work smoothly.

I plan to use two servers at at least. The jobs to crawl pages and generate
indexes are done by one of them. After that, the new available indexes
should be transmitted to anther one which is responsible for responding
users' queries. From users' point of view, this system must be fast.
However, I don't know how I can get the additional indexes which I can
transmit. After transmission, how to append them to the old indexes? Does
the appending block searching?

When generating indexex, Lucene is used. However, I cannot see the updates
so that I cannot send them. I know Hadoop does the above thing internally.
How can it be merged with Lucene?

Thanks so much for your help!

Bing Li

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedNov 19, '10 at 4:26p
activeNov 19, '10 at 6:01p
posts3
users2
websitehadoop.apache.org...
irc#hadoop

2 users in discussion

Bing Li: 2 posts Marc Sturlese: 1 post

People

Translate

site design / logo © 2022 Grokbase