FAQ
I've a web application which uses Lucene for search functionality. Lucene
search requests are served by web services sitting on 2 application servers
(IIS 7).The 2 application servers are Load balanced using "netscaler".

Both these servers have a batch job running which updates search indexes on
the respective servers in the night on a daily basis.

I need to synchronize search indexes on these 2 servers so that at any point
of time both the servers have uptodate indexes. I was thinking what could be
the best architecture/design strategy to do so given the fact that any of
the 2 application servers could be serving search request depending upon its
availability.

Any inputs please?

Thanks for reading!

--
View this message in context: http://www.nabble.com/Synchronizing-Lucene-indexes-across-2-application-servers-tp24105223p24105223.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Search Discussions

  • Joel Halbert at Jun 19, 2009 at 8:24 am
    do they have to be kept in synch in real time?
    does each server handle writes to its own index which then need to be
    propagated to the other server's index?

    From a simplicity point of view, to minimise the amount of self consistency
    checking that needs to happen I would suggest even having a third, master
    index, to which all writes happen. As writes are applied to the master they
    are then propagated to the 2 servers. You then just need to keep a track of
    the latest document written to each of the two "slave" servers, and in vcase
    of failure/recovery on either you just request all deltas since the last
    known record on each.
    On Friday 19 June 2009 05:10:42 mitu2009 wrote:
    I've a web application which uses Lucene for search functionality. Lucene
    search requests are served by web services sitting on 2 application servers
    (IIS 7).The 2 application servers are Load balanced using "netscaler".

    Both these servers have a batch job running which updates search indexes on
    the respective servers in the night on a daily basis.

    I need to synchronize search indexes on these 2 servers so that at any
    point of time both the servers have uptodate indexes. I was thinking what
    could be the best architecture/design strategy to do so given the fact that
    any of the 2 application servers could be serving search request depending
    upon its availability.

    Any inputs please?

    Thanks for reading!


    --
    Joel Halbert
    020 3051 8637
    075 2501 0825
    [email protected]
    www.su3analytics.com
    www.storequery.com
    SU3 Analytics Ltd, The Print House, 18 Ashwin St, London E8 3DL.

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: [email protected]
    For additional commands, e-mail: [email protected]
  • Ian Lea at Jun 19, 2009 at 8:49 am
    Or have a third master index, as Joel suggests, apply all updates to that
    index, only, then at the end of each batch index update run, use rsync or
    equivalent to push the master index out to the 2 search servers and then
    tell them to reopen their indexes.


    --
    Ian.

    On Fri, Jun 19, 2009 at 9:23 AM, Joel Halbert wrote:

    do they have to be kept in synch in real time?
    does each server handle writes to its own index which then need to be
    propagated to the other server's index?

    From a simplicity point of view, to minimise the amount of self consistency
    checking that needs to happen I would suggest even having a third, master
    index, to which all writes happen. As writes are applied to the master they
    are then propagated to the 2 servers. You then just need to keep a track of
    the latest document written to each of the two "slave" servers, and in
    vcase
    of failure/recovery on either you just request all deltas since the last
    known record on each.
    On Friday 19 June 2009 05:10:42 mitu2009 wrote:
    I've a web application which uses Lucene for search functionality. Lucene
    search requests are served by web services sitting on 2 application servers
    (IIS 7).The 2 application servers are Load balanced using "netscaler".

    Both these servers have a batch job running which updates search indexes on
    the respective servers in the night on a daily basis.

    I need to synchronize search indexes on these 2 servers so that at any
    point of time both the servers have uptodate indexes. I was thinking what
    could be the best architecture/design strategy to do so given the fact that
    any of the 2 application servers could be serving search request depending
    upon its availability.

    Any inputs please?

    Thanks for reading!


    --
    Joel Halbert
    020 3051 8637
    075 2501 0825
    [email protected]
    www.su3analytics.com
    www.storequery.com
    SU3 Analytics Ltd, The Print House, 18 Ashwin St, London E8 3DL.

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: [email protected]
    For additional commands, e-mail: [email protected]
  • Otis Gospodnetic at Jun 20, 2009 at 4:29 am
    Hello,

    You may want to look at Lucene's younger brother named Solr: http://lucene.apache.org/solr/

    Otis
    --
    Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


    ----- Original Message ----
    From: mitu2009 <[email protected]>
    To: [email protected]
    Sent: Friday, June 19, 2009 12:10:42 AM
    Subject: Synchronizing Lucene indexes across 2 application servers


    I've a web application which uses Lucene for search functionality. Lucene
    search requests are served by web services sitting on 2 application servers
    (IIS 7).The 2 application servers are Load balanced using "netscaler".

    Both these servers have a batch job running which updates search indexes on
    the respective servers in the night on a daily basis.

    I need to synchronize search indexes on these 2 servers so that at any point
    of time both the servers have uptodate indexes. I was thinking what could be
    the best architecture/design strategy to do so given the fact that any of
    the 2 application servers could be serving search request depending upon its
    availability.

    Any inputs please?

    Thanks for reading!

    --
    View this message in context:
    http://www.nabble.com/Synchronizing-Lucene-indexes-across-2-application-servers-tp24105223p24105223.html
    Sent from the Lucene - Java Users mailing list archive at Nabble.com.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: [email protected]
    For additional commands, e-mail: [email protected]

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: [email protected]
    For additional commands, e-mail: [email protected]
  • Ken Krugler at Jun 20, 2009 at 6:02 pm

    I've a web application which uses Lucene for search functionality. Lucene
    search requests are served by web services sitting on 2 application servers
    (IIS 7).The 2 application servers are Load balanced using "netscaler".

    Both these servers have a batch job running which updates search indexes on
    the respective servers in the night on a daily basis.

    I need to synchronize search indexes on these 2 servers so that at any point
    of time both the servers have uptodate indexes. I was thinking what could be
    the best architecture/design strategy to do so given the fact that any of
    the 2 application servers could be serving search request depending upon its
    availability.
    You could use Katta for this, as another option - it's an open source
    distributed Lucene search system.

    Under the hood Katta uses ZooKeeper to handle distribution of data to
    multiple servers. Once Katta has added an index to both systems, then
    you can switch to it (and eventually remove the old index).

    The fact that you'd need two Katta "masters" makes things a bit more
    interesting, as you'd have to coordinate when they both decide to
    switch to using the new index(es).

    -- Ken
    --
    Ken Krugler
    +1 530-210-6378

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: [email protected]
    For additional commands, e-mail: [email protected]

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedJun 19, '09 at 4:11a
activeJun 20, '09 at 6:02p
posts5
users5
websitelucene.apache.org

People

Translate

site design / logo © 2023 Grokbase