FAQ
We've discussed realtime search before, it looks like after the next release
we can get some sort of realtime search working. I was going to open a new
issue but decided it might be best to discuss realtime search on the dev
list.

Lucene can implement realtime search as the ability to add, update, or
delete documents with latency in the sub 5 millisecond range. A couple of
different options are available.

1) Expose a rolling set of realtime readers over the memory index used by
IndexWriter. Requires incrementally updating field caches and filters, and
is somewhat unclear how IndexReader versioning would work (for example
versions of the term dictionary).
2) Implement realtime search by incrementally creating and merging readers
in memory. The system would use MemoryIndex or InstantiatedIndex to quickly
(more quickly than RAMDirectory) create indexes from added documents. The
in memory indexes would be periodically merged in the background and
according to RAM used write to disk. Each update would generate a new
IndexReader or MultiSearcher that includes the new updates. Field caches
and filters could be cached per IndexReader according to how Lucene works
today. The downside of this approach is the indexing will not be as fast as
#1 because of the in memory merging which similar to the Lucene pre 2.3
which merged in memory segments using RAMDirectory.

Are there other implementation options?

A new patch would focus on providing in memory indexing as part of the core
of Lucene. The work of LUCENE-1483 and LUCENE-1314 would be used. I am not
sure if option #2 can become part of core if it relies on a contrib module?
It makes sense to provide a new realtime oriented merge policy that merges
segments based on the number of deletes rather than a merge factor. The
realtime merge policy would keep the segments within a minimum and maximum
size in kilobytes to limit the time consumed by merging which is assumed
would occur frequently.

LUCENE-1313 which includes a transaction log with rollback and was designed
with distributed search and may be retired or the components split out.

Search Discussions

  • Marvin Humphrey at Dec 24, 2008 at 2:23 am

    On Tue, Dec 23, 2008 at 05:51:43PM -0800, Jason Rutherglen wrote:

    Are there other implementation options?
    Here's the plan for Lucy/KS:

    1) Design index formats that can be memory mapped rather than slurped,
    bringing the cost of opening/reopening an IndexReader down to a
    negligible level.
    2) Enable segment-centric sorted search. (LUCENE-1483)
    3) Implement tombstone-based deletions, so that the cost of deleting
    documents scales with the number of deletions rather than the size of the
    index.
    4) Allow 2 concurrent writers: one for small, fast updates, and one for
    big background merges.

    Marvin Humphrey


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Robert engels at Dec 24, 2008 at 2:37 am
    Is there something that I am missing? I see lots of references to
    using "memory mapped" files to "dramatically" improve performance.

    I don't think this is the case at all. At the lowest levels, it is
    somewhat more efficient from a CPU standpoint, but with a decent OS
    cache the IO performance difference is going to negligible.

    The primary benefit of memory mapped files is simplicity in code
    (although in Java there is another layer needed - think C ), and the
    file can be treated as a random accessible memory array.

    From my OS design experience, the page at http://en.wikipedia.org/
    wiki/Memory-mapped_file is incorrect.

    Even if the memory mapped file is mapped into the virtual memory
    space, unless you specialized memory controllers and disk systems,
    when a page fault occurs, the OS loads the page just as any other.

    The difference with direct IO, is that there is first a simple
    translation from position to disk page, and the OS disk page cache is
    checked. Almost exactly the same thing occurs with a memory mapped file.

    The memory addressed is accessed, if not in memory, a page fault
    occurs, and the page is loaded from the file (it may be loaded from
    the OS disk cache in this process).

    The point being, if the page is not in the cache (which is probably
    the case with a large index), the time to load the page is far
    greater than the difference between the IO address translation and
    the memory address lookup.

    If all of the pages of the index can fit in memory, a properly
    configured system is going to have them in the page cache anyway....


    On Dec 23, 2008, at 8:22 PM, Marvin Humphrey wrote:
    On Tue, Dec 23, 2008 at 05:51:43PM -0800, Jason Rutherglen wrote:

    Are there other implementation options?
    Here's the plan for Lucy/KS:

    1) Design index formats that can be memory mapped rather than
    slurped,
    bringing the cost of opening/reopening an IndexReader down to a
    negligible level.
    2) Enable segment-centric sorted search. (LUCENE-1483)
    3) Implement tombstone-based deletions, so that the cost of deleting
    documents scales with the number of deletions rather than the
    size of the
    index.
    4) Allow 2 concurrent writers: one for small, fast updates, and
    one for
    big background merges.

    Marvin Humphrey


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Marvin Humphrey at Dec 24, 2008 at 3:21 am

    On Tue, Dec 23, 2008 at 08:36:24PM -0600, robert engels wrote:
    Is there something that I am missing? Yes.
    I see lots of references to using "memory mapped" files to "dramatically"
    improve performance.
    There have been substantial discussions about this design in JIRA,
    notably LUCENE-1458.

    The "dramatic" improvement is WRT to opening/reopening an IndexReader.
    Presently in both KS and Lucene, certain data structures have to be read at
    IndexReader startup and unpacked into process memory -- in particular, the
    term dictionary index and sort caches. If those data structures can be
    represented by a memory mapped file rather than built up from scratch, we save
    big.

    Marvin Humphrey


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Robert engels at Dec 24, 2008 at 5:03 am
    Seems doubtful you will be able to do this without increasing the
    index size dramatically. Since it will need to be stored
    "unpacked" (in order to have random access), yet the terms are
    variable length - leading to using a maximum=minimum size for every
    term.

    In the end I highly doubt it will make much difference in speed -
    here's several reasons why...

    1. with "fixed" size terms, the additional IO (larger pages) probably
    offsets a lot of the random access benefit. This is why "compressed"
    disks on a fast machine (CPU) are often faster than "uncompressed" -
    more data is read during every IO access.

    2. with a reopen, only new segments are "read", and since it is a new
    segment, it is most likely already in the disk cache (from the
    write), so the reopen penalty is negligible (especially if the term
    index "skip to" is written last).

    3. If the reopen is after an optimize - when the OS cache has
    probably been obliterated, then the warm up time is going to be
    similar in most cases anyway, since the "index" pages will also not
    be in core (in the case of memory mapped files). Again, writing the
    "skip to" last can help with this.

    Just because a file is memory mapped does not mean its pages will
    have an greater likelihood to be in the cache. The locality of
    reference is going to control this, just as the most/often access
    controls it in the OS disk cache. Also, most OSs will take real
    memory from the virtual address space and add it to the disk cache if
    the process is doing lots of IO.

    If you have a memory mapped "term index", you are still going to need
    to perform a binary search to find the correct term "page", and after
    an optimize the visited pages will not be in the cache (or in core).
    On Dec 23, 2008, at 9:20 PM, Marvin Humphrey wrote:
    On Tue, Dec 23, 2008 at 08:36:24PM -0600, robert engels wrote:
    Is there something that I am missing? Yes.
    I see lots of references to using "memory mapped" files to
    "dramatically"
    improve performance.
    There have been substantial discussions about this design in JIRA,
    notably LUCENE-1458.

    The "dramatic" improvement is WRT to opening/reopening an IndexReader.
    Presently in both KS and Lucene, certain data structures have to be
    read at
    IndexReader startup and unpacked into process memory -- in
    particular, the
    term dictionary index and sort caches. If those data structures
    can be
    represented by a memory mapped file rather than built up from
    scratch, we save
    big.

    Marvin Humphrey


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Robert engels at Dec 24, 2008 at 4:51 pm
    Thinking about this some more, you could use fixed length pages for
    the term index, with a page header containing a count of entries, and
    use key compression (to avoid the constant entry size).

    The problem with this is that you still have to decode the entries
    (slowing the processing - since a simple binary search on the page is
    not possible).

    But, if you also add a 'least term and greatest term' to the page
    header (you can avoid the duplicate storage of these entries as
    well), you can perform a binary search of the term index much faster.
    You only need to decode the index page containing (maybe) the desired
    entry.

    If you were doing a prefix/range search, you will still end up
    decoding lots of pages...

    This is why a database has their own page cache, and usually caches
    the decoded form (for index pages) for faster processing - at the
    expense of higher memory usage. Usually data pages are not cached in
    the decoded/uncompressed form. In most cases the database vendor will
    recommend removing the OS page cache on the database server, and
    allocating all of the memory to the database process.

    You may be able to avoid some of the warm-up of an index using memory
    mapped files, but with proper ordering of the writing of the index,
    it probably isn't necessary. Beyond that, processing the term index
    directly using NIO does not appear that it will be faster than using
    an in-process cache of the term index (similar to the skip-to memory
    index now).

    The BEST approach is probably to have the index writer build the
    memory "skip to" structure as it writes the segment, and then include
    this in the segment during the reopen - no warming required !. As
    long as the reader and writer are in the same process, it will be a
    winner !
    On Dec 23, 2008, at 11:02 PM, robert engels wrote:

    Seems doubtful you will be able to do this without increasing the
    index size dramatically. Since it will need to be stored
    "unpacked" (in order to have random access), yet the terms are
    variable length - leading to using a maximum=minimum size for every
    term.

    In the end I highly doubt it will make much difference in speed -
    here's several reasons why...

    1. with "fixed" size terms, the additional IO (larger pages)
    probably offsets a lot of the random access benefit. This is why
    "compressed" disks on a fast machine (CPU) are often faster than
    "uncompressed" - more data is read during every IO access.

    2. with a reopen, only new segments are "read", and since it is a
    new segment, it is most likely already in the disk cache (from the
    write), so the reopen penalty is negligible (especially if the term
    index "skip to" is written last).

    3. If the reopen is after an optimize - when the OS cache has
    probably been obliterated, then the warm up time is going to be
    similar in most cases anyway, since the "index" pages will also not
    be in core (in the case of memory mapped files). Again, writing the
    "skip to" last can help with this.

    Just because a file is memory mapped does not mean its pages will
    have an greater likelihood to be in the cache. The locality of
    reference is going to control this, just as the most/often access
    controls it in the OS disk cache. Also, most OSs will take real
    memory from the virtual address space and add it to the disk cache
    if the process is doing lots of IO.

    If you have a memory mapped "term index", you are still going to
    need to perform a binary search to find the correct term "page",
    and after an optimize the visited pages will not be in the cache
    (or in core).
    On Dec 23, 2008, at 9:20 PM, Marvin Humphrey wrote:
    On Tue, Dec 23, 2008 at 08:36:24PM -0600, robert engels wrote:
    Is there something that I am missing? Yes.
    I see lots of references to using "memory mapped" files to
    "dramatically"
    improve performance.
    There have been substantial discussions about this design in JIRA,
    notably LUCENE-1458.

    The "dramatic" improvement is WRT to opening/reopening an
    IndexReader.
    Presently in both KS and Lucene, certain data structures have to
    be read at
    IndexReader startup and unpacked into process memory -- in
    particular, the
    term dictionary index and sort caches. If those data structures
    can be
    represented by a memory mapped file rather than built up from
    scratch, we save
    big.

    Marvin Humphrey


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Paul Elschot at Dec 24, 2008 at 5:34 pm

    Op Wednesday 24 December 2008 17:51:04 schreef robert engels:
    Thinking about this some more, you could use fixed length pages for
    the term index, with a page header containing a count of entries, and
    use key compression (to avoid the constant entry size).

    The problem with this is that you still have to decode the entries
    (slowing the processing - since a simple binary search on the page is
    not possible).
    The cache between the pages and the cpu is also a bottleneck nowadays.
    See here:

    Super-Scalar RAM-CPU Cache Compression
    M Zukowski, S Heman, N Nes, P Boncz - cwi.nl

    currently available from this link:

    http://www.cwi.nl/htbin/ins1/publications?request=pdfgz&key=ZuHeNeBo:ICDE:06

    Also, some preliminary results on lucene indexes
    are available at LUCENE-1410.

    Regards,
    Paul Elschot

    But, if you also add a 'least term and greatest term' to the page
    header (you can avoid the duplicate storage of these entries as
    well), you can perform a binary search of the term index much faster.
    You only need to decode the index page containing (maybe) the desired
    entry.

    If you were doing a prefix/range search, you will still end up
    decoding lots of pages...

    This is why a database has their own page cache, and usually caches
    the decoded form (for index pages) for faster processing - at the
    expense of higher memory usage. Usually data pages are not cached in
    the decoded/uncompressed form. In most cases the database vendor will
    recommend removing the OS page cache on the database server, and
    allocating all of the memory to the database process.

    You may be able to avoid some of the warm-up of an index using memory
    mapped files, but with proper ordering of the writing of the index,
    it probably isn't necessary. Beyond that, processing the term index
    directly using NIO does not appear that it will be faster than using
    an in-process cache of the term index (similar to the skip-to memory
    index now).

    The BEST approach is probably to have the index writer build the
    memory "skip to" structure as it writes the segment, and then include
    this in the segment during the reopen - no warming required !. As
    long as the reader and writer are in the same process, it will be a
    winner !
    On Dec 23, 2008, at 11:02 PM, robert engels wrote:
    Seems doubtful you will be able to do this without increasing the
    index size dramatically. Since it will need to be stored
    "unpacked" (in order to have random access), yet the terms are
    variable length - leading to using a maximum=minimum size for every
    term.

    In the end I highly doubt it will make much difference in speed -
    here's several reasons why...

    1. with "fixed" size terms, the additional IO (larger pages)
    probably offsets a lot of the random access benefit. This is why
    "compressed" disks on a fast machine (CPU) are often faster than
    "uncompressed" - more data is read during every IO access.

    2. with a reopen, only new segments are "read", and since it is a
    new segment, it is most likely already in the disk cache (from the
    write), so the reopen penalty is negligible (especially if the term
    index "skip to" is written last).

    3. If the reopen is after an optimize - when the OS cache has
    probably been obliterated, then the warm up time is going to be
    similar in most cases anyway, since the "index" pages will also not
    be in core (in the case of memory mapped files). Again, writing the
    "skip to" last can help with this.

    Just because a file is memory mapped does not mean its pages will
    have an greater likelihood to be in the cache. The locality of
    reference is going to control this, just as the most/often access
    controls it in the OS disk cache. Also, most OSs will take real
    memory from the virtual address space and add it to the disk cache
    if the process is doing lots of IO.

    If you have a memory mapped "term index", you are still going to
    need to perform a binary search to find the correct term "page",
    and after an optimize the visited pages will not be in the cache
    (or in core).
    On Dec 23, 2008, at 9:20 PM, Marvin Humphrey wrote:
    On Tue, Dec 23, 2008 at 08:36:24PM -0600, robert engels wrote:
    Is there something that I am missing? Yes.
    I see lots of references to using "memory mapped" files to
    "dramatically"
    improve performance.
    There have been substantial discussions about this design in JIRA,
    notably LUCENE-1458.

    The "dramatic" improvement is WRT to opening/reopening an
    IndexReader.
    Presently in both KS and Lucene, certain data structures have to
    be read at
    IndexReader startup and unpacked into process memory -- in
    particular, the
    term dictionary index and sort caches. If those data structures
    can be
    represented by a memory mapped file rather than built up from
    scratch, we save
    big.

    Marvin Humphrey


    ------------------------------------------------------------------
    --- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
    -------------------------------------------------------------------
    -- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Robert engels at Dec 24, 2008 at 6:03 pm
    As I pointed out in another email, I understand the benefits of
    compression (compressed disks vs. uncompressed, etc.). PFOR is
    definitely a winner !

    As I understood this discussion though, it was an attempt to remove
    the in memory 'skip to' index, to avoid the reading of this during
    index open/reopen.

    I was attempting to point out that this in-memory index is still
    needed, but there are ways to improve the current process.

    I don't think a mapped file for the term index is going to work for a
    variety of reasons. Mapped files are designed as a programming
    simplification - mainly for older systems that use line delimited
    files - rather than having to create "page/section" caches when
    processing very large files (when only a small portion is used at any
    given time - ie. the data visible on the screen). When you end up
    visiting a large portion of the file anyway (to do a full
    repagination), an in-process intelligent cache is going to be far
    superior.

    My review of the Java Buffer related classes does not give me the
    impression it is going to be faster - in fact it will be slower- than
    a single copy into user space, and process/decompress there. The
    Buffer system is suitable when perform little inspection, and then
    direct copy to another buffer (think reading from a file, and sending
    out on a socket). If you end up inspecting the buffer, it is going to
    be very slow.
    On Dec 24, 2008, at 11:33 AM, Paul Elschot wrote:


    Op Wednesday 24 December 2008 17:51:04 schreef robert engels:
    Thinking about this some more, you could use fixed length pages for
    the term index, with a page header containing a count of entries, and
    use key compression (to avoid the constant entry size).

    The problem with this is that you still have to decode the entries
    (slowing the processing - since a simple binary search on the page is
    not possible).
    The cache between the pages and the cpu is also a bottleneck nowadays.
    See here:

    Super-Scalar RAM-CPU Cache Compression
    M Zukowski, S Heman, N Nes, P Boncz - cwi.nl

    currently available from this link:

    http://www.cwi.nl/htbin/ins1/publications?
    request=pdfgz&key=ZuHeNeBo:ICDE:06

    Also, some preliminary results on lucene indexes
    are available at LUCENE-1410.

    Regards,
    Paul Elschot

    But, if you also add a 'least term and greatest term' to the page
    header (you can avoid the duplicate storage of these entries as
    well), you can perform a binary search of the term index much faster.
    You only need to decode the index page containing (maybe) the desired
    entry.

    If you were doing a prefix/range search, you will still end up
    decoding lots of pages...

    This is why a database has their own page cache, and usually caches
    the decoded form (for index pages) for faster processing - at the
    expense of higher memory usage. Usually data pages are not cached in
    the decoded/uncompressed form. In most cases the database vendor will
    recommend removing the OS page cache on the database server, and
    allocating all of the memory to the database process.

    You may be able to avoid some of the warm-up of an index using memory
    mapped files, but with proper ordering of the writing of the index,
    it probably isn't necessary. Beyond that, processing the term index
    directly using NIO does not appear that it will be faster than using
    an in-process cache of the term index (similar to the skip-to memory
    index now).

    The BEST approach is probably to have the index writer build the
    memory "skip to" structure as it writes the segment, and then include
    this in the segment during the reopen - no warming required !. As
    long as the reader and writer are in the same process, it will be a
    winner !
    On Dec 23, 2008, at 11:02 PM, robert engels wrote:
    Seems doubtful you will be able to do this without increasing the
    index size dramatically. Since it will need to be stored
    "unpacked" (in order to have random access), yet the terms are
    variable length - leading to using a maximum=minimum size for every
    term.

    In the end I highly doubt it will make much difference in speed -
    here's several reasons why...

    1. with "fixed" size terms, the additional IO (larger pages)
    probably offsets a lot of the random access benefit. This is why
    "compressed" disks on a fast machine (CPU) are often faster than
    "uncompressed" - more data is read during every IO access.

    2. with a reopen, only new segments are "read", and since it is a
    new segment, it is most likely already in the disk cache (from the
    write), so the reopen penalty is negligible (especially if the term
    index "skip to" is written last).

    3. If the reopen is after an optimize - when the OS cache has
    probably been obliterated, then the warm up time is going to be
    similar in most cases anyway, since the "index" pages will also not
    be in core (in the case of memory mapped files). Again, writing the
    "skip to" last can help with this.

    Just because a file is memory mapped does not mean its pages will
    have an greater likelihood to be in the cache. The locality of
    reference is going to control this, just as the most/often access
    controls it in the OS disk cache. Also, most OSs will take real
    memory from the virtual address space and add it to the disk cache
    if the process is doing lots of IO.

    If you have a memory mapped "term index", you are still going to
    need to perform a binary search to find the correct term "page",
    and after an optimize the visited pages will not be in the cache
    (or in core).
    On Dec 23, 2008, at 9:20 PM, Marvin Humphrey wrote:
    On Tue, Dec 23, 2008 at 08:36:24PM -0600, robert engels wrote:
    Is there something that I am missing? Yes.
    I see lots of references to using "memory mapped" files to
    "dramatically"
    improve performance.
    There have been substantial discussions about this design in JIRA,
    notably LUCENE-1458.

    The "dramatic" improvement is WRT to opening/reopening an
    IndexReader.
    Presently in both KS and Lucene, certain data structures have to
    be read at
    IndexReader startup and unpacked into process memory -- in
    particular, the
    term dictionary index and sort caches. If those data structures
    can be
    represented by a memory mapped file rather than built up from
    scratch, we save
    big.

    Marvin Humphrey


    ------------------------------------------------------------------
    --- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
    -------------------------------------------------------------------
    -- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Marvin Humphrey at Dec 24, 2008 at 7:32 pm

    On Wed, Dec 24, 2008 at 12:02:24PM -0600, robert engels wrote:
    As I understood this discussion though, it was an attempt to remove
    the in memory 'skip to' index, to avoid the reading of this during
    index open/reopen.
    No. That idea was entertained briefly and quickly discarded. There seems to
    be an awful lot of irrelevant noise in the current thread arising due to lack
    of familiarity with the ongoing discussions in JIRA.

    Marvin Humphrey

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Marvin Humphrey at Dec 24, 2008 at 7:03 pm

    On Tue, Dec 23, 2008 at 11:02:56PM -0600, robert engels wrote:
    Seems doubtful you will be able to do this without increasing the
    index size dramatically. Since it will need to be stored
    "unpacked" (in order to have random access), yet the terms are
    variable length - leading to using a maximum=minimum size for every
    term.
    Wow. That's a spectacularly awful design. Its worst case -- one outlier
    term, say, 1000 characters in length, in a field where the average term length
    is in the single digits -- would explode the index size and incur wasteful IO
    overhead, just as you say.

    Good thing we've never considered it. :)

    I'm hoping we can improve on this, but for now, we've ended up at a two-file
    design for the term dictionary index.

    1) Stacked 64-bit file pointers.
    2) Variable length character and term info data, interpreted using a
    pluggable codec.

    In the index at least, each entry would contain the full term text, encoded as
    UTF-8. Probably the primary term dictionary would continue to use string
    diffs.

    That design offers no significant benefits other than those that flow from
    compatibility with mmap: faster IndexReader open/reaopen, lower RAM usage
    under multiple processes by way of buffer sharing. IO bandwidth requirements
    and speed are probably a little better, but lookups on the term dictionary
    index are not a significant search-time bottleneck.

    Additionally, sort caches would be written at index time in three files, and
    memory mapped as laid out in
    <https://issues.apache.org/jira/browse/LUCENE-831?focusedCommentId=12656150#action_12656150>.

    1) Stacked 64-bit file pointers.
    2) Character data.
    3) Doc num to ord mapping.

    Marvin Humphrey


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Robert engels at Dec 24, 2008 at 5:18 am
    Also, if you are thinking that accessing the "buffer" directly will
    be faster than "parsing" the packed structure, I'm not so sure.

    You can review the source for the various buffers, and since the is
    no "struct" support in Java, you end up combining bytes to make
    longs, etc. Also, a lot of the accesses are though Unsafe, which is
    slower than the indirection on a Java object to access a field.

    My review of these classes makes me think that parsing the "skip to"
    index once into java objects for later use is going to be a lot
    faster overall than accessing the entire mapped file directly on
    every invocation.
    On Dec 23, 2008, at 9:20 PM, Marvin Humphrey wrote:
    On Tue, Dec 23, 2008 at 08:36:24PM -0600, robert engels wrote:
    Is there something that I am missing? Yes.
    I see lots of references to using "memory mapped" files to
    "dramatically"
    improve performance.
    There have been substantial discussions about this design in JIRA,
    notably LUCENE-1458.

    The "dramatic" improvement is WRT to opening/reopening an IndexReader.
    Presently in both KS and Lucene, certain data structures have to be
    read at
    IndexReader startup and unpacked into process memory -- in
    particular, the
    term dictionary index and sort caches. If those data structures
    can be
    represented by a memory mapped file rather than built up from
    scratch, we save
    big.

    Marvin Humphrey


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Michael McCandless at Dec 26, 2008 at 11:23 am

    Marvin Humphrey wrote:

    4) Allow 2 concurrent writers: one for small, fast updates, and one for
    big background merges.
    Marvin can you describe more detail here? It sounds like this is your
    solution for "decoupling" segments changes due to merges from changes
    from docs being indexed, from a reader's standpoint?

    Since you are using mmap to achieve near zero brand-new IndexReader
    creation, whereas in Lucene we are moving towards achieving real-time
    by always reopening a current IndexReader (not a brand new one), it
    seems like you should not actually have to worry about the case of
    reopening a reader after a large merge has finished?

    We need to deal with this case (background the warming) because
    creating that new SegmentReader (on the newly merged segment) can take
    a non-trivial amount of time.

    Mike
  • Marvin Humphrey at Dec 26, 2008 at 6:54 pm

    On Fri, Dec 26, 2008 at 06:22:23AM -0500, Michael McCandless wrote:
    4) Allow 2 concurrent writers: one for small, fast updates, and one for
    big background merges.
    Marvin can you describe more detail here?
    The goal is to improve worst-case write performance.

    Currently, writes are quick most of the time, but occassionally you'll trigger
    a big merge and get stuck. To solve this problem, we can assign a merge
    policy to our primary writer which tells it to merge no more than
    mergeThreshold documents. The value of mergeTheshold will need tuning
    depending on document size, change rate, and so on, but the idea is that we
    want this writer to do as much merging as it can while still keeping
    worst-case write performance down to an acceptable number.

    Doing only small merges just puts off the day of reckoning, of course. By
    avoiding big consolidations, we are slowly accumulating small-to-medium sized
    segments and causing a gradual degradation of search-time performance.

    What we'd like is a separate write process, operating (mostly) in the
    background, dedicated solely to merging segments which contain at least
    mergeThreshold docs.

    If all we have to do is add documents to the index, adding that second write
    process isn't a big deal. We have to worry about competion for segment,
    snapshot, and temp file names, but that's about it.

    Deletions make matters more complicated, but with a tombstone-based deletions
    mechanism, the problems are solvable.

    When the background merge writer starts up, it will see a particular view of
    the index in time, including deletions. It will perform nearly all of its
    operations based on this view of the index, mapping around documents which
    were marked as deleted at init time.

    In between the time when the background merge writer starts up and the time it
    finishes consolidating segment data, we assume that the primary writer will
    have modified the index.

    * New docs have been added in new segments.
    * Tombstones have been added which suppress documents in segments which
    didn't even exist when the background merge writer started up.
    * Tombstones have been added which suppress documents in segments which
    existed when the background merge writer started up, but were not merged.
    * Tombstones have been added which suppress documents in segments which have
    just been merged.

    Only the last category of deletions matters.

    At this point, the background merge writer aquires an exclusive write lock on
    the index. It examines recently added tombstones, translates the document
    numbers and writes a tombstone file against itself. Then it writes the
    snapshot file to commit its changes and releases the write lock.

    Worst case update performance for the system is now the sum of the time it
    takes the background merge writer consolidate tombstones and worst-case
    performance of the primary writer.
    It sounds like this is your solution for "decoupling" segments changes due
    to merges from changes from docs being indexed, from a reader's standpoint?
    It's true that we are decoupling the process of making logical changes to the
    index from the process of internal consolidation. I probably wouldn't
    describe that as being done from the reader's standpoint, though.

    With mmap and data structures optimized for it, we basically solve the
    read-time responsiveness cost problem. From the client perspective, the delay
    between firing off a change order and seeing that change made live is now
    dominated by the time it takes to actually update the index. The time between
    the commit and having an IndexReader which can see that commit is negligible
    in comparision.
    Since you are using mmap to achieve near zero brand-new IndexReader
    creation, whereas in Lucene we are moving towards achieving real-time
    by always reopening a current IndexReader (not a brand new one), it
    seems like you should not actually have to worry about the case of
    reopening a reader after a large merge has finished?
    Even though we can rely on mmap rather than slurping, there are potentially a
    lot of files to open and a lot of JSON-encoded metadata to parse, so I'm not
    certain that Lucy/KS will never have to worry about the time it takes to open
    a new IndexReader. Fortunately, we can implement reopen() if we need to.
    We need to deal with this case (background the warming) because
    creating that new SegmentReader (on the newly merged segment) can take
    a non-trivial amount of time.
    Yes. Without mmap or some other solution, I think improvements to worst-case
    update performance in Lucene will continue to be constrained by post-commit
    IndexReader opening costs.

    Marvin Humphrey


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Doug Cutting at Dec 24, 2008 at 5:53 pm

    Jason Rutherglen wrote:
    2) Implement realtime search by incrementally creating and merging
    readers in memory. The system would use MemoryIndex or
    InstantiatedIndex to quickly (more quickly than RAMDirectory) create
    indexes from added documents.
    As a baseline, how fast is it to simply use RAMDirectory? If one, e.g.,
    flushes changes every 10ms or so, and has a background thread that uses
    IndexReader.reopen() to keep a fresh version for reading?

    Also, what are the requirements? Must a document be visible to search
    within 10ms of being added? Or must it be visible to search from the
    time that the call to add it returns? In the latter case one might
    still use an approach like the above. Writing a small new segment to a
    RAMDirectory and then, with no merging, calling IndexReader.reopen(),
    should be quite fast. All merging could be done in the background, as
    should post-merge reopens() that involve large segments.

    In short, I wonder if new reader and writer implementations are in fact
    required or whether, perhaps with a few optimizations, the existing
    implementations might meet this need.

    Doug

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Jason Rutherglen at Dec 24, 2008 at 6:24 pm
    Also, what are the requirements? Must a document be visible to search
    within 10ms of being added?

    0-5ms. Otherwise it's not realtime, it's batch indexing. The realtime
    system can support small batches by encoding them into RAMDirectories if
    they are of sufficient size.
    Or must it be visible to search from the time that the call to add it
    returns?

    Most people probably expect the update latency offered by SQL databases.
    As a baseline, how fast is it to simply use RAMDirectory?
    It depends on how fast searches over the realtime index need to be. The
    detriment to speed occurs with having many small segments that are
    continuously decoded (terms, postings, etc). The advantage of MemoryIndex
    and InstantiatedIndex is an actual increase in search speed compared with
    RAMDirectory (see the Performance Notes at
    http://hudson.zones.apache.org/hudson/job/Lucene-trunk/javadoc//org/apache/lucene/index/memory/MemoryIndex.htmland
    )and no need to continuously decode segments that are short lived.

    Anecdotal tests indicated the merging overhead of using RAMDirectory as
    compared with MI or II is significant enough to make it only useful for
    doing batches in the 1000s which does not seem to be what people expect from
    realtime search.
    On Wed, Dec 24, 2008 at 9:53 AM, Doug Cutting wrote:

    Jason Rutherglen wrote:
    2) Implement realtime search by incrementally creating and merging readers
    in memory. The system would use MemoryIndex or InstantiatedIndex to quickly
    (more quickly than RAMDirectory) create indexes from added documents.
    As a baseline, how fast is it to simply use RAMDirectory? If one, e.g.,
    flushes changes every 10ms or so, and has a background thread that uses
    IndexReader.reopen() to keep a fresh version for reading?

    Also, what are the requirements? Must a document be visible to search
    within 10ms of being added? Or must it be visible to search from the time
    that the call to add it returns? In the latter case one might still use an
    approach like the above. Writing a small new segment to a RAMDirectory and
    then, with no merging, calling IndexReader.reopen(), should be quite fast.
    All merging could be done in the background, as should post-merge reopens()
    that involve large segments.

    In short, I wonder if new reader and writer implementations are in fact
    required or whether, perhaps with a few optimizations, the existing
    implementations might meet this need.

    Doug

    ---------------------------------------------------------------------

    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Robert engels at Dec 24, 2008 at 6:38 pm

    On Dec 24, 2008, at 12:23 PM, Jason Rutherglen wrote:

    Also, what are the requirements? Must a document be visible to
    search within 10ms of being added?

    0-5ms. Otherwise it's not realtime, it's batch indexing. The
    realtime system can support small batches by encoding them into
    RAMDirectories if they are of sufficient size.
    Or must it be visible to search from the time that the call to
    add it returns?

    Most people probably expect the update latency offered by SQL
    databases.
    This is the problem spot. In an SQL database, when an update/add
    occurs, the same connection/transaction will see the changes when
    requested IMMEDIATELY - there is 0 latency.

    In order to do this you MUST have the concept of transactions and/or
    connections.

    OR you must make it so that every update/add is immediately available
    - this is probably simpler.

    You just need to always search the ram and the disk index. The
    deletions must be mapped to the disk index, and the "latest" version
    of the document must be obtained from the ram index (if it is there).

    You just need to merge the ram and disk in the background... and
    continually create new/merged ram disks.

    The memory requirements are going to go up, but you can always add a
    "block" so that if the background merger gets too far behind, the
    system blocks any current requests (to avoid the system running out
    of memory).
    As a baseline, how fast is it to simply use RAMDirectory?
    It depends on how fast searches over the realtime index need to
    be. The detriment to speed occurs with having many small segments
    that are continuously decoded (terms, postings, etc). The
    advantage of MemoryIndex and InstantiatedIndex is an actual
    increase in search speed compared with RAMDirectory (see the
    Performance Notes at http://hudson.zones.apache.org/hudson/job/
    Lucene-trunk/javadoc//org/apache/lucene/index/memory/
    MemoryIndex.html and )and no need to continuously decode segments
    that are short lived.

    Anecdotal tests indicated the merging overhead of using
    RAMDirectory as compared with MI or II is significant enough to
    make it only useful for doing batches in the 1000s which does not
    seem to be what people expect from realtime search.

    On Wed, Dec 24, 2008 at 9:53 AM, Doug Cutting wrote:
    Jason Rutherglen wrote:
    2) Implement realtime search by incrementally creating and merging
    readers in memory. The system would use MemoryIndex or
    InstantiatedIndex to quickly (more quickly than RAMDirectory)
    create indexes from added documents.

    As a baseline, how fast is it to simply use RAMDirectory? If one,
    e.g., flushes changes every 10ms or so, and has a background thread
    that uses IndexReader.reopen() to keep a fresh version for reading?

    Also, what are the requirements? Must a document be visible to
    search within 10ms of being added? Or must it be visible to search
    from the time that the call to add it returns? In the latter case
    one might still use an approach like the above. Writing a small
    new segment to a RAMDirectory and then, with no merging, calling
    IndexReader.reopen(), should be quite fast. All merging could be
    done in the background, as should post-merge reopens() that involve
    large segments.

    In short, I wonder if new reader and writer implementations are in
    fact required or whether, perhaps with a few optimizations, the
    existing implementations might meet this need.

    Doug

    ---------------------------------------------------------------------

    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Michael McCandless at Dec 25, 2008 at 7:24 pm
    I think the necessary low-level changes to Lucene for real-time are
    actually already well underway...

    The biggest barrier is how we now ask for FieldCache values a the
    Multi*Reader level. This makes reopen cost catastrophic for a large
    index.

    Once we succeed in making FieldCache usage within Lucene
    segment-centric (LUCENE-1483 = sorting becomes segment-centric),
    LUCENE-831 (= deprecate old FieldCache API in favor of segment-centric
    or iteration API), we are most of the way there. LUCENE-1231 (column
    stride fields) should make initing the per-segment FieldCache much
    faster, though I think that's a "nice to have" for real-time search
    (because either 1) warming will happen in the BG, or 2) the segment is
    tiny).

    So then I think we should start with approach #2 (build real-time on
    top of the Lucene core) and iterate from there. Newly added docs go
    into a tiny segments, which IndexReader.reopen pulls in. Replaced or
    deleted docs record the delete against the right SegmentReader (and
    LUCENE-1314 lets reopen carry those pending deletes forward, in RAM).

    I would take the simple approach first: use ordinary SegmentReader on
    a RAMDirectory for the tiny segments. If that proves too slow, swap
    in Memory/InstantiatedIndex for the tiny segments. If that proves too
    slow, build a reader impl that reads from DocumentsWriter RAM buffer.

    One challenge is reopening after a big merge finishes... we'd need a
    way to 1) allow the merge to be committed, then 2) start warming a new
    reader in the BG, but 3) allow newly flushed segments to use the old
    SegmentReaders reading the segments that were merged (because they are
    still warm), and 4) once new reader is warm, we decref old segments
    and use the new reader going forwards.

    Alternatively, and maybe simpler, a merge is not allowed to commit
    until a new SegmentReader has been warmed against the newly merged
    segment.

    I'm not sure how best to do this... we may need more info in
    SegmentInfo[s] to track the genealogy of each segment, or something.
    We may need to have IndexWriter give more info when it's modifying
    SegmentInfos, eg we'd need the reader to access newly flushed segments
    (IndexWriter does not write a new segments_N until commit). Maybe
    IndexWriter needs to warm readers... maybe IndexReader.open/reopen
    needs to be given an IndexWriter and then access its un-flushed
    in-memory SegmentInfos... not sure. We'd need to fix
    SegmentReader.get to provide single instance for a given segment.

    I agree we'd want a specialized merge policy. EG it should merge RAM
    segments w/ higher priority, and probably not merge mixed RAM & disk
    segments.

    Mike
    Jason Rutherglen wrote:
    We've discussed realtime search before, it looks like after the next
    release we can get some sort of realtime search working. I was going to
    open a new issue but decided it might be best to discuss realtime search on
    the dev list.

    Lucene can implement realtime search as the ability to add, update, or
    delete documents with latency in the sub 5 millisecond range. A couple of
    different options are available.

    1) Expose a rolling set of realtime readers over the memory index used by
    IndexWriter. Requires incrementally updating field caches and filters, and
    is somewhat unclear how IndexReader versioning would work (for example
    versions of the term dictionary).
    2) Implement realtime search by incrementally creating and merging readers
    in memory. The system would use MemoryIndex or InstantiatedIndex to quickly
    (more quickly than RAMDirectory) create indexes from added documents. The
    in memory indexes would be periodically merged in the background and
    according to RAM used write to disk. Each update would generate a new
    IndexReader or MultiSearcher that includes the new updates. Field caches
    and filters could be cached per IndexReader according to how Lucene works
    today. The downside of this approach is the indexing will not be as fast as
    #1 because of the in memory merging which similar to the Lucene pre 2.3
    which merged in memory segments using RAMDirectory.

    Are there other implementation options?

    A new patch would focus on providing in memory indexing as part of the core
    of Lucene. The work of LUCENE-1483 and LUCENE-1314 would be used. I am not
    sure if option #2 can become part of core if it relies on a contrib module?
    It makes sense to provide a new realtime oriented merge policy that merges
    segments based on the number of deletes rather than a merge factor. The
    realtime merge policy would keep the segments within a minimum and maximum
    size in kilobytes to limit the time consumed by merging which is assumed
    would occur frequently.

    LUCENE-1313 which includes a transaction log with rollback and was designed
    with distributed search and may be retired or the components split out.
  • Doug Cutting at Dec 26, 2008 at 6:21 pm

    Michael McCandless wrote:
    So then I think we should start with approach #2 (build real-time on
    top of the Lucene core) and iterate from there. Newly added docs go
    into a tiny segments, which IndexReader.reopen pulls in. Replaced or
    deleted docs record the delete against the right SegmentReader (and
    LUCENE-1314 lets reopen carry those pending deletes forward, in RAM).

    I would take the simple approach first: use ordinary SegmentReader on
    a RAMDirectory for the tiny segments. If that proves too slow, swap
    in Memory/InstantiatedIndex for the tiny segments. If that proves too
    slow, build a reader impl that reads from DocumentsWriter RAM buffer.
    +1 This sounds like a good approach to me. I don't see any fundamental
    reasons why we need different representations, and fewer implementations
    of IndexWriter and IndexReader is generally better, unless they get way
    too hairy. Mostly it seems that real-time can be done with our existing
    toolbox of datastructures, but with some slightly different control
    structures. Once we have the control structure in place then we should
    look at optimizing data structures as needed.

    Doug

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • J. Delgado at Dec 26, 2008 at 6:48 pm
    The addition of docs into tiny segments using the current data structures
    seems the right way to go. Sometime back one of my engineers implemented
    pseudo real-time using MultiSearcher by having an in-memory (RAM based)
    "short-term" index that auto-merged into a disk-based "long term" index that
    eventually get merged into "archive" indexes. Index optimization would take
    place during these merges. The search we required was very time-sensitive
    (searching last-minute breaking news wires). The advantage of having an
    archive index is that very old documents in our applications were not
    usually searched on unless archives were explicitely selected.

    -- Joaquin
    On Fri, Dec 26, 2008 at 10:20 AM, Doug Cutting wrote:

    Michael McCandless wrote:
    So then I think we should start with approach #2 (build real-time on
    top of the Lucene core) and iterate from there. Newly added docs go
    into a tiny segments, which IndexReader.reopen pulls in. Replaced or
    deleted docs record the delete against the right SegmentReader (and
    LUCENE-1314 lets reopen carry those pending deletes forward, in RAM).

    I would take the simple approach first: use ordinary SegmentReader on
    a RAMDirectory for the tiny segments. If that proves too slow, swap
    in Memory/InstantiatedIndex for the tiny segments. If that proves too
    slow, build a reader impl that reads from DocumentsWriter RAM buffer.
    +1 This sounds like a good approach to me. I don't see any fundamental
    reasons why we need different representations, and fewer implementations of
    IndexWriter and IndexReader is generally better, unless they get way too
    hairy. Mostly it seems that real-time can be done with our existing toolbox
    of datastructures, but with some slightly different control structures.
    Once we have the control structure in place then we should look at
    optimizing data structures as needed.

    Doug


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • J. Delgado at Dec 26, 2008 at 6:55 pm
    One thing that I forgot to mention is that in our implementation the
    real-time indexing took place with many "folder-based" listeners writing to
    many tiny in-memory indexes partitioned by "sub-sources" with fewer
    long-term and archive indexes per box. Overall distributed search across
    various lucene-based search services was done using a federator component,
    very much like shard based searches is done today (I believe).

    -- Joaquin.
    l

    On Fri, Dec 26, 2008 at 10:48 AM, J. Delgado wrote:

    The addition of docs into tiny segments using the current data structures
    seems the right way to go. Sometime back one of my engineers implemented
    pseudo real-time using MultiSearcher by having an in-memory (RAM based)
    "short-term" index that auto-merged into a disk-based "long term" index that
    eventually get merged into "archive" indexes. Index optimization would take
    place during these merges. The search we required was very time-sensitive
    (searching last-minute breaking news wires). The advantage of having an
    archive index is that very old documents in our applications were not
    usually searched on unless archives were explicitely selected.

    -- Joaquin

    On Fri, Dec 26, 2008 at 10:20 AM, Doug Cutting wrote:

    Michael McCandless wrote:
    So then I think we should start with approach #2 (build real-time on
    top of the Lucene core) and iterate from there. Newly added docs go
    into a tiny segments, which IndexReader.reopen pulls in. Replaced or
    deleted docs record the delete against the right SegmentReader (and
    LUCENE-1314 lets reopen carry those pending deletes forward, in RAM).

    I would take the simple approach first: use ordinary SegmentReader on
    a RAMDirectory for the tiny segments. If that proves too slow, swap
    in Memory/InstantiatedIndex for the tiny segments. If that proves too
    slow, build a reader impl that reads from DocumentsWriter RAM buffer.
    +1 This sounds like a good approach to me. I don't see any fundamental
    reasons why we need different representations, and fewer implementations of
    IndexWriter and IndexReader is generally better, unless they get way too
    hairy. Mostly it seems that real-time can be done with our existing toolbox
    of datastructures, but with some slightly different control structures.
    Once we have the control structure in place then we should look at
    optimizing data structures as needed.

    Doug


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Robert Engels at Dec 26, 2008 at 5:31 pm
    That could very well be, but I was referencing your statement:

    "1) Design index formats that can be memory mapped rather than slurped,
    bringing the cost of opening/reopening an IndexReader down to a
    negligible level."

    The only reason to do this (or have it happen) is if you perform a binary search on the term index.

    Using a 2 file system is going to be WAY slower - I'll bet lunch. It might be workable if the files were on a striped drive, or put each file on a different drive/controller, but requiring such specially configured hardware is not a good idea. In the common case (single drive), you are going to be seeking all over the place.

    Saving the memory structure from the write of the segment is going to offer far superior performance - you can binary seek on the memory structure, not the mmap file. The only problem with this is that there is going to be a minimum memory requirement.

    Also, the mmap is only suitable for 64 bit platforms, since there is no way in Java to unmap, you are going to run out of address space as segments are rewritten.








    -----Original Message-----
    From: Marvin Humphrey <marvin@rectangular.com>
    Sent: Dec 24, 2008 1:31 PM
    To: java-dev@lucene.apache.org
    Subject: Re: Realtime Search
    On Wed, Dec 24, 2008 at 12:02:24PM -0600, robert engels wrote:
    As I understood this discussion though, it was an attempt to remove
    the in memory 'skip to' index, to avoid the reading of this during
    index open/reopen.
    No. That idea was entertained briefly and quickly discarded. There seems to
    be an awful lot of irrelevant noise in the current thread arising due to lack
    of familiarity with the ongoing discussions in JIRA.

    Marvin Humphrey

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Marvin Humphrey at Dec 26, 2008 at 9:53 pm
    Robert,

    Three exchanges ago in this thread, you made the incorrect assumption that the
    motivation behind using mmap was read speed, and that memory mapping was being
    waved around as some sort of magic wand:

    Is there something that I am missing? I see lots of references to
    using "memory mapped" files to "dramatically" improve performance.

    I don't think this is the case at all. At the lowest levels, it is
    somewhat more efficient from a CPU standpoint, but with a decent OS
    cache the IO performance difference is going to negligible.

    In response, I indicated that the mmap design had been discussed in JIRA, and
    pointed you at a particular issue.

    There have been substantial discussions about this design in JIRA,
    notably LUCENE-1458.

    The "dramatic" improvement is WRT to opening/reopening an IndexReader.

    Apparently, you did not go back to read that JIRA thread, because you
    subsequently offered a critique of a purely invented design you assumed we
    must have arrived at, and continued to argue with a straw man about read
    speed:

    1. with "fixed" size terms, the additional IO (larger pages) probably
    offsets a lot of the random access benefit. This is why "compressed"
    disks on a fast machine (CPU) are often faster than "uncompressed" -
    more data is read during every IO access.

    While my reply did not specifically point back to LUCENE-1458 again, I hoped
    that having your foolish assumption exposed would motivate you to go back and
    read it, so that you could offer an informed critique of the *actual* design.
    I also linked to a specific comment in LUCENE-831 which explained how mmap
    applied to sort caches.

    Additionally, sort caches would be written at index time in three files, and
    memory mapped as laid out in
    <https://issues.apache.org/jira/browse/LUCENE-831?focusedCommentId=12656150#action_12656150>.

    Apparently you still didn't go back and read up, because you subsequently made
    a third incorrect assumption, this time about plans to do away with the term
    dictionary index. In response I griped about JIRA again, using slightly
    stronger but still intentionally indirect language.

    No. That idea was entertained briefly and quickly discarded. There seems
    to be an awful lot of irrelevant noise in the current thread arising due
    to lack of familiarity with the ongoing discussions in JIRA.

    Unfortunately, this must not have worked either, because you have now offered a
    fourth message based on incorrect assumptions which would have been remedied by
    bringing yourself up to date with the relevant JIRA threads.
    That could very well be, but I was referencing your statement:

    "1) Design index formats that can be memory mapped rather than slurped,
    bringing the cost of opening/reopening an IndexReader down to a
    negligible level."

    The only reason to do this (or have it happen) is if you perform a binary
    search on the term index.
    No. As discussed in LUCENE-1458, LUCENE-1483, the specific link I pointed you
    towards in LUCENE-831, the message where I provided you with that link, and
    elsewhere in this thread... loading the term dictionary index is important, but
    the cost pales in comparison to the cost of loading sort caches.
    Using a 2 file system is going to be WAY slower - I'll bet lunch. It might be
    workable if the files were on a striped drive, or put each file on a different
    drive/controller, but requiring such specially configured hardware is not a
    good idea. In the common case (single drive), you are going to be seeking all
    over the place.
    Mike McCandless and I had an extensive debate about the pros and cons of
    depending on the OS cache to hold the term dictionary index under LUCENE-1458.
    The concerns you express here were fully addressed, and even resolved under an
    "agree to disagree" design.
    Also, the mmap is only suitable for 64 bit platforms, since there is no way
    in Java to unmap, you are going to run out of address space as segments are
    rewritten.
    The discussion of how the mmap design translates from Lucy to Lucene is an
    important one, but I despair of having it if we have to rehash all of
    LUCENE-1458, LUCENE-831, and possibly LUCENE-1476 and LUCENE-1483 because you
    cannot be troubled to bring yourself up to speed before commenting.

    You are obviously knowledgable on the subject of low level memory issues. Me
    and Mike McCandless ain't exactly chopped liver, though, and neither are a lot
    of other people around here who *are* bothering to keep up with the threads in
    JIRA. I request that you show the rest of us more respect. Our time is
    valuable, too.

    Marvin Humphrey


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Robert Engels at Dec 26, 2008 at 6:01 pm
    Also, if you are really set on the mmap strategy, why not use the single file with fixed length pages, using the header I proposed (and key compression). You don't need any fancy partial page stuff, just waste a small amount of space at the end of pages.

    I think this is going to far faster than a file of fixed length offsets (I assume you would also put the entry data length in file #1 as well), and a file of data (file #2). Mainly because the final page(s) can be more efficiently searched, and since you can use compression (since you have pages), the files are going to be significantly smaller (improving the write time, and the cache efficiency).

    -----Original Message-----
    From: Robert Engels <rengels@ix.netcom.com>
    Sent: Dec 26, 2008 11:30 AM
    To: java-dev@lucene.apache.org, java-dev@lucene.apache.org
    Subject: Re: Realtime Search

    That could very well be, but I was referencing your statement:

    "1) Design index formats that can be memory mapped rather than slurped,
    bringing the cost of opening/reopening an IndexReader down to a
    negligible level."

    The only reason to do this (or have it happen) is if you perform a binary search on the term index.

    Using a 2 file system is going to be WAY slower - I'll bet lunch. It might be workable if the files were on a striped drive, or put each file on a different drive/controller, but requiring such specially configured hardware is not a good idea. In the common case (single drive), you are going to be seeking all over the place.

    Saving the memory structure from the write of the segment is going to offer far superior performance - you can binary seek on the memory structure, not the mmap file. The only problem with this is that there is going to be a minimum memory requirement.

    Also, the mmap is only suitable for 64 bit platforms, since there is no way in Java to unmap, you are going to run out of address space as segments are rewritten.








    -----Original Message-----
    From: Marvin Humphrey <marvin@rectangular.com>
    Sent: Dec 24, 2008 1:31 PM
    To: java-dev@lucene.apache.org
    Subject: Re: Realtime Search
    On Wed, Dec 24, 2008 at 12:02:24PM -0600, robert engels wrote:
    As I understood this discussion though, it was an attempt to remove
    the in memory 'skip to' index, to avoid the reading of this during
    index open/reopen.
    No. That idea was entertained briefly and quickly discarded. There seems to
    be an awful lot of irrelevant noise in the current thread arising due to lack
    of familiarity with the ongoing discussions in JIRA.

    Marvin Humphrey

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Robert Engels at Dec 26, 2008 at 6:56 pm
    This is what we mostly do, but we serialize the documents to a log file first, so if server crashes before the background merge of the RAM segments into the disk segments completes, we can replay the operations on server restart. Since the serialize is a sequential write to an already open file, it is very fast.

    I realize that many users do not wrap Lucene in a server process, so it doesn't seem that writing only to the RAM segments will work? How will the other processes/servers see them? Doesn't seem it would be real-time for them.

    Maybe restrict the real-time search to "server" Lucene installations? If you are concerned about performance in the first place, seems a requirement anyway.

    On this note, maybe to allow greater advancement of Lucene, Lucene should move to a design approach similar to many databases. You have an embedded version, which is designed for single process with multiple threads, and a server version which wraps the embedded version allowing multiple clients. Seems to be a far simpler architecture. I know I addressed have brought this up in the past, but maybe time to revisit? It was the core of unix design (no file locks needed), and works well for many dbs (i.e. derby)







    -----Original Message-----
    From: Doug Cutting <cutting@apache.org>
    Sent: Dec 26, 2008 12:20 PM
    To: java-dev@lucene.apache.org
    Subject: Re: Realtime Search

    Michael McCandless wrote:
    So then I think we should start with approach #2 (build real-time on
    top of the Lucene core) and iterate from there. Newly added docs go
    into a tiny segments, which IndexReader.reopen pulls in. Replaced or
    deleted docs record the delete against the right SegmentReader (and
    LUCENE-1314 lets reopen carry those pending deletes forward, in RAM).

    I would take the simple approach first: use ordinary SegmentReader on
    a RAMDirectory for the tiny segments. If that proves too slow, swap
    in Memory/InstantiatedIndex for the tiny segments. If that proves too
    slow, build a reader impl that reads from DocumentsWriter RAM buffer.
    +1 This sounds like a good approach to me. I don't see any fundamental
    reasons why we need different representations, and fewer implementations
    of IndexWriter and IndexReader is generally better, unless they get way
    too hairy. Mostly it seems that real-time can be done with our existing
    toolbox of datastructures, but with some slightly different control
    structures. Once we have the control structure in place then we should
    look at optimizing data structures as needed.

    Doug

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Robert Engels at Dec 26, 2008 at 8:35 pm
    If you move to the "either embedded, or server model", the post reopen is trivial, as the structures can be created as the segment is written.

    It is the networked shared access model that causes a lot of these optimizations to be far more complex than needed.

    Would it maybe be simpler to move the "embedded or server" model, and add a network shared file (e.g. nfs) access model as a layer? The latter is going to perform far worse anyway.

    I guess I don't understand why Lucene continues to try and support this model. NO ONE does it any more. This is the way MS Access worked, and everyone that wanted performance needed to move to SQL server for the server model.

    -----Original Message-----
    From: Marvin Humphrey <marvin@rectangular.com>
    Sent: Dec 26, 2008 12:53 PM
    To: java-dev@lucene.apache.org
    Subject: Re: Realtime Search
    On Fri, Dec 26, 2008 at 06:22:23AM -0500, Michael McCandless wrote:
    4) Allow 2 concurrent writers: one for small, fast updates, and one for
    big background merges.
    Marvin can you describe more detail here?
    The goal is to improve worst-case write performance.

    Currently, writes are quick most of the time, but occassionally you'll trigger
    a big merge and get stuck. To solve this problem, we can assign a merge
    policy to our primary writer which tells it to merge no more than
    mergeThreshold documents. The value of mergeTheshold will need tuning
    depending on document size, change rate, and so on, but the idea is that we
    want this writer to do as much merging as it can while still keeping
    worst-case write performance down to an acceptable number.

    Doing only small merges just puts off the day of reckoning, of course. By
    avoiding big consolidations, we are slowly accumulating small-to-medium sized
    segments and causing a gradual degradation of search-time performance.

    What we'd like is a separate write process, operating (mostly) in the
    background, dedicated solely to merging segments which contain at least
    mergeThreshold docs.

    If all we have to do is add documents to the index, adding that second write
    process isn't a big deal. We have to worry about competion for segment,
    snapshot, and temp file names, but that's about it.

    Deletions make matters more complicated, but with a tombstone-based deletions
    mechanism, the problems are solvable.

    When the background merge writer starts up, it will see a particular view of
    the index in time, including deletions. It will perform nearly all of its
    operations based on this view of the index, mapping around documents which
    were marked as deleted at init time.

    In between the time when the background merge writer starts up and the time it
    finishes consolidating segment data, we assume that the primary writer will
    have modified the index.

    * New docs have been added in new segments.
    * Tombstones have been added which suppress documents in segments which
    didn't even exist when the background merge writer started up.
    * Tombstones have been added which suppress documents in segments which
    existed when the background merge writer started up, but were not merged.
    * Tombstones have been added which suppress documents in segments which have
    just been merged.

    Only the last category of deletions matters.

    At this point, the background merge writer aquires an exclusive write lock on
    the index. It examines recently added tombstones, translates the document
    numbers and writes a tombstone file against itself. Then it writes the
    snapshot file to commit its changes and releases the write lock.

    Worst case update performance for the system is now the sum of the time it
    takes the background merge writer consolidate tombstones and worst-case
    performance of the primary writer.
    It sounds like this is your solution for "decoupling" segments changes due
    to merges from changes from docs being indexed, from a reader's standpoint?
    It's true that we are decoupling the process of making logical changes to the
    index from the process of internal consolidation. I probably wouldn't
    describe that as being done from the reader's standpoint, though.

    With mmap and data structures optimized for it, we basically solve the
    read-time responsiveness cost problem. From the client perspective, the delay
    between firing off a change order and seeing that change made live is now
    dominated by the time it takes to actually update the index. The time between
    the commit and having an IndexReader which can see that commit is negligible
    in comparision.
    Since you are using mmap to achieve near zero brand-new IndexReader
    creation, whereas in Lucene we are moving towards achieving real-time
    by always reopening a current IndexReader (not a brand new one), it
    seems like you should not actually have to worry about the case of
    reopening a reader after a large merge has finished?
    Even though we can rely on mmap rather than slurping, there are potentially a
    lot of files to open and a lot of JSON-encoded metadata to parse, so I'm not
    certain that Lucy/KS will never have to worry about the time it takes to open
    a new IndexReader. Fortunately, we can implement reopen() if we need to.
    We need to deal with this case (background the warming) because
    creating that new SegmentReader (on the newly merged segment) can take
    a non-trivial amount of time.
    Yes. Without mmap or some other solution, I think improvements to worst-case
    update performance in Lucene will continue to be constrained by post-commit
    IndexReader opening costs.

    Marvin Humphrey


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Robert Engels at Dec 26, 2008 at 8:40 pm
    There is also the distributed model - but in that case each node is running some sort of server anyway (as in Hadoop).

    It seems that the distributed model would be easier to develop using Hadoop over the embedded model.
    -----Original Message-----
    From: Robert Engels <rengels@ix.netcom.com>
    Sent: Dec 26, 2008 2:34 PM
    To: java-dev@lucene.apache.org
    Subject: Re: Realtime Search

    If you move to the "either embedded, or server model", the post reopen is trivial, as the structures can be created as the segment is written.

    It is the networked shared access model that causes a lot of these optimizations to be far more complex than needed.

    Would it maybe be simpler to move the "embedded or server" model, and add a network shared file (e.g. nfs) access model as a layer? The latter is going to perform far worse anyway.

    I guess I don't understand why Lucene continues to try and support this model. NO ONE does it any more. This is the way MS Access worked, and everyone that wanted performance needed to move to SQL server for the server model.

    -----Original Message-----
    From: Marvin Humphrey <marvin@rectangular.com>
    Sent: Dec 26, 2008 12:53 PM
    To: java-dev@lucene.apache.org
    Subject: Re: Realtime Search
    On Fri, Dec 26, 2008 at 06:22:23AM -0500, Michael McCandless wrote:
    4) Allow 2 concurrent writers: one for small, fast updates, and one for
    big background merges.
    Marvin can you describe more detail here?
    The goal is to improve worst-case write performance.

    Currently, writes are quick most of the time, but occassionally you'll trigger
    a big merge and get stuck. To solve this problem, we can assign a merge
    policy to our primary writer which tells it to merge no more than
    mergeThreshold documents. The value of mergeTheshold will need tuning
    depending on document size, change rate, and so on, but the idea is that we
    want this writer to do as much merging as it can while still keeping
    worst-case write performance down to an acceptable number.

    Doing only small merges just puts off the day of reckoning, of course. By
    avoiding big consolidations, we are slowly accumulating small-to-medium sized
    segments and causing a gradual degradation of search-time performance.

    What we'd like is a separate write process, operating (mostly) in the
    background, dedicated solely to merging segments which contain at least
    mergeThreshold docs.

    If all we have to do is add documents to the index, adding that second write
    process isn't a big deal. We have to worry about competion for segment,
    snapshot, and temp file names, but that's about it.

    Deletions make matters more complicated, but with a tombstone-based deletions
    mechanism, the problems are solvable.

    When the background merge writer starts up, it will see a particular view of
    the index in time, including deletions. It will perform nearly all of its
    operations based on this view of the index, mapping around documents which
    were marked as deleted at init time.

    In between the time when the background merge writer starts up and the time it
    finishes consolidating segment data, we assume that the primary writer will
    have modified the index.

    * New docs have been added in new segments.
    * Tombstones have been added which suppress documents in segments which
    didn't even exist when the background merge writer started up.
    * Tombstones have been added which suppress documents in segments which
    existed when the background merge writer started up, but were not merged.
    * Tombstones have been added which suppress documents in segments which have
    just been merged.

    Only the last category of deletions matters.

    At this point, the background merge writer aquires an exclusive write lock on
    the index. It examines recently added tombstones, translates the document
    numbers and writes a tombstone file against itself. Then it writes the
    snapshot file to commit its changes and releases the write lock.

    Worst case update performance for the system is now the sum of the time it
    takes the background merge writer consolidate tombstones and worst-case
    performance of the primary writer.
    It sounds like this is your solution for "decoupling" segments changes due
    to merges from changes from docs being indexed, from a reader's standpoint?
    It's true that we are decoupling the process of making logical changes to the
    index from the process of internal consolidation. I probably wouldn't
    describe that as being done from the reader's standpoint, though.

    With mmap and data structures optimized for it, we basically solve the
    read-time responsiveness cost problem. From the client perspective, the delay
    between firing off a change order and seeing that change made live is now
    dominated by the time it takes to actually update the index. The time between
    the commit and having an IndexReader which can see that commit is negligible
    in comparision.
    Since you are using mmap to achieve near zero brand-new IndexReader
    creation, whereas in Lucene we are moving towards achieving real-time
    by always reopening a current IndexReader (not a brand new one), it
    seems like you should not actually have to worry about the case of
    reopening a reader after a large merge has finished?
    Even though we can rely on mmap rather than slurping, there are potentially a
    lot of files to open and a lot of JSON-encoded metadata to parse, so I'm not
    certain that Lucy/KS will never have to worry about the time it takes to open
    a new IndexReader. Fortunately, we can implement reopen() if we need to.
    We need to deal with this case (background the warming) because
    creating that new SegmentReader (on the newly merged segment) can take
    a non-trivial amount of time.
    Yes. Without mmap or some other solution, I think improvements to worst-case
    update performance in Lucene will continue to be constrained by post-commit
    IndexReader opening costs.

    Marvin Humphrey


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Robert Engels at Dec 26, 2008 at 10:44 pm
    You are full of crap. From your own comments in Lucene 1458:

    "The work on streamlining the term dictionary is excellent, but perhaps we can do better still. Can we design a format that allows us rely upon the operating system's virtual memory and avoid caching in process memory altogether?

    Say that we break up the index file into fixed-width blocks of 1024 bytes. Most blocks would start with a complete term/pointer pairing, though at the top of each block, we'd need a status byte indicating whether the block contains a continuation from the previous block in order to handle cases where term length exceeds the block size.

    For Lucy/KinoSearch our plan would be to mmap() on the file, but accessing it as a stream would work, too. Seeking around the index term dictionary would involve seeking the stream to multiples of the block size and performing binary search, rather than performing binary search on an array of cached terms. There would be increased processor overhead; my guess is that since the second stage of a term dictionary seek – scanning through the primary term dictionary – involves comparatively more processor power than this, the increased costs would be acceptable."

    and then you state farther down

    "Killing off the term dictionary index yields a nice improvement in code and file specification simplicity, and there's no performance penalty for our primary optimization target use case.
    We could also explore something in-between, eg it'd be nice to
    genericize MultiLevelSkipListWriter so that it could index arbitrary
    files, then we could use that to index the terms dict. You could
    choose to spend dedicated process RAM on the higher levels of the skip
    tree, and then tentatively trust IO cache for the lower levels.
    That doesn't meet the design goals of bringing the cost of opening/warming an IndexReader down to near-zero and sharing backing buffers among multiple forks. It's also very complicated, which of course bothers me more than it bothers you. So I imagine we'll choose different paths."

    The thing I find funny is that many are approaching these issues as if new ground is being broken. These are ALL standard, long-known issues that any database engineer has already worked with, and there are accepted designs given applicable constraints.

    This is why I've tried to point folks towards alternative designs that open the door much wider to increased performance/reliability/robustness.

    Do what you like. You obviously will. This is the problem with the Lucene managers - the problems are only the ones they see - same with the solutions. If the solution (or questions) put them outside their comfort zone, they are ignored or dismissed in a tone that is designed to limit any further questions (especially those that might question their ability and/or understanding).


    -----Original Message-----
    From: Marvin Humphrey <marvin@rectangular.com>
    Sent: Dec 26, 2008 3:53 PM
    To: java-dev@lucene.apache.org, Robert Engels <rengels@ix.netcom.com>
    Subject: Re: Realtime Search

    Robert,

    Three exchanges ago in this thread, you made the incorrect assumption that the
    motivation behind using mmap was read speed, and that memory mapping was being
    waved around as some sort of magic wand:

    Is there something that I am missing? I see lots of references to
    using "memory mapped" files to "dramatically" improve performance.

    I don't think this is the case at all. At the lowest levels, it is
    somewhat more efficient from a CPU standpoint, but with a decent OS
    cache the IO performance difference is going to negligible.

    In response, I indicated that the mmap design had been discussed in JIRA, and
    pointed you at a particular issue.

    There have been substantial discussions about this design in JIRA,
    notably LUCENE-1458.

    The "dramatic" improvement is WRT to opening/reopening an IndexReader.

    Apparently, you did not go back to read that JIRA thread, because you
    subsequently offered a critique of a purely invented design you assumed we
    must have arrived at, and continued to argue with a straw man about read
    speed:

    1. with "fixed" size terms, the additional IO (larger pages) probably
    offsets a lot of the random access benefit. This is why "compressed"
    disks on a fast machine (CPU) are often faster than "uncompressed" -
    more data is read during every IO access.

    While my reply did not specifically point back to LUCENE-1458 again, I hoped
    that having your foolish assumption exposed would motivate you to go back and
    read it, so that you could offer an informed critique of the *actual* design.
    I also linked to a specific comment in LUCENE-831 which explained how mmap
    applied to sort caches.

    Additionally, sort caches would be written at index time in three files, and
    memory mapped as laid out in
    <https://issues.apache.org/jira/browse/LUCENE-831?focusedCommentId=12656150#action_12656150>.

    Apparently you still didn't go back and read up, because you subsequently made
    a third incorrect assumption, this time about plans to do away with the term
    dictionary index. In response I griped about JIRA again, using slightly
    stronger but still intentionally indirect language.

    No. That idea was entertained briefly and quickly discarded. There seems
    to be an awful lot of irrelevant noise in the current thread arising due
    to lack of familiarity with the ongoing discussions in JIRA.

    Unfortunately, this must not have worked either, because you have now offered a
    fourth message based on incorrect assumptions which would have been remedied by
    bringing yourself up to date with the relevant JIRA threads.
    That could very well be, but I was referencing your statement:

    "1) Design index formats that can be memory mapped rather than slurped,
    bringing the cost of opening/reopening an IndexReader down to a
    negligible level."

    The only reason to do this (or have it happen) is if you perform a binary
    search on the term index.
    No. As discussed in LUCENE-1458, LUCENE-1483, the specific link I pointed you
    towards in LUCENE-831, the message where I provided you with that link, and
    elsewhere in this thread... loading the term dictionary index is important, but
    the cost pales in comparison to the cost of loading sort caches.
    Using a 2 file system is going to be WAY slower - I'll bet lunch. It might be
    workable if the files were on a striped drive, or put each file on a different
    drive/controller, but requiring such specially configured hardware is not a
    good idea. In the common case (single drive), you are going to be seeking all
    over the place.
    Mike McCandless and I had an extensive debate about the pros and cons of
    depending on the OS cache to hold the term dictionary index under LUCENE-1458.
    The concerns you express here were fully addressed, and even resolved under an
    "agree to disagree" design.
    Also, the mmap is only suitable for 64 bit platforms, since there is no way
    in Java to unmap, you are going to run out of address space as segments are
    rewritten.
    The discussion of how the mmap design translates from Lucy to Lucene is an
    important one, but I despair of having it if we have to rehash all of
    LUCENE-1458, LUCENE-831, and possibly LUCENE-1476 and LUCENE-1483 because you
    cannot be troubled to bring yourself up to speed before commenting.

    You are obviously knowledgable on the subject of low level memory issues. Me
    and Mike McCandless ain't exactly chopped liver, though, and neither are a lot
    of other people around here who *are* bothering to keep up with the threads in
    JIRA. I request that you show the rest of us more respect. Our time is
    valuable, too.

    Marvin Humphrey


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Andrzej Bialecki at Dec 26, 2008 at 10:51 pm

    Robert Engels wrote:
    You are full of **beep** *beep* ...
    No matter whether you are right or wrong, please keep a civil tone on
    this public forum. We are professionals here, so let's discuss and
    disagree if must be - but in a professional and grown-up way. Thank you.


    --
    Best regards,
    Andrzej Bialecki <><
    ___. ___ ___ ___ _ _ __________________________________
    [__ || __|__/|__||\/| Information Retrieval, Semantic Web
    ___|||__|| \| || | Embedded Unix, System Integration
    http://www.sigram.com Contact: info at sigram dot com


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-dev @
categorieslucene
postedDec 24, '08 at 1:52a
activeDec 26, '08 at 10:51p
posts28
users8
websitelucene.apache.org

People

Translate

site design / logo © 2021 Grokbase