On Wed, Jan 07, 2009 at 10:36:01PM -0600, robert engels wrote:
Yes, and I don't think the "worst-case" is correct.

When you go to write that segment, you determine that it is a "large"
segment, but has few deletions (one in this case), it will be written
compressed in probably less than 10 bytes (1 byte header, vlong
start, vint length - you only write the ones...)...
If a segment slowly accumulates deletions over time during different indexing
sessions, at some point you will cross the threshold where the deletions file
needs to be written out as an uncompressed bit vector. From then on, adding a
single additional deletion to the segment during any subsequent indexing
session triggers what I'm calling "worst-case" behavior: the whole bit vector
file needs to be rewritten for the sake of a one deletion.

Marvin Humphrey

To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Search Discussions

Discussion Posts


Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 7 of 91 | next ›
Discussion Overview
groupjava-dev @
postedJan 8, '09 at 12:01a
activeJan 30, '09 at 10:47p



site design / logo © 2021 Grokbase