FAQ
Given an existing index, is there any way to update the value of a particular field across all documents, without deleting/re-indexing documents? For instance, if I have a date field, and I need to offset all dates based on change to stored time zone (subtract 12 hours from each value). The sort order of field values would not change, and the postings should not need to change, only values of fields. I do not believe there is any API to do it, but is there some lower level way to do it (modifying files manually)? I only ask because I have a large index and I don't want to re-index all documents.

Search Discussions

  • Andrzej Bialecki at Aug 7, 2008 at 8:52 pm

    Robert Stewart wrote:
    Given an existing index, is there any way to update the value of a
    particular field across all documents, without deleting/re-indexing
    documents? For instance, if I have a date field, and I need to
    offset all dates based on change to stored time zone (subtract 12
    hours from each value). The sort order of field values would not
    change, and the postings should not need to change, only values of
    fields. I do not believe there is any API to do it, but is there
    some lower level way to do it (modifying files manually)? I only ask
    because I have a large index and I don't want to re-index all
    documents.

    Perhaps you could do this using FilterIndexReader implementation, i.e.
    wrap your original index within this filter and then invoke
    IndexWriter.addIndexes(IndexReader[]). If you override
    FilterIndexReader.document(int, FieldSelector), which retrieves the
    stored fields of a document, then you can change the values of stored
    fields only, just before they are added to the new index. Note that this
    works ONLY for non-indexed fields that are stored (i.e. Store.YES,
    Index.NO) - if a field is both stored and indexed then you need to
    modify the postings too, which requires overriding other methods.

    --
    Best regards,
    Andrzej Bialecki <><
    ___. ___ ___ ___ _ _ __________________________________
    [__ || __|__/|__||\/| Information Retrieval, Semantic Web
    ___|||__|| \| || | Embedded Unix, System Integration
    http://www.sigram.com Contact: info at sigram dot com


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: [email protected]
    For additional commands, e-mail: [email protected]
  • Michael McCandless at Aug 7, 2008 at 10:27 pm
    It's also possible to create your own code that uses FieldsReader and
    FieldsWriter (both in org.apache.lucene.index) directly to create new
    fdx/fdt files, with the date fixed on each document. You'd have to
    enumerate over all segments in your index, create a FieldsReader
    instance on each, create a FieldsWriter writing to a new segment name,
    iterate through the docs fixing your date, calling
    FieldsWriter.writeField per field then flushDocument to finish the
    document.

    I think that should work, but I've never tried it!

    Your code would have to be in org.apache.lucene.index since these
    classes are package private.

    Mike

    Andrzej Bialecki wrote:
    Robert Stewart wrote:
    Given an existing index, is there any way to update the value of a
    particular field across all documents, without deleting/re-indexing
    documents? For instance, if I have a date field, and I need to
    offset all dates based on change to stored time zone (subtract 12
    hours from each value). The sort order of field values would not
    change, and the postings should not need to change, only values of
    fields. I do not believe there is any API to do it, but is there
    some lower level way to do it (modifying files manually)? I only ask
    because I have a large index and I don't want to re-index all
    documents.

    Perhaps you could do this using FilterIndexReader implementation,
    i.e. wrap your original index within this filter and then invoke
    IndexWriter.addIndexes(IndexReader[]). If you override
    FilterIndexReader.document(int, FieldSelector), which retrieves the
    stored fields of a document, then you can change the values of
    stored fields only, just before they are added to the new index.
    Note that this works ONLY for non-indexed fields that are stored
    (i.e. Store.YES, Index.NO) - if a field is both stored and indexed
    then you need to modify the postings too, which requires overriding
    other methods.

    --
    Best regards,
    Andrzej Bialecki <><
    ___. ___ ___ ___ _ _ __________________________________
    [__ || __|__/|__||\/| Information Retrieval, Semantic Web
    ___|||__|| \| || | Embedded Unix, System Integration
    http://www.sigram.com Contact: info at sigram dot com


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: [email protected]
    For additional commands, e-mail: [email protected]

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: [email protected]
    For additional commands, e-mail: [email protected]

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedAug 7, '08 at 7:49p
activeAug 7, '08 at 10:27p
posts3
users3
websitelucene.apache.org

People

Translate

site design / logo © 2023 Grokbase