FAQ
Hi, I'm using lucene to compute the score of some documents.
For several reasons I need also to know the documents that don't match the
input
query. For example with score 0.
I don't know the engine of lucene and I was wondering how difficult this
change would be.
Thanks.

--
Paolo Valleri

Search Discussions

  • Toke Eskildsen at Jun 25, 2008 at 8:38 am

    On Wed, 2008-06-25 at 09:29 +0200, Paolo Valleri wrote:
    For several reasons I need also to know the documents that don't match the
    input query. For example with score 0.
    Make a list of the docid for all the non-deleted documents in the index.
    Collect the docids from the search-result. Subtract the two lists.

    You can get the non-deleted docids by iterating from 0 to maxDoc()-1
    (from IndexReader) and using the IndexReaders isDeleted(docid).


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Paolo Valleri at Jun 25, 2008 at 7:48 pm
    Thank for answer.
    For take docid of all document in the index I need to write a class
    that implement indexReader or there is an other method ?

    paolo


    2008/6/25 Toke Eskildsen <te@statsbiblioteket.dk>:
    On Wed, 2008-06-25 at 09:29 +0200, Paolo Valleri wrote:
    For several reasons I need also to know the documents that don't match the
    input query. For example with score 0.
    Make a list of the docid for all the non-deleted documents in the index.
    Collect the docids from the search-result. Subtract the two lists.

    You can get the non-deleted docids by iterating from 0 to maxDoc()-1
    (from IndexReader) and using the IndexReaders isDeleted(docid).


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

    --
    Paolo Valleri
  • Yonik Seeley at Jun 25, 2008 at 8:14 pm

    On Wed, Jun 25, 2008 at 3:47 PM, Paolo Valleri wrote:
    For take docid of all document in the index I need to write a class
    that implement indexReader or there is an other method ?
    MatchAllDocsQuery does it.

    -Yonik

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Toke Eskildsen at Jun 26, 2008 at 9:37 am

    On Wed, 2008-06-25 at 21:47 +0200, Paolo Valleri wrote:
    For take docid of all document in the index I need to write a class
    that implement indexReader or there is an other method ?
    Something along the following should work and be quite fast. However, it
    might be overly complex.

    // Do this every time the index is updated
    IndexReader reader = IndexReader.open(indexLocation);
    BitSet deleted = new BitSet(reader.maxDoc());
    for (int i = 0 ; i < reader.maxDoc() ; i++) {
    if (reader.isDeleted(i)) {
    deleted.set(i);
    }
    }
    QueryParser parser =
    new QueryParser("freetext", new StandardAnalyzer());

    // Do this for every search
    Query query = parser.parse("java");
    QueryWrapperFilter filter = new QueryWrapperFilter(query);
    BitSet workset = filter.bits(reader);
    workset.or(deleted);
    // workset now marks all docids that are either matching or deleted
    System.out.print("Non-matching documents: ");
    for (int i = 0 ; i < reader.maxDoc() ; i++) {
    if (!workset.get(i)) {
    System.out.print(i + " ");
    }
    }



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedJun 25, '08 at 7:31a
activeJun 26, '08 at 9:37a
posts5
users3
websitelucene.apache.org

People

Translate

site design / logo © 2022 Grokbase