FAQ
Our index contains documents with an unique ID (long number) corresponding
to a record in SQL database. I want to start a lucene search within a list
of ID's returned from a SQL resultset.

IDataReader reader = cmd.Execute("select ID from ...")
long[] ids = GetID's(reader);
string search = Join(ids) /// results in (ID:1 OR ID:2 OR ID:3)
search += " AND (" + <searchstring entered by user> + ")";
IndexSearcher searcher = new IndexSearcher(_Reader);
Query query = new Lucene.Net.QueryParsers.QueryParser(,,,, search);
TopDocs docs = searcher.Search(query, 1000000);

Is this the way to go?

Thanks

Search Discussions

  • Digy digy at May 27, 2011 at 7:19 am
    Creation of SimplefacetedSearch is slow. Therefore it should only be created
    when a new reader is opened (or reopened).
    DIGY
    On Fri, May 27, 2011 at 10:02 AM, Marco Dissel wrote:

    Our index contains documents with an unique ID (long number) corresponding
    to a record in SQL database. I want to start a lucene search within a list
    of ID's returned from a SQL resultset.

    IDataReader reader = cmd.Execute("select ID from ...")
    long[] ids = GetID's(reader);
    string search = Join(ids) /// results in (ID:1 OR ID:2 OR ID:3)
    search += " AND (" + <searchstring entered by user> + ")";
    IndexSearcher searcher = new IndexSearcher(_Reader);
    Query query = new Lucene.Net.QueryParsers.QueryParser(,,,, search);
    TopDocs docs = searcher.Search(query, 1000000);

    Is this the way to go?

    Thanks
  • Digy digy at May 27, 2011 at 7:36 am
    When term count in query increases, search time increases also. Therefore,
    if possible, use precomputed BitSets and "AND" it with user's query. Another
    costly operation is the fectching data from the index. Limiting the result
    count to smaller numbers may help too.

    DIGY
    On Fri, May 27, 2011 at 10:18 AM, digy digy wrote:

    Creation of SimplefacetedSearch is slow. Therefore it should only be
    created when a new reader is opened (or reopened).
    DIGY

    On Fri, May 27, 2011 at 10:02 AM, Marco Dissel wrote:

    Our index contains documents with an unique ID (long number) corresponding
    to a record in SQL database. I want to start a lucene search within a list
    of ID's returned from a SQL resultset.

    IDataReader reader = cmd.Execute("select ID from ...")
    long[] ids = GetID's(reader);
    string search = Join(ids) /// results in (ID:1 OR ID:2 OR ID:3)
    search += " AND (" + <searchstring entered by user> + ")";
    IndexSearcher searcher = new IndexSearcher(_Reader);
    Query query = new Lucene.Net.QueryParsers.QueryParser(,,,, search);
    TopDocs docs = searcher.Search(query, 1000000);

    Is this the way to go?

    Thanks
  • Moray McConnachie at May 27, 2011 at 7:46 am
    If the list of IDs could be very long, then the search string could
    become horrendously long, and you might also have to look at the maximum
    clauses Lucene permits in a query.

    If user's search is reasonably restrictive, you might do better running
    the user's search then filtering the list of returned results according
    to your list of IDs. There are numerous ways of achieving this, I
    personally like the custom collector because you can add more
    complicated logic later if you want.

    Yours,
    Moray


    -------------------------------------
    Moray McConnachie
    Director of IT +44 1865 261 600
    Oxford Analytica http://www.oxan.com

    -----Original Message-----
    From: digy digy
    Sent: 27 May 2011 08:19
    To: lucene-net-user@lucene.apache.org
    Subject: Re: [Lucene.Net] search within a list of unique ID's

    Creation of SimplefacetedSearch is slow. Therefore it should only be
    created when a new reader is opened (or reopened).
    DIGY

    On Fri, May 27, 2011 at 10:02 AM, Marco Dissel
    wrote:
    Our index contains documents with an unique ID (long number)
    corresponding to a record in SQL database. I want to start a lucene
    search within a list of ID's returned from a SQL resultset.

    IDataReader reader = cmd.Execute("select ID from ...") long[] ids =
    GetID's(reader); string search = Join(ids) /// results in (ID:1 OR
    ID:2 OR ID:3) search += " AND (" + <searchstring entered by user> +
    ")"; IndexSearcher searcher = new IndexSearcher(_Reader); Query query
    = new Lucene.Net.QueryParsers.QueryParser(,,,, search); TopDocs docs =
    searcher.Search(query, 1000000);

    Is this the way to go?

    Thanks
    ---------------------------------------------------------
    Disclaimer

    This message and any attachments are confidential and/or privileged. If this has been sent to you in error, please do not use, retain or disclose them, and contact the sender as soon as possible.

    Oxford Analytica Ltd
    Registered in England: No. 1196703
    5 Alfred Street, Oxford
    United Kingdom, OX1 4EH
    ---------------------------------------------------------
  • Marco Dissel at May 27, 2011 at 8:22 am
    But is the document score not different if i filter the list afterwards with
    my list of unique ID's?

    On Fri, May 27, 2011 at 9:45 AM, Moray McConnachie wrote:

    If the list of IDs could be very long, then the search string could
    become horrendously long, and you might also have to look at the maximum
    clauses Lucene permits in a query.

    If user's search is reasonably restrictive, you might do better running
    the user's search then filtering the list of returned results according
    to your list of IDs. There are numerous ways of achieving this, I
    personally like the custom collector because you can add more
    complicated logic later if you want.

    Yours,
    Moray


    -------------------------------------
    Moray McConnachie
    Director of IT +44 1865 261 600
    Oxford Analytica http://www.oxan.com

    -----Original Message-----
    From: digy digy
    Sent: 27 May 2011 08:19
    To: lucene-net-user@lucene.apache.org
    Subject: Re: [Lucene.Net] search within a list of unique ID's

    Creation of SimplefacetedSearch is slow. Therefore it should only be
    created when a new reader is opened (or reopened).
    DIGY

    On Fri, May 27, 2011 at 10:02 AM, Marco Dissel
    wrote:
    Our index contains documents with an unique ID (long number)
    corresponding to a record in SQL database. I want to start a lucene
    search within a list of ID's returned from a SQL resultset.

    IDataReader reader = cmd.Execute("select ID from ...") long[] ids =
    GetID's(reader); string search = Join(ids) /// results in (ID:1 OR
    ID:2 OR ID:3) search += " AND (" + <searchstring entered by user> +
    ")"; IndexSearcher searcher = new IndexSearcher(_Reader); Query query
    = new Lucene.Net.QueryParsers.QueryParser(,,,, search); TopDocs docs =
    searcher.Search(query, 1000000);

    Is this the way to go?

    Thanks
    ---------------------------------------------------------
    Disclaimer

    This message and any attachments are confidential and/or privileged. If
    this has been sent to you in error, please do not use, retain or disclose
    them, and contact the sender as soon as possible.

    Oxford Analytica Ltd
    Registered in England: No. 1196703
    5 Alfred Street, Oxford
    United Kingdom, OX1 4EH
    ---------------------------------------------------------
  • Moray McConnachie at May 27, 2011 at 9:00 am
    It might be different as an actual number, but the relative ranking
    should not change - since each ID matches only one document, only the
    user's part of the search can affect the ranking.

    Someone else would need to speak as to whether the relative value of the
    scores will also change, but I can't see this being a problem.

    The custom collector route also means that paging is handled as normal
    and doesn't present a problem.

    If you follow this route, and if the user's valid list of IDs remains
    constant per user, you will certainly want to cache and index the list
    of valid IDs.

    Yours,
    Moray
    -------------------------------------
    Moray McConnachie
    Director of IT +44 1865 261 600
    Oxford Analytica http://www.oxan.com

    -----Original Message-----
    From: Marco Dissel
    Sent: 27 May 2011 09:14
    To: lucene-net-user@lucene.apache.org
    Subject: Re: [Lucene.Net] search within a list of unique ID's

    But is the document score not different if i filter the list afterwards
    with my list of unique ID's?

    On Fri, May 27, 2011 at 9:45 AM, Moray McConnachie wrote:

    If the list of IDs could be very long, then the search string could
    become horrendously long, and you might also have to look at the
    maximum clauses Lucene permits in a query.

    If user's search is reasonably restrictive, you might do better
    running the user's search then filtering the list of returned results
    according to your list of IDs. There are numerous ways of achieving
    this, I personally like the custom collector because you can add more
    complicated logic later if you want.

    Yours,
    Moray


    -------------------------------------
    Moray McConnachie
    Director of IT +44 1865 261 600
    Oxford Analytica http://www.oxan.com

    -----Original Message-----
    From: digy digy
    Sent: 27 May 2011 08:19
    To: lucene-net-user@lucene.apache.org
    Subject: Re: [Lucene.Net] search within a list of unique ID's

    Creation of SimplefacetedSearch is slow. Therefore it should only be
    created when a new reader is opened (or reopened).
    DIGY

    On Fri, May 27, 2011 at 10:02 AM, Marco Dissel
    wrote:
    Our index contains documents with an unique ID (long number)
    corresponding to a record in SQL database. I want to start a lucene
    search within a list of ID's returned from a SQL resultset.

    IDataReader reader = cmd.Execute("select ID from ...") long[] ids =
    GetID's(reader); string search = Join(ids) /// results in (ID:1 OR
    ID:2 OR ID:3) search += " AND (" + <searchstring entered by user> +
    ")"; IndexSearcher searcher = new IndexSearcher(_Reader); Query
    query = new Lucene.Net.QueryParsers.QueryParser(,,,, search);
    TopDocs docs =
    searcher.Search(query, 1000000);

    Is this the way to go?

    Thanks
    ---------------------------------------------------------
    Disclaimer

    This message and any attachments are confidential and/or privileged.
    If this has been sent to you in error, please do not use, retain or
    disclose them, and contact the sender as soon as possible.

    Oxford Analytica Ltd
    Registered in England: No. 1196703
    5 Alfred Street, Oxford
    United Kingdom, OX1 4EH
    ---------------------------------------------------------
    ---------------------------------------------------------
    Disclaimer

    This message and any attachments are confidential and/or privileged. If this has been sent to you in error, please do not use, retain or disclose them, and contact the sender as soon as possible.

    Oxford Analytica Ltd
    Registered in England: No. 1196703
    5 Alfred Street, Oxford
    United Kingdom, OX1 4EH
    ---------------------------------------------------------
  • Granroth, Neal V. at May 27, 2011 at 2:41 pm
    Yes, I think that the relative values of the scores could change significantly depending on the number of matches to documents which are then filtered out.

    I found CachingWrapperFilter useful to do this type of selection in the past where I needed to limit the set of documents that would be searched before the query is run.


    - Neal

    -----Original Message-----
    From: Moray McConnachie
    Sent: Friday, May 27, 2011 4:00 AM
    To: lucene-net-user@lucene.apache.org
    Subject: RE: [Lucene.Net] search within a list of unique ID's

    It might be different as an actual number, but the relative ranking
    should not change - since each ID matches only one document, only the
    user's part of the search can affect the ranking.

    Someone else would need to speak as to whether the relative value of the
    scores will also change, but I can't see this being a problem.

    The custom collector route also means that paging is handled as normal
    and doesn't present a problem.

    If you follow this route, and if the user's valid list of IDs remains
    constant per user, you will certainly want to cache and index the list
    of valid IDs.

    Yours,
    Moray
    -------------------------------------
    Moray McConnachie
    Director of IT +44 1865 261 600
    Oxford Analytica http://www.oxan.com

    -----Original Message-----
    From: Marco Dissel
    Sent: 27 May 2011 09:14
    To: lucene-net-user@lucene.apache.org
    Subject: Re: [Lucene.Net] search within a list of unique ID's

    But is the document score not different if i filter the list afterwards
    with my list of unique ID's?

    On Fri, May 27, 2011 at 9:45 AM, Moray McConnachie wrote:

    If the list of IDs could be very long, then the search string could
    become horrendously long, and you might also have to look at the
    maximum clauses Lucene permits in a query.

    If user's search is reasonably restrictive, you might do better
    running the user's search then filtering the list of returned results
    according to your list of IDs. There are numerous ways of achieving
    this, I personally like the custom collector because you can add more
    complicated logic later if you want.

    Yours,
    Moray


    -------------------------------------
    Moray McConnachie
    Director of IT +44 1865 261 600
    Oxford Analytica http://www.oxan.com

    -----Original Message-----
    From: digy digy
    Sent: 27 May 2011 08:19
    To: lucene-net-user@lucene.apache.org
    Subject: Re: [Lucene.Net] search within a list of unique ID's

    Creation of SimplefacetedSearch is slow. Therefore it should only be
    created when a new reader is opened (or reopened).
    DIGY

    On Fri, May 27, 2011 at 10:02 AM, Marco Dissel
    wrote:
    Our index contains documents with an unique ID (long number)
    corresponding to a record in SQL database. I want to start a lucene
    search within a list of ID's returned from a SQL resultset.

    IDataReader reader = cmd.Execute("select ID from ...") long[] ids =
    GetID's(reader); string search = Join(ids) /// results in (ID:1 OR
    ID:2 OR ID:3) search += " AND (" + <searchstring entered by user> +
    ")"; IndexSearcher searcher = new IndexSearcher(_Reader); Query
    query = new Lucene.Net.QueryParsers.QueryParser(,,,, search);
    TopDocs docs =
    searcher.Search(query, 1000000);

    Is this the way to go?

    Thanks
    ---------------------------------------------------------
    Disclaimer

    This message and any attachments are confidential and/or privileged.
    If this has been sent to you in error, please do not use, retain or
    disclose them, and contact the sender as soon as possible.

    Oxford Analytica Ltd
    Registered in England: No. 1196703
    5 Alfred Street, Oxford
    United Kingdom, OX1 4EH
    ---------------------------------------------------------
    ---------------------------------------------------------
    Disclaimer

    This message and any attachments are confidential and/or privileged. If this has been sent to you in error, please do not use, retain or disclose them, and contact the sender as soon as possible.

    Oxford Analytica Ltd
    Registered in England: No. 1196703
    5 Alfred Street, Oxford
    United Kingdom, OX1 4EH
    ---------------------------------------------------------
  • Marco Dissel at May 27, 2011 at 7:47 am
    I think you're answering the wrong thread... This is a new question ;-)
    On Fri, May 27, 2011 at 9:18 AM, digy digy wrote:

    Creation of SimplefacetedSearch is slow. Therefore it should only be
    created
    when a new reader is opened (or reopened).
    DIGY

    On Fri, May 27, 2011 at 10:02 AM, Marco Dissel <marco.dissel@gmail.com
    wrote:
    Our index contains documents with an unique ID (long number)
    corresponding
    to a record in SQL database. I want to start a lucene search within a list
    of ID's returned from a SQL resultset.

    IDataReader reader = cmd.Execute("select ID from ...")
    long[] ids = GetID's(reader);
    string search = Join(ids) /// results in (ID:1 OR ID:2 OR ID:3)
    search += " AND (" + <searchstring entered by user> + ")";
    IndexSearcher searcher = new IndexSearcher(_Reader);
    Query query = new Lucene.Net.QueryParsers.QueryParser(,,,, search);
    TopDocs docs = searcher.Search(query, 1000000);

    Is this the way to go?

    Thanks

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouplucene-net-user @
categorieslucene
postedMay 27, '11 at 7:15a
activeMay 27, '11 at 2:41p
posts8
users4
websitelucene.apache.org

People

Translate

site design / logo © 2022 Grokbase