FAQ
Hi,

Is there anyone built your own filter query before, in order to perform
search within search results. Meaning after the first search, the result is
cached and the second search searches the result that return from the first
searched, and is not searching the whole index again.

Just wondering by using Hits, can it achieved the result above?

I have try using the boolean search but the result wasn't what I
expected.

Thanks

regards,
Wooi Meng
--
View this message in context: http://www.nabble.com/Filter-query-method-tf2586547.html#a7211921
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Search Discussions

  • Chris Hostetter at Nov 7, 2006 at 4:54 am
    : Is there anyone built your own filter query before, in order to perform
    : search within search results. Meaning after the first search, the result is
    : cached and the second search searches the result that return from the first
    : searched, and is not searching the whole index again.
    :
    : Just wondering by using Hits, can it achieved the result above?

    1) it's typically not neccessary to worry about the caching of the results
    too much for the simple cases that Hits can be used in.

    2) you have to decide what semantics you are interested in ... is it a
    "search A within a result set B" where the score is only based on query A,
    or is it a "search on A and B" where A and B both make up the score?

    ...in the first case, you can use the serch method that takes a Filter
    and some caching is possible, in the second case BooleanQuery is what you
    really want.


    -Hoss


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Spinergywmy at Nov 7, 2006 at 5:44 am
    Hi Chris,

    My scenario is:

    I will key in the first search value in the text box, then the first
    search result will be return. Next, I will clear the first search value and
    key in the second search value within the same text box. The 2nd search
    value will search the first result, for instance, my first result found 3
    records, after the 2nd search is performed, it should be return 1 record.
    Meaning the first result is searched from the index folder, while 2nd search
    should not search the index directory again.

    I hope the scenario above make sense to you.

    Below a the codes that I have written, hope you can point out which part
    I m done wrong. Thanks

    reader = IndexReader.open(DsConstant.indexDir);
    Searcher searcher = new IndexSearcher(reader);
    Analyzer analyzer = new StandardAnalyzer();
    QueryParser parser = new QueryParser(DsConstant.idxFileContent,
    analyzer);
    Query query = parser.parse(searchString);

    BooleanQuery bq = new BooleanQuery();
    query = new FilteredQuery(new MatchAllDocQuery(), new
    SingleDocTestFilter(0));
    bq.add(query, BooleanClause.Occur.MUST);
    query = new FilteredQuery(new MatchAllDocQuery(), new
    SingleDocTestFilter(0));
    bq.add(query, BooleanClause.Occur.MUST);

    searchHits = searcher.search(query);

    Unfortunately, the codes above doesn't return any result. Hope to hear
    from you again.


    regards,
    Wooi Meng
    --
    View this message in context: http://www.nabble.com/Filter-query-method-tf2586547.html#a7212918
    Sent from the Lucene - Java Users mailing list archive at Nabble.com.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Spinergywmy at Nov 7, 2006 at 9:06 am
    Hi Chris,

    Another thing I would like to emphasize is:

    for example, if my first query is "Java" and it returned 3 records.
    For my 2nd query is "Tomcat", and within my first search, there is only one
    record contain the word "Tomcat", this will be my intention to do so.

    Is it possible for lucene to do so?

    Thanks

    regards,
    Wooi Meng
    --
    View this message in context: http://www.nabble.com/Filter-query-method-tf2586547.html#a7214786
    Sent from the Lucene - Java Users mailing list archive at Nabble.com.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Chris Hostetter at Nov 7, 2006 at 7:31 pm
    : for example, if my first query is "Java" and it returned 3 records.
    : For my 2nd query is "Tomcat", and within my first search, there is only one
    : record contain the word "Tomcat", this will be my intention to do so.
    :
    : Is it possible for lucene to do so?

    anything is possible .. but as i said, you have to first decide wether you
    want the scores of your final query to only be based on the tf/idf of
    the word "Tomcat" (in which case you want to build a Filter out of the
    query on "Java" or if you want the scores to be based on the tf/idf of
    both words (in which case you should build a big boolean query)

    I also have no idea what SingleDocTestFilter is in the code you posted, so
    i really can't guess why your code doesn't work for you... what is it
    supose to do?




    -Hoss


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Spinergywmy at Nov 8, 2006 at 12:56 am
    Hi Chris,

    For my case, I want the scores of final query to be based on the second
    query = "Tomcat". So, is there any example that I can refering to?

    Thanks.

    regards,
    Wooi Meng
    --
    View this message in context: http://www.nabble.com/Filter-query-method-tf2586547.html#a7230856
    Sent from the Lucene - Java Users mailing list archive at Nabble.com.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Spinergywmy at Nov 8, 2006 at 1:05 am
    Hi Chris,

    I have made some modification for the codes that I have posted yesterday.

    reader = IndexReader.open(DsConstant.indexDir);
    Searcher searcher = new IndexSearcher(reader);
    Analyzer analyzer = new StandardAnalyzer();
    QueryParser parser = new QueryParser(DsConstant.idxFileContent,
    analyzer);

    Query query1 = parser.parse(searchString1);
    Query query2 = parser.parse(searchString2);

    filter = new Filter(){
    public BitSet bits (IndexReader reader){
    System.out.println("inside bit set");

    BitSet bits = new BitSet(reader.maxDoc());
    System.out.println("bits size is ::: " +bits.size());
    for(int i=start; (i<end && i<bits.size()); i++)
    {
    bits.set(i);
    }

    return bits;
    }
    };

    BooleanQuery bq = new BooleanQuery();
    query1 = new FilteredQuery(new MatchAllDocQuery(), filter);
    bq.add(query1, BooleanClause.Occur.MUST);
    bq.add(query2, BooleanClause.Occur.MUST);

    searchHits = searcher.search(bq);

    Can you pls point me out which part I have done it wrong. Thanks.


    regards,
    Wooi Meng
    --
    View this message in context: http://www.nabble.com/Filter-query-method-tf2586547.html#a7230940
    Sent from the Lucene - Java Users mailing list archive at Nabble.com.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Doron Cohen at Nov 8, 2006 at 7:18 am
    Hi Wooi Meng,

    I don't understand this code.
    In particular, searchString1 has no efect, and not clear what are start and
    end.

    spinergywmy <spinergywmy@gmail.com> wrote on 07/11/2006 17:04:45:
    Query query1 = parser.parse(searchString1);
    Query query2 = parser.parse(searchString2);

    filter = new Filter(){
    public BitSet bits (IndexReader reader){
    BitSet bits = new BitSet(reader.maxDoc());
    for(int i=start; (i<end && i<bits.size()); i++)
    bits.set(i);
    return bits;
    }
    };

    BooleanQuery bq = new BooleanQuery();
    query1 = new FilteredQuery(new MatchAllDocQuery(), filter);
    bq.add(query1, BooleanClause.Occur.MUST);
    bq.add(query2, BooleanClause.Occur.MUST);

    searchHits = searcher.search(bq);
    Anyhow, if you do not cache the results of the first search, two simple
    options are:

    (1) score by both str1 and str2:

    Query query1 = parser.parse(searchString1);
    Query query2 = parser.parse(searchString2);
    BooleanQuery q = new BooleanQuery();
    q.add(query1, BooleanClause.Occur.MUST);
    q.add(query2, BooleanClause.Occur.MUST);
    Hits searchHits = searcher.search(q);

    (2) score ony by str, filter by str1:

    Query query1 = parser.parse(searchString1);
    Filter f1 = new QueryFilter(query1);
    Query q2 = parser.parse(searchString2);
    Hits searchHits = searcher.search(q2,f1);

    I would start with this, and go for caching only if performance issues
    justify that.

    Doron


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Spinergywmy at Nov 8, 2006 at 9:56 am
    Hi Doron,

    Thanks for the suggestion.

    However, the solution that u suggest come close to what I really want.

    The scenario sound like this:

    when I key in the first search query, for instance, "Java", the return
    result has 3 records. Next, the perform the second search, which the query
    this time is "Tomcat".

    within my first search result, there is only one record that contains
    "Java" and "Tomcat" words, therefore, there should be only one record return
    for 2nd search. And the highlight is now move from "Java" to "Tomcat".

    basically, my second search query should be searching within the first
    result records rather than go back to the index folder to redo the search
    contents again.

    I hope I did explain what I have to do. Is there any solution for this?
    Hope to hear from you soon.

    regards,
    Wooi Meng
    --
    View this message in context: http://www.nabble.com/Filter-query-method-tf2586547.html#a7235584
    Sent from the Lucene - Java Users mailing list archive at Nabble.com.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Doron Cohen at Nov 8, 2006 at 7:03 pm

    spinergywmy wrote on 08/11/2006 01:56:00:
    within my first search result, there is only one record that contains
    "Java" and "Tomcat" words, therefore, there should be only one record return
    for 2nd search. And the highlight is now move from "Java" to "Tomcat".
    To my understanding the 2nd suggestion does exactly this, as long as the
    highlighter QueryScorer is based on q2:

    : (2) score only by str2, filter by str1:
    : Query query1 = parser.parse(searchString1);
    : Filter f1 = new QueryFilter(query1);
    : Query q2 = parser.parse(searchString2);
    : Hits searchHits = searcher.search(q2,f1);

    basically, my second search query should be searching within the first
    result records rather than go back to the index folder to redo the search
    contents again.
    Going back to the server (for q1) might take more time, but is simple. Task
    wise, does not block you from programming any behavior required (e.g. the
    various options that Chris mentioned). You would be able to get the same
    results - returned docs, their scores, highlighting - as when not going
    back to the server (for q1). Only disadvantage might be performance. Did
    you get that working but saw performance problems?



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Spinergywmy at Nov 9, 2006 at 12:49 am
    Hi,

    Thanks, I will try it again.

    regards,
    Wooi Meng
    --
    View this message in context: http://www.nabble.com/Filter-query-method-tf2586547.html#a7250452
    Sent from the Lucene - Java Users mailing list archive at Nabble.com.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Spinergywmy at Nov 10, 2006 at 9:48 am
    Hi Doron,

    I m not sure I m implement your suggestion correctly.

    The way I did is I have 2 separate methods controlling by the check box.
    I used basic search method for the first time and that will look up the
    index from the directory. After I got the result, I will check the checkbox
    and that will lead to 2nd method which the reader again from the index
    directory and searcher search based on the reader. Next, I m only implement
    the codes that suggested by you. So, I m not sure I m doing the right thing
    or not.

    I have attached the work that I done on below:

    public String search(String searchString) throws IOException,
    Exception
    {
    StringBuffer buff = new StringBuffer();
    String field = "";

    try
    {
    reader = IndexReader.open(DsConstant.indexDir);
    Searcher searcher = new IndexSearcher(reader);
    Analyzer analyzer = new StandardAnalyzer();
    QueryParser parser = new QueryParser(DsConstant.idxFileContent,
    analyzer);
    //Query query = parser.parse(searchString);

    String[] metaFields = field.split(":");
    metaFields = new String[]{"contents", "companyId", "ownerId",
    "createdBy", "docName", "docDesc", "keywords", "docId", "docPropName"};

    for(int i = 0; i < metaFields.length; i++)
    {
    buff.append(metaFields[i] + ":" + searchString);
    if(i != (metaFields.length-1))
    {
    buff.append(" OR ");
    }
    }

    Query query = parser.parse(buff.toString());
    query = query.rewrite(reader);

    System.out.println("query ::: " + query);

    searchHits = searcher.search(query);

    System.out.println("search hits is ::: " +searchHits.length());

    if(searchHits.length() > 0)
    {
    QueryScorer scorer = new QueryScorer(query);
    Highlighter highlighter = new Highlighter(new SimpleHTMLFormatter("<span
    style='background-color:yellow; font-weight:bold;'>",
    "</span>"), scorer);

    for(int i = 0; i < searchHits.length(); i++)
    {
    Document doc = searchHits.doc(i);
    String text = doc.get(DsConstant.idxFileContent);
    TokenStream tokenstream =
    analyzer.tokenStream(DsConstant.idxFileContent, new StringReader(text));
    //buff.append("<p> '" + DsConstant.userDir
    buff.append("!");
    buff.append("<p " + searchHits.doc(i).get(DsConstant.idxPath) + " "
    + searchHits.doc(i).get("docName") + " <br>");
    buff.append("score: " + searchHits.score(i) + "<br>");
    buff.append(highlighter.getBestFragments(tokenstream, text, 3, "...")+
    "</p>");
    }

    //System.out.println("Folder path is ::: " +DsConstant.folderPath);

    searcher.close();
    }

    System.out.println("Found "+searchHits.length()+" searchHits with query =
    "+query);
    }
    catch(Exception e)
    {
    e.printStackTrace();
    }

    return buff.toString();
    //return resultList;
    }

    public String refindSearchResult(String searchString1, String
    searchString2) throws IOException, Exception
    {
    System.out.println("inside search util - refind search result");

    IndexReader reader = null;
    StringBuffer buff = new StringBuffer();

    try
    {
    reader = IndexReader.open(DsConstant.indexDir);
    Searcher searcher = new IndexSearcher(reader);
    Analyzer analyzer = new StandardAnalyzer();
    QueryParser parser = new QueryParser(DsConstant.idxFileContent,
    analyzer);

    Query query1 = parser.parse(searchString1);
    filter = new QueryFilter(query1);
    Query query2 = parser.parse(searchString2);

    searchHits = searcher.search(query2, filter);

    System.out.println("search hits is ::: " +searchHits.length());

    if(searchHits.length() > 0)
    {
    QueryScorer scorer = new QueryScorer(query2);
    Highlighter highlighter = new Highlighter(new SimpleHTMLFormatter("<span
    style='background-color:yellow; font-weight:bold;'>",
    "</span>"), scorer);

    for(int i = 0; i < searchHits.length(); i++)
    {
    Document doc = searchHits.doc(i);
    String text = doc.get(DsConstant.idxFileContent);
    TokenStream tokenstream =
    analyzer.tokenStream(DsConstant.idxFileContent, new StringReader(text));
    //buff.append("<p> '" + DsConstant.userDir
    buff.append("<p " + searchHits.doc(i).get(DsConstant.idxPath) + " "
    + searchHits.doc(i).get("docName") + " <br>");
    buff.append("score: " + searchHits.score(i) + "<br>");
    buff.append(highlighter.getBestFragments(tokenstream, text, 3, "...")+
    "</p>");
    buff.append("!").substring(i);
    }

    searcher.close();
    }

    System.out.println("Found "+searchHits.length()+" searchHits with query =
    "+query2);
    }
    catch(Exception e)
    {
    e.printStackTrace();
    }

    return buff.toString();
    }

    Please point out where I m done wrong and hopefully the right solution
    that I can use.

    Thank you.

    regards,
    Wooi Meng
    --
    View this message in context: http://www.nabble.com/Filter-query-method-tf2586547.html#a7273997
    Sent from the Lucene - Java Users mailing list archive at Nabble.com.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Doron Cohen at Nov 10, 2006 at 9:03 pm
    You did not specify what's wrong - in what way is the code below not
    working as you expect?

    Two things to check:

    (1) search() and refindSearchResult() process the text of the first query
    differently. In search() the text is added to multiple fields
    ("metaField"). The way it is done btw would not work well if searchString
    is multi-word. You should take a look at the printout for the result query
    to see if this is what you expect. In refindSearchResult() the metaFields
    logic is not done for q1. You should print q1 and print the filter to see
    they match what you expect, and also compare to the query as printed in
    search().

    (2) reuse searcher for better performance - currently in every call to
    search() and to refindSearchResult() a new searcher is opened. This would
    hurt performance. Better to reuse a searchers between queries, and, from
    time to time, reopen it to refresh with recent index changes. Many
    discussions on this in the mailing list.

    Hope this helps,
    Doron

    spinergywmy wrote:
    Hi Doron,

    I m not sure I m implement your suggestion correctly.

    The way I did is I have 2 separate methods controlling by the check box.
    I used basic search method for the first time and that will look up the
    index from the directory. After I got the result, I will check the checkbox
    and that will lead to 2nd method which the reader again from the index
    directory and searcher search based on the reader. Next, I m only implement
    the codes that suggested by you. So, I m not sure I m doing the right thing
    or not.

    I have attached the work that I done on below:

    public String search(String searchString) throws IOException,
    Exception
    {
    StringBuffer buff = new StringBuffer();
    String field = "";

    try
    {
    reader = IndexReader.open(DsConstant.indexDir);
    Searcher searcher = new IndexSearcher(reader);
    Analyzer analyzer = new StandardAnalyzer();
    QueryParser parser = new QueryParser(DsConstant.idxFileContent,
    analyzer);
    //Query query = parser.parse(searchString);

    String[] metaFields = field.split(":");
    metaFields = new String[]{"contents", "companyId", "ownerId",
    "createdBy", "docName", "docDesc", "keywords", "docId", "docPropName"};

    for(int i = 0; i < metaFields.length; i++)
    {
    buff.append(metaFields[i] + ":" + searchString);
    if(i != (metaFields.length-1))
    {
    buff.append(" OR ");
    }
    }

    Query query = parser.parse(buff.toString());
    query = query.rewrite(reader);

    System.out.println("query ::: " + query);

    searchHits = searcher.search(query);

    System.out.println("search hits is ::: " +searchHits.length());

    if(searchHits.length() > 0)
    {
    QueryScorer scorer = new QueryScorer(query);
    Highlighter highlighter = new Highlighter(new
    SimpleHTMLFormatter("<span
    style='background-color:yellow; font-weight:bold;'>",
    "</span>"), scorer);

    for(int i = 0; i < searchHits.length(); i++)
    {
    Document doc = searchHits.doc(i);
    String text = doc.get(DsConstant.idxFileContent);
    TokenStream tokenstream =
    analyzer.tokenStream(DsConstant.idxFileContent, new StringReader(text));
    //buff.append("<p> '" + DsConstant.userDir
    buff.append("!");
    buff.append("<p " + searchHits.doc(i).
    get(DsConstant.idxPath) + " "
    + searchHits.doc(i).get("docName") + " <br>");
    buff.append("score: " + searchHits.score(i) + "<br>");
    buff.append(highlighter.getBestFragments(tokenstream,
    text, 3, "...")+
    "</p>");
    }

    //System.out.println("Folder path is ::: " +DsConstant.
    folderPath);

    searcher.close();
    }

    System.out.println("Found "+searchHits.length()+"
    searchHits with query =
    "+query);
    }
    catch(Exception e)
    {
    e.printStackTrace();
    }

    return buff.toString();
    //return resultList;
    }

    public String refindSearchResult(String searchString1, String
    searchString2) throws IOException, Exception
    {
    System.out.println("inside search util - refind search result");

    IndexReader reader = null;
    StringBuffer buff = new StringBuffer();

    try
    {
    reader = IndexReader.open(DsConstant.indexDir);
    Searcher searcher = new IndexSearcher(reader);
    Analyzer analyzer = new StandardAnalyzer();
    QueryParser parser = new QueryParser(DsConstant.idxFileContent,
    analyzer);

    Query query1 = parser.parse(searchString1);
    filter = new QueryFilter(query1);
    Query query2 = parser.parse(searchString2);

    searchHits = searcher.search(query2, filter);

    System.out.println("search hits is ::: " +searchHits.length());

    if(searchHits.length() > 0)
    {
    QueryScorer scorer = new QueryScorer(query2);
    Highlighter highlighter = new Highlighter(new
    SimpleHTMLFormatter("<span
    style='background-color:yellow; font-weight:bold;'>",
    "</span>"), scorer);

    for(int i = 0; i < searchHits.length(); i++)
    {
    Document doc = searchHits.doc(i);
    String text = doc.get(DsConstant.idxFileContent);
    TokenStream tokenstream =
    analyzer.tokenStream(DsConstant.idxFileContent, new StringReader(text));
    //buff.append("<p> '" + DsConstant.userDir
    buff.append("<p " + searchHits.doc(i).
    get(DsConstant.idxPath) + " "
    + searchHits.doc(i).get("docName") + " <br>");
    buff.append("score: " + searchHits.score(i) + "<br>");
    buff.append(highlighter.getBestFragments(tokenstream,
    text, 3, "...")+
    "</p>");
    buff.append("!").substring(i);
    }

    searcher.close();
    }

    System.out.println("Found "+searchHits.length()+"
    searchHits with query =
    "+query2);
    }
    catch(Exception e)
    {
    e.printStackTrace();
    }

    return buff.toString();
    }

    Please point out where I m done wrong and hopefully the right solution
    that I can use.

    Thank you.

    regards,
    Wooi Meng
    --
    View this message in context: http://www.nabble.com/Filter-query-
    method-tf2586547.html#a7273997
    Sent from the Lucene - Java Users mailing list archive at Nabble.com.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Spinergywmy at Nov 13, 2006 at 1:59 am
    Hi Doron,

    How to reuse the searcher? Is there any example on how to do this?

    Thanks

    regards,
    Wooi Meng
    --
    View this message in context: http://www.nabble.com/Filter-query-method-tf2586547.html#a7310277
    Sent from the Lucene - Java Users mailing list archive at Nabble.com.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Erick Erickson at Nov 13, 2006 at 3:08 am
    Just don't close it. Think of this as a singleton pattern.....

    Erick
    On 11/12/06, spinergywmy wrote:


    Hi Doron,

    How to reuse the searcher? Is there any example on how to do this?

    Thanks

    regards,
    Wooi Meng
    --
    View this message in context:
    http://www.nabble.com/Filter-query-method-tf2586547.html#a7310277
    Sent from the Lucene - Java Users mailing list archive at Nabble.com.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedNov 7, '06 at 3:17a
activeNov 13, '06 at 3:08a
posts15
users4
websitelucene.apache.org

People

Translate

site design / logo © 2022 Grokbase