I am coding a *local pdf search engine* in Java.(If someone did it before,
could you please give some tips?) So I need query parse.
Assume I want to search for "hello user" in the document.
*Q1*. I have 4 kinds of queries in my program. They are:
1. Match Exact words or phases. e.g. "hello user", "hello users", etc. can
be the search results. Note: the words must be in right order.
2. Match Exact words or phases, but whole words only. e.g. only "hello
user" can be the search result. Note: the words must be in right order.
3. Match Any words. e.g. "hello", "user", "users", etc. can be the search
4. Match Any words or phases, but whole words only. e.g. only "hello",
"user", "hello user" can be the search results.
My question: what the query expressions are for these 4 kind that i can use
it to parse?
2nd expression is: "hello user".
4th expression is: "hello" OR "user".
Correct? And what about the other 2?
*Q2*. Then I use the following code to get context of the search result.
Note: the result must be ordered by relevance.
Query q = new QueryParser(Version.LUCENE_30,"field",
QueryScorer scorer = new QueryScorer(query);
Highlighter highlighter = new Highlighter(scorer);
highlighter.setTextFragmenter(new SimpleSpanFragmenter(scorer, 500));
And there are something wrong in line 2 for the 4 kinds of queries. How can
I modify the code to get the snippet of the content?