I am implementing a search function for address by hibernate search which is based on lucene. The class definition as following:
public class Address implements Cloneable
private int id;
private String addrCountry;
private String addrDesc;
private String addrLineOne;
private String addrLineTwo;
private String addrCity;
As you see, addrCountry, addrLineone and addrCity are fields for search. I am using default analyzer in index & search. So I think country name like United States would be indexed as two terms United, and states.
In addition, during search, a search keyword like United states, or Salt lake city would be tokenized as two or three single words.
As result, any address fields contain united, city would be returned. like United Kingdom, but actually I want to get a result of united states.
My expected result as following:
if someone searches for "united" it should return "united states" and "united kingdom".
if someone searches for "united states" it should return "united states", and not "united kingdom".
I hope the analyzer can generate term with multiple words. say, united states to united states. I think standardanalyzer would analyze united states to united and states?
A different example: if search keyword is parking lot in Salt Lake City, the generated terms to search need to be: parking lot and Salt Lake City, not parking,lot,salt,lake and city.
I wonder if any analyzer can help me to implement my requirement. It would be better to use dictionary based solution, then I can manage some search terms that could have multiple words.