Document category identification in query
Dec 16, 2009 at 1:48 am
Can anybody help me or maybe point me to relevant resources I could learn
: Hi, I am trying to expand user queries to figure out potential document categories implied in the query. I wanted to know what was the best way to figure out the document category that is the most relevant to the query. Let me explain further: I have created categories that are applied to documents I want to index. Some example categories are : Hotel Restaurant Fast Food Chinese Restaurant Church Bank Gas station I also am trying to create category aliases such as Chinese food can also be named
: I think you can do this with search suggestion like algorithms. First, you should categorize the search log, e.g. Thai Restaurant or Chinese Restaurant or KFC should be assigned categories including Restaurant. When user is typing, figure out from the search log which keyword is nearest to the input and take that keyword's categories as the user input's category. BTW, I do not understand why you need to know the category of user input -- Weiwei Wang Alex Wang 王巍巍 Room 403, Mengmin Wei Building
: Query classification is an interesting question and there are many papers discussed this. For more infomation, you could refe these papers, "A taxonomy of web search", "Understanding user goal in web search", "Our winning solution to query classification in KDDCUP 2005". In your question, i think you can do this by two steps. First, use query to retrieve document, then use the category information of retrieveled documents to classify the query using algorithm such as KNN. Second, use the query
: Hi ! Many thanks to both of you for your suggestions and answers! What Weiwei Wang suggests is a part of the solution I am willing to implement. I will definitely use the suggest-as-you-type approach in the query form as it will allow for pre-emptive disambiguation and I believe, will give very satisfying results. However, search users are wild beasts and I can't count on them to always use the given suggestions. All I can count on is very erratic, sparse and ambiguous queries :) So I need an
search for empty field?
how to select top categories.
Confused about boolean query and how an IndexReader is associated with Hits
Searching for null value?
Jaccard Similarity in Lucene
negative wildcard query
Boosting a document at query time, based on a field value/range
Query on using Payload with MoreLikeThis class
which analyzer for exact matchs
2 of 7
Dec 15, '09 at 4:24a
Dec 21, '09 at 1:30p
3 users in discussion
Fei liu (2)
Weiwei Wang (2)
Groups & Organizations
site design / logo © 2021 Grokbase