FAQ
Hi everyone,

My question : i have medias with a "title" field and a "caption"
field and a "keywords" field.

I want to be able to search in those 3 fields at the same time. For
example, if i search "black car" the boolean query looks like this
combination of termqueries:

(title=black or keywords=black or caption=black) and (title=car or
keywords= car or caption= car).

So if "black" is in caption and "car" is in title I must find the media.

I'm afraid that those boolean queries will be slow when there are a
lot of terms in the query.

I can, at index time add a "fulltext" field to each media that will
contains title, caption and keywords concatenated.

the query becomes : (fulltext=black and fulltext=car), much simpler.

But i must still be able to search only in title or only in caption
or only in keywords, so I must still add the other fields, doubling
the indexed terms.

Has someone done a similar thing? Is it worth it, or will the First
boolean query remain fast enough?

Thx,

Antoine




--
Antoine Baudoux
Development Manager
ab@taktik.be
Tél.: +32 2 333 58 44
GSM: +32 499 534 538
Fax.: +32 2 648 16 53

Search Discussions

  • Erick Erickson at Aug 21, 2007 at 8:48 pm
    A three field boolean query isn't very complex, so I don't
    think that's a problem. Although it does depend a bit upon
    how many terms you allow.

    But I'd try the simplest thing first, which would be to put
    all the terms in a fulltext field as well as in individual terms.

    Then get some performance measurements and some
    idea of what the total size of the index will be and make
    some decisions at that point.

    It's actually remarkably easy to switch from one
    of those solutions to the other if performance isn't what
    you need. In general, I've had better luck not worrying
    about space and going for simple code, *then* changing
    things around if there's a problem.

    Best
    Erick
    On 8/21/07, Antoine Baudoux wrote:

    Hi everyone,

    My question : i have medias with a "title" field and a "caption"
    field and a "keywords" field.

    I want to be able to search in those 3 fields at the same time. For
    example, if i search "black car" the boolean query looks like this
    combination of termqueries:

    (title=black or keywords=black or caption=black) and (title=car or
    keywords= car or caption= car).

    So if "black" is in caption and "car" is in title I must find the media.

    I'm afraid that those boolean queries will be slow when there are a
    lot of terms in the query.

    I can, at index time add a "fulltext" field to each media that will
    contains title, caption and keywords concatenated.

    the query becomes : (fulltext=black and fulltext=car), much simpler.

    But i must still be able to search only in title or only in caption
    or only in keywords, so I must still add the other fields, doubling
    the indexed terms.

    Has someone done a similar thing? Is it worth it, or will the First
    boolean query remain fast enough?

    Thx,

    Antoine




    --
    Antoine Baudoux
    Development Manager
    ab@taktik.be
    Tél.: +32 2 333 58 44
    GSM: +32 499 534 538
    Fax.: +32 2 648 16 53

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedAug 21, '07 at 8:39p
activeAug 21, '07 at 8:48p
posts2
users2
websitelucene.apache.org

2 users in discussion

Erick Erickson: 1 post Antoine Baudoux: 1 post

People

Translate

site design / logo © 2022 Grokbase