FAQ
Hello,

I'm playing around with Xapian and I'm wondering whether it's possible to
retrieve the estimated number of documents returned by each database that is
part of a query.

For example--suppose that I have two databases, one that stores news items
and another one that stores article. When a search is performed, by default
I want to return a set that contains matches from both dbs. However, I also
want to give the user an idea of how many of the matches come from each
database.

1. Is this possible without running the query again against either db?
2. As a side question, is there a significant performance hit in combining
multiple databases as opposed to using a single db? In that case, how could
I separate the different types of data to achieve the result I described
above?

Thanks much!


Marco

Search Discussions

  • Olly Betts at Jun 13, 2005 at 5:55 pm

    On Sun, Jun 12, 2005 at 10:55:27PM -0400, Marco Tabini wrote:
    I'm playing around with Xapian and I'm wondering whether it's possible to
    retrieve the estimated number of documents returned by each database that is
    part of a query.
    No. Currently statistics for each term are merged, then the estimates
    calculated.

    This is likely to change though. I'm planning to change to storing the
    first and last document id which each term indexes and use the query's
    structure to apply intersections, unions, etc to these ranges. This
    should improve the estimate statistics, but it is probably best done per
    database, and then summed.

    It would be pretty easy to make per-database statistics available then.
    1. Is this possible without running the query again against either db?
    No, although this probably won't be very expensive to do as most of the
    database blocks you'll need will be cached from the first query.
    Generally it's the I/O which takes the time (unless the database is
    small, in which case it's quick anyway!)
    2. As a side question, is there a significant performance hit in combining
    multiple databases as opposed to using a single db?
    Shouldn't be much. The main hit will be that separate databases will
    usually be smaller, so need less I/O.

    Cheers,
    Olly

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupxapian-discuss @
categoriesxapian
postedJun 13, '05 at 3:55a
activeJun 13, '05 at 5:55p
posts2
users2
websitexapian.org
irc#xapian

2 users in discussion

Olly Betts: 1 post Marco Tabini: 1 post

People

Translate

site design / logo © 2022 Grokbase