FAQ
I want to build a list of terms of all documents and their frequency data.
It seems the information I need is in "tis" and "tii" files. However I havent't found a way to handle them till now.

How can I get the term frequency data?

Thanks ,
Serkan

Search Discussions

  • Bernhard Messer at Aug 24, 2004 at 3:11 pm
    Serkan,

    it's easier using the IndexReader class to get the information you need.
    If you just need the doc frequency of each term you could use the sample.

    IndexReader ir = null;
    try {
    if (!IndexReader.indexExists("tmp/index"))
    return;
    ir = IndexReader.open("/tmp/index");
    TermEnum termEnum = ir.terms();
    while (termEnum.next()) {
    Term t = termEnum.term();
    System.out.println(t.text() + " --> " + ir.docFreq(t));

    }
    }
    catch (IOException e) {
    System.out.println(e.toString());
    }
    finally {
    if (ir != null) {
    try {
    ir.close();
    } catch (IOException e) {
    System.err.println("IOException, opened IndexReader
    can't be closed: " + e.toString());
    }
    }
    }

    hope this helps,
    Bernhard

    Serkan Oktar wrote:
    I want to build a list of terms of all documents and their frequency data.
    It seems the information I need is in "tis" and "tii" files. However I havent't found a way to handle them till now.

    How can I get the term frequency data?

    Thanks ,
    Serkan

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
    For additional commands, e-mail: lucene-user-help@jakarta.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedAug 24, '04 at 10:10a
activeAug 24, '04 at 3:11p
posts2
users2
websitelucene.apache.org

2 users in discussion

Serkan Oktar: 1 post Bernhard Messer: 1 post

People

Translate

site design / logo © 2022 Grokbase