FAQ
Hi

I have a large index, on which Highlighter.Net works fine, but
FastVectorHighlighter returns null as a Best Fragment on Some documents.

the searcher works fine. It is just the highlighter. The field has been
indexed in the same manner for all documents, so I fail to understand Why it
highlights some documents but not all.

Using FastVectorHighlighter.Net 2.9.2, built from trunk rev942061

Search Discussions

  • Digy at May 9, 2010 at 7:18 pm
    With so little info, It is very hard to guess the reason.

    For ex.,
    doc.Add( new Field("f1", "abc def", Field.Store.NO, Field.Index.ANALYZED,
    Field.TermVector.NO) );
    doc.Add( new Field("f2", "ghi jkl", Field.Store.YES, Field.Index.ANALYZED,
    Field.TermVector.WITH_POSITIONS_OFFSETS) ); //<-- Field used in highlighting

    with a query something like [f1:abc OR f2:abc], you would get a hit but no
    highlight.

    DIGY

    -----Original Message-----
    From: Midhat Ali
    Sent: Friday, May 07, 2010 9:03 PM
    To: lucene-net-user
    Subject: FastVectorHighlighter.Net returning null on GetBestFragment

    Hi

    I have a large index, on which Highlighter.Net works fine, but
    FastVectorHighlighter returns null as a Best Fragment on Some documents.

    the searcher works fine. It is just the highlighter. The field has been
    indexed in the same manner for all documents, so I fail to understand Why it
    highlights some documents but not all.

    Using FastVectorHighlighter.Net 2.9.2, built from trunk rev942061
  • Midhat Ali at May 10, 2010 at 5:38 am
    Okay here is some more info

    Declaration:
    FastVectorHighlighter fvHighlighter = new FastVectorHighlighter(true,
    true);
    Usage:
    contentString =
    fvHighlighter.GetBestFragment(fvHighlighter.GetFieldQuery(query),
    searcher.GetIndexReader(), i, "FullContent", 20);

    Query:
    +FullContent:"company has" BranchName:"company has"

    2 Document Examples that returns null:
    stored/compressed,indexed,tokenized,termVector,termVectorOffsets,termVectorPosition<BranchName:ITEM
    9. CHANGES IN AND DISAGREEMENTS WITH ACCOUNTANTS ON ACCOUNTING AND FINANCIAL
    DISCLOSURE>
    stored/compressed,indexed,tokenized,termVector,termVectorOffsets,termVectorPosition<FullContent:
    The Company has nothing to report for this item. >

    stored/compressed,indexed,tokenized,termVector,termVectorOffsets,termVectorPosition<BranchName:
    At the present time, the Company has not issued preferred stock. >
    stored/compressed,indexed,tokenized,termVector,termVectorOffsets,termVectorPosition<FullContent:
    The Company has not declared or paid dividends on its common stock to date,
    and does not anticipate paying dividends in the foreseeable future. Any
    dividends declared would be subject to the prior rights of holders of any
    outstanding cumulative preferred stock. Certain Company loan documents
    restrict the payment of dividends. >

    1 Example document that returns highlighted value
    stored/compressed,indexed,tokenized,termVector,termVectorOffsets,termVectorPosition<BranchName:
    Purchase power contracts bought-out: >
    stored/compressed,indexed,tokenized,termVector,termVectorOffsets,termVectorPosition<FullContent:
    In conjunction with divestiture, the Company has made lump sum payments to
    effectively terminate a number of purchase power contracts. These payments
    are recorded as regulatory assets and are amortized as they are recovered
    from customers. >


    I hope now you can help me understand what's happening. Thanks


    On Mon, May 10, 2010 at 12:17 AM, Digy wrote:

    With so little info, It is very hard to guess the reason.

    For ex.,
    doc.Add( new Field("f1", "abc def", Field.Store.NO, Field.Index.ANALYZED,
    Field.TermVector.NO) );
    doc.Add( new Field("f2", "ghi jkl", Field.Store.YES, Field.Index.ANALYZED,
    Field.TermVector.WITH_POSITIONS_OFFSETS) ); //<-- Field used in
    highlighting

    with a query something like [f1:abc OR f2:abc], you would get a hit but no
    highlight.

    DIGY

    -----Original Message-----
    From: Midhat Ali
    Sent: Friday, May 07, 2010 9:03 PM
    To: lucene-net-user
    Subject: FastVectorHighlighter.Net returning null on GetBestFragment

    Hi

    I have a large index, on which Highlighter.Net works fine, but
    FastVectorHighlighter returns null as a Best Fragment on Some documents.

    the searcher works fine. It is just the highlighter. The field has been
    indexed in the same manner for all documents, so I fail to understand Why
    it
    highlights some documents but not all.

    Using FastVectorHighlighter.Net 2.9.2, built from trunk rev942061
  • Digy at May 10, 2010 at 5:45 pm
    Below is what I can create from your messy data. It works as expected and
    returns "The <b>Company has</b> noth".



    DIGY



    Lucene.Net.Store.RAMDirectory DIR = new
    Lucene.Net.Store.RAMDirectory();



    IndexWriter wr = new IndexWriter(DIR, new StandardAnalyzer());

    Document doc = new Document();

    doc.Add(new Field("BranchName", "ITEM 9. CHANGES IN AND
    DISAGREEMENTS WITH ACCOUNTANTS ON ACCOUNTING AND FINANCIAL DISCLOSURE",
    Field.Store.COMPRESS, Field.Index.ANALYZED,
    Field.TermVector.WITH_POSITIONS_OFFSETS));

    doc.Add(new Field("FullContent", "The Company has nothing to
    report for this item.", Field.Store.COMPRESS, Field.Index.ANALYZED,
    Field.TermVector.WITH_POSITIONS_OFFSETS));

    wr.AddDocument(doc);

    wr.Close();



    IndexSearcher searcher = new IndexSearcher(DIR);

    QueryParser qp = new QueryParser("FullContent", new
    StandardAnalyzer());

    Query query = qp.Parse("+FullContent:\"company has\"
    BranchName:\"company has\"");



    TopDocs tdocs = searcher.Search(query, 10);



    FastVectorHighlighter fvHighlighter = new
    FastVectorHighlighter(true, true);

    for (int i = 0; i < tdocs.scoreDocs.Length; i++)

    {

    string bestfragment =
    fvHighlighter.GetBestFragment(fvHighlighter.GetFieldQuery(query),
    searcher.GetIndexReader(), tdocs.scoreDocs[i].doc, "FullContent", 20);

    MessageBox.Show(bestfragment);

    }







    -----Original Message-----
    From: Midhat Ali
    Sent: Monday, May 10, 2010 8:38 AM
    To: lucene-net-user@lucene.apache.org
    Subject: Re: FastVectorHighlighter.Net returning null on GetBestFragment



    Okay here is some more info



    Declaration:

    FastVectorHighlighter fvHighlighter = new FastVectorHighlighter(true,

    true);

    Usage:

    contentString =

    fvHighlighter.GetBestFragment(fvHighlighter.GetFieldQuery(query),

    searcher.GetIndexReader(), i, "FullContent", 20);



    Query:

    +FullContent:"company has" BranchName:"company has"



    2 Document Examples that returns null:

    stored/compressed,indexed,tokenized,termVector,termVectorOffsets,termVectorP
    osition<BranchName:ITEM

    9. CHANGES IN AND DISAGREEMENTS WITH ACCOUNTANTS ON ACCOUNTING AND FINANCIAL

    DISCLOSURE>

    stored/compressed,indexed,tokenized,termVector,termVectorOffsets,termVectorP
    osition<FullContent:

    The Company has nothing to report for this item. >



    stored/compressed,indexed,tokenized,termVector,termVectorOffsets,termVectorP
    osition<BranchName:

    At the present time, the Company has not issued preferred stock. >

    stored/compressed,indexed,tokenized,termVector,termVectorOffsets,termVectorP
    osition<FullContent:

    The Company has not declared or paid dividends on its common stock to date,

    and does not anticipate paying dividends in the foreseeable future. Any

    dividends declared would be subject to the prior rights of holders of any

    outstanding cumulative preferred stock. Certain Company loan documents

    restrict the payment of dividends. >



    1 Example document that returns highlighted value

    stored/compressed,indexed,tokenized,termVector,termVectorOffsets,termVectorP
    osition<BranchName:

    Purchase power contracts bought-out: >

    stored/compressed,indexed,tokenized,termVector,termVectorOffsets,termVectorP
    osition<FullContent:

    In conjunction with divestiture, the Company has made lump sum payments to

    effectively terminate a number of purchase power contracts. These payments

    are recorded as regulatory assets and are amortized as they are recovered

    from customers. >





    I hope now you can help me understand what's happening. Thanks







    On Mon, May 10, 2010 at 12:17 AM, Digy wrote:


    With so little info, It is very hard to guess the reason. >
    For ex.,
    doc.Add( new Field("f1", "abc def", Field.Store.NO, Field.Index.ANALYZED,
    Field.TermVector.NO) );
    doc.Add( new Field("f2", "ghi jkl", Field.Store.YES, Field.Index.ANALYZED,
    Field.TermVector.WITH_POSITIONS_OFFSETS) ); //<-- Field used in
    highlighting >
    with a query something like [f1:abc OR f2:abc], you would get a hit but no
    highlight. >
    DIGY >
    -----Original Message-----
    From: Midhat Ali
    Sent: Friday, May 07, 2010 9:03 PM
    To: lucene-net-user
    Subject: FastVectorHighlighter.Net returning null on GetBestFragment >
    Hi >
    I have a large index, on which Highlighter.Net works fine, but
    FastVectorHighlighter returns null as a Best Fragment on Some documents. >
    the searcher works fine. It is just the highlighter. The field has been
    indexed in the same manner for all documents, so I fail to understand Why
    it
    highlights some documents but not all. >
    Using FastVectorHighlighter.Net 2.9.2, built from trunk rev942061
    >

    >
  • Digy at May 10, 2010 at 5:57 pm
    One more thing,

    What is "i". Is it index of a loop or docID?



    fvHighlighter.GetBestFragment(fvHighlighter.GetFieldQuery(query),searcher.Ge
    tIndexReader(), i, "FullContent", 20);



    DIGY



    -----Original Message-----
    From: Midhat Ali
    Sent: Monday, May 10, 2010 8:38 AM
    To: lucene-net-user@lucene.apache.org
    Subject: Re: FastVectorHighlighter.Net returning null on GetBestFragment



    Okay here is some more info



    Declaration:

    FastVectorHighlighter fvHighlighter = new FastVectorHighlighter(true,

    true);

    Usage:

    contentString =

    fvHighlighter.GetBestFragment(fvHighlighter.GetFieldQuery(query),

    searcher.GetIndexReader(), i, "FullContent", 20);



    Query:

    +FullContent:"company has" BranchName:"company has"



    2 Document Examples that returns null:

    stored/compressed,indexed,tokenized,termVector,termVectorOffsets,termVectorP
    osition<BranchName:ITEM

    9. CHANGES IN AND DISAGREEMENTS WITH ACCOUNTANTS ON ACCOUNTING AND FINANCIAL

    DISCLOSURE>

    stored/compressed,indexed,tokenized,termVector,termVectorOffsets,termVectorP
    osition<FullContent:

    The Company has nothing to report for this item. >



    stored/compressed,indexed,tokenized,termVector,termVectorOffsets,termVectorP
    osition<BranchName:

    At the present time, the Company has not issued preferred stock. >

    stored/compressed,indexed,tokenized,termVector,termVectorOffsets,termVectorP
    osition<FullContent:

    The Company has not declared or paid dividends on its common stock to date,

    and does not anticipate paying dividends in the foreseeable future. Any

    dividends declared would be subject to the prior rights of holders of any

    outstanding cumulative preferred stock. Certain Company loan documents

    restrict the payment of dividends. >



    1 Example document that returns highlighted value

    stored/compressed,indexed,tokenized,termVector,termVectorOffsets,termVectorP
    osition<BranchName:

    Purchase power contracts bought-out: >

    stored/compressed,indexed,tokenized,termVector,termVectorOffsets,termVectorP
    osition<FullContent:

    In conjunction with divestiture, the Company has made lump sum payments to

    effectively terminate a number of purchase power contracts. These payments

    are recorded as regulatory assets and are amortized as they are recovered

    from customers. >





    I hope now you can help me understand what's happening. Thanks







    On Mon, May 10, 2010 at 12:17 AM, Digy wrote:


    With so little info, It is very hard to guess the reason. >
    For ex.,
    doc.Add( new Field("f1", "abc def", Field.Store.NO, Field.Index.ANALYZED,
    Field.TermVector.NO) );
    doc.Add( new Field("f2", "ghi jkl", Field.Store.YES, Field.Index.ANALYZED,
    Field.TermVector.WITH_POSITIONS_OFFSETS) ); //<-- Field used in
    highlighting >
    with a query something like [f1:abc OR f2:abc], you would get a hit but no
    highlight. >
    DIGY >
    -----Original Message-----
    From: Midhat Ali
    Sent: Friday, May 07, 2010 9:03 PM
    To: lucene-net-user
    Subject: FastVectorHighlighter.Net returning null on GetBestFragment >
    Hi >
    I have a large index, on which Highlighter.Net works fine, but
    FastVectorHighlighter returns null as a Best Fragment on Some documents. >
    the searcher works fine. It is just the highlighter. The field has been
    indexed in the same manner for all documents, so I fail to understand Why
    it
    highlights some documents but not all. >
    Using FastVectorHighlighter.Net 2.9.2, built from trunk rev942061
    >

    >
  • Midhat Ali at May 11, 2010 at 5:22 am
    Thanks DIGY

    This was the problem. i is the document sequence number in the hits found.
    It should have been the document number in the Lucene Index. Problem Solved.

    I think the NDoc should be more clear. I will revise them once I get a good
    enough understanding of the library.

    -Midhat

    On Mon, May 10, 2010 at 10:56 PM, Digy wrote:

    One more thing,

    What is "i". Is it index of a loop or docID?




    fvHighlighter.GetBestFragment(fvHighlighter.GetFieldQuery(query),searcher.Ge
    tIndexReader(), i, "FullContent", 20);



    DIGY



    -----Original Message-----
    From: Midhat Ali
    Sent: Monday, May 10, 2010 8:38 AM
    To: lucene-net-user@lucene.apache.org
    Subject: Re: FastVectorHighlighter.Net returning null on GetBestFragment



    Okay here is some more info



    Declaration:

    FastVectorHighlighter fvHighlighter = new FastVectorHighlighter(true,

    true);

    Usage:

    contentString =

    fvHighlighter.GetBestFragment(fvHighlighter.GetFieldQuery(query),

    searcher.GetIndexReader(), i, "FullContent", 20);



    Query:

    +FullContent:"company has" BranchName:"company has"



    2 Document Examples that returns null:


    stored/compressed,indexed,tokenized,termVector,termVectorOffsets,termVectorP
    osition<BranchName:ITEM

    9. CHANGES IN AND DISAGREEMENTS WITH ACCOUNTANTS ON ACCOUNTING AND
    FINANCIAL

    DISCLOSURE>


    stored/compressed,indexed,tokenized,termVector,termVectorOffsets,termVectorP
    osition<FullContent:

    The Company has nothing to report for this item. >




    stored/compressed,indexed,tokenized,termVector,termVectorOffsets,termVectorP
    osition<BranchName:

    At the present time, the Company has not issued preferred stock. >


    stored/compressed,indexed,tokenized,termVector,termVectorOffsets,termVectorP
    osition<FullContent:

    The Company has not declared or paid dividends on its common stock to date,

    and does not anticipate paying dividends in the foreseeable future. Any

    dividends declared would be subject to the prior rights of holders of any

    outstanding cumulative preferred stock. Certain Company loan documents

    restrict the payment of dividends. >



    1 Example document that returns highlighted value


    stored/compressed,indexed,tokenized,termVector,termVectorOffsets,termVectorP
    osition<BranchName:

    Purchase power contracts bought-out: >


    stored/compressed,indexed,tokenized,termVector,termVectorOffsets,termVectorP
    osition<FullContent:

    In conjunction with divestiture, the Company has made lump sum payments to

    effectively terminate a number of purchase power contracts. These payments

    are recorded as regulatory assets and are amortized as they are recovered

    from customers. >





    I hope now you can help me understand what's happening. Thanks







    On Mon, May 10, 2010 at 12:17 AM, Digy wrote:


    With so little info, It is very hard to guess the reason.

    For ex.,
    doc.Add( new Field("f1", "abc def", Field.Store.NO,
    Field.Index.ANALYZED,
    Field.TermVector.NO) );
    doc.Add( new Field("f2", "ghi jkl", Field.Store.YES,
    Field.Index.ANALYZED,
    Field.TermVector.WITH_POSITIONS_OFFSETS) ); //<-- Field used in
    highlighting

    with a query something like [f1:abc OR f2:abc], you would get a hit but no
    highlight.

    DIGY

    -----Original Message-----
    From: Midhat Ali
    Sent: Friday, May 07, 2010 9:03 PM
    To: lucene-net-user
    Subject: FastVectorHighlighter.Net returning null on GetBestFragment

    Hi

    I have a large index, on which Highlighter.Net works fine, but
    FastVectorHighlighter returns null as a Best Fragment on Some documents.

    the searcher works fine. It is just the highlighter. The field has been
    indexed in the same manner for all documents, so I fail to understand Why
    it
    highlights some documents but not all.

    Using FastVectorHighlighter.Net 2.9.2, built from trunk rev942061

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouplucene-net-user @
categorieslucene
postedMay 7, '10 at 6:03p
activeMay 11, '10 at 5:22a
posts6
users2
websitelucene.apache.org

2 users in discussion

Midhat Ali: 3 posts Digy: 3 posts

People

Translate

site design / logo © 2022 Grokbase