FAQ
the following code has a bug of StringIndexOutofBounds when multiple matched
terms need highlight

private String makeFragment( WeightedFragInfo fragInfo, String src, int s,
String[] preTags, String[] postTags, Encoder encoder ){
StringBuilder fragment = new StringBuilder();
int srcIndex = 0;
for( SubInfo subInfo : fragInfo.subInfos ){
for( Toffs to : subInfo.termsOffsets ){
fragment
.append( encoder.encodeText( src.substring( srcIndex,
to.startOffset - s ) ) )
.append( getPreTag( preTags, subInfo.seqnum ) )
.append( encoder.encodeText( src.substring( to.startOffset - s,
to.endOffset - s ) ) )
.append( getPostTag( postTags, subInfo.seqnum ) );
srcIndex = to.endOffset - s;
}
}
fragment.append( encoder.encodeText( src.substring( srcIndex ) ) );
return fragment.toString();
}--
王巍巍
Cell: 18911288489
MSN: ww.wang.cs@gmail.com
Blog: http://whisper.eyesay.org
围脖:http://t.sina.com/lolorosa

Search Discussions

  • Steven A Rowe at May 23, 2011 at 5:35 am
    Hi WeiWei,

    Thanks for the report.

    Can you provide a self-contained unit test that triggers the bug?

    Thanks,
    Steve
    -----Original Message-----
    From: Weiwei Wang
    Sent: Monday, May 23, 2011 1:25 AM
    To: java-user@lucene.apache.org
    Subject: FastVectorHighlighter StringIndexOutofBounds bug

    the following code has a bug of StringIndexOutofBounds when multiple
    matched
    terms need highlight

    private String makeFragment( WeightedFragInfo fragInfo, String src, int
    s,
    String[] preTags, String[] postTags, Encoder encoder ){
    StringBuilder fragment = new StringBuilder();
    int srcIndex = 0;
    for( SubInfo subInfo : fragInfo.subInfos ){
    for( Toffs to : subInfo.termsOffsets ){
    fragment
    .append( encoder.encodeText( src.substring( srcIndex,
    to.startOffset - s ) ) )
    .append( getPreTag( preTags, subInfo.seqnum ) )
    .append( encoder.encodeText( src.substring( to.startOffset - s,
    to.endOffset - s ) ) )
    .append( getPostTag( postTags, subInfo.seqnum ) );
    srcIndex = to.endOffset - s;
    }
    }
    fragment.append( encoder.encodeText( src.substring( srcIndex ) ) );
    return fragment.toString();
    }--
    王巍巍
    Cell: 18911288489
    MSN: ww.wang.cs@gmail.com
    Blog: http://whisper.eyesay.org
    围脖:http://t.sina.com/lolorosa
  • Weiwei Wang at May 23, 2011 at 5:37 am
    1. source string: 777777777
    2. WhitespaceTokenizer + EGramTokenFilter
    3. FastVectorHighlighter,
    4. debug info: subInfos=(777((8,11))777((5,8))777((2,5)))/3.0(2,102),
    srcIndex is not correctly computed for the second loop of the outer for-loop

    2011/5/23 Weiwei Wang <ww.wang.cs@gmail.com>
    the following code has a bug of StringIndexOutofBounds when multiple
    matched terms need highlight

    private String makeFragment( WeightedFragInfo fragInfo, String src, int s,
    String[] preTags, String[] postTags, Encoder encoder ){
    StringBuilder fragment = new StringBuilder();
    int srcIndex = 0;
    for( SubInfo subInfo : fragInfo.subInfos ){
    for( Toffs to : subInfo.termsOffsets ){
    fragment
    .append( encoder.encodeText( src.substring( srcIndex,
    to.startOffset - s ) ) )
    .append( getPreTag( preTags, subInfo.seqnum ) )
    .append( encoder.encodeText( src.substring( to.startOffset - s,
    to.endOffset - s ) ) )
    .append( getPostTag( postTags, subInfo.seqnum ) );
    srcIndex = to.endOffset - s;
    }
    }
    fragment.append( encoder.encodeText( src.substring( srcIndex ) ) );
    return fragment.toString();
    }--
    王巍巍
    Cell: 18911288489
    MSN: ww.wang.cs@gmail.com
    Blog: http://whisper.eyesay.org
    围脖:http://t.sina.com/lolorosa

    --
    王巍巍
    Cell: 18911288489
    MSN: ww.wang.cs@gmail.com
    Blog: http://whisper.eyesay.org
    围脖:http://t.sina.com/lolorosa
  • Koji Sekiguchi at May 23, 2011 at 12:53 pm

    (11/05/23 14:36), Weiwei Wang wrote:
    1. source string: 777777777
    2. WhitespaceTokenizer + EGramTokenFilter
    3. FastVectorHighlighter,
    4. debug info: subInfos=(777((8,11))777((5,8))777((2,5)))/3.0(2,102),
    srcIndex is not correctly computed for the second loop of the outer for-loop
    How does your query look like?
    And what is EGramTokenFilter? Is it NGramTokenFilter?
    If so, what are min and max gram sizes?
    Note that FVH has a restriction - min and max should equal.
    (i.e. min=1 and max=3 cannot be supported by FVH)

    koji
    --
    http://www.rondhuit.com/en/

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedMay 23, '11 at 5:28a
activeMay 23, '11 at 12:53p
posts4
users3
websitelucene.apache.org

People

Translate

site design / logo © 2022 Grokbase