FAQ
Hi,

I have such a problem when creating lucene index for many html files:

It shows "aborted, expected<tagname>....<tagend>" for those html files
which contain java scripts. It seems it cannot parse the tags < \>.
Does anyone has any solution?

Thank you very very much...!!!

Ivy.

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org

Search Discussions

  • Patrick Burleson at Sep 7, 2004 at 8:58 pm
    Why oh why did you send this to the tomcat lists?

    Don't cross post! Especially when the question doesn't even apply to
    one of the lists.

    Patrick
    On Tue, 7 Sep 2004 16:35:35 -0400, hui liu wrote:
    Hi,

    I have such a problem when creating lucene index for many html files:

    It shows "aborted, expected<tagname>....<tagend>" for those html files
    which contain java scripts. It seems it cannot parse the tags < \>.
    Does anyone has any solution?

    Thank you very very much...!!!

    Ivy.

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
    For additional commands, e-mail: lucene-user-help@jakarta.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
    For additional commands, e-mail: lucene-user-help@jakarta.apache.org
  • Sergiu gordea at Sep 8, 2004 at 6:10 am
    maybe you should encode the html code ...

    Patrick Burleson wrote:
    Why oh why did you send this to the tomcat lists?

    Don't cross post! Especially when the question doesn't even apply to
    one of the lists.

    Patrick

    On Tue, 7 Sep 2004 16:35:35 -0400, hui liu wrote:

    Hi,

    I have such a problem when creating lucene index for many html files:

    It shows "aborted, expected<tagname>....<tagend>" for those html files
    which contain java scripts. It seems it cannot parse the tags < \>.
    ?? is < \> a valid tag? I think it should be < />
    Do you want to index the whole HTML file, or just the information i this
    files?
    Maybe you should use a HTML2TXT converter, and then index the resulting
    text.


    All the best,

    Sergiu
    Does anyone has any solution?

    Thank you very very much...!!!

    Ivy.

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
    For additional commands, e-mail: lucene-user-help@jakarta.apache.org


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
    For additional commands, e-mail: lucene-user-help@jakarta.apache.org


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
    For additional commands, e-mail: lucene-user-help@jakarta.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedSep 7, '04 at 8:36p
activeSep 8, '04 at 6:10a
posts3
users3
websitelucene.apache.org

People

Translate

site design / logo © 2022 Grokbase