FAQ
Hello,

I´m building a search engine for HTML-Dokuments, and I´ve got a HTML-parsing
problem.

This documents are in german. In this documents are different special
characters, and different ways of writing this special characters, like "ö",
"ö" and "&#246". Do somebody know a parsing engine that has no problems
with all this different ways to write this special characters?

Thanks

b.warzecha

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Search Discussions

Discussion Posts

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 1 of 3 | next ›
Discussion Overview
groupjava-user @
categorieslucene
postedMay 3, '05 at 8:36a
activeMay 3, '05 at 4:13p
posts3
users3
websitelucene.apache.org

People

Translate

site design / logo © 2022 Grokbase