FAQ
Hi All,

I have a text files that contain several sentences, there is space between
each sentence.
When searching the index , i get the path for the documents that match the
query

String path = doc.get("path");


Is it possible to get the number of the sentence that match the query
inside the matched documents?

Thanks in advance
--
View this message in context: http://www.nabble.com/Return-the-sentence-number-in-the-indexed-files-tp18543061p18543061.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Search Discussions

  • Grant Ingersoll at Jul 19, 2008 at 9:39 pm

    On Jul 19, 2008, at 6:00 AM, starz10de wrote:
    Hi All,

    I have a text files that contain several sentences, there is space
    between
    each sentence.
    When searching the index , i get the path for the documents that
    match the
    query

    String path = doc.get("path");


    Is it possible to get the number of the sentence that match the query
    inside the matched documents?
    Not without some extra work. This kind of thing requires post (or
    pre) processing. You can use SpanQuery to know where in a document
    you matched, and then do the sentence calculations. Another option is
    to index each sentence as a separate document and then post process to
    combine.

    If you search the archives on this list and java-dev you'll see
    several discussions on the topic. See:
    http://lucene.markmail.org/message/we25gm32p6qot32c?q=sentence+detection
    and
    http://lucene.markmail.org/message/uq6ffx3oqsulgxys?q=sentence

    HTH,
    Grant


    --------------------------
    Grant Ingersoll
    http://www.lucidimagination.com

    Lucene Helpful Hints:
    http://wiki.apache.org/lucene-java/BasicsOfPerformance
    http://wiki.apache.org/lucene-java/LuceneFAQ








    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Starz10de at Jul 20, 2008 at 11:54 am
    thanks Grant for the answer,

    to index each sentence as a separate document , i already did this and it
    work fine, i indexed more than 93000 sentences (Documents) approx. in 11
    minutes. I though the other option might be more efficient.

    Farag

    Grant Ingersoll-6 wrote:
    On Jul 19, 2008, at 6:00 AM, starz10de wrote:


    Hi All,

    I have a text files that contain several sentences, there is space
    between
    each sentence.
    When searching the index , i get the path for the documents that
    match the
    query

    String path = doc.get("path");


    Is it possible to get the number of the sentence that match the query
    inside the matched documents?
    Not without some extra work. This kind of thing requires post (or
    pre) processing. You can use SpanQuery to know where in a document
    you matched, and then do the sentence calculations. Another option is
    to index each sentence as a separate document and then post process to
    combine.

    If you search the archives on this list and java-dev you'll see
    several discussions on the topic. See:
    http://lucene.markmail.org/message/we25gm32p6qot32c?q=sentence+detection
    and
    http://lucene.markmail.org/message/uq6ffx3oqsulgxys?q=sentence

    HTH,
    Grant


    --------------------------
    Grant Ingersoll
    http://www.lucidimagination.com

    Lucene Helpful Hints:
    http://wiki.apache.org/lucene-java/BasicsOfPerformance
    http://wiki.apache.org/lucene-java/LuceneFAQ








    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

    --
    View this message in context: http://www.nabble.com/Return-the-sentence-number-in-the-indexed-files-tp18543061p18553514.html
    Sent from the Lucene - Java Users mailing list archive at Nabble.com.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedJul 19, '08 at 10:00a
activeJul 20, '08 at 11:54a
posts3
users2
websitelucene.apache.org

2 users in discussion

Starz10de: 2 posts Grant Ingersoll: 1 post

People

Translate

site design / logo © 2022 Grokbase