FAQ
Hi guys,

I'm using a SinkTokenizer to collect some terms of the documents while doing
the main document indexing
I attached it to a specific field (tokenized, indexed).

*

writer* = *new* IndexWriter(index, *my _analyzer*, create,
*new*IndexWriter.MaxFieldLength(1000000));
doc.add(new Field("content", reader));

doc.add(*new* Field("*myField*",*my_analyzer.sinkStream**));*

writer.addDocument(doc);

I have a set of document which don't have those terms so the Sink is empty.

writer.addDocument works fine on the first document, but it fails always on
the second ????

Any idea what I should look for... I kind of get stuck, because
understanding what's done under addDocument is tough.

-Raymond-

Search Discussions

  • Erick Erickson at Mar 28, 2009 at 7:36 pm
    What kind of failures do you get? And I'm confused by the code. Are
    you creating a new IndexWriter every time? Do you ever close it?

    It'd help to see the surrounding code...

    Best
    Erick
    On Sat, Mar 28, 2009 at 1:36 PM, Raymond Balmès wrote:

    Hi guys,

    I'm using a SinkTokenizer to collect some terms of the documents while
    doing
    the main document indexing
    I attached it to a specific field (tokenized, indexed).

    *

    writer* = *new* IndexWriter(index, *my _analyzer*, create,
    *new*IndexWriter.MaxFieldLength(1000000));
    doc.add(new Field("content", reader));

    doc.add(*new* Field("*myField*",*my_analyzer.sinkStream**));*

    writer.addDocument(doc);

    I have a set of document which don't have those terms so the Sink is empty.

    writer.addDocument works fine on the first document, but it fails always on
    the second ????

    Any idea what I should look for... I kind of get stuck, because
    understanding what's done under addDocument is tough.

    -Raymond-
  • Raymond Balmès at Mar 30, 2009 at 8:43 am
    Yes indeed confusing code... I was also very confused.
    In the meantime I solved my problem by checking in the tokenStream method of
    myAnalyzer which field was being looked at and applying the right stream to
    the right field. No idea if this is how it is intended to be done, but it
    works perfect in my case.

    I found out that the fields are processed in alpha order... and not in
    creation order. Is there any reason for that ?

    -Ray-
    On Sat, Mar 28, 2009 at 9:30 PM, Erick Erickson wrote:

    What kind of failures do you get? And I'm confused by the code. Are
    you creating a new IndexWriter every time? Do you ever close it?

    It'd help to see the surrounding code...

    Best
    Erick

    On Sat, Mar 28, 2009 at 1:36 PM, Raymond Balmès <raymond.balmes@gmail.com
    wrote:
    Hi guys,

    I'm using a SinkTokenizer to collect some terms of the documents while
    doing
    the main document indexing
    I attached it to a specific field (tokenized, indexed).

    *

    writer* = *new* IndexWriter(index, *my _analyzer*, create,
    *new*IndexWriter.MaxFieldLength(1000000));
    doc.add(new Field("content", reader));

    doc.add(*new* Field("*myField*",*my_analyzer.sinkStream**));*

    writer.addDocument(doc);

    I have a set of document which don't have those terms so the Sink is empty.
    writer.addDocument works fine on the first document, but it fails always on
    the second ????

    Any idea what I should look for... I kind of get stuck, because
    understanding what's done under addDocument is tough.

    -Raymond-
  • Grant Ingersoll at Mar 30, 2009 at 12:19 pm

    On Mar 30, 2009, at 4:42 AM, Raymond Balmès wrote:

    I found out that the fields are processed in alpha order... and not in
    creation order. Is there any reason for that ?
    Hmm, that doesn't sound right (in other words, something must have
    changed). What version of Lucene are you using?

    -Grant
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Raymond Balmès at Mar 30, 2009 at 3:35 pm
    lucene 2.4.0
    On Mon, Mar 30, 2009 at 2:18 PM, Grant Ingersoll wrote:

    On Mar 30, 2009, at 4:42 AM, Raymond Balmès wrote:



    I found out that the fields are processed in alpha order... and not in
    creation order. Is there any reason for that ?
    Hmm, that doesn't sound right (in other words, something must have
    changed). What version of Lucene are you using?

    -Grant
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Grant Ingersoll at Mar 31, 2009 at 12:44 pm
    I'm going to bring this over to java-dev.

    -Grant
    On Mar 30, 2009, at 11:34 AM, Raymond Balmès wrote:

    lucene 2.4.0

    On Mon, Mar 30, 2009 at 2:18 PM, Grant Ingersoll
    wrote:
    On Mar 30, 2009, at 4:42 AM, Raymond Balmès wrote:



    I found out that the fields are processed in alpha order... and
    not in
    creation order. Is there any reason for that ?
    Hmm, that doesn't sound right (in other words, something must have
    changed). What version of Lucene are you using?

    -Grant
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    --------------------------
    Grant Ingersoll
    http://www.lucidimagination.com/

    Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
    using Solr/Lucene:
    http://www.lucidimagination.com/search


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Grant Ingersoll at Mar 31, 2009 at 1:25 pm
    I might add that I don't know that we explicitly ever declare they
    must be in order, but it has always been my understanding that they
    should be and I confirm this by several conversations in the past:
    http://www.lucidimagination.com/search/document/274ec8c1c56fdd54/order_of_field_objects_within_document#5ffce4509ed32511

    http://www.lucidimagination.com/search/document/d6b19ab1bd87e30a/order_of_fields_returned_by_document_getfields#d6b19ab1bd87e30a

    http://www.lucidimagination.com/search/document/deda4dd3f9041bee/the_order_of_fields_in_document_fields#bb26d84091aebcaa

    -Grant
    On Mar 31, 2009, at 8:44 AM, Grant Ingersoll wrote:

    I'm going to bring this over to java-dev.

    -Grant
    On Mar 30, 2009, at 11:34 AM, Raymond Balmès wrote:

    lucene 2.4.0

    On Mon, Mar 30, 2009 at 2:18 PM, Grant Ingersoll
    wrote:
    On Mar 30, 2009, at 4:42 AM, Raymond Balmès wrote:



    I found out that the fields are processed in alpha order... and
    not in
    creation order. Is there any reason for that ?
    Hmm, that doesn't sound right (in other words, something must have
    changed). What version of Lucene are you using?

    -Grant
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    --------------------------
    Grant Ingersoll
    http://www.lucidimagination.com/

    Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
    using Solr/Lucene:
    http://www.lucidimagination.com/search


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    --------------------------
    Grant Ingersoll
    http://www.lucidimagination.com/

    Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
    using Solr/Lucene:
    http://www.lucidimagination.com/search


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Raymond Balmès at Mar 31, 2009 at 1:43 pm
    Well I wanted an order because in my first analysis I'm collecting terms
    which I put in a 2nd field. I can live with whatever order (creation or
    alpha) I just needed to know and also was wondering why it is that way,
    looks to me as an extra complication.

    -Raymond-
    On Tue, Mar 31, 2009 at 3:24 PM, Grant Ingersoll wrote:

    I might add that I don't know that we explicitly ever declare they must be
    in order, but it has always been my understanding that they should be and I
    confirm this by several conversations in the past:

    http://www.lucidimagination.com/search/document/274ec8c1c56fdd54/order_of_field_objects_within_document#5ffce4509ed32511


    http://www.lucidimagination.com/search/document/d6b19ab1bd87e30a/order_of_fields_returned_by_document_getfields#d6b19ab1bd87e30a


    http://www.lucidimagination.com/search/document/deda4dd3f9041bee/the_order_of_fields_in_document_fields#bb26d84091aebcaa

    -Grant


    On Mar 31, 2009, at 8:44 AM, Grant Ingersoll wrote:

    I'm going to bring this over to java-dev.
    -Grant

    On Mar 30, 2009, at 11:34 AM, Raymond Balmès wrote:

    lucene 2.4.0
    On Mon, Mar 30, 2009 at 2:18 PM, Grant Ingersoll <gsingers@apache.org
    wrote:
    On Mar 30, 2009, at 4:42 AM, Raymond Balmès wrote:

    I found out that the fields are processed in alpha order... and not in
    creation order. Is there any reason for that ?
    Hmm, that doesn't sound right (in other words, something must have
    changed). What version of Lucene are you using?

    -Grant
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

    --------------------------
    Grant Ingersoll
    http://www.lucidimagination.com/

    Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using
    Solr/Lucene:
    http://www.lucidimagination.com/search


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    --------------------------
    Grant Ingersoll
    http://www.lucidimagination.com/

    Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using
    Solr/Lucene:
    http://www.lucidimagination.com/search


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedMar 28, '09 at 5:36p
activeMar 31, '09 at 1:43p
posts8
users3
websitelucene.apache.org

People

Translate

site design / logo © 2022 Grokbase