FAQ
Hi,

I am trying to index large data having many files using following code

if (fileName.EndsWith(".txt"))

{

try

{

StreamReader fstr_in = new StreamReader(folderName + "\\" + fileName);

String line = null;

int counter = 0;

while ((line = fstr_in.ReadLine()) != null)

{

try

{

counter++;

String[] details = extractDetailsFromLine(line);

String paragraph = details[0];

String coords = details[1];

Lucene.Net.Documents.Document doc = new Document();

String name = fileName.Replace(".txt","");

String[] firstname = name.Split(new char[] { '_' });

String tifName = firstname[0] + ".tif";

//doc.Add(Field.UnStored("filename", folderName + "\\" + tifName,
Lucene.Net.Documents.Field.Store.YES, Lucene.Net.Documents.Field.Index.NO));

//doc.Add(Field.keyword("paragraph", paragraph,
Lucene.Net.Documents.Field.Store.YES,
Lucene.Net.Documents.Field.Index.TOKENIZED));

//doc.Add(Field.Text("coords", coords, Lucene.Net.Documents.Field.Store.YES,
Field.Index.NO));

doc.Add(new Lucene.Net.Documents.Field("filename", folderName + "\\" +
tifName, Lucene.Net.Documents.Field.Store.YES,
Lucene.Net.Documents.Field.Index.NO));

doc.Add(new Lucene.Net.Documents.Field("paragraph", paragraph,
Lucene.Net.Documents.Field.Store.NO,
Lucene.Net.Documents.Field.Index.TOKENIZED));

doc.Add(new Lucene.Net.Documents.Field("coords", coords,
Lucene.Net.Documents.Field.Store.YES, Field.Index.NO));

writer.AddDocument(doc); // Throws exception

}

catch (System.IO.IOException ex) { }

}

}

catch (System.IO.IOException ex)

{

}

}





However Lucene throws an exception while indexing :

The exception is Index was outside the bounds of the array.

Current DocCount is 178723

STACK TRACE

at Lucene.Net.Index.DocumentsWriter.Abort(AbortException ae)\r\n at
Lucene.Net.Index.DocumentsWriter.UpdateDocument(Document doc, Analyzer
analyzer, Term delTerm)\r\n at
Lucene.Net.Index.DocumentsWriter.AddDocument(Document doc, Analyzer
analyzer)\r\n at Lucene.Net.Index.IndexWriter.AddDocument(Document doc,
Analyzer analyzer)\r\n at
Lucene.Net.Index.IndexWriter.AddDocument(Document doc)\r\n at
Default2.indexDoc(IndexWriter writer, String folderName, String fileName) in
d:\\RamThakur\\lucene_related\\luceneExample\\Default2.aspx.cs:line 120\r\n
at Default2.indexFolder(String folder, String indexDir) in
d:\\RamThakur\\lucene_related\\luceneExample\\Default2.aspx.cs:line 69\r\n
at Default2.btnIndex_Click(Object sender, EventArgs e) in
d:\\RamThakur\\lucene_related\\luceneExample\\Default2.aspx.cs:line 33\r\n
at System.Web.UI.WebControls.Button.OnClick(EventArgs e)\r\n at
System.Web.UI.WebControls.Button.RaisePostBackEvent(String
eventArgument)\r\n at
System.Web.UI.WebControls.Button.System.Web.UI.IPostBackEventHandler.RaisePo
stBackEvent(String eventArgument)\r\n at
System.Web.UI.Page.RaisePostBackEvent(IPostBackEventHandler sourceControl,
String eventArgument)\r\n at
System.Web.UI.Page.RaisePostBackEvent(NameValueCollection postData)\r\n at
System.Web.UI.Page.ProcessRequestMain(Boolean includeStagesBeforeAsyncPoint,
Boolean includeStagesAfterAsyncPoint)



Any help in this regard is highly appreciated.



Thanks and Regards,
Ram K. Singh
Datamatics Global Services Limited
PE-Cell| Knowledge Centre , Andheri (East) | Mumbai 400 093.
Email ram.singh@datamatics.com | | Tel : +91 22 6102 0116 | Mobile: +91
9869 74 9107








Disclaimer: The information contained in this e-mail and attachments if any are privileged and confidential and are intended for the individual(s) or entity(ies) named in this e-mail. If the reader or recipient is not the intended recipient, or employee or agent responsible for delivering to the intended recipient, you are hereby notified that dissemination, distribution or copying of this communication or attachments thereof is strictly prohibited. IF YOU RECEIVE this communication in error, please immediately notify the sender and return the original message.

Search Discussions

  • Shashi Kant at Mar 19, 2010 at 10:27 am
    This was an issue with an older build of Lucene.net. You might want to
    upgrade to the latest from Subversion.

    On Fri, Mar 19, 2010 at 2:18 AM, Ram wrote:
    Hi,

    I am trying to index large data having many files using following code

    if (fileName.EndsWith(".txt"))

    {

    try

    {

    StreamReader fstr_in = new StreamReader(folderName + "\\" + fileName);

    String line = null;

    int counter = 0;

    while ((line = fstr_in.ReadLine()) != null)

    {

    try

    {

    counter++;

    String[] details = extractDetailsFromLine(line);

    String paragraph = details[0];

    String coords = details[1];

    Lucene.Net.Documents.Document doc = new Document();

    String name = fileName.Replace(".txt","");

    String[] firstname = name.Split(new char[] { '_' });

    String tifName = firstname[0] + ".tif";

    //doc.Add(Field.UnStored("filename", folderName + "\\" + tifName,
    Lucene.Net.Documents.Field.Store.YES, Lucene.Net.Documents.Field.Index.NO));

    //doc.Add(Field.keyword("paragraph", paragraph,
    Lucene.Net.Documents.Field.Store.YES,
    Lucene.Net.Documents.Field.Index.TOKENIZED));

    //doc.Add(Field.Text("coords", coords, Lucene.Net.Documents.Field.Store.YES,
    Field.Index.NO));

    doc.Add(new Lucene.Net.Documents.Field("filename", folderName + "\\" +
    tifName, Lucene.Net.Documents.Field.Store.YES,
    Lucene.Net.Documents.Field.Index.NO));

    doc.Add(new Lucene.Net.Documents.Field("paragraph", paragraph,
    Lucene.Net.Documents.Field.Store.NO,
    Lucene.Net.Documents.Field.Index.TOKENIZED));

    doc.Add(new Lucene.Net.Documents.Field("coords", coords,
    Lucene.Net.Documents.Field.Store.YES, Field.Index.NO));

    writer.AddDocument(doc);  // Throws exception

    }

    catch (System.IO.IOException ex) { }

    }

    }

    catch (System.IO.IOException ex)

    {

    }

    }





    However Lucene throws an exception while indexing :

    The exception is Index was outside the bounds of the array.

    Current DocCount is 178723

    STACK TRACE

    at Lucene.Net.Index.DocumentsWriter.Abort(AbortException ae)\r\n   at
    Lucene.Net.Index.DocumentsWriter.UpdateDocument(Document doc, Analyzer
    analyzer, Term delTerm)\r\n   at
    Lucene.Net.Index.DocumentsWriter.AddDocument(Document doc, Analyzer
    analyzer)\r\n   at Lucene.Net.Index.IndexWriter.AddDocument(Document doc,
    Analyzer analyzer)\r\n   at
    Lucene.Net.Index.IndexWriter.AddDocument(Document doc)\r\n   at
    Default2.indexDoc(IndexWriter writer, String folderName, String fileName) in
    d:\\RamThakur\\lucene_related\\luceneExample\\Default2.aspx.cs:line 120\r\n
    at Default2.indexFolder(String folder, String indexDir) in
    d:\\RamThakur\\lucene_related\\luceneExample\\Default2.aspx.cs:line 69\r\n
    at Default2.btnIndex_Click(Object sender, EventArgs e) in
    d:\\RamThakur\\lucene_related\\luceneExample\\Default2.aspx.cs:line 33\r\n
    at System.Web.UI.WebControls.Button.OnClick(EventArgs e)\r\n   at
    System.Web.UI.WebControls.Button.RaisePostBackEvent(String
    eventArgument)\r\n   at
    System.Web.UI.WebControls.Button.System.Web.UI.IPostBackEventHandler.RaisePo
    stBackEvent(String eventArgument)\r\n   at
    System.Web.UI.Page.RaisePostBackEvent(IPostBackEventHandler sourceControl,
    String eventArgument)\r\n   at
    System.Web.UI.Page.RaisePostBackEvent(NameValueCollection postData)\r\n   at
    System.Web.UI.Page.ProcessRequestMain(Boolean includeStagesBeforeAsyncPoint,
    Boolean includeStagesAfterAsyncPoint)



    Any help in this regard is highly appreciated.



    Thanks and Regards,
    Ram K. Singh
    Datamatics Global Services Limited
    PE-Cell| Knowledge Centre , Andheri (East) | Mumbai 400 093.
    Email  ram.singh@datamatics.com | | Tel : +91 22 6102 0116 | Mobile: +91
    9869 74 9107








    Disclaimer: The information contained in this e-mail and attachments if any are privileged and confidential and are intended for the individual(s) or entity(ies) named in this e-mail. If the reader or recipient is not the intended recipient, or employee or agent responsible for delivering to the intended recipient, you are hereby notified that dissemination, distribution or copying of this communication or attachments thereof is strictly prohibited. IF YOU RECEIVE this communication in error, please immediately notify the sender and return the original message.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouplucene-net-user @
categorieslucene
postedMar 19, '10 at 6:22a
activeMar 19, '10 at 10:27a
posts2
users2
websitelucene.apache.org

2 users in discussion

Ram: 1 post Shashi Kant: 1 post

People

Translate

site design / logo © 2022 Grokbase