I'm trying to query a large index using Lucene.NET (version 2.9.2.1).
The index contains ~1.2 Million docs and its size is 1.7GB

When trying to open an IndexSearcher, I get the following Exception:
Lucene.Net.Index.CurruptIndexException - {"Incompatible format version: 2 expected 1 or lower"}
at Lucene.Net.Index.SegmentInfos.FindSegmentsFile.Run(IndexCommit commit)
at Lucene.Net.Index.DirectoryReader.Open(Directory directory, IndexDeletionPolicy deletionPolicy, IndexCommit commit, Boolean readOnly, Int32 termInfosIndexDivisor)
at Lucene.Net.Index.IndexReader.Open(Directory directory, IndexDeletionPolicy deletionPolicy, IndexCommit commit, Boolean readOnly, Int32 termInfosIndexDivisor)
at Lucene.Net.Index.IndexReader.Open(Directory directory, Boolean readOnly)
at Lucene.Net.Search.IndexSearcher..ctor(Directory path, Boolean readOnly)
at CET.KotarIndexBuilder.KotarParagraphsSearcher.Search(String sQueryTerm) in C:\Users\odedo\documents\visual studio 2010\Projects\CET.LucenePOC\CET.KotarIndexBuilder\KotarParagraphsSearcher.cs:line 42
at CET.KotarIndexBuilder.Program.Main(String[] args) in C:\Users\odedo\documents\visual studio 2010\Projects\CET.LucenePOC\CET.KotarIndexBuilder\Program.cs:line 23
at System.AppDomain._nExecuteAssembly(RuntimeAssembly assembly, String[] args)
at System.AppDomain.ExecuteAssembly(String assemblyFile, Evidence assemblySecurity, String[] args)
at Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly()
at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean ignoreSyncCtx)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
at System.Threading.ThreadHelper.ThreadStart()

The strange thing is, that I can open the index for writing/updating, and I can also open the index using Luke, and query it.
I tried to optimize the index using Luke, but it did not do any good.

When I rebuild this index with a subset of the data, for example only 500K or 750K of documents, I can open and query it successfully... but somewhere over 1M docs, something goes wrong.

I'm running VS2010, on .NET 4.0, using 64bit machine over Win7Pro.

Thanks,
Oded

Search Discussions

  • Digy at Feb 14, 2011 at 6:16 pm
    Hi Oded,
    I just created a 3M doc. Index, made searches, inspected it with Luke 0.9.9
    and everything seemed to be alright.
    Index corruption is a very serious case and Lucene.Net tries to handle such
    cases so that even if you reset your computer in the middle of an critical
    operation you shouldn't get corrupted index( some docs not commited may get
    lost).
    Is it possible that your index is modified by an external process such as
    virus scanners etc?

    DIGY






    -----Original Message-----
    From: Oded Olberg
    Sent: Sunday, February 13, 2011 11:22 AM
    To: 'lucene-net-user@lucene.apache.org'
    Subject: CurruptIndexException problem

    I'm trying to query a large index using Lucene.NET (version 2.9.2.1).
    The index contains ~1.2 Million docs and its size is 1.7GB

    When trying to open an IndexSearcher, I get the following Exception:
    Lucene.Net.Index.CurruptIndexException - {"Incompatible format version: 2
    expected 1 or lower"}
    at Lucene.Net.Index.SegmentInfos.FindSegmentsFile.Run(IndexCommit commit)
    at Lucene.Net.Index.DirectoryReader.Open(Directory directory,
    IndexDeletionPolicy deletionPolicy, IndexCommit commit, Boolean readOnly,
    Int32 termInfosIndexDivisor)
    at Lucene.Net.Index.IndexReader.Open(Directory directory,
    IndexDeletionPolicy deletionPolicy, IndexCommit commit, Boolean readOnly,
    Int32 termInfosIndexDivisor)
    at Lucene.Net.Index.IndexReader.Open(Directory directory, Boolean
    readOnly)
    at Lucene.Net.Search.IndexSearcher..ctor(Directory path, Boolean
    readOnly)
    at CET.KotarIndexBuilder.KotarParagraphsSearcher.Search(String
    sQueryTerm) in C:\Users\odedo\documents\visual studio
    2010\Projects\CET.LucenePOC\CET.KotarIndexBuilder\KotarParagraphsSearcher.cs
    :line 42
    at CET.KotarIndexBuilder.Program.Main(String[] args) in
    C:\Users\odedo\documents\visual studio
    2010\Projects\CET.LucenePOC\CET.KotarIndexBuilder\Program.cs:line 23
    at System.AppDomain._nExecuteAssembly(RuntimeAssembly assembly, String[]
    args)
    at System.AppDomain.ExecuteAssembly(String assemblyFile, Evidence
    assemblySecurity, String[] args)
    at Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly()
    at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
    at System.Threading.ExecutionContext.Run(ExecutionContext
    executionContext, ContextCallback callback, Object state, Boolean
    ignoreSyncCtx)
    at System.Threading.ExecutionContext.Run(ExecutionContext
    executionContext, ContextCallback callback, Object state)
    at System.Threading.ThreadHelper.ThreadStart()

    The strange thing is, that I can open the index for writing/updating, and I
    can also open the index using Luke, and query it.
    I tried to optimize the index using Luke, but it did not do any good.

    When I rebuild this index with a subset of the data, for example only 500K
    or 750K of documents, I can open and query it successfully... but somewhere
    over 1M docs, something goes wrong.

    I'm running VS2010, on .NET 4.0, using 64bit machine over Win7Pro.

    Thanks,
    Oded
  • Moray McConnachie at Feb 14, 2011 at 6:28 pm
    I saw a problem like this a while ago when we updated our indexing app to a later (major) version of lucene, while our searching apps stayed at an earlier version.

    The problem went away when we reindexed all our data and made sure the indexing app stayed behind the searching apps in terms of the Lucene version used.

    Could this apply in your case?

    M.

    ------------------
    Moray McConnachie
    Director of IT,
    Oxford Analytica

    -----------------------------------------
    Disclaimer

    This message and any attachments are confidential and/or privileged. If this has been sent to you in error, please do not use, retain or disclose them, and contact the sender as soon as possible.

    Oxford Analytica Ltd
    Registered in England: No. 1196703
    5 Alfred Street, Oxford
    United Kingdom, OX1 4EH
    -----------------------------------------


    -----Original Message-----
    From: "Digy" <digydigy@gmail.com>
    Date: Mon, 14 Feb 2011 20:14:52
    To: <lucene-net-user@lucene.apache.org>
    Reply-To: lucene-net-user@lucene.apache.org
    Subject: RE: CurruptIndexException problem

    Hi Oded,
    I just created a 3M doc. Index, made searches, inspected it with Luke 0.9.9
    and everything seemed to be alright.
    Index corruption is a very serious case and Lucene.Net tries to handle such
    cases so that even if you reset your computer in the middle of an critical
    operation you shouldn't get corrupted index( some docs not commited may get
    lost).
    Is it possible that your index is modified by an external process such as
    virus scanners etc?

    DIGY






    -----Original Message-----
    From: Oded Olberg
    Sent: Sunday, February 13, 2011 11:22 AM
    To: 'lucene-net-user@lucene.apache.org'
    Subject: CurruptIndexException problem

    I'm trying to query a large index using Lucene.NET (version 2.9.2.1).
    The index contains ~1.2 Million docs and its size is 1.7GB

    When trying to open an IndexSearcher, I get the following Exception:
    Lucene.Net.Index.CurruptIndexException - {"Incompatible format version: 2
    expected 1 or lower"}
    at Lucene.Net.Index.SegmentInfos.FindSegmentsFile.Run(IndexCommit commit)
    at Lucene.Net.Index.DirectoryReader.Open(Directory directory,
    IndexDeletionPolicy deletionPolicy, IndexCommit commit, Boolean readOnly,
    Int32 termInfosIndexDivisor)
    at Lucene.Net.Index.IndexReader.Open(Directory directory,
    IndexDeletionPolicy deletionPolicy, IndexCommit commit, Boolean readOnly,
    Int32 termInfosIndexDivisor)
    at Lucene.Net.Index.IndexReader.Open(Directory directory, Boolean
    readOnly)
    at Lucene.Net.Search.IndexSearcher..ctor(Directory path, Boolean
    readOnly)
    at CET.KotarIndexBuilder.KotarParagraphsSearcher.Search(String
    sQueryTerm) in C:\Users\odedo\documents\visual studio
    2010\Projects\CET.LucenePOC\CET.KotarIndexBuilder\KotarParagraphsSearcher.cs
    :line 42
    at CET.KotarIndexBuilder.Program.Main(String[] args) in
    C:\Users\odedo\documents\visual studio
    2010\Projects\CET.LucenePOC\CET.KotarIndexBuilder\Program.cs:line 23
    at System.AppDomain._nExecuteAssembly(RuntimeAssembly assembly, String[]
    args)
    at System.AppDomain.ExecuteAssembly(String assemblyFile, Evidence
    assemblySecurity, String[] args)
    at Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly()
    at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
    at System.Threading.ExecutionContext.Run(ExecutionContext
    executionContext, ContextCallback callback, Object state, Boolean
    ignoreSyncCtx)
    at System.Threading.ExecutionContext.Run(ExecutionContext
    executionContext, ContextCallback callback, Object state)
    at System.Threading.ThreadHelper.ThreadStart()

    The strange thing is, that I can open the index for writing/updating, and I
    can also open the index using Luke, and query it.
    I tried to optimize the index using Luke, but it did not do any good.

    When I rebuild this index with a subset of the data, for example only 500K
    or 750K of documents, I can open and query it successfully... but somewhere
    over 1M docs, something goes wrong.

    I'm running VS2010, on .NET 4.0, using 64bit machine over Win7Pro.

    Thanks,
    Oded
  • Oded Olberg at Feb 15, 2011 at 8:15 am
    Both the creation and the querying attempt of the index was done using Lucene.Net 2.9.2

    Thanks,
    Oded

    -----Original Message-----
    From: Moray McConnachie
    Sent: Monday, February 14, 2011 8:28 PM
    To: lucene-net-user@lucene.apache.org
    Subject: Re: CurruptIndexException problem

    I saw a problem like this a while ago when we updated our indexing app to a later (major) version of lucene, while our searching apps stayed at an earlier version.

    The problem went away when we reindexed all our data and made sure the indexing app stayed behind the searching apps in terms of the Lucene version used.

    Could this apply in your case?

    M.

    ------------------
    Moray McConnachie
    Director of IT,
    Oxford Analytica

    -----------------------------------------
    Disclaimer

    This message and any attachments are confidential and/or privileged. If this has been sent to you in error, please do not use, retain or disclose them, and contact the sender as soon as possible.

    Oxford Analytica Ltd
    Registered in England: No. 1196703
    5 Alfred Street, Oxford
    United Kingdom, OX1 4EH
    -----------------------------------------


    -----Original Message-----
    From: "Digy" <digydigy@gmail.com>
    Date: Mon, 14 Feb 2011 20:14:52
    To: <lucene-net-user@lucene.apache.org>
    Reply-To: lucene-net-user@lucene.apache.org
    Subject: RE: CurruptIndexException problem

    Hi Oded,
    I just created a 3M doc. Index, made searches, inspected it with Luke 0.9.9
    and everything seemed to be alright.
    Index corruption is a very serious case and Lucene.Net tries to handle such
    cases so that even if you reset your computer in the middle of an critical
    operation you shouldn't get corrupted index( some docs not commited may get
    lost).
    Is it possible that your index is modified by an external process such as
    virus scanners etc?

    DIGY






    -----Original Message-----
    From: Oded Olberg
    Sent: Sunday, February 13, 2011 11:22 AM
    To: 'lucene-net-user@lucene.apache.org'
    Subject: CurruptIndexException problem

    I'm trying to query a large index using Lucene.NET (version 2.9.2.1).
    The index contains ~1.2 Million docs and its size is 1.7GB

    When trying to open an IndexSearcher, I get the following Exception:
    Lucene.Net.Index.CurruptIndexException - {"Incompatible format version: 2
    expected 1 or lower"}
    at Lucene.Net.Index.SegmentInfos.FindSegmentsFile.Run(IndexCommit commit)
    at Lucene.Net.Index.DirectoryReader.Open(Directory directory,
    IndexDeletionPolicy deletionPolicy, IndexCommit commit, Boolean readOnly,
    Int32 termInfosIndexDivisor)
    at Lucene.Net.Index.IndexReader.Open(Directory directory,
    IndexDeletionPolicy deletionPolicy, IndexCommit commit, Boolean readOnly,
    Int32 termInfosIndexDivisor)
    at Lucene.Net.Index.IndexReader.Open(Directory directory, Boolean
    readOnly)
    at Lucene.Net.Search.IndexSearcher..ctor(Directory path, Boolean
    readOnly)
    at CET.KotarIndexBuilder.KotarParagraphsSearcher.Search(String
    sQueryTerm) in C:\Users\odedo\documents\visual studio
    2010\Projects\CET.LucenePOC\CET.KotarIndexBuilder\KotarParagraphsSearcher.cs
    :line 42
    at CET.KotarIndexBuilder.Program.Main(String[] args) in
    C:\Users\odedo\documents\visual studio
    2010\Projects\CET.LucenePOC\CET.KotarIndexBuilder\Program.cs:line 23
    at System.AppDomain._nExecuteAssembly(RuntimeAssembly assembly, String[]
    args)
    at System.AppDomain.ExecuteAssembly(String assemblyFile, Evidence
    assemblySecurity, String[] args)
    at Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly()
    at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
    at System.Threading.ExecutionContext.Run(ExecutionContext
    executionContext, ContextCallback callback, Object state, Boolean
    ignoreSyncCtx)
    at System.Threading.ExecutionContext.Run(ExecutionContext
    executionContext, ContextCallback callback, Object state)
    at System.Threading.ThreadHelper.ThreadStart()

    The strange thing is, that I can open the index for writing/updating, and I
    can also open the index using Luke, and query it.
    I tried to optimize the index using Luke, but it did not do any good.

    When I rebuild this index with a subset of the data, for example only 500K
    or 750K of documents, I can open and query it successfully... but somewhere
    over 1M docs, something goes wrong.

    I'm running VS2010, on .NET 4.0, using 64bit machine over Win7Pro.

    Thanks,
    Oded
  • Hugh Spiller at Feb 15, 2011 at 1:38 pm
    Hi Oded,

    If you're editing or optimizing the index using the latest Luke (1.0.1), you're upgrading it to Luke's Lucene 3.0.1 level. Could that be the problem? Try using an older version.

    Hugh

    -----Original Message-----
    From: Oded Olberg
    Sent: 15 February 2011 08:15
    To: lucene-net-user@lucene.apache.org
    Subject: RE: CurruptIndexException problem

    Both the creation and the querying attempt of the index was done using Lucene.Net 2.9.2

    Thanks,
    Oded

    -----Original Message-----
    From: Moray McConnachie
    Sent: Monday, February 14, 2011 8:28 PM
    To: lucene-net-user@lucene.apache.org
    Subject: Re: CurruptIndexException problem

    I saw a problem like this a while ago when we updated our indexing app to a later (major) version of lucene, while our searching apps stayed at an earlier version.

    The problem went away when we reindexed all our data and made sure the indexing app stayed behind the searching apps in terms of the Lucene version used.

    Could this apply in your case?

    M.

    ------------------
    Moray McConnachie
    Director of IT,
    Oxford Analytica

    -----------------------------------------
    Disclaimer

    This message and any attachments are confidential and/or privileged. If this has been sent to you in error, please do not use, retain or disclose them, and contact the sender as soon as possible.

    Oxford Analytica Ltd
    Registered in England: No. 1196703
    5 Alfred Street, Oxford
    United Kingdom, OX1 4EH
    -----------------------------------------


    -----Original Message-----
    From: "Digy" <digydigy@gmail.com>
    Date: Mon, 14 Feb 2011 20:14:52
    To: <lucene-net-user@lucene.apache.org>
    Reply-To: lucene-net-user@lucene.apache.org
    Subject: RE: CurruptIndexException problem

    Hi Oded,
    I just created a 3M doc. Index, made searches, inspected it with Luke 0.9.9
    and everything seemed to be alright.
    Index corruption is a very serious case and Lucene.Net tries to handle such
    cases so that even if you reset your computer in the middle of an critical
    operation you shouldn't get corrupted index( some docs not commited may get
    lost).
    Is it possible that your index is modified by an external process such as
    virus scanners etc?

    DIGY






    -----Original Message-----
    From: Oded Olberg
    Sent: Sunday, February 13, 2011 11:22 AM
    To: 'lucene-net-user@lucene.apache.org'
    Subject: CurruptIndexException problem

    I'm trying to query a large index using Lucene.NET (version 2.9.2.1).
    The index contains ~1.2 Million docs and its size is 1.7GB

    When trying to open an IndexSearcher, I get the following Exception:
    Lucene.Net.Index.CurruptIndexException - {"Incompatible format version: 2
    expected 1 or lower"}
    at Lucene.Net.Index.SegmentInfos.FindSegmentsFile.Run(IndexCommit commit)
    at Lucene.Net.Index.DirectoryReader.Open(Directory directory,
    IndexDeletionPolicy deletionPolicy, IndexCommit commit, Boolean readOnly,
    Int32 termInfosIndexDivisor)
    at Lucene.Net.Index.IndexReader.Open(Directory directory,
    IndexDeletionPolicy deletionPolicy, IndexCommit commit, Boolean readOnly,
    Int32 termInfosIndexDivisor)
    at Lucene.Net.Index.IndexReader.Open(Directory directory, Boolean
    readOnly)
    at Lucene.Net.Search.IndexSearcher..ctor(Directory path, Boolean
    readOnly)
    at CET.KotarIndexBuilder.KotarParagraphsSearcher.Search(String
    sQueryTerm) in C:\Users\odedo\documents\visual studio
    2010\Projects\CET.LucenePOC\CET.KotarIndexBuilder\KotarParagraphsSearcher.cs
    :line 42
    at CET.KotarIndexBuilder.Program.Main(String[] args) in
    C:\Users\odedo\documents\visual studio
    2010\Projects\CET.LucenePOC\CET.KotarIndexBuilder\Program.cs:line 23
    at System.AppDomain._nExecuteAssembly(RuntimeAssembly assembly, String[]
    args)
    at System.AppDomain.ExecuteAssembly(String assemblyFile, Evidence
    assemblySecurity, String[] args)
    at Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly()
    at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
    at System.Threading.ExecutionContext.Run(ExecutionContext
    executionContext, ContextCallback callback, Object state, Boolean
    ignoreSyncCtx)
    at System.Threading.ExecutionContext.Run(ExecutionContext
    executionContext, ContextCallback callback, Object state)
    at System.Threading.ThreadHelper.ThreadStart()

    The strange thing is, that I can open the index for writing/updating, and I
    can also open the index using Luke, and query it.
    I tried to optimize the index using Luke, but it did not do any good.

    When I rebuild this index with a subset of the data, for example only 500K
    or 750K of documents, I can open and query it successfully... but somewhere
    over 1M docs, something goes wrong.

    I'm running VS2010, on .NET 4.0, using 64bit machine over Win7Pro.

    Thanks,
    Oded



    --------------------------------------------------------------------------------------------------
    This email and any attachments are confidential and are for the use of the addressee only. If you are not the addressee, you must not use or disclose the contents to any other person. Please immediately notify the sender and delete the email. Statements and opinions expressed here may not represent those of the company. Email correspondence is monitored by the company. This information may be subject to export control regulation. You are obliged to comply with such regulations.

    The parent company of the Renishaw Group is Renishaw plc, registered in England no. 1106260. Registered Office: New Mills, Wotton-under-Edge, Gloucestershire, GL12 8JR, United Kingdom. Tel +44 (0) 1453 524524
    --------------------------------------------------------------------------------------------------
  • Oded Olberg at Feb 16, 2011 at 8:47 am
    Hi all,

    This was it.
    After optimizing the index using Luke, I could not search or read the index.
    I've rebuilt the index overnight (all the 1.5M docs) and was able to query it using Lucene.Net 2.9.2.

    Thank you for your help.

    Oded

    -----Original Message-----
    From: Hugh Spiller
    Sent: Tuesday, February 15, 2011 3:38 PM
    To: lucene-net-user@lucene.apache.org
    Subject: RE: CurruptIndexException problem

    Hi Oded,

    If you're editing or optimizing the index using the latest Luke (1.0.1), you're upgrading it to Luke's Lucene 3.0.1 level. Could that be the problem? Try using an older version.

    Hugh

    -----Original Message-----
    From: Oded Olberg
    Sent: 15 February 2011 08:15
    To: lucene-net-user@lucene.apache.org
    Subject: RE: CurruptIndexException problem

    Both the creation and the querying attempt of the index was done using Lucene.Net 2.9.2

    Thanks,
    Oded

    -----Original Message-----
    From: Moray McConnachie
    Sent: Monday, February 14, 2011 8:28 PM
    To: lucene-net-user@lucene.apache.org
    Subject: Re: CurruptIndexException problem

    I saw a problem like this a while ago when we updated our indexing app to a later (major) version of lucene, while our searching apps stayed at an earlier version.

    The problem went away when we reindexed all our data and made sure the indexing app stayed behind the searching apps in terms of the Lucene version used.

    Could this apply in your case?

    M.

    ------------------
    Moray McConnachie
    Director of IT,
    Oxford Analytica

    -----------------------------------------
    Disclaimer

    This message and any attachments are confidential and/or privileged. If this has been sent to you in error, please do not use, retain or disclose them, and contact the sender as soon as possible.

    Oxford Analytica Ltd
    Registered in England: No. 1196703
    5 Alfred Street, Oxford
    United Kingdom, OX1 4EH
    -----------------------------------------


    -----Original Message-----
    From: "Digy" <digydigy@gmail.com>
    Date: Mon, 14 Feb 2011 20:14:52
    To: <lucene-net-user@lucene.apache.org>
    Reply-To: lucene-net-user@lucene.apache.org
    Subject: RE: CurruptIndexException problem

    Hi Oded,
    I just created a 3M doc. Index, made searches, inspected it with Luke 0.9.9
    and everything seemed to be alright.
    Index corruption is a very serious case and Lucene.Net tries to handle such
    cases so that even if you reset your computer in the middle of an critical
    operation you shouldn't get corrupted index( some docs not commited may get
    lost).
    Is it possible that your index is modified by an external process such as
    virus scanners etc?

    DIGY






    -----Original Message-----
    From: Oded Olberg
    Sent: Sunday, February 13, 2011 11:22 AM
    To: 'lucene-net-user@lucene.apache.org'
    Subject: CurruptIndexException problem

    I'm trying to query a large index using Lucene.NET (version 2.9.2.1).
    The index contains ~1.2 Million docs and its size is 1.7GB

    When trying to open an IndexSearcher, I get the following Exception:
    Lucene.Net.Index.CurruptIndexException - {"Incompatible format version: 2
    expected 1 or lower"}
    at Lucene.Net.Index.SegmentInfos.FindSegmentsFile.Run(IndexCommit commit)
    at Lucene.Net.Index.DirectoryReader.Open(Directory directory,
    IndexDeletionPolicy deletionPolicy, IndexCommit commit, Boolean readOnly,
    Int32 termInfosIndexDivisor)
    at Lucene.Net.Index.IndexReader.Open(Directory directory,
    IndexDeletionPolicy deletionPolicy, IndexCommit commit, Boolean readOnly,
    Int32 termInfosIndexDivisor)
    at Lucene.Net.Index.IndexReader.Open(Directory directory, Boolean
    readOnly)
    at Lucene.Net.Search.IndexSearcher..ctor(Directory path, Boolean
    readOnly)
    at CET.KotarIndexBuilder.KotarParagraphsSearcher.Search(String
    sQueryTerm) in C:\Users\odedo\documents\visual studio
    2010\Projects\CET.LucenePOC\CET.KotarIndexBuilder\KotarParagraphsSearcher.cs
    :line 42
    at CET.KotarIndexBuilder.Program.Main(String[] args) in
    C:\Users\odedo\documents\visual studio
    2010\Projects\CET.LucenePOC\CET.KotarIndexBuilder\Program.cs:line 23
    at System.AppDomain._nExecuteAssembly(RuntimeAssembly assembly, String[]
    args)
    at System.AppDomain.ExecuteAssembly(String assemblyFile, Evidence
    assemblySecurity, String[] args)
    at Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly()
    at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
    at System.Threading.ExecutionContext.Run(ExecutionContext
    executionContext, ContextCallback callback, Object state, Boolean
    ignoreSyncCtx)
    at System.Threading.ExecutionContext.Run(ExecutionContext
    executionContext, ContextCallback callback, Object state)
    at System.Threading.ThreadHelper.ThreadStart()

    The strange thing is, that I can open the index for writing/updating, and I
    can also open the index using Luke, and query it.
    I tried to optimize the index using Luke, but it did not do any good.

    When I rebuild this index with a subset of the data, for example only 500K
    or 750K of documents, I can open and query it successfully... but somewhere
    over 1M docs, something goes wrong.

    I'm running VS2010, on .NET 4.0, using 64bit machine over Win7Pro.

    Thanks,
    Oded



    --------------------------------------------------------------------------------------------------
    This email and any attachments are confidential and are for the use of the addressee only. If you are not the addressee, you must not use or disclose the contents to any other person. Please immediately notify the sender and delete the email. Statements and opinions expressed here may not represent those of the company. Email correspondence is monitored by the company. This information may be subject to export control regulation. You are obliged to comply with such regulations.

    The parent company of the Renishaw Group is Renishaw plc, registered in England no. 1106260. Registered Office: New Mills, Wotton-under-Edge, Gloucestershire, GL12 8JR, United Kingdom. Tel +44 (0) 1453 524524
    --------------------------------------------------------------------------------------------------
  • Oded Olberg at Feb 15, 2011 at 8:14 am
    Hi DIGY,

    I have no doubt that Lucene can handle a lot more than 1.2M Docs.
    And I'm 100% sure that the index created is valid, as I'm able to modify it using the .NET api, and read and query it using the latest version of Luke.

    What I was unable to do using Lucene.NET is to open an IndexReader or an IndexSearcher.
    I tried to "play" with some of the overloads of the constructors (readonly - true/false) but with no success.

    The solution I'm using now is that I've converted the Java version 3.0.3 using IKVM to a .NET assembly, and the whole process works perfectly.
    Do you have experience in large scale production projects using the IKVM framework, as this was the only path that seems to work for my indexes at the moment.

    I can guarantee that there are no background processes modifying the index while the query attempt failed.
    In addition, there are no Anti-Virus running on my PC.

    Thank you,

    Oded


    -----Original Message-----
    From: Digy
    Sent: Monday, February 14, 2011 8:15 PM
    To: lucene-net-user@lucene.apache.org
    Subject: RE: CurruptIndexException problem

    Hi Oded,
    I just created a 3M doc. Index, made searches, inspected it with Luke 0.9.9
    and everything seemed to be alright.
    Index corruption is a very serious case and Lucene.Net tries to handle such
    cases so that even if you reset your computer in the middle of an critical
    operation you shouldn't get corrupted index( some docs not commited may get
    lost).
    Is it possible that your index is modified by an external process such as
    virus scanners etc?

    DIGY






    -----Original Message-----
    From: Oded Olberg
    Sent: Sunday, February 13, 2011 11:22 AM
    To: 'lucene-net-user@lucene.apache.org'
    Subject: CurruptIndexException problem

    I'm trying to query a large index using Lucene.NET (version 2.9.2.1).
    The index contains ~1.2 Million docs and its size is 1.7GB

    When trying to open an IndexSearcher, I get the following Exception:
    Lucene.Net.Index.CurruptIndexException - {"Incompatible format version: 2
    expected 1 or lower"}
    at Lucene.Net.Index.SegmentInfos.FindSegmentsFile.Run(IndexCommit commit)
    at Lucene.Net.Index.DirectoryReader.Open(Directory directory,
    IndexDeletionPolicy deletionPolicy, IndexCommit commit, Boolean readOnly,
    Int32 termInfosIndexDivisor)
    at Lucene.Net.Index.IndexReader.Open(Directory directory,
    IndexDeletionPolicy deletionPolicy, IndexCommit commit, Boolean readOnly,
    Int32 termInfosIndexDivisor)
    at Lucene.Net.Index.IndexReader.Open(Directory directory, Boolean
    readOnly)
    at Lucene.Net.Search.IndexSearcher..ctor(Directory path, Boolean
    readOnly)
    at CET.KotarIndexBuilder.KotarParagraphsSearcher.Search(String
    sQueryTerm) in C:\Users\odedo\documents\visual studio
    2010\Projects\CET.LucenePOC\CET.KotarIndexBuilder\KotarParagraphsSearcher.cs
    :line 42
    at CET.KotarIndexBuilder.Program.Main(String[] args) in
    C:\Users\odedo\documents\visual studio
    2010\Projects\CET.LucenePOC\CET.KotarIndexBuilder\Program.cs:line 23
    at System.AppDomain._nExecuteAssembly(RuntimeAssembly assembly, String[]
    args)
    at System.AppDomain.ExecuteAssembly(String assemblyFile, Evidence
    assemblySecurity, String[] args)
    at Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly()
    at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
    at System.Threading.ExecutionContext.Run(ExecutionContext
    executionContext, ContextCallback callback, Object state, Boolean
    ignoreSyncCtx)
    at System.Threading.ExecutionContext.Run(ExecutionContext
    executionContext, ContextCallback callback, Object state)
    at System.Threading.ThreadHelper.ThreadStart()

    The strange thing is, that I can open the index for writing/updating, and I
    can also open the index using Luke, and query it.
    I tried to optimize the index using Luke, but it did not do any good.

    When I rebuild this index with a subset of the data, for example only 500K
    or 750K of documents, I can open and query it successfully... but somewhere
    over 1M docs, something goes wrong.

    I'm running VS2010, on .NET 4.0, using 64bit machine over Win7Pro.

    Thanks,
    Oded
  • Michael Mitiaguin at Feb 15, 2011 at 10:55 am
    Diverting from topic - Lucene.Net.Index.CurruptIndexException

    Just wondering how Corrupt became CurruptI. Was spelling error brought by
    cross translation from Java. I remember something else misspelled in method
    name , but need to check emails and current code base whether it still
    exists.
    Just for consideration for future translations.
    On Sun, Feb 13, 2011 at 8:21 PM, Oded Olberg wrote:

    I'm trying to query a large index using Lucene.NET (version 2.9.2.1).
    The index contains ~1.2 Million docs and its size is 1.7GB

    When trying to open an IndexSearcher, I get the following Exception:
    Lucene.Net.Index.CurruptIndexException - {"Incompatible format version: 2
    expected 1 or lower"}
    at Lucene.Net.Index.SegmentInfos.FindSegmentsFile.Run(IndexCommit commit)
    at Lucene.Net.Index.DirectoryReader.Open(Directory directory,
    IndexDeletionPolicy deletionPolicy, IndexCommit commit, Boolean readOnly,
    Int32 termInfosIndexDivisor)
    at Lucene.Net.Index.IndexReader.Open(Directory directory,
    IndexDeletionPolicy deletionPolicy, IndexCommit commit, Boolean readOnly,
    Int32 termInfosIndexDivisor)
    at Lucene.Net.Index.IndexReader.Open(Directory directory, Boolean
    readOnly)
    at Lucene.Net.Search.IndexSearcher..ctor(Directory path, Boolean
    readOnly)
    at CET.KotarIndexBuilder.KotarParagraphsSearcher.Search(String
    sQueryTerm) in C:\Users\odedo\documents\visual studio
    2010\Projects\CET.LucenePOC\CET.KotarIndexBuilder\KotarParagraphsSearcher.cs:line
    42
    at CET.KotarIndexBuilder.Program.Main(String[] args) in
    C:\Users\odedo\documents\visual studio
    2010\Projects\CET.LucenePOC\CET.KotarIndexBuilder\Program.cs:line 23
    at System.AppDomain._nExecuteAssembly(RuntimeAssembly assembly, String[]
    args)
    at System.AppDomain.ExecuteAssembly(String assemblyFile, Evidence
    assemblySecurity, String[] args)
    at Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly()
    at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
    at System.Threading.ExecutionContext.Run(ExecutionContext
    executionContext, ContextCallback callback, Object state, Boolean
    ignoreSyncCtx)
    at System.Threading.ExecutionContext.Run(ExecutionContext
    executionContext, ContextCallback callback, Object state)
    at System.Threading.ThreadHelper.ThreadStart()

    The strange thing is, that I can open the index for writing/updating, and I
    can also open the index using Luke, and query it.
    I tried to optimize the index using Luke, but it did not do any good.

    When I rebuild this index with a subset of the data, for example only 500K
    or 750K of documents, I can open and query it successfully... but somewhere
    over 1M docs, something goes wrong.

    I'm running VS2010, on .NET 4.0, using 64bit machine over Win7Pro.

    Thanks,
    Oded
  • Oded Olberg at Feb 15, 2011 at 10:57 am
    Sorry about that, VS2010 lets you copy the stack trace, but not the fully qualifying name of the Exception object, so I had to type it in myself.

    Oded

    -----Original Message-----
    From: Michael Mitiaguin
    Sent: Tuesday, February 15, 2011 12:55 PM
    To: lucene-net-user@lucene.apache.org
    Subject: Re: CurruptIndexException problem

    Diverting from topic - Lucene.Net.Index.CurruptIndexException

    Just wondering how Corrupt became CurruptI. Was spelling error brought by
    cross translation from Java. I remember something else misspelled in method
    name , but need to check emails and current code base whether it still
    exists.
    Just for consideration for future translations.
    On Sun, Feb 13, 2011 at 8:21 PM, Oded Olberg wrote:

    I'm trying to query a large index using Lucene.NET (version 2.9.2.1).
    The index contains ~1.2 Million docs and its size is 1.7GB

    When trying to open an IndexSearcher, I get the following Exception:
    Lucene.Net.Index.CurruptIndexException - {"Incompatible format version: 2
    expected 1 or lower"}
    at Lucene.Net.Index.SegmentInfos.FindSegmentsFile.Run(IndexCommit commit)
    at Lucene.Net.Index.DirectoryReader.Open(Directory directory,
    IndexDeletionPolicy deletionPolicy, IndexCommit commit, Boolean readOnly,
    Int32 termInfosIndexDivisor)
    at Lucene.Net.Index.IndexReader.Open(Directory directory,
    IndexDeletionPolicy deletionPolicy, IndexCommit commit, Boolean readOnly,
    Int32 termInfosIndexDivisor)
    at Lucene.Net.Index.IndexReader.Open(Directory directory, Boolean
    readOnly)
    at Lucene.Net.Search.IndexSearcher..ctor(Directory path, Boolean
    readOnly)
    at CET.KotarIndexBuilder.KotarParagraphsSearcher.Search(String
    sQueryTerm) in C:\Users\odedo\documents\visual studio
    2010\Projects\CET.LucenePOC\CET.KotarIndexBuilder\KotarParagraphsSearcher.cs:line
    42
    at CET.KotarIndexBuilder.Program.Main(String[] args) in
    C:\Users\odedo\documents\visual studio
    2010\Projects\CET.LucenePOC\CET.KotarIndexBuilder\Program.cs:line 23
    at System.AppDomain._nExecuteAssembly(RuntimeAssembly assembly, String[]
    args)
    at System.AppDomain.ExecuteAssembly(String assemblyFile, Evidence
    assemblySecurity, String[] args)
    at Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly()
    at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
    at System.Threading.ExecutionContext.Run(ExecutionContext
    executionContext, ContextCallback callback, Object state, Boolean
    ignoreSyncCtx)
    at System.Threading.ExecutionContext.Run(ExecutionContext
    executionContext, ContextCallback callback, Object state)
    at System.Threading.ThreadHelper.ThreadStart()

    The strange thing is, that I can open the index for writing/updating, and I
    can also open the index using Luke, and query it.
    I tried to optimize the index using Luke, but it did not do any good.

    When I rebuild this index with a subset of the data, for example only 500K
    or 750K of documents, I can open and query it successfully... but somewhere
    over 1M docs, something goes wrong.

    I'm running VS2010, on .NET 4.0, using 64bit machine over Win7Pro.

    Thanks,
    Oded
  • Jean-Francois Beaulac at Feb 15, 2011 at 2:23 pm
    Hi,

    When you
    open your index with Luke, what index format is it reporting? It displays it in
    the overview tab once you have the index opened.

    jf
    From: OdedO@cet.ac.il
    To: lucene-net-user@lucene.apache.org
    Subject: RE: CurruptIndexException problem
    Date: Tue, 15 Feb 2011 08:14:22 +0000

    Hi DIGY,

    I have no doubt that Lucene can handle a lot more than 1.2M Docs.
    And I'm 100% sure that the index created is valid, as I'm able to modify it using the .NET api, and read and query it using the latest version of Luke.

    What I was unable to do using Lucene.NET is to open an IndexReader or an IndexSearcher.
    I tried to "play" with some of the overloads of the constructors (readonly - true/false) but with no success.

    The solution I'm using now is that I've converted the Java version 3.0.3 using IKVM to a .NET assembly, and the whole process works perfectly.
    Do you have experience in large scale production projects using the IKVM framework, as this was the only path that seems to work for my indexes at the moment.

    I can guarantee that there are no background processes modifying the index while the query attempt failed.
    In addition, there are no Anti-Virus running on my PC.

    Thank you,

    Oded


    -----Original Message-----
    From: Digy
    Sent: Monday, February 14, 2011 8:15 PM
    To: lucene-net-user@lucene.apache.org
    Subject: RE: CurruptIndexException problem

    Hi Oded,
    I just created a 3M doc. Index, made searches, inspected it with Luke 0.9.9
    and everything seemed to be alright.
    Index corruption is a very serious case and Lucene.Net tries to handle such
    cases so that even if you reset your computer in the middle of an critical
    operation you shouldn't get corrupted index( some docs not commited may get
    lost).
    Is it possible that your index is modified by an external process such as
    virus scanners etc?

    DIGY






    -----Original Message-----
    From: Oded Olberg
    Sent: Sunday, February 13, 2011 11:22 AM
    To: 'lucene-net-user@lucene.apache.org'
    Subject: CurruptIndexException problem

    I'm trying to query a large index using Lucene.NET (version 2.9.2.1).
    The index contains ~1.2 Million docs and its size is 1.7GB

    When trying to open an IndexSearcher, I get the following Exception:
    Lucene.Net.Index.CurruptIndexException - {"Incompatible format version: 2
    expected 1 or lower"}
    at Lucene.Net.Index.SegmentInfos.FindSegmentsFile.Run(IndexCommit commit)
    at Lucene.Net.Index.DirectoryReader.Open(Directory directory,
    IndexDeletionPolicy deletionPolicy, IndexCommit commit, Boolean readOnly,
    Int32 termInfosIndexDivisor)
    at Lucene.Net.Index.IndexReader.Open(Directory directory,
    IndexDeletionPolicy deletionPolicy, IndexCommit commit, Boolean readOnly,
    Int32 termInfosIndexDivisor)
    at Lucene.Net.Index.IndexReader.Open(Directory directory, Boolean
    readOnly)
    at Lucene.Net.Search.IndexSearcher..ctor(Directory path, Boolean
    readOnly)
    at CET.KotarIndexBuilder.KotarParagraphsSearcher.Search(String
    sQueryTerm) in C:\Users\odedo\documents\visual studio
    2010\Projects\CET.LucenePOC\CET.KotarIndexBuilder\KotarParagraphsSearcher.cs
    :line 42
    at CET.KotarIndexBuilder.Program.Main(String[] args) in
    C:\Users\odedo\documents\visual studio
    2010\Projects\CET.LucenePOC\CET.KotarIndexBuilder\Program.cs:line 23
    at System.AppDomain._nExecuteAssembly(RuntimeAssembly assembly, String[]
    args)
    at System.AppDomain.ExecuteAssembly(String assemblyFile, Evidence
    assemblySecurity, String[] args)
    at Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly()
    at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
    at System.Threading.ExecutionContext.Run(ExecutionContext
    executionContext, ContextCallback callback, Object state, Boolean
    ignoreSyncCtx)
    at System.Threading.ExecutionContext.Run(ExecutionContext
    executionContext, ContextCallback callback, Object state)
    at System.Threading.ThreadHelper.ThreadStart()

    The strange thing is, that I can open the index for writing/updating, and I
    can also open the index using Luke, and query it.
    I tried to optimize the index using Luke, but it did not do any good.

    When I rebuild this index with a subset of the data, for example only 500K
    or 750K of documents, I can open and query it successfully... but somewhere
    over 1M docs, something goes wrong.

    I'm running VS2010, on .NET 4.0, using 64bit machine over Win7Pro.

    Thanks,
    Oded

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouplucene-net-user @
categorieslucene
postedFeb 13, '11 at 9:22a
activeFeb 16, '11 at 8:47a
posts10
users6
websitelucene.apache.org

People

Translate

site design / logo © 2022 Grokbase