FAQ
I am in a tight loop adding items into my index. After running for a couple
minutes in the loop just fine, I get the error posted below. If I then step
through (don't stop the debugger, just hit F10 to keep stepping), it adds
just fine. If I let it run, it will get the error again immediately. If I
keep stepping through, though, I get no error. Only when it is running
continuously.

I added a sleep statement in my attempt to "program by coincidence" but it
had no effect. Here is the code I am executing. The error is below that. The
error occurs on the iw.AddDocument line:


public static void AddPostsToIndex(List<Post> posts)

{

IndexWriter iw = GetIndexWriter();

foreach (Post post in posts)

{

DateTime loopItemStart = DateTime.Now;

iw.AddDocument(post.ToDocument());

System.Threading.Thread.Sleep(10);

log.DebugFormat("Added post for feedItem {0} in {1}", post.FeedItemId,

DateTime.Now.Subtract(loopItemStart));

}

iw.Close();

}

System.IO.FileNotFoundException was unhandled
Message="Could not find file 'C:\\FeedReader\\FullTextSearch\\_oy.fnm'."
Source="mscorlib"
FileName="C:\\FeedReader\\FullTextSearch\\_oy.fnm"
StackTrace:
at System.IO.__Error.WinIOError(Int32 errorCode, String
maybeFullPath)
at System.IO.FileStream.Init(String path, FileMode mode, FileAccess
access, Int32 rights, Boolean useRights, FileShare share, Int32 bufferSize,
FileOptions options, SECURITY_ATTRIBUTES secAttrs, String msgPath, Boolean
bFromProxy)
at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess
access, FileShare share)
at Lucene.Net.Store.FSIndexInput.Descriptor..ctor(FSIndexInput
enclosingInstance, FileInfo file, FileAccess mode)
at Lucene.Net.Store.FSIndexInput..ctor(FileInfo path)
at Lucene.Net.Store.FSDirectory.OpenInput(String name)
at Lucene.Net.Index.FieldInfos..ctor(Directory d, String name)
at Lucene.Net.Index.SegmentReader.Initialize(SegmentInfo si)
at Lucene.Net.Index.SegmentReader.Get(Directory dir, SegmentInfo si,
SegmentInfos sis, Boolean closeDir, Boolean ownDir)
at Lucene.Net.Index.SegmentReader.Get(SegmentInfo si)
at Lucene.Net.Index.IndexWriter.MergeSegments(Int32 minSegment, Int32
end)
at Lucene.Net.Index.IndexWriter.MergeSegments(Int32 minSegment)
at Lucene.Net.Index.IndexWriter.MaybeMergeSegments()
at Lucene.Net.Index.IndexWriter.AddDocument(Document doc, Analyzer
analyzer)
at Lucene.Net.Index.IndexWriter.AddDocument(Document doc)
at FullTextSearch.Tasks.IndexManager.AddPostsToIndex(List`1 posts)
at FullTextSearch.Tasks.IndexManager.ValidateIndex()
at Indox.Program.RefreshDocsInIndex() in
C:\Dev\WebSites\FeedReader\FullTextSearch\System\Indox\Program.cs:line 61
at Indox.Program.HandleArguments(String[] args) in
C:\Dev\WebSites\FeedReader\FullTextSearch\System\Indox\Program.cs:line 40
at Indox.Program.Main(String[] args) in
C:\Dev\WebSites\FeedReader\FullTextSearch\System\Indox\Program.cs:line 23
at System.AppDomain.nExecuteAssembly(Assembly assembly, String[]
args)
at System.AppDomain.ExecuteAssembly(String assemblyFile, Evidence
assemblySecurity, String[] args)
at Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly()
at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
at System.Threading.ExecutionContext.Run(ExecutionContext
executionContext, ContextCallback callback, Object state)
at System.Threading.ThreadHelper.ThreadStart()


--
-
P

Search Discussions

  • Patrick Burrows at Jun 24, 2007 at 5:02 pm
    If I call .Optimize() I get the same error...


    On 6/24/07, Patrick Burrows wrote:

    I am in a tight loop adding items into my index. After running for a
    couple minutes in the loop just fine, I get the error posted below. If I
    then step through (don't stop the debugger, just hit F10 to keep stepping),
    it adds just fine. If I let it run, it will get the error again immediately.
    If I keep stepping through, though, I get no error. Only when it is running
    continuously.

    I added a sleep statement in my attempt to "program by coincidence" but it
    had no effect. Here is the code I am executing. The error is below that. The
    error occurs on the iw.AddDocument line:


    public
    static void AddPostsToIndex( List<Post> posts)

    {

    IndexWriter iw = GetIndexWriter();

    foreach (Post post in posts)

    {

    DateTime loopItemStart = DateTime.Now;

    iw.AddDocument(post.ToDocument());

    System.Threading.
    Thread.Sleep(10);

    log.DebugFormat(
    "Added post for feedItem {0} in {1}", post.FeedItemId,

    DateTime.Now.Subtract(loopItemStart));

    }

    iw.Close();

    }

    System.IO.FileNotFoundException was unhandled
    Message="Could not find file 'C:\\FeedReader\\FullTextSearch\\_oy.fnm'."
    Source="mscorlib"
    FileName="C:\\FeedReader\\FullTextSearch\\_oy.fnm"
    StackTrace:
    at System.IO.__Error.WinIOError(Int32 errorCode, String
    maybeFullPath)
    at System.IO.FileStream.Init(String path, FileMode mode, FileAccess
    access, Int32 rights, Boolean useRights, FileShare share, Int32 bufferSize,
    FileOptions options, SECURITY_ATTRIBUTES secAttrs, String msgPath, Boolean
    bFromProxy)
    at System.IO.FileStream..ctor(String path, FileMode mode,
    FileAccess access, FileShare share)
    at Lucene.Net.Store.FSIndexInput.Descriptor..ctor(FSIndexInput
    enclosingInstance, FileInfo file, FileAccess mode)
    at Lucene.Net.Store.FSIndexInput..ctor(FileInfo path)
    at Lucene.Net.Store.FSDirectory.OpenInput(String name)
    at Lucene.Net.Index.FieldInfos..ctor(Directory d, String name)
    at Lucene.Net.Index.SegmentReader.Initialize (SegmentInfo si)
    at Lucene.Net.Index.SegmentReader.Get(Directory dir, SegmentInfo
    si, SegmentInfos sis, Boolean closeDir, Boolean ownDir)
    at Lucene.Net.Index.SegmentReader.Get(SegmentInfo si)
    at Lucene.Net.Index.IndexWriter.MergeSegments(Int32 minSegment,
    Int32 end)
    at Lucene.Net.Index.IndexWriter.MergeSegments(Int32 minSegment)
    at Lucene.Net.Index.IndexWriter.MaybeMergeSegments()
    at Lucene.Net.Index.IndexWriter.AddDocument(Document doc, Analyzer
    analyzer)
    at Lucene.Net.Index.IndexWriter.AddDocument(Document doc)
    at FullTextSearch.Tasks.IndexManager.AddPostsToIndex(List`1 posts)
    at FullTextSearch.Tasks.IndexManager.ValidateIndex()
    at Indox.Program.RefreshDocsInIndex() in
    C:\Dev\WebSites\FeedReader\FullTextSearch\System\Indox\Program.cs:line 61
    at Indox.Program.HandleArguments (String[] args) in
    C:\Dev\WebSites\FeedReader\FullTextSearch\System\Indox\Program.cs:line 40
    at Indox.Program.Main(String[] args) in
    C:\Dev\WebSites\FeedReader\FullTextSearch\System\Indox\Program.cs:line 23
    at System.AppDomain.nExecuteAssembly(Assembly assembly, String[]
    args)
    at System.AppDomain.ExecuteAssembly(String assemblyFile, Evidence
    assemblySecurity, String[] args)
    at Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly()
    at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
    at System.Threading.ExecutionContext.Run(ExecutionContext
    executionContext, ContextCallback callback, Object state)
    at System.Threading.ThreadHelper.ThreadStart ()


    --
    -
    P


    --
    -
    P
  • Patrick Burrows at Jun 24, 2007 at 5:21 pm
    I deleted and recreated my index and things seem to be indexing now just
    fine. I went ahead and deleted it because everything google said was "wow,
    that seems bad" whenever someone else got this error.
    On 6/24/07, Patrick Burrows wrote:

    If I call .Optimize() I get the same error...


    On 6/24/07, Patrick Burrows wrote:

    I am in a tight loop adding items into my index. After running for a
    couple minutes in the loop just fine, I get the error posted below. If I
    then step through (don't stop the debugger, just hit F10 to keep stepping),
    it adds just fine. If I let it run, it will get the error again immediately.
    If I keep stepping through, though, I get no error. Only when it is running
    continuously.

    I added a sleep statement in my attempt to "program by coincidence" but
    it had no effect. Here is the code I am executing. The error is below that.
    The error occurs on the iw.AddDocument line:


    public
    static void AddPostsToIndex( List<Post> posts)

    {

    IndexWriter iw = GetIndexWriter();

    foreach (Post post in posts)

    {

    DateTime loopItemStart = DateTime.Now;

    iw.AddDocument(post.ToDocument());

    System.Threading.
    Thread.Sleep(10);

    log.DebugFormat(
    "Added post for feedItem {0} in {1}", post.FeedItemId,

    DateTime.Now.Subtract(loopItemStart));

    }

    iw.Close();

    }

    System.IO.FileNotFoundException was unhandled
    Message="Could not find file
    'C:\\FeedReader\\FullTextSearch\\_oy.fnm'."
    Source="mscorlib"
    FileName="C:\\FeedReader\\FullTextSearch\\_oy.fnm"
    StackTrace:
    at System.IO.__Error.WinIOError(Int32 errorCode, String
    maybeFullPath)
    at System.IO.FileStream.Init(String path, FileMode mode,
    FileAccess access, Int32 rights, Boolean useRights, FileShare share, Int32
    bufferSize, FileOptions options, SECURITY_ATTRIBUTES secAttrs, String
    msgPath, Boolean bFromProxy)
    at System.IO.FileStream..ctor(String path, FileMode mode,
    FileAccess access, FileShare share)
    at Lucene.Net.Store.FSIndexInput.Descriptor..ctor(FSIndexInput
    enclosingInstance, FileInfo file, FileAccess mode)
    at Lucene.Net.Store.FSIndexInput..ctor(FileInfo path)
    at Lucene.Net.Store.FSDirectory.OpenInput(String name)
    at Lucene.Net.Index.FieldInfos..ctor(Directory d, String name)
    at Lucene.Net.Index.SegmentReader.Initialize (SegmentInfo si)
    at Lucene.Net.Index.SegmentReader.Get(Directory dir, SegmentInfo
    si, SegmentInfos sis, Boolean closeDir, Boolean ownDir)
    at Lucene.Net.Index.SegmentReader.Get(SegmentInfo si)
    at Lucene.Net.Index.IndexWriter.MergeSegments(Int32 minSegment,
    Int32 end)
    at Lucene.Net.Index.IndexWriter.MergeSegments(Int32 minSegment)
    at Lucene.Net.Index.IndexWriter.MaybeMergeSegments()
    at Lucene.Net.Index.IndexWriter.AddDocument(Document doc,
    Analyzer analyzer)
    at Lucene.Net.Index.IndexWriter.AddDocument(Document doc)
    at FullTextSearch.Tasks.IndexManager.AddPostsToIndex(List`1
    posts)
    at FullTextSearch.Tasks.IndexManager.ValidateIndex()
    at Indox.Program.RefreshDocsInIndex() in
    C:\Dev\WebSites\FeedReader\FullTextSearch\System\Indox\Program.cs:line 61
    at Indox.Program.HandleArguments (String[] args) in
    C:\Dev\WebSites\FeedReader\FullTextSearch\System\Indox\Program.cs:line 40
    at Indox.Program.Main(String[] args) in
    C:\Dev\WebSites\FeedReader\FullTextSearch\System\Indox\Program.cs:line 23
    at System.AppDomain.nExecuteAssembly(Assembly assembly, String[]
    args)
    at System.AppDomain.ExecuteAssembly(String assemblyFile, Evidence
    assemblySecurity, String[] args)
    at
    Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly ()
    at System.Threading.ThreadHelper.ThreadStart_Context(Object
    state)
    at System.Threading.ExecutionContext.Run(ExecutionContext
    executionContext, ContextCallback callback, Object state)
    at System.Threading.ThreadHelper.ThreadStart ()


    --
    -
    P


    --
    -
    P



    --
    -
    P
  • Torsten Rendelmann at Jun 25, 2007 at 6:28 am
    These kind of errors we also got - the reason was:
    We accessed the index by multiple threads. Think, the same
    happens if you access the index by two processes as
    it seems examining the callstack (guess).


    TorstenR
    -----Original Message-----
    From: Patrick Burrows
    Sent: Sunday, June 24, 2007 7:21 PM
    To: lucene-net-user@incubator.apache.org
    Subject: Re: FileNotFound Exception

    I deleted and recreated my index and things seem to be
    indexing now just
    fine. I went ahead and deleted it because everything google
    said was "wow,
    that seems bad" whenever someone else got this error.
    On 6/24/07, Patrick Burrows wrote:

    If I call .Optimize() I get the same error...


    On 6/24/07, Patrick Burrows wrote:

    I am in a tight loop adding items into my index. After
    running for a
    couple minutes in the loop just fine, I get the error
    posted below. If I
    then step through (don't stop the debugger, just hit F10
    to keep stepping),
    it adds just fine. If I let it run, it will get the error
    again immediately.
    If I keep stepping through, though, I get no error. Only
    when it is running
    continuously.

    I added a sleep statement in my attempt to "program by
    coincidence" but
    it had no effect. Here is the code I am executing. The
    error is below that.
    The error occurs on the iw.AddDocument line:


    public
    static void AddPostsToIndex( List<Post> posts)

    {

    IndexWriter iw = GetIndexWriter();

    foreach (Post post in posts)

    {

    DateTime loopItemStart = DateTime.Now;

    iw.AddDocument(post.ToDocument());

    System.Threading.
    Thread.Sleep(10);

    log.DebugFormat(
    "Added post for feedItem {0} in {1}", post.FeedItemId,

    DateTime.Now.Subtract(loopItemStart));

    }

    iw.Close();

    }

    System.IO.FileNotFoundException was unhandled
    Message="Could not find file
    'C:\\FeedReader\\FullTextSearch\\_oy.fnm'."
    Source="mscorlib"
    FileName="C:\\FeedReader\\FullTextSearch\\_oy.fnm"
    StackTrace:
    at System.IO.__Error.WinIOError(Int32 errorCode, String
    maybeFullPath)
    at System.IO.FileStream.Init(String path, FileMode mode,
    FileAccess access, Int32 rights, Boolean useRights,
    FileShare share, Int32
    bufferSize, FileOptions options, SECURITY_ATTRIBUTES
    secAttrs, String
    msgPath, Boolean bFromProxy)
    at System.IO.FileStream..ctor(String path, FileMode mode,
    FileAccess access, FileShare share)
    at
    Lucene.Net.Store.FSIndexInput.Descriptor..ctor(FSIndexInput
    enclosingInstance, FileInfo file, FileAccess mode)
    at Lucene.Net.Store.FSIndexInput..ctor(FileInfo path)
    at Lucene.Net.Store.FSDirectory.OpenInput(String name)
    at Lucene.Net.Index.FieldInfos..ctor(Directory d,
    String name)
    at Lucene.Net.Index.SegmentReader.Initialize
    (SegmentInfo si)
    at Lucene.Net.Index.SegmentReader.Get(Directory
    dir, SegmentInfo
    si, SegmentInfos sis, Boolean closeDir, Boolean ownDir)
    at Lucene.Net.Index.SegmentReader.Get(SegmentInfo si)
    at
    Lucene.Net.Index.IndexWriter.MergeSegments(Int32 minSegment,
    Int32 end)
    at
    Lucene.Net.Index.IndexWriter.MergeSegments(Int32 minSegment)
    at Lucene.Net.Index.IndexWriter.MaybeMergeSegments()
    at Lucene.Net.Index.IndexWriter.AddDocument(Document doc,
    Analyzer analyzer)
    at Lucene.Net.Index.IndexWriter.AddDocument(Document doc)
    at FullTextSearch.Tasks.IndexManager.AddPostsToIndex(List`1
    posts)
    at FullTextSearch.Tasks.IndexManager.ValidateIndex()
    at Indox.Program.RefreshDocsInIndex() in
    C:\Dev\WebSites\FeedReader\FullTextSearch\System\Indox\Program
    .cs:line 61
    at Indox.Program.HandleArguments (String[] args) in
    C:\Dev\WebSites\FeedReader\FullTextSearch\System\Indox\Program
    .cs:line 40
    at Indox.Program.Main(String[] args) in
    C:\Dev\WebSites\FeedReader\FullTextSearch\System\Indox\Program
    .cs:line 23
    at System.AppDomain.nExecuteAssembly(Assembly
    assembly, String[]
    args)
    at System.AppDomain.ExecuteAssembly(String
    assemblyFile, Evidence
    assemblySecurity, String[] args)
    at
    Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly ()
    at System.Threading.ThreadHelper.ThreadStart_Context(Object
    state)
    at System.Threading.ExecutionContext.Run(ExecutionContext
    executionContext, ContextCallback callback, Object state)
    at System.Threading.ThreadHelper.ThreadStart ()


    --
    -
    P


    --
    -
    P



    --
    -
    P
  • Patrick Burrows at Jun 25, 2007 at 1:27 pm
    Ooh... is that right?

    Cause I access it via a website without any sort of sync locking. The site
    isn't live. But, by the very nature of a website, it is multithreaded.

    I also have separate processes which are constantly updating the index.

    And yet another process that validates the index once a week (makes sure
    there are no dupes or missed records).

    Access to the index through all these things must be synchronized? That
    seems... cumbersome. At best.

    On 6/25/07, Torsten Rendelmann wrote:

    These kind of errors we also got - the reason was:
    We accessed the index by multiple threads. Think, the same
    happens if you access the index by two processes as
    it seems examining the callstack (guess).


    TorstenR
    -----Original Message-----
    From: Patrick Burrows
    Sent: Sunday, June 24, 2007 7:21 PM
    To: lucene-net-user@incubator.apache.org
    Subject: Re: FileNotFound Exception

    I deleted and recreated my index and things seem to be
    indexing now just
    fine. I went ahead and deleted it because everything google
    said was "wow,
    that seems bad" whenever someone else got this error.
    On 6/24/07, Patrick Burrows wrote:

    If I call .Optimize() I get the same error...


    On 6/24/07, Patrick Burrows wrote:

    I am in a tight loop adding items into my index. After
    running for a
    couple minutes in the loop just fine, I get the error
    posted below. If I
    then step through (don't stop the debugger, just hit F10
    to keep stepping),
    it adds just fine. If I let it run, it will get the error
    again immediately.
    If I keep stepping through, though, I get no error. Only
    when it is running
    continuously.

    I added a sleep statement in my attempt to "program by
    coincidence" but
    it had no effect. Here is the code I am executing. The
    error is below that.
    The error occurs on the iw.AddDocument line:


    public
    static void AddPostsToIndex( List<Post> posts)

    {

    IndexWriter iw = GetIndexWriter();

    foreach (Post post in posts)

    {

    DateTime loopItemStart = DateTime.Now;

    iw.AddDocument(post.ToDocument());

    System.Threading.
    Thread.Sleep(10);

    log.DebugFormat(
    "Added post for feedItem {0} in {1}", post.FeedItemId,

    DateTime.Now.Subtract(loopItemStart));

    }

    iw.Close();

    }

    System.IO.FileNotFoundException was unhandled
    Message="Could not find file
    'C:\\FeedReader\\FullTextSearch\\_oy.fnm'."
    Source="mscorlib"
    FileName="C:\\FeedReader\\FullTextSearch\\_oy.fnm"
    StackTrace:
    at System.IO.__Error.WinIOError(Int32 errorCode, String
    maybeFullPath)
    at System.IO.FileStream.Init(String path, FileMode mode,
    FileAccess access, Int32 rights, Boolean useRights,
    FileShare share, Int32
    bufferSize, FileOptions options, SECURITY_ATTRIBUTES
    secAttrs, String
    msgPath, Boolean bFromProxy)
    at System.IO.FileStream..ctor(String path, FileMode mode,
    FileAccess access, FileShare share)
    at
    Lucene.Net.Store.FSIndexInput.Descriptor..ctor(FSIndexInput
    enclosingInstance, FileInfo file, FileAccess mode)
    at Lucene.Net.Store.FSIndexInput..ctor(FileInfo path)
    at Lucene.Net.Store.FSDirectory.OpenInput(String name)
    at Lucene.Net.Index.FieldInfos..ctor(Directory d,
    String name)
    at Lucene.Net.Index.SegmentReader.Initialize
    (SegmentInfo si)
    at Lucene.Net.Index.SegmentReader.Get(Directory
    dir, SegmentInfo
    si, SegmentInfos sis, Boolean closeDir, Boolean ownDir)
    at Lucene.Net.Index.SegmentReader.Get(SegmentInfo si)
    at
    Lucene.Net.Index.IndexWriter.MergeSegments(Int32 minSegment,
    Int32 end)
    at
    Lucene.Net.Index.IndexWriter.MergeSegments(Int32 minSegment)
    at Lucene.Net.Index.IndexWriter.MaybeMergeSegments()
    at Lucene.Net.Index.IndexWriter.AddDocument(Document doc,
    Analyzer analyzer)
    at Lucene.Net.Index.IndexWriter.AddDocument(Document doc)
    at FullTextSearch.Tasks.IndexManager.AddPostsToIndex(List`1
    posts)
    at FullTextSearch.Tasks.IndexManager.ValidateIndex()
    at Indox.Program.RefreshDocsInIndex() in
    C:\Dev\WebSites\FeedReader\FullTextSearch\System\Indox\Program
    .cs:line 61
    at Indox.Program.HandleArguments (String[] args) in
    C:\Dev\WebSites\FeedReader\FullTextSearch\System\Indox\Program
    .cs:line 40
    at Indox.Program.Main(String[] args) in
    C:\Dev\WebSites\FeedReader\FullTextSearch\System\Indox\Program
    .cs:line 23
    at System.AppDomain.nExecuteAssembly(Assembly
    assembly, String[]
    args)
    at System.AppDomain.ExecuteAssembly(String
    assemblyFile, Evidence
    assemblySecurity, String[] args)
    at
    Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly ()
    at System.Threading.ThreadHelper.ThreadStart_Context(Object
    state)
    at System.Threading.ExecutionContext.Run(ExecutionContext
    executionContext, ContextCallback callback, Object state)
    at System.Threading.ThreadHelper.ThreadStart ()


    --
    -
    P


    --
    -
    P



    --
    -
    P

    --
    -
    P
  • Torsten Rendelmann at Jun 25, 2007 at 1:49 pm
    I'm not the designer of that software, just a user ;-)

    For that scenario seems to be the best solution to
    maintain one index (add, remove documents) using one
    index modifier thread (e.g.message queued), and have a
    second index with a R/O index reader for public.
    Then define a job that copy the updated index to the
    Public (updater and reader must have closed their indexes
    that time).

    TorstenR
    -----Original Message-----
    From: Patrick Burrows
    Sent: Monday, June 25, 2007 3:27 PM
    To: lucene-net-user@incubator.apache.org
    Subject: Re: FileNotFound Exception

    Ooh... is that right?

    Cause I access it via a website without any sort of sync
    locking. The site
    isn't live. But, by the very nature of a website, it is multithreaded.

    I also have separate processes which are constantly updating
    the index.

    And yet another process that validates the index once a week
    (makes sure
    there are no dupes or missed records).

    Access to the index through all these things must be
    synchronized? That
    seems... cumbersome. At best.

    On 6/25/07, Torsten Rendelmann wrote:

    These kind of errors we also got - the reason was:
    We accessed the index by multiple threads. Think, the same
    happens if you access the index by two processes as
    it seems examining the callstack (guess).


    TorstenR
    -----Original Message-----
    From: Patrick Burrows
    Sent: Sunday, June 24, 2007 7:21 PM
    To: lucene-net-user@incubator.apache.org
    Subject: Re: FileNotFound Exception

    I deleted and recreated my index and things seem to be
    indexing now just
    fine. I went ahead and deleted it because everything google
    said was "wow,
    that seems bad" whenever someone else got this error.
    On 6/24/07, Patrick Burrows wrote:

    If I call .Optimize() I get the same error...


    On 6/24/07, Patrick Burrows wrote:

    I am in a tight loop adding items into my index. After
    running for a
    couple minutes in the loop just fine, I get the error
    posted below. If I
    then step through (don't stop the debugger, just hit F10
    to keep stepping),
    it adds just fine. If I let it run, it will get the error
    again immediately.
    If I keep stepping through, though, I get no error. Only
    when it is running
    continuously.

    I added a sleep statement in my attempt to "program by
    coincidence" but
    it had no effect. Here is the code I am executing. The
    error is below that.
    The error occurs on the iw.AddDocument line:


    public
    static void AddPostsToIndex( List<Post> posts)

    {

    IndexWriter iw = GetIndexWriter();

    foreach (Post post in posts)

    {

    DateTime loopItemStart = DateTime.Now;

    iw.AddDocument(post.ToDocument());

    System.Threading.
    Thread.Sleep(10);

    log.DebugFormat(
    "Added post for feedItem {0} in {1}", post.FeedItemId,

    DateTime.Now.Subtract(loopItemStart));

    }

    iw.Close();

    }

    System.IO.FileNotFoundException was unhandled
    Message="Could not find file
    'C:\\FeedReader\\FullTextSearch\\_oy.fnm'."
    Source="mscorlib"
    FileName="C:\\FeedReader\\FullTextSearch\\_oy.fnm"
    StackTrace:
    at System.IO.__Error.WinIOError(Int32 errorCode, String
    maybeFullPath)
    at System.IO.FileStream.Init(String path,
    FileMode mode,
    FileAccess access, Int32 rights, Boolean useRights,
    FileShare share, Int32
    bufferSize, FileOptions options, SECURITY_ATTRIBUTES
    secAttrs, String
    msgPath, Boolean bFromProxy)
    at System.IO.FileStream..ctor(String path,
    FileMode mode,
    FileAccess access, FileShare share)
    at
    Lucene.Net.Store.FSIndexInput.Descriptor..ctor(FSIndexInput
    enclosingInstance, FileInfo file, FileAccess mode)
    at Lucene.Net.Store.FSIndexInput..ctor(FileInfo path)
    at Lucene.Net.Store.FSDirectory.OpenInput(String name)
    at Lucene.Net.Index.FieldInfos..ctor(Directory d,
    String name)
    at Lucene.Net.Index.SegmentReader.Initialize
    (SegmentInfo si)
    at Lucene.Net.Index.SegmentReader.Get(Directory
    dir, SegmentInfo
    si, SegmentInfos sis, Boolean closeDir, Boolean ownDir)
    at Lucene.Net.Index.SegmentReader.Get(SegmentInfo si)
    at
    Lucene.Net.Index.IndexWriter.MergeSegments(Int32 minSegment,
    Int32 end)
    at
    Lucene.Net.Index.IndexWriter.MergeSegments(Int32 minSegment)
    at Lucene.Net.Index.IndexWriter.MaybeMergeSegments()
    at
    Lucene.Net.Index.IndexWriter.AddDocument(Document doc,
    Analyzer analyzer)
    at
    Lucene.Net.Index.IndexWriter.AddDocument(Document doc)
    at
    FullTextSearch.Tasks.IndexManager.AddPostsToIndex(List`1
    posts)
    at FullTextSearch.Tasks.IndexManager.ValidateIndex()
    at Indox.Program.RefreshDocsInIndex() in
    C:\Dev\WebSites\FeedReader\FullTextSearch\System\Indox\Program
    .cs:line 61
    at Indox.Program.HandleArguments (String[] args) in
    C:\Dev\WebSites\FeedReader\FullTextSearch\System\Indox\Program
    .cs:line 40
    at Indox.Program.Main(String[] args) in
    C:\Dev\WebSites\FeedReader\FullTextSearch\System\Indox\Program
    .cs:line 23
    at System.AppDomain.nExecuteAssembly(Assembly
    assembly, String[]
    args)
    at System.AppDomain.ExecuteAssembly(String
    assemblyFile, Evidence
    assemblySecurity, String[] args)
    at
    Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly ()
    at
    System.Threading.ThreadHelper.ThreadStart_Context(Object
    state)
    at
    System.Threading.ExecutionContext.Run(ExecutionContext
    executionContext, ContextCallback callback, Object state)
    at System.Threading.ThreadHelper.ThreadStart ()


    --
    -
    P


    --
    -
    P



    --
    -
    P

    --
    -
    P
  • Vijay Santhanam at Jun 25, 2007 at 1:52 pm
    Hi Patrick,

    If you intend to be dynamically updating the index (i.e. while the searcher
    is still alive) you can't avoid synchronization because the IndexSearcher is
    unaware of new docs until it is refreshed/reinstantiated.

    Defining an interface is very important for compartmentalizing the search
    engine. The clients (like your ASP.NET website) of your search engine should
    be hidden from the IndexSearcher, Modifier, Writer and Readers to protect
    them from headache synching concerns.

    Also, I found creating wrapper objects and extending lucene classes allowed
    me to further exclude Lucene.Net.* classes from my search engine interface.

    Decoupling Lucene.Net and it's wrapping consumption classes
    (YourIndexBuilder, YourIndexUpdater,etc) is a good start for scaling your
    search engine too.

    During the last Lucene.Net project I was involved with, we put Lucene.Net
    inside a Windows service, and exposed it with a static singleton remoting
    interface.

    Vijay Santhanam
    B.Eng.(Soft.)
    Spectrum Wired - Software Engineer

    T: +61 2 4925 3266
    F: +61 2 4925 3255
    M: +61 407 525 087
    W: www.spectrumwired.com


    Disclaimer: This email and any attached files are intended solely for the
    named addressee, are confidential and may contain legally privileged
    information. The copying or distribution of them or any information they
    contain, by anyone other than the addressee, is prohibited. If you have
    received this email in error, please let us know by telephone or return the
    email to the sender and destroy all copies. Thank you.



    -----Original Message-----
    From: Patrick Burrows
    Sent: Monday, 25 June 2007 11:27 PM
    To: lucene-net-user@incubator.apache.org
    Subject: Re: FileNotFound Exception

    Ooh... is that right?

    Cause I access it via a website without any sort of sync locking. The site
    isn't live. But, by the very nature of a website, it is multithreaded.

    I also have separate processes which are constantly updating the index.

    And yet another process that validates the index once a week (makes sure
    there are no dupes or missed records).

    Access to the index through all these things must be synchronized? That
    seems... cumbersome. At best.

    On 6/25/07, Torsten Rendelmann wrote:

    These kind of errors we also got - the reason was:
    We accessed the index by multiple threads. Think, the same
    happens if you access the index by two processes as
    it seems examining the callstack (guess).


    TorstenR
    -----Original Message-----
    From: Patrick Burrows
    Sent: Sunday, June 24, 2007 7:21 PM
    To: lucene-net-user@incubator.apache.org
    Subject: Re: FileNotFound Exception

    I deleted and recreated my index and things seem to be
    indexing now just
    fine. I went ahead and deleted it because everything google
    said was "wow,
    that seems bad" whenever someone else got this error.
    On 6/24/07, Patrick Burrows wrote:

    If I call .Optimize() I get the same error...


    On 6/24/07, Patrick Burrows wrote:

    I am in a tight loop adding items into my index. After
    running for a
    couple minutes in the loop just fine, I get the error
    posted below. If I
    then step through (don't stop the debugger, just hit F10
    to keep stepping),
    it adds just fine. If I let it run, it will get the error
    again immediately.
    If I keep stepping through, though, I get no error. Only
    when it is running
    continuously.

    I added a sleep statement in my attempt to "program by
    coincidence" but
    it had no effect. Here is the code I am executing. The
    error is below that.
    The error occurs on the iw.AddDocument line:


    public
    static void AddPostsToIndex( List<Post> posts)

    {

    IndexWriter iw = GetIndexWriter();

    foreach (Post post in posts)

    {

    DateTime loopItemStart = DateTime.Now;

    iw.AddDocument(post.ToDocument());

    System.Threading.
    Thread.Sleep(10);

    log.DebugFormat(
    "Added post for feedItem {0} in {1}", post.FeedItemId,

    DateTime.Now.Subtract(loopItemStart));

    }

    iw.Close();

    }

    System.IO.FileNotFoundException was unhandled
    Message="Could not find file
    'C:\\FeedReader\\FullTextSearch\\_oy.fnm'."
    Source="mscorlib"
    FileName="C:\\FeedReader\\FullTextSearch\\_oy.fnm"
    StackTrace:
    at System.IO.__Error.WinIOError(Int32 errorCode, String
    maybeFullPath)
    at System.IO.FileStream.Init(String path, FileMode mode,
    FileAccess access, Int32 rights, Boolean useRights,
    FileShare share, Int32
    bufferSize, FileOptions options, SECURITY_ATTRIBUTES
    secAttrs, String
    msgPath, Boolean bFromProxy)
    at System.IO.FileStream..ctor(String path, FileMode mode,
    FileAccess access, FileShare share)
    at
    Lucene.Net.Store.FSIndexInput.Descriptor..ctor(FSIndexInput
    enclosingInstance, FileInfo file, FileAccess mode)
    at Lucene.Net.Store.FSIndexInput..ctor(FileInfo path)
    at Lucene.Net.Store.FSDirectory.OpenInput(String name)
    at Lucene.Net.Index.FieldInfos..ctor(Directory d,
    String name)
    at Lucene.Net.Index.SegmentReader.Initialize
    (SegmentInfo si)
    at Lucene.Net.Index.SegmentReader.Get(Directory
    dir, SegmentInfo
    si, SegmentInfos sis, Boolean closeDir, Boolean ownDir)
    at Lucene.Net.Index.SegmentReader.Get(SegmentInfo si)
    at
    Lucene.Net.Index.IndexWriter.MergeSegments(Int32 minSegment,
    Int32 end)
    at
    Lucene.Net.Index.IndexWriter.MergeSegments(Int32 minSegment)
    at Lucene.Net.Index.IndexWriter.MaybeMergeSegments()
    at Lucene.Net.Index.IndexWriter.AddDocument(Document doc,
    Analyzer analyzer)
    at Lucene.Net.Index.IndexWriter.AddDocument(Document doc)
    at FullTextSearch.Tasks.IndexManager.AddPostsToIndex(List`1
    posts)
    at FullTextSearch.Tasks.IndexManager.ValidateIndex()
    at Indox.Program.RefreshDocsInIndex() in
    C:\Dev\WebSites\FeedReader\FullTextSearch\System\Indox\Program
    .cs:line 61
    at Indox.Program.HandleArguments (String[] args) in
    C:\Dev\WebSites\FeedReader\FullTextSearch\System\Indox\Program
    .cs:line 40
    at Indox.Program.Main(String[] args) in
    C:\Dev\WebSites\FeedReader\FullTextSearch\System\Indox\Program
    .cs:line 23
    at System.AppDomain.nExecuteAssembly(Assembly
    assembly, String[]
    args)
    at System.AppDomain.ExecuteAssembly(String
    assemblyFile, Evidence
    assemblySecurity, String[] args)
    at
    Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly ()
    at System.Threading.ThreadHelper.ThreadStart_Context(Object
    state)
    at System.Threading.ExecutionContext.Run(ExecutionContext
    executionContext, ContextCallback callback, Object state)
    at System.Threading.ThreadHelper.ThreadStart ()


    --
    -
    P


    --
    -
    P



    --
    -
    P

    --
    -
    P



    __________ NOD32 2220 (20070426) Information __________

    This message was checked by NOD32 antivirus system.
    http://www.eset.com
  • Patrick Burrows at Jun 25, 2007 at 2:10 pm
    Yeah. It is already very much abstracted.

    So, it sounds like everyone is agreeing about the multiple access issues.
    Hm. I hadn't anticipated that at all.

    On thinking about it, I'm not too concerned about writing. I could sync that
    all up.

    I'm very concerned about actually searching, though. Only one search at at
    time? I can't see that ever working.


    On 6/25/07, Vijay Santhanam wrote:

    Hi Patrick,

    If you intend to be dynamically updating the index (i.e. while the
    searcher
    is still alive) you can't avoid synchronization because the IndexSearcher
    is
    unaware of new docs until it is refreshed/reinstantiated.

    Defining an interface is very important for compartmentalizing the search
    engine. The clients (like your ASP.NET website) of your search engine
    should
    be hidden from the IndexSearcher, Modifier, Writer and Readers to protect
    them from headache synching concerns.

    Also, I found creating wrapper objects and extending lucene classes
    allowed
    me to further exclude Lucene.Net.* classes from my search engine
    interface.

    Decoupling Lucene.Net and it's wrapping consumption classes
    (YourIndexBuilder, YourIndexUpdater,etc) is a good start for scaling your
    search engine too.

    During the last Lucene.Net project I was involved with, we put Lucene.Net
    inside a Windows service, and exposed it with a static singleton remoting
    interface.

    Vijay Santhanam
    B.Eng.(Soft.)
    Spectrum Wired - Software Engineer

    T: +61 2 4925 3266
    F: +61 2 4925 3255
    M: +61 407 525 087
    W: www.spectrumwired.com


    Disclaimer: This email and any attached files are intended solely for the
    named addressee, are confidential and may contain legally privileged
    information. The copying or distribution of them or any information they
    contain, by anyone other than the addressee, is prohibited. If you have
    received this email in error, please let us know by telephone or return
    the
    email to the sender and destroy all copies. Thank you.



    -----Original Message-----
    From: Patrick Burrows
    Sent: Monday, 25 June 2007 11:27 PM
    To: lucene-net-user@incubator.apache.org
    Subject: Re: FileNotFound Exception

    Ooh... is that right?

    Cause I access it via a website without any sort of sync locking. The site
    isn't live. But, by the very nature of a website, it is multithreaded.

    I also have separate processes which are constantly updating the index.

    And yet another process that validates the index once a week (makes sure
    there are no dupes or missed records).

    Access to the index through all these things must be synchronized? That
    seems... cumbersome. At best.

    On 6/25/07, Torsten Rendelmann wrote:

    These kind of errors we also got - the reason was:
    We accessed the index by multiple threads. Think, the same
    happens if you access the index by two processes as
    it seems examining the callstack (guess).


    TorstenR
    -----Original Message-----
    From: Patrick Burrows
    Sent: Sunday, June 24, 2007 7:21 PM
    To: lucene-net-user@incubator.apache.org
    Subject: Re: FileNotFound Exception

    I deleted and recreated my index and things seem to be
    indexing now just
    fine. I went ahead and deleted it because everything google
    said was "wow,
    that seems bad" whenever someone else got this error.
    On 6/24/07, Patrick Burrows wrote:

    If I call .Optimize() I get the same error...


    On 6/24/07, Patrick Burrows wrote:

    I am in a tight loop adding items into my index. After
    running for a
    couple minutes in the loop just fine, I get the error
    posted below. If I
    then step through (don't stop the debugger, just hit F10
    to keep stepping),
    it adds just fine. If I let it run, it will get the error
    again immediately.
    If I keep stepping through, though, I get no error. Only
    when it is running
    continuously.

    I added a sleep statement in my attempt to "program by
    coincidence" but
    it had no effect. Here is the code I am executing. The
    error is below that.
    The error occurs on the iw.AddDocument line:


    public
    static void AddPostsToIndex( List<Post> posts)

    {

    IndexWriter iw = GetIndexWriter();

    foreach (Post post in posts)

    {

    DateTime loopItemStart = DateTime.Now;

    iw.AddDocument(post.ToDocument());

    System.Threading.
    Thread.Sleep(10);

    log.DebugFormat(
    "Added post for feedItem {0} in {1}", post.FeedItemId,

    DateTime.Now.Subtract(loopItemStart));

    }

    iw.Close();

    }

    System.IO.FileNotFoundException was unhandled
    Message="Could not find file
    'C:\\FeedReader\\FullTextSearch\\_oy.fnm'."
    Source="mscorlib"
    FileName="C:\\FeedReader\\FullTextSearch\\_oy.fnm"
    StackTrace:
    at System.IO.__Error.WinIOError(Int32 errorCode, String
    maybeFullPath)
    at System.IO.FileStream.Init(String path, FileMode mode,
    FileAccess access, Int32 rights, Boolean useRights,
    FileShare share, Int32
    bufferSize, FileOptions options, SECURITY_ATTRIBUTES
    secAttrs, String
    msgPath, Boolean bFromProxy)
    at System.IO.FileStream..ctor(String path, FileMode mode,
    FileAccess access, FileShare share)
    at
    Lucene.Net.Store.FSIndexInput.Descriptor..ctor(FSIndexInput
    enclosingInstance, FileInfo file, FileAccess mode)
    at Lucene.Net.Store.FSIndexInput..ctor(FileInfo path)
    at Lucene.Net.Store.FSDirectory.OpenInput(String name)
    at Lucene.Net.Index.FieldInfos..ctor(Directory d,
    String name)
    at Lucene.Net.Index.SegmentReader.Initialize
    (SegmentInfo si)
    at Lucene.Net.Index.SegmentReader.Get(Directory
    dir, SegmentInfo
    si, SegmentInfos sis, Boolean closeDir, Boolean ownDir)
    at Lucene.Net.Index.SegmentReader.Get(SegmentInfo si)
    at
    Lucene.Net.Index.IndexWriter.MergeSegments(Int32 minSegment,
    Int32 end)
    at
    Lucene.Net.Index.IndexWriter.MergeSegments(Int32 minSegment)
    at Lucene.Net.Index.IndexWriter.MaybeMergeSegments()
    at Lucene.Net.Index.IndexWriter.AddDocument(Document doc,
    Analyzer analyzer)
    at Lucene.Net.Index.IndexWriter.AddDocument(Document doc)
    at FullTextSearch.Tasks.IndexManager.AddPostsToIndex(List`1
    posts)
    at FullTextSearch.Tasks.IndexManager.ValidateIndex()
    at Indox.Program.RefreshDocsInIndex() in
    C:\Dev\WebSites\FeedReader\FullTextSearch\System\Indox\Program
    .cs:line 61
    at Indox.Program.HandleArguments (String[] args) in
    C:\Dev\WebSites\FeedReader\FullTextSearch\System\Indox\Program
    .cs:line 40
    at Indox.Program.Main(String[] args) in
    C:\Dev\WebSites\FeedReader\FullTextSearch\System\Indox\Program
    .cs:line 23
    at System.AppDomain.nExecuteAssembly(Assembly
    assembly, String[]
    args)
    at System.AppDomain.ExecuteAssembly(String
    assemblyFile, Evidence
    assemblySecurity, String[] args)
    at
    Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly ()
    at System.Threading.ThreadHelper.ThreadStart_Context(Object
    state)
    at System.Threading.ExecutionContext.Run(ExecutionContext
    executionContext, ContextCallback callback, Object state)
    at System.Threading.ThreadHelper.ThreadStart ()


    --
    -
    P


    --
    -
    P



    --
    -
    P

    --
    -
    P



    __________ NOD32 2220 (20070426) Information __________

    This message was checked by NOD32 antivirus system.
    http://www.eset.com

    --
    -
    P
  • Kurt Mackey at Jun 25, 2007 at 2:22 pm
    Heh, this is where Lucene gets annoying.

    Basically, if you're using an IndexReader (and by extension, and
    IndexSearcher) purely for read access to the Lucene index, you don't
    really have to worry about synchronization. For performance reasons,
    it's best to keep a single IndexSearcher around and use it for all your
    threads.

    You *do* have to synchronize write operations. So you may not have more
    than one IndexWriter or IndexReader you've used for deletes open at a
    time. Once you've done your writes, you reopen the IndexSearcher.

    I'm actually working on extracting a library to handle all this from an
    existing app. I'll put it up for consumption once it's reasonably
    functional, and hopefully it will help take some of the pain out of
    using Lucene on a web application.

    -Kurt


    -----Original Message-----
    From: Patrick Burrows
    Sent: Monday, June 25, 2007 9:11 AM
    To: lucene-net-user@incubator.apache.org
    Subject: Re: FileNotFound Exception

    Yeah. It is already very much abstracted.

    So, it sounds like everyone is agreeing about the multiple access
    issues.
    Hm. I hadn't anticipated that at all.

    On thinking about it, I'm not too concerned about writing. I could sync
    that
    all up.

    I'm very concerned about actually searching, though. Only one search at
    at
    time? I can't see that ever working.


    On 6/25/07, Vijay Santhanam wrote:

    Hi Patrick,

    If you intend to be dynamically updating the index (i.e. while the
    searcher
    is still alive) you can't avoid synchronization because the
    IndexSearcher
    is
    unaware of new docs until it is refreshed/reinstantiated.

    Defining an interface is very important for compartmentalizing the search
    engine. The clients (like your ASP.NET website) of your search engine
    should
    be hidden from the IndexSearcher, Modifier, Writer and Readers to protect
    them from headache synching concerns.

    Also, I found creating wrapper objects and extending lucene classes
    allowed
    me to further exclude Lucene.Net.* classes from my search engine
    interface.

    Decoupling Lucene.Net and it's wrapping consumption classes
    (YourIndexBuilder, YourIndexUpdater,etc) is a good start for scaling your
    search engine too.

    During the last Lucene.Net project I was involved with, we put
    Lucene.Net
    inside a Windows service, and exposed it with a static singleton remoting
    interface.

    Vijay Santhanam
    B.Eng.(Soft.)
    Spectrum Wired - Software Engineer

    T: +61 2 4925 3266
    F: +61 2 4925 3255
    M: +61 407 525 087
    W: www.spectrumwired.com


    Disclaimer: This email and any attached files are intended solely for the
    named addressee, are confidential and may contain legally privileged
    information. The copying or distribution of them or any information they
    contain, by anyone other than the addressee, is prohibited. If you have
    received this email in error, please let us know by telephone or return
    the
    email to the sender and destroy all copies. Thank you.



    -----Original Message-----
    From: Patrick Burrows
    Sent: Monday, 25 June 2007 11:27 PM
    To: lucene-net-user@incubator.apache.org
    Subject: Re: FileNotFound Exception

    Ooh... is that right?

    Cause I access it via a website without any sort of sync locking. The site
    isn't live. But, by the very nature of a website, it is multithreaded.

    I also have separate processes which are constantly updating the index.
    And yet another process that validates the index once a week (makes sure
    there are no dupes or missed records).

    Access to the index through all these things must be synchronized? That
    seems... cumbersome. At best.

    On 6/25/07, Torsten Rendelmann wrote:

    These kind of errors we also got - the reason was:
    We accessed the index by multiple threads. Think, the same
    happens if you access the index by two processes as
    it seems examining the callstack (guess).


    TorstenR
    -----Original Message-----
    From: Patrick Burrows
    Sent: Sunday, June 24, 2007 7:21 PM
    To: lucene-net-user@incubator.apache.org
    Subject: Re: FileNotFound Exception

    I deleted and recreated my index and things seem to be
    indexing now just
    fine. I went ahead and deleted it because everything google
    said was "wow,
    that seems bad" whenever someone else got this error.
    On 6/24/07, Patrick Burrows wrote:

    If I call .Optimize() I get the same error...


    On 6/24/07, Patrick Burrows wrote:

    I am in a tight loop adding items into my index. After
    running for a
    couple minutes in the loop just fine, I get the error
    posted below. If I
    then step through (don't stop the debugger, just hit F10
    to keep stepping),
    it adds just fine. If I let it run, it will get the error
    again immediately.
    If I keep stepping through, though, I get no error. Only
    when it is running
    continuously.

    I added a sleep statement in my attempt to "program by
    coincidence" but
    it had no effect. Here is the code I am executing. The
    error is below that.
    The error occurs on the iw.AddDocument line:


    public
    static void AddPostsToIndex( List<Post> posts)

    {

    IndexWriter iw = GetIndexWriter();

    foreach (Post post in posts)

    {

    DateTime loopItemStart = DateTime.Now;

    iw.AddDocument(post.ToDocument());

    System.Threading.
    Thread.Sleep(10);

    log.DebugFormat(
    "Added post for feedItem {0} in {1}", post.FeedItemId,

    DateTime.Now.Subtract(loopItemStart));

    }

    iw.Close();

    }

    System.IO.FileNotFoundException was unhandled
    Message="Could not find file
    'C:\\FeedReader\\FullTextSearch\\_oy.fnm'."
    Source="mscorlib"
    FileName="C:\\FeedReader\\FullTextSearch\\_oy.fnm"
    StackTrace:
    at System.IO.__Error.WinIOError(Int32 errorCode, String
    maybeFullPath)
    at System.IO.FileStream.Init(String path, FileMode
    mode,
    FileAccess access, Int32 rights, Boolean useRights,
    FileShare share, Int32
    bufferSize, FileOptions options, SECURITY_ATTRIBUTES
    secAttrs, String
    msgPath, Boolean bFromProxy)
    at System.IO.FileStream..ctor(String path, FileMode
    mode,
    FileAccess access, FileShare share)
    at
    Lucene.Net.Store.FSIndexInput.Descriptor..ctor(FSIndexInput
    enclosingInstance, FileInfo file, FileAccess mode)
    at Lucene.Net.Store.FSIndexInput..ctor(FileInfo path)
    at Lucene.Net.Store.FSDirectory.OpenInput(String name)
    at Lucene.Net.Index.FieldInfos..ctor(Directory d,
    String name)
    at Lucene.Net.Index.SegmentReader.Initialize
    (SegmentInfo si)
    at Lucene.Net.Index.SegmentReader.Get(Directory
    dir, SegmentInfo
    si, SegmentInfos sis, Boolean closeDir, Boolean ownDir)
    at Lucene.Net.Index.SegmentReader.Get(SegmentInfo si)
    at
    Lucene.Net.Index.IndexWriter.MergeSegments(Int32 minSegment,
    Int32 end)
    at
    Lucene.Net.Index.IndexWriter.MergeSegments(Int32 minSegment)
    at Lucene.Net.Index.IndexWriter.MaybeMergeSegments()
    at Lucene.Net.Index.IndexWriter.AddDocument(Document
    doc,
    Analyzer analyzer)
    at Lucene.Net.Index.IndexWriter.AddDocument(Document
    doc)
    at
    FullTextSearch.Tasks.IndexManager.AddPostsToIndex(List`1
    posts)
    at FullTextSearch.Tasks.IndexManager.ValidateIndex()
    at Indox.Program.RefreshDocsInIndex() in
    C:\Dev\WebSites\FeedReader\FullTextSearch\System\Indox\Program
    .cs:line 61
    at Indox.Program.HandleArguments (String[] args) in
    C:\Dev\WebSites\FeedReader\FullTextSearch\System\Indox\Program
    .cs:line 40
    at Indox.Program.Main(String[] args) in
    C:\Dev\WebSites\FeedReader\FullTextSearch\System\Indox\Program
    .cs:line 23
    at System.AppDomain.nExecuteAssembly(Assembly
    assembly, String[]
    args)
    at System.AppDomain.ExecuteAssembly(String
    assemblyFile, Evidence
    assemblySecurity, String[] args)
    at
    Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly ()
    at
    System.Threading.ThreadHelper.ThreadStart_Context(Object
    state)
    at
    System.Threading.ExecutionContext.Run(ExecutionContext
    executionContext, ContextCallback callback, Object state)
    at System.Threading.ThreadHelper.ThreadStart ()


    --
    -
    P


    --
    -
    P



    --
    -
    P

    --
    -
    P



    __________ NOD32 2220 (20070426) Information __________

    This message was checked by NOD32 antivirus system.
    http://www.eset.com

    --
    -
    P
  • Patrick Burrows at Jun 25, 2007 at 2:45 pm
    I don't even like the idea of keeping the IndexSearcher around. So far my
    app is completely stateless. I guess I will need to do that, though.

    My IndexReaders don't ever delete, so that should be fine. I am already
    queueing most write operations (except for the validations I mentioned
    before). I could just add those to the queue as well. Though...my queue
    reader is multi-threaded, so it can do multiple writes at a time. I guess I
    need to strip out all that code. No point making it multi-threaded if the
    core functionality has to be synchronized.

    My goal was going to be to scale the consumer of the write queue out across
    multiple machines so I could use additional processing power to update the
    lucene index. Sounds like this is a terrible idea. :-)

    How do other people handle scaling their index writers? It sounds like, in
    every case, writing is a single access thing.

    I guess scaling out to multiple indexes is an option... but then how do you
    search across all those indexes at once?



    On 6/25/07, Kurt Mackey wrote:

    Heh, this is where Lucene gets annoying.

    Basically, if you're using an IndexReader (and by extension, and
    IndexSearcher) purely for read access to the Lucene index, you don't
    really have to worry about synchronization. For performance reasons,
    it's best to keep a single IndexSearcher around and use it for all your
    threads.

    You *do* have to synchronize write operations. So you may not have more
    than one IndexWriter or IndexReader you've used for deletes open at a
    time. Once you've done your writes, you reopen the IndexSearcher.

    I'm actually working on extracting a library to handle all this from an
    existing app. I'll put it up for consumption once it's reasonably
    functional, and hopefully it will help take some of the pain out of
    using Lucene on a web application.

    -Kurt


    -----Original Message-----
    From: Patrick Burrows
    Sent: Monday, June 25, 2007 9:11 AM
    To: lucene-net-user@incubator.apache.org
    Subject: Re: FileNotFound Exception

    Yeah. It is already very much abstracted.

    So, it sounds like everyone is agreeing about the multiple access
    issues.
    Hm. I hadn't anticipated that at all.

    On thinking about it, I'm not too concerned about writing. I could sync
    that
    all up.

    I'm very concerned about actually searching, though. Only one search at
    at
    time? I can't see that ever working.


    On 6/25/07, Vijay Santhanam wrote:

    Hi Patrick,

    If you intend to be dynamically updating the index (i.e. while the
    searcher
    is still alive) you can't avoid synchronization because the
    IndexSearcher
    is
    unaware of new docs until it is refreshed/reinstantiated.

    Defining an interface is very important for compartmentalizing the search
    engine. The clients (like your ASP.NET website) of your search engine
    should
    be hidden from the IndexSearcher, Modifier, Writer and Readers to protect
    them from headache synching concerns.

    Also, I found creating wrapper objects and extending lucene classes
    allowed
    me to further exclude Lucene.Net.* classes from my search engine
    interface.

    Decoupling Lucene.Net and it's wrapping consumption classes
    (YourIndexBuilder, YourIndexUpdater,etc) is a good start for scaling your
    search engine too.

    During the last Lucene.Net project I was involved with, we put
    Lucene.Net
    inside a Windows service, and exposed it with a static singleton remoting
    interface.

    Vijay Santhanam
    B.Eng.(Soft.)
    Spectrum Wired - Software Engineer

    T: +61 2 4925 3266
    F: +61 2 4925 3255
    M: +61 407 525 087
    W: www.spectrumwired.com


    Disclaimer: This email and any attached files are intended solely for the
    named addressee, are confidential and may contain legally privileged
    information. The copying or distribution of them or any information they
    contain, by anyone other than the addressee, is prohibited. If you have
    received this email in error, please let us know by telephone or return
    the
    email to the sender and destroy all copies. Thank you.



    -----Original Message-----
    From: Patrick Burrows
    Sent: Monday, 25 June 2007 11:27 PM
    To: lucene-net-user@incubator.apache.org
    Subject: Re: FileNotFound Exception

    Ooh... is that right?

    Cause I access it via a website without any sort of sync locking. The site
    isn't live. But, by the very nature of a website, it is multithreaded.

    I also have separate processes which are constantly updating the index.
    And yet another process that validates the index once a week (makes sure
    there are no dupes or missed records).

    Access to the index through all these things must be synchronized? That
    seems... cumbersome. At best.

    On 6/25/07, Torsten Rendelmann wrote:

    These kind of errors we also got - the reason was:
    We accessed the index by multiple threads. Think, the same
    happens if you access the index by two processes as
    it seems examining the callstack (guess).


    TorstenR
    -----Original Message-----
    From: Patrick Burrows
    Sent: Sunday, June 24, 2007 7:21 PM
    To: lucene-net-user@incubator.apache.org
    Subject: Re: FileNotFound Exception

    I deleted and recreated my index and things seem to be
    indexing now just
    fine. I went ahead and deleted it because everything google
    said was "wow,
    that seems bad" whenever someone else got this error.
    On 6/24/07, Patrick Burrows wrote:

    If I call .Optimize() I get the same error...


    On 6/24/07, Patrick Burrows wrote:

    I am in a tight loop adding items into my index. After
    running for a
    couple minutes in the loop just fine, I get the error
    posted below. If I
    then step through (don't stop the debugger, just hit F10
    to keep stepping),
    it adds just fine. If I let it run, it will get the error
    again immediately.
    If I keep stepping through, though, I get no error. Only
    when it is running
    continuously.

    I added a sleep statement in my attempt to "program by
    coincidence" but
    it had no effect. Here is the code I am executing. The
    error is below that.
    The error occurs on the iw.AddDocument line:


    public
    static void AddPostsToIndex( List<Post> posts)

    {

    IndexWriter iw = GetIndexWriter();

    foreach (Post post in posts)

    {

    DateTime loopItemStart = DateTime.Now;

    iw.AddDocument(post.ToDocument());

    System.Threading.
    Thread.Sleep(10);

    log.DebugFormat(
    "Added post for feedItem {0} in {1}", post.FeedItemId,

    DateTime.Now.Subtract(loopItemStart));

    }

    iw.Close();

    }

    System.IO.FileNotFoundException was unhandled
    Message="Could not find file
    'C:\\FeedReader\\FullTextSearch\\_oy.fnm'."
    Source="mscorlib"
    FileName="C:\\FeedReader\\FullTextSearch\\_oy.fnm"
    StackTrace:
    at System.IO.__Error.WinIOError(Int32 errorCode, String
    maybeFullPath)
    at System.IO.FileStream.Init(String path, FileMode
    mode,
    FileAccess access, Int32 rights, Boolean useRights,
    FileShare share, Int32
    bufferSize, FileOptions options, SECURITY_ATTRIBUTES
    secAttrs, String
    msgPath, Boolean bFromProxy)
    at System.IO.FileStream..ctor(String path, FileMode
    mode,
    FileAccess access, FileShare share)
    at
    Lucene.Net.Store.FSIndexInput.Descriptor..ctor(FSIndexInput
    enclosingInstance, FileInfo file, FileAccess mode)
    at Lucene.Net.Store.FSIndexInput..ctor(FileInfo path)
    at Lucene.Net.Store.FSDirectory.OpenInput(String name)
    at Lucene.Net.Index.FieldInfos..ctor(Directory d,
    String name)
    at Lucene.Net.Index.SegmentReader.Initialize
    (SegmentInfo si)
    at Lucene.Net.Index.SegmentReader.Get(Directory
    dir, SegmentInfo
    si, SegmentInfos sis, Boolean closeDir, Boolean ownDir)
    at Lucene.Net.Index.SegmentReader.Get(SegmentInfo si)
    at
    Lucene.Net.Index.IndexWriter.MergeSegments(Int32 minSegment,
    Int32 end)
    at
    Lucene.Net.Index.IndexWriter.MergeSegments(Int32 minSegment)
    at Lucene.Net.Index.IndexWriter.MaybeMergeSegments()
    at Lucene.Net.Index.IndexWriter.AddDocument(Document
    doc,
    Analyzer analyzer)
    at Lucene.Net.Index.IndexWriter.AddDocument(Document
    doc)
    at
    FullTextSearch.Tasks.IndexManager.AddPostsToIndex(List`1
    posts)
    at FullTextSearch.Tasks.IndexManager.ValidateIndex()
    at Indox.Program.RefreshDocsInIndex() in
    C:\Dev\WebSites\FeedReader\FullTextSearch\System\Indox\Program
    .cs:line 61
    at Indox.Program.HandleArguments (String[] args) in
    C:\Dev\WebSites\FeedReader\FullTextSearch\System\Indox\Program
    .cs:line 40
    at Indox.Program.Main(String[] args) in
    C:\Dev\WebSites\FeedReader\FullTextSearch\System\Indox\Program
    .cs:line 23
    at System.AppDomain.nExecuteAssembly(Assembly
    assembly, String[]
    args)
    at System.AppDomain.ExecuteAssembly(String
    assemblyFile, Evidence
    assemblySecurity, String[] args)
    at
    Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly ()
    at
    System.Threading.ThreadHelper.ThreadStart_Context(Object
    state)
    at
    System.Threading.ExecutionContext.Run(ExecutionContext
    executionContext, ContextCallback callback, Object state)
    at System.Threading.ThreadHelper.ThreadStart ()


    --
    -
    P


    --
    -
    P



    --
    -
    P

    --
    -
    P



    __________ NOD32 2220 (20070426) Information __________

    This message was checked by NOD32 antivirus system.
    http://www.eset.com

    --
    -
    P


    --
    -
    P
  • Vijay Santhanam at Jun 26, 2007 at 3:21 am
    " How do other people handle scaling their index writers? It sounds like, in
    every case, writing is a single access thing."

    We communicate updates to our IndexUpdaters via the database, and
    periodically service the update one record at a time.

    I can't remember off the top of my head, but concurrent writes to the index
    is a bad idea from "Lucene in Action" (I totally recommend this book for you
    even if you're using Lucene.Net).


    " My goal was going to be to scale the consumer of the write queue out
    across
    multiple machines so I could use additional processing power to update the
    lucene index. Sounds like this is a terrible idea. :-)"

    AFAIK, You'll have to serialize your writes.


    " I guess scaling out to multiple indexes is an option... but then how do
    you
    search across all those indexes at once?"

    Creating multiple indexes sounds like a nightmare because MultiIndexSearcher
    (*sp*) interleaves the results meaning you'll have to filter out duplicate
    documents - this sounds slow and cumbersome. I'm curious what others say
    about this.


    May I ask how big your compacted index is? And how big you are designing it
    for?



    Vijay Santhanam
    B.Eng.(Soft.)
    Spectrum Wired - Software Engineer

    -----Original Message-----
    From: Patrick Burrows
    Sent: Tuesday, 26 June 2007 12:45 AM
    To: lucene-net-user@incubator.apache.org
    Subject: Re: FileNotFound Exception

    I don't even like the idea of keeping the IndexSearcher around. So far my
    app is completely stateless. I guess I will need to do that, though.

    My IndexReaders don't ever delete, so that should be fine. I am already
    queueing most write operations (except for the validations I mentioned
    before). I could just add those to the queue as well. Though...my queue
    reader is multi-threaded, so it can do multiple writes at a time. I guess I
    need to strip out all that code. No point making it multi-threaded if the
    core functionality has to be synchronized.

    My goal was going to be to scale the consumer of the write queue out across
    multiple machines so I could use additional processing power to update the
    lucene index. Sounds like this is a terrible idea. :-)

    How do other people handle scaling their index writers? It sounds like, in
    every case, writing is a single access thing.

    I guess scaling out to multiple indexes is an option... but then how do you
    search across all those indexes at once?



    On 6/25/07, Kurt Mackey wrote:

    Heh, this is where Lucene gets annoying.

    Basically, if you're using an IndexReader (and by extension, and
    IndexSearcher) purely for read access to the Lucene index, you don't
    really have to worry about synchronization. For performance reasons,
    it's best to keep a single IndexSearcher around and use it for all your
    threads.

    You *do* have to synchronize write operations. So you may not have more
    than one IndexWriter or IndexReader you've used for deletes open at a
    time. Once you've done your writes, you reopen the IndexSearcher.

    I'm actually working on extracting a library to handle all this from an
    existing app. I'll put it up for consumption once it's reasonably
    functional, and hopefully it will help take some of the pain out of
    using Lucene on a web application.

    -Kurt


    -----Original Message-----
    From: Patrick Burrows
    Sent: Monday, June 25, 2007 9:11 AM
    To: lucene-net-user@incubator.apache.org
    Subject: Re: FileNotFound Exception

    Yeah. It is already very much abstracted.

    So, it sounds like everyone is agreeing about the multiple access
    issues.
    Hm. I hadn't anticipated that at all.

    On thinking about it, I'm not too concerned about writing. I could sync
    that
    all up.

    I'm very concerned about actually searching, though. Only one search at
    at
    time? I can't see that ever working.


    On 6/25/07, Vijay Santhanam wrote:

    Hi Patrick,

    If you intend to be dynamically updating the index (i.e. while the
    searcher
    is still alive) you can't avoid synchronization because the
    IndexSearcher
    is
    unaware of new docs until it is refreshed/reinstantiated.

    Defining an interface is very important for compartmentalizing the search
    engine. The clients (like your ASP.NET website) of your search engine
    should
    be hidden from the IndexSearcher, Modifier, Writer and Readers to protect
    them from headache synching concerns.

    Also, I found creating wrapper objects and extending lucene classes
    allowed
    me to further exclude Lucene.Net.* classes from my search engine
    interface.

    Decoupling Lucene.Net and it's wrapping consumption classes
    (YourIndexBuilder, YourIndexUpdater,etc) is a good start for scaling your
    search engine too.

    During the last Lucene.Net project I was involved with, we put
    Lucene.Net
    inside a Windows service, and exposed it with a static singleton remoting
    interface.

    Vijay Santhanam
    B.Eng.(Soft.)
    Spectrum Wired - Software Engineer

    T: +61 2 4925 3266
    F: +61 2 4925 3255
    M: +61 407 525 087
    W: www.spectrumwired.com


    Disclaimer: This email and any attached files are intended solely for the
    named addressee, are confidential and may contain legally privileged
    information. The copying or distribution of them or any information they
    contain, by anyone other than the addressee, is prohibited. If you have
    received this email in error, please let us know by telephone or return
    the
    email to the sender and destroy all copies. Thank you.



    -----Original Message-----
    From: Patrick Burrows
    Sent: Monday, 25 June 2007 11:27 PM
    To: lucene-net-user@incubator.apache.org
    Subject: Re: FileNotFound Exception

    Ooh... is that right?

    Cause I access it via a website without any sort of sync locking. The site
    isn't live. But, by the very nature of a website, it is multithreaded.

    I also have separate processes which are constantly updating the index.
    And yet another process that validates the index once a week (makes sure
    there are no dupes or missed records).

    Access to the index through all these things must be synchronized? That
    seems... cumbersome. At best.

    On 6/25/07, Torsten Rendelmann wrote:

    These kind of errors we also got - the reason was:
    We accessed the index by multiple threads. Think, the same
    happens if you access the index by two processes as
    it seems examining the callstack (guess).


    TorstenR
    -----Original Message-----
    From: Patrick Burrows
    Sent: Sunday, June 24, 2007 7:21 PM
    To: lucene-net-user@incubator.apache.org
    Subject: Re: FileNotFound Exception

    I deleted and recreated my index and things seem to be
    indexing now just
    fine. I went ahead and deleted it because everything google
    said was "wow,
    that seems bad" whenever someone else got this error.
    On 6/24/07, Patrick Burrows wrote:

    If I call .Optimize() I get the same error...


    On 6/24/07, Patrick Burrows wrote:

    I am in a tight loop adding items into my index. After
    running for a
    couple minutes in the loop just fine, I get the error
    posted below. If I
    then step through (don't stop the debugger, just hit F10
    to keep stepping),
    it adds just fine. If I let it run, it will get the error
    again immediately.
    If I keep stepping through, though, I get no error. Only
    when it is running
    continuously.

    I added a sleep statement in my attempt to "program by
    coincidence" but
    it had no effect. Here is the code I am executing. The
    error is below that.
    The error occurs on the iw.AddDocument line:


    public
    static void AddPostsToIndex( List<Post> posts)

    {

    IndexWriter iw = GetIndexWriter();

    foreach (Post post in posts)

    {

    DateTime loopItemStart = DateTime.Now;

    iw.AddDocument(post.ToDocument());

    System.Threading.
    Thread.Sleep(10);

    log.DebugFormat(
    "Added post for feedItem {0} in {1}", post.FeedItemId,

    DateTime.Now.Subtract(loopItemStart));

    }

    iw.Close();

    }

    System.IO.FileNotFoundException was unhandled
    Message="Could not find file
    'C:\\FeedReader\\FullTextSearch\\_oy.fnm'."
    Source="mscorlib"
    FileName="C:\\FeedReader\\FullTextSearch\\_oy.fnm"
    StackTrace:
    at System.IO.__Error.WinIOError(Int32 errorCode, String
    maybeFullPath)
    at System.IO.FileStream.Init(String path, FileMode
    mode,
    FileAccess access, Int32 rights, Boolean useRights,
    FileShare share, Int32
    bufferSize, FileOptions options, SECURITY_ATTRIBUTES
    secAttrs, String
    msgPath, Boolean bFromProxy)
    at System.IO.FileStream..ctor(String path, FileMode
    mode,
    FileAccess access, FileShare share)
    at
    Lucene.Net.Store.FSIndexInput.Descriptor..ctor(FSIndexInput
    enclosingInstance, FileInfo file, FileAccess mode)
    at Lucene.Net.Store.FSIndexInput..ctor(FileInfo path)
    at Lucene.Net.Store.FSDirectory.OpenInput(String name)
    at Lucene.Net.Index.FieldInfos..ctor(Directory d,
    String name)
    at Lucene.Net.Index.SegmentReader.Initialize
    (SegmentInfo si)
    at Lucene.Net.Index.SegmentReader.Get(Directory
    dir, SegmentInfo
    si, SegmentInfos sis, Boolean closeDir, Boolean ownDir)
    at Lucene.Net.Index.SegmentReader.Get(SegmentInfo si)
    at
    Lucene.Net.Index.IndexWriter.MergeSegments(Int32 minSegment,
    Int32 end)
    at
    Lucene.Net.Index.IndexWriter.MergeSegments(Int32 minSegment)
    at Lucene.Net.Index.IndexWriter.MaybeMergeSegments()
    at Lucene.Net.Index.IndexWriter.AddDocument(Document
    doc,
    Analyzer analyzer)
    at Lucene.Net.Index.IndexWriter.AddDocument(Document
    doc)
    at
    FullTextSearch.Tasks.IndexManager.AddPostsToIndex(List`1
    posts)
    at FullTextSearch.Tasks.IndexManager.ValidateIndex()
    at Indox.Program.RefreshDocsInIndex() in
    C:\Dev\WebSites\FeedReader\FullTextSearch\System\Indox\Program
    .cs:line 61
    at Indox.Program.HandleArguments (String[] args) in
    C:\Dev\WebSites\FeedReader\FullTextSearch\System\Indox\Program
    .cs:line 40
    at Indox.Program.Main(String[] args) in
    C:\Dev\WebSites\FeedReader\FullTextSearch\System\Indox\Program
    .cs:line 23
    at System.AppDomain.nExecuteAssembly(Assembly
    assembly, String[]
    args)
    at System.AppDomain.ExecuteAssembly(String
    assemblyFile, Evidence
    assemblySecurity, String[] args)
    at
    Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly ()
    at
    System.Threading.ThreadHelper.ThreadStart_Context(Object
    state)
    at
    System.Threading.ExecutionContext.Run(ExecutionContext
    executionContext, ContextCallback callback, Object state)
    at System.Threading.ThreadHelper.ThreadStart ()


    --
    -
    P


    --
    -
    P



    --
    -
    P

    --
    -
    P



    __________ NOD32 2220 (20070426) Information __________

    This message was checked by NOD32 antivirus system.
    http://www.eset.com

    --
    -
    P


    --
    -
    P



    __________ NOD32 2220 (20070426) Information __________

    This message was checked by NOD32 antivirus system.
    http://www.eset.com
  • Patrick Burrows at Jun 26, 2007 at 1:21 pm
    I'm indexing RSS feeds for a hosted RSS Aggregator. So, potentially, it
    could be every post in every feed. I'll send the link once it is up and
    running so you can see how I use lucene. But, essentially, it is exactly
    what it sounds like.

    I could avoid duplicates in multiple indexes by separating indexes by feed
    starting letter (for instance) or any other arbitrary system.


    On 6/25/07, Vijay Santhanam wrote:

    " How do other people handle scaling their index writers? It sounds like,
    in
    every case, writing is a single access thing."

    We communicate updates to our IndexUpdaters via the database, and
    periodically service the update one record at a time.

    I can't remember off the top of my head, but concurrent writes to the
    index
    is a bad idea from "Lucene in Action" (I totally recommend this book for
    you
    even if you're using Lucene.Net).


    " My goal was going to be to scale the consumer of the write queue out
    across
    multiple machines so I could use additional processing power to update the
    lucene index. Sounds like this is a terrible idea. :-)"

    AFAIK, You'll have to serialize your writes.


    " I guess scaling out to multiple indexes is an option... but then how do
    you
    search across all those indexes at once?"

    Creating multiple indexes sounds like a nightmare because
    MultiIndexSearcher
    (*sp*) interleaves the results meaning you'll have to filter out duplicate
    documents - this sounds slow and cumbersome. I'm curious what others say
    about this.


    May I ask how big your compacted index is? And how big you are designing
    it
    for?



    Vijay Santhanam
    B.Eng.(Soft.)
    Spectrum Wired - Software Engineer

    -----Original Message-----
    From: Patrick Burrows
    Sent: Tuesday, 26 June 2007 12:45 AM
    To: lucene-net-user@incubator.apache.org
    Subject: Re: FileNotFound Exception

    I don't even like the idea of keeping the IndexSearcher around. So far my
    app is completely stateless. I guess I will need to do that, though.

    My IndexReaders don't ever delete, so that should be fine. I am already
    queueing most write operations (except for the validations I mentioned
    before). I could just add those to the queue as well. Though...my queue
    reader is multi-threaded, so it can do multiple writes at a time. I guess
    I
    need to strip out all that code. No point making it multi-threaded if the
    core functionality has to be synchronized.

    My goal was going to be to scale the consumer of the write queue out
    across
    multiple machines so I could use additional processing power to update the
    lucene index. Sounds like this is a terrible idea. :-)

    How do other people handle scaling their index writers? It sounds like, in
    every case, writing is a single access thing.

    I guess scaling out to multiple indexes is an option... but then how do
    you
    search across all those indexes at once?



    On 6/25/07, Kurt Mackey wrote:

    Heh, this is where Lucene gets annoying.

    Basically, if you're using an IndexReader (and by extension, and
    IndexSearcher) purely for read access to the Lucene index, you don't
    really have to worry about synchronization. For performance reasons,
    it's best to keep a single IndexSearcher around and use it for all your
    threads.

    You *do* have to synchronize write operations. So you may not have more
    than one IndexWriter or IndexReader you've used for deletes open at a
    time. Once you've done your writes, you reopen the IndexSearcher.

    I'm actually working on extracting a library to handle all this from an
    existing app. I'll put it up for consumption once it's reasonably
    functional, and hopefully it will help take some of the pain out of
    using Lucene on a web application.

    -Kurt


    -----Original Message-----
    From: Patrick Burrows
    Sent: Monday, June 25, 2007 9:11 AM
    To: lucene-net-user@incubator.apache.org
    Subject: Re: FileNotFound Exception

    Yeah. It is already very much abstracted.

    So, it sounds like everyone is agreeing about the multiple access
    issues.
    Hm. I hadn't anticipated that at all.

    On thinking about it, I'm not too concerned about writing. I could sync
    that
    all up.

    I'm very concerned about actually searching, though. Only one search at
    at
    time? I can't see that ever working.


    On 6/25/07, Vijay Santhanam wrote:

    Hi Patrick,

    If you intend to be dynamically updating the index (i.e. while the
    searcher
    is still alive) you can't avoid synchronization because the
    IndexSearcher
    is
    unaware of new docs until it is refreshed/reinstantiated.

    Defining an interface is very important for compartmentalizing the search
    engine. The clients (like your ASP.NET website) of your search engine
    should
    be hidden from the IndexSearcher, Modifier, Writer and Readers to protect
    them from headache synching concerns.

    Also, I found creating wrapper objects and extending lucene classes
    allowed
    me to further exclude Lucene.Net.* classes from my search engine
    interface.

    Decoupling Lucene.Net and it's wrapping consumption classes
    (YourIndexBuilder, YourIndexUpdater,etc) is a good start for scaling your
    search engine too.

    During the last Lucene.Net project I was involved with, we put
    Lucene.Net
    inside a Windows service, and exposed it with a static singleton remoting
    interface.

    Vijay Santhanam
    B.Eng.(Soft.)
    Spectrum Wired - Software Engineer

    T: +61 2 4925 3266
    F: +61 2 4925 3255
    M: +61 407 525 087
    W: www.spectrumwired.com


    Disclaimer: This email and any attached files are intended solely for the
    named addressee, are confidential and may contain legally privileged
    information. The copying or distribution of them or any information they
    contain, by anyone other than the addressee, is prohibited. If you have
    received this email in error, please let us know by telephone or return
    the
    email to the sender and destroy all copies. Thank you.



    -----Original Message-----
    From: Patrick Burrows
    Sent: Monday, 25 June 2007 11:27 PM
    To: lucene-net-user@incubator.apache.org
    Subject: Re: FileNotFound Exception

    Ooh... is that right?

    Cause I access it via a website without any sort of sync locking. The site
    isn't live. But, by the very nature of a website, it is multithreaded.

    I also have separate processes which are constantly updating the index.
    And yet another process that validates the index once a week (makes sure
    there are no dupes or missed records).

    Access to the index through all these things must be synchronized? That
    seems... cumbersome. At best.

    On 6/25/07, Torsten Rendelmann wrote:

    These kind of errors we also got - the reason was:
    We accessed the index by multiple threads. Think, the same
    happens if you access the index by two processes as
    it seems examining the callstack (guess).


    TorstenR
    -----Original Message-----
    From: Patrick Burrows
    Sent: Sunday, June 24, 2007 7:21 PM
    To: lucene-net-user@incubator.apache.org
    Subject: Re: FileNotFound Exception

    I deleted and recreated my index and things seem to be
    indexing now just
    fine. I went ahead and deleted it because everything google
    said was "wow,
    that seems bad" whenever someone else got this error.
    On 6/24/07, Patrick Burrows wrote:

    If I call .Optimize() I get the same error...


    On 6/24/07, Patrick Burrows wrote:

    I am in a tight loop adding items into my index. After
    running for a
    couple minutes in the loop just fine, I get the error
    posted below. If I
    then step through (don't stop the debugger, just hit F10
    to keep stepping),
    it adds just fine. If I let it run, it will get the error
    again immediately.
    If I keep stepping through, though, I get no error. Only
    when it is running
    continuously.

    I added a sleep statement in my attempt to "program by
    coincidence" but
    it had no effect. Here is the code I am executing. The
    error is below that.
    The error occurs on the iw.AddDocument line:


    public
    static void AddPostsToIndex( List<Post> posts)

    {

    IndexWriter iw = GetIndexWriter();

    foreach (Post post in posts)

    {

    DateTime loopItemStart = DateTime.Now;

    iw.AddDocument(post.ToDocument());

    System.Threading.
    Thread.Sleep(10);

    log.DebugFormat(
    "Added post for feedItem {0} in {1}", post.FeedItemId,

    DateTime.Now.Subtract(loopItemStart));

    }

    iw.Close();

    }

    System.IO.FileNotFoundException was unhandled
    Message="Could not find file
    'C:\\FeedReader\\FullTextSearch\\_oy.fnm'."
    Source="mscorlib"
    FileName="C:\\FeedReader\\FullTextSearch\\_oy.fnm"
    StackTrace:
    at System.IO.__Error.WinIOError(Int32 errorCode, String
    maybeFullPath)
    at System.IO.FileStream.Init(String path, FileMode
    mode,
    FileAccess access, Int32 rights, Boolean useRights,
    FileShare share, Int32
    bufferSize, FileOptions options, SECURITY_ATTRIBUTES
    secAttrs, String
    msgPath, Boolean bFromProxy)
    at System.IO.FileStream..ctor(String path, FileMode
    mode,
    FileAccess access, FileShare share)
    at
    Lucene.Net.Store.FSIndexInput.Descriptor..ctor(FSIndexInput
    enclosingInstance, FileInfo file, FileAccess mode)
    at Lucene.Net.Store.FSIndexInput..ctor(FileInfo path)
    at Lucene.Net.Store.FSDirectory.OpenInput(String name)
    at Lucene.Net.Index.FieldInfos..ctor(Directory d,
    String name)
    at Lucene.Net.Index.SegmentReader.Initialize
    (SegmentInfo si)
    at Lucene.Net.Index.SegmentReader.Get(Directory
    dir, SegmentInfo
    si, SegmentInfos sis, Boolean closeDir, Boolean ownDir)
    at Lucene.Net.Index.SegmentReader.Get(SegmentInfo si)
    at
    Lucene.Net.Index.IndexWriter.MergeSegments(Int32 minSegment,
    Int32 end)
    at
    Lucene.Net.Index.IndexWriter.MergeSegments(Int32 minSegment)
    at Lucene.Net.Index.IndexWriter.MaybeMergeSegments()
    at Lucene.Net.Index.IndexWriter.AddDocument(Document
    doc,
    Analyzer analyzer)
    at Lucene.Net.Index.IndexWriter.AddDocument(Document
    doc)
    at
    FullTextSearch.Tasks.IndexManager.AddPostsToIndex(List`1
    posts)
    at FullTextSearch.Tasks.IndexManager.ValidateIndex()
    at Indox.Program.RefreshDocsInIndex() in
    C:\Dev\WebSites\FeedReader\FullTextSearch\System\Indox\Program
    .cs:line 61
    at Indox.Program.HandleArguments (String[] args) in
    C:\Dev\WebSites\FeedReader\FullTextSearch\System\Indox\Program
    .cs:line 40
    at Indox.Program.Main(String[] args) in
    C:\Dev\WebSites\FeedReader\FullTextSearch\System\Indox\Program
    .cs:line 23
    at System.AppDomain.nExecuteAssembly(Assembly
    assembly, String[]
    args)
    at System.AppDomain.ExecuteAssembly(String
    assemblyFile, Evidence
    assemblySecurity, String[] args)
    at
    Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly ()
    at
    System.Threading.ThreadHelper.ThreadStart_Context(Object
    state)
    at
    System.Threading.ExecutionContext.Run(ExecutionContext
    executionContext, ContextCallback callback, Object state)
    at System.Threading.ThreadHelper.ThreadStart ()


    --
    -
    P


    --
    -
    P



    --
    -
    P

    --
    -
    P



    __________ NOD32 2220 (20070426) Information __________

    This message was checked by NOD32 antivirus system.
    http://www.eset.com

    --
    -
    P


    --
    -
    P



    __________ NOD32 2220 (20070426) Information __________

    This message was checked by NOD32 antivirus system.
    http://www.eset.com

    --
    -
    P

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouplucene-net-user @
categorieslucene
postedJun 24, '07 at 4:58p
activeJun 26, '07 at 1:21p
posts12
users4
websitelucene.apache.org

People

Translate

site design / logo © 2022 Grokbase