FAQ
Mike:
Actually there is more issues at first glance with OJVMDirectory integration.
Note this, I am creating an index with two simple documents:
INFO: Performing: SELECT /*+ DYNAMIC_SAMPLING(0) RULE NOCACHE(T1) */
T1.rowid,F1,extractValue(F2,'/emp/name/text()')
"name",extractValue(F2,'/emp/@id') "id" FROM LUCENE.T1 for update nowait
Sep 26, 2008 3:44:16 PM org.apache.lucene.indexer.TableIndexer index
FINE: Document<stored/uncompressed,indexed<rowid:AAARLCAAEAAAm2QAAA>
indexed,tokenized<F1:001> indexed,tokenized<name:ravi>
indexed,tokenized<id:01>>
Sep 26, 2008 3:44:16 PM org.apache.lucene.indexer.TableIndexer index
FINE: Document<stored/uncompressed,indexed<rowid:AAARLCAAEAAAm2QAAB>
indexed,tokenized<F1:003> indexed,tokenized<name:murthy>
indexed,tokenized<id:03>>
IW 10 [Root Thread]: flush: segment=_0 docStoreSegment=_0
docStoreOffset=0 flushDocs=true flushDeletes=true flushDocStores=false
numDocs=2
numBufDelTerms=0
IW 10 [Root Thread]: index before flush
IW 10 [Root Thread]: DW: flush postings as segment _0 numDocs=2
IW 10 [Root Thread]: DW: oldRAMSize=111616 newFlushedSize=166
docs/MB=12,633.446 new/old=0.149%
IFD [Root Thread]: now checkpoint "segments_1" [1 segments ; isCommit = false]
IW 10 [Root Thread]: LMP: findMerges: 1 segments
IW 10 [Root Thread]: LMP: level -1.0 to 2.2741578: 1 segments
IW 10 [Root Thread]: CMS: now merge
IW 10 [Root Thread]: CMS: index: _0:C2->_0
IW 10 [Root Thread]: CMS: no more merges pending; now return
IW 10 [Root Thread]: now flush at close
IW 10 [Root Thread]: flush: segment=null docStoreSegment=_0
docStoreOffset=2 flushDocs=false flushDeletes=true flushDocStores=true
numDocs=0 numBufDelTerms=0
IW 10 [Root Thread]: index before flush _0:C2->_0
IW 10 [Root Thread]: flush shared docStore segment _0
IW 10 [Root Thread]: DW: closeDocStore: 2 files to flush to segment _0 numDocs=2
IW 10 [Root Thread]: CMS: now merge
IW 10 [Root Thread]: CMS: index: _0:C2->_0
IW 10 [Root Thread]: CMS: no more merges pending; now return
IW 10 [Root Thread]: now call final commit()
IW 10 [Root Thread]: startCommit(): start sizeInBytes=0
IW 10 [Root Thread]: startCommit index=_0:C2->_0 changeCount=2
IW 10 [Root Thread]: now sync _0.fnm
IW 10 [Root Thread]: now sync _0.frq
IW 10 [Root Thread]: now sync _0.prx
IW 10 [Root Thread]: now sync _0.tis
IW 10 [Root Thread]: now sync _0.tii
IW 10 [Root Thread]: now sync _0.nrm
IW 10 [Root Thread]: now sync _0.fdx
IW 10 [Root Thread]: now sync _0.fdt
IW 10 [Root Thread]: done all syncs
IW 10 [Root Thread]: commit: pendingCommit != null
IFD [Root Thread]: now checkpoint "segments_2" [1 segments ; isCommit = true]
IFD [Root Thread]: deleteCommits: now decRef commit "segments_1"
IFD [Root Thread]: delete "segments_1"
IW 10 [Root Thread]: commit: done
IW 10 [Root Thread]: at close: _0:C2->_0
Sep 26, 2008 3:44:16 PM org.apache.lucene.indexer.LuceneDomainIndex
ODCIIndexCreate
FINER: RETURN 0

Index created.

And when I am trying to read the index I got:
INFO: Analyzer: org.apache.lucene.analysis.WhitespaceAnalyzer@f2164127
Sep 26, 2008 3:44:48 PM org.apache.lucene.indexer.LuceneDomainIndex ODCIStart
INFO: qryStr: DESC(name:ravi)
Sep 26, 2008 3:44:48 PM org.apache.lucene.indexer.LuceneDomainIndex ODCIStart
INFO: storing cachingFilter: -1378376940 and searcher: 781713581
qryStr: DESC(name:ravi)
Sep 26, 2008 3:44:48 PM org.apache.lucene.indexer.LuceneDomainIndex getSort
INFO: using sort: <score>,<doc>
Exception in thread "Root Thread" java.lang.IndexOutOfBoundsException:
Index: 6, Size: 4
at java.util.ArrayList.RangeCheck(ArrayList.java)
at java.util.ArrayList.get(ArrayList.java)
at org.apache.lucene.index.FieldInfos.fieldInfo(FieldInfos.java)
at org.apache.lucene.index.FieldInfos.fieldName(FieldInfos.java)
at org.apache.lucene.index.TermBuffer.read(TermBuffer.java)
at org.apache.lucene.index.SegmentTermEnum.next(SegmentTermEnum.java)
at org.apache.lucene.index.SegmentTermEnum.scanTo(SegmentTermEnum.java)
at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java)
at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java)
at org.apache.lucene.index.SegmentReader.docFreq(SegmentReader.java)
at org.apache.lucene.search.IndexSearcher.docFreq(IndexSearcher.java)
at org.apache.lucene.search.Similarity.idf(Similarity.java)
at org.apache.lucene.search.TermQuery$TermWeight.(TermQuery.java)
at org.apache.lucene.search.Query.weight(Query.java)
at org.apache.lucene.search.Hits.(Searcher.java)
at org.apache.lucene.indexer.LuceneDomainIndex.ODCIStart(LuceneDomainIndex.java)

Which definetly means that something is not well saved at OJVM
directory BLOB storage :(
This are my files:
SQL> select file_size,name from it1$t;

FILE_SIZE NAME
---------- ------------------------------
10 parameters
1 updateCount
28 segments_1
20 segments.gen
8 _0.frq
8 _0.prx
103 _0.tis
35 _0.tii
12 _0.nrm
22 _0.fnm
48 _0.fdt
20 _0.fdx
62 segments_2
I'll add some debugging information at my classes which save/load
buffers to see how many calls and which arguments are used.
Marcelo.

On Fri, Sep 26, 2008 at 1:41 PM, Michael McCandless
wrote:
This one looks spooky!

Is it easily repeated? If you could print out which 2 terms you had tried
to delete, and then zip up the index just before deleting those docs (after
closing the writer) and send to me, I can try to understand what's wrong
with the index. It looks as if the *.tis file for one of the segments is
truncated.

If you capture the series of add/update/delete documents, can you get a
standalone Java test to show this?

Does this test create an entirely new index?

We did change the index format in 2.4 to use "true" UTF8 encoding for all
text content; not sure that this applies here (to BufferedIndexReader it's
all bytes) but it may.

BufferedIndexReader in general can do random IO, especially when reading the
term dict file (*.tis), when you

Mike

Marcelo Ochoa wrote:
Michael:
I just start testing 2.4rc2 running inside OJVM.
I found a similar stack trace during indexing:
IW 3 [Root Thread]: flush: segment=_3 docStoreSegment=_3
docStoreOffset=0 flushDocs=true flushDeletes=true flushDocStores=false
numDocs=2 numBufDelTerms=2
IW 3 [Root Thread]: index before flush _1:C2->_1 _2:C2->_2
IW 3 [Root Thread]: DW: flush postings as segment _3 numDocs=2
IW 3 [Root Thread]: DW: oldRAMSize=111616 newFlushedSize=264
docs/MB=7,943.758 new/old=0.237%
IW 3 [Root Thread]: DW: apply 2 buffered deleted terms and 0 deleted
docIDs and 0 deleted queries on 3 segments.
IW 3 [Root Thread]: hit exception flushing deletes
Exception in thread "Root Thread" java.io.IOException: read past EOF
at
org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java)
at
org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java)
at
org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java)
at org.apache.lucene.index.TermBuffer.read(TermBuffer.java)
at
org.apache.lucene.index.SegmentTermEnum.next(SegmentTermEnum.java)
at
org.apache.lucene.index.SegmentTermEnum.scanTo(SegmentTermEnum.java)
at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java)
at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java)
at
org.apache.lucene.index.SegmentTermDocs.seek(SegmentTermDocs.java)
at org.apache.lucene.index.IndexReader.termDocs(IndexReader.java)
at
org.apache.lucene.index.DocumentsWriter.applyDeletes(DocumentsWriter.java)
at
org.apache.lucene.index.DocumentsWriter.applyDeletes(DocumentsWriter.java:918)
at
org.apache.lucene.index.IndexWriter.applyDeletes(IndexWriter.java)
at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java)
at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java)
at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java)
at
org.apache.lucene.indexer.LuceneDomainIndex.sync(LuceneDomainIndex.java:1308)

I'll reinstall with a full debug info to see all line numbers in
Lucene java code.
Is there a list of semantic changes at BufferedIndeInput code?
I mean it do sequential or random writes for example.
But anyway, I just compiled with latest code and ran my test suites,
I'll investigate the problem a bit more.
Best regards, Marcelo.

On Fri, Sep 26, 2008 at 7:32 AM, Michael McCandless
wrote:
Can you describe the sequence of steps that your replication process goes
through?

Also, which filesystem is the index being accessed through?

Mike

rahul_k123 wrote:
First of all, thanks to all the people who helped me in getting the
lucene
replication setup working and right now its live in our production :-)

Everything working fine, except that i am seeing some exceptions on
slaves.

The following is the one which is occuring more often on slaves

at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at


java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
at


java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
at java.lang.Thread.run(Thread.java:619)
Caused by: com.IndexingException: [SYSTEM_ERROR] Cannot access index
[data_dir/index]: [read past EOF]
at


com.lucene.LuceneSearchService.getSearchResults(LuceneSearchService.java:964)
... 12 more
Caused by: java.io.IOException: read past EOF
at


org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:146)
at


org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:38)
at org.apache.lucene.store.IndexInput.readInt(IndexInput.java:66)
at org.apache.lucene.store.IndexInput.readLong(IndexInput.java:89)
at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:147)
at
org.apache.lucene.index.SegmentReader.document(SegmentReader.java:659)
at


org.apache.lucene.index.MultiSegmentReader.document(MultiSegmentReader.java:257)
at
org.apache.lucene.index.IndexReader.document(IndexReader.java:525)

and the second one is

at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at


java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
at


java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
at java.lang.Thread.run(Thread.java:619)
Caused by: java.lang.IllegalArgumentException: attempt to access a
deleted
document
at
org.apache.lucene.index.SegmentReader.document(SegmentReader.java:657)
at


org.apache.lucene.index.MultiSegmentReader.document(MultiSegmentReader.java:257)
at
org.apache.lucene.index.IndexReader.document(IndexReader.java:525)
This is on master index .



Any help is appreciated

Thanks.

--
View this message in context:

http://www.nabble.com/Caused-by%3A-java.io.IOException%3A-read-past-EOF-on-Slave-tp19682684p19682684.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


--
Marcelo F. Ochoa
http://marceloochoa.blogspot.com/
http://marcelo.ochoa.googlepages.com/home
______________
Do you Know DBPrism? Look @ DB Prism's Web Site
http://www.dbprism.com.ar/index.html
More info?
Chapter 17 of the book "Programming the Oracle Database using Java &
Web Services"
http://www.amazon.com/gp/product/1555583296/
Chapter 21 of the book "Professional XML Databases" - Wrox Press
http://www.amazon.com/gp/product/1861003587/
Chapter 8 of the book "Oracle & Open Source" - O'Reilly
http://www.oreilly.com/catalog/oracleopen/

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


--
Marcelo F. Ochoa
http://marceloochoa.blogspot.com/
http://marcelo.ochoa.googlepages.com/home
______________
Want to integrate Lucene and Oracle?
http://marceloochoa.blogspot.com/2007/09/running-lucene-inside-your-oracle-jvm.html
Is Oracle 11g REST ready?
http://marceloochoa.blogspot.com/2008/02/is-oracle-11g-rest-ready.html

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 7 of 13 | next ›
Discussion Overview
groupjava-user @
categorieslucene
postedSep 26, '08 at 5:23a
activeSep 30, '08 at 1:14p
posts13
users4
websitelucene.apache.org

People

Translate

site design / logo © 2021 Grokbase