FAQ
well actually I am doing a kind of a thesis regarding information
retrieval.and my tutor wanted me to be able to create a program that
firstly index a document in memory using RAMDirectory and then
flushing it to disk using FSDirectory periodically. I was having a
hard time implementing it in my source code. so that's why I am asking
what's the best way to put RAMDirectory methods in the source code
below? maybe you can give me an idea.eventhough I know it'll take
longer time to index due to the use of RAMDirectory and FSDirectory
but nonetheless I'd still be interested in how to integrate
RAMDirectory into FSDirectory.

thanks

PS : regarding the hijack thread, I thought because I was the one who
created it so I think I could hijack it.but thanks for the advice
though and now I am creating a new thread. :-)
On 10/12/10, Erick Erickson wrote:
It's a good idea to start a new thread when asking a different question,
see:
http://people.apache.org/~hossman/#threadhijack

<http://people.apache.org/~hossman/#threadhijack>I have to ask why you want
to integrate the RAM directory. If you're using it
to speed up indexing, you're probably making way more work for yourself
than you need to. If you're trying to do something with Near Real Time, one
suggestion is to just not bother. Add docs to the RAM directory AND your
FSDirectory simultaneously. The data you index to FSDir won't be visible
until you reopen the FSDir reader, so your flush could then be just
reopen everything...

Best
Erick
On Mon, Oct 11, 2010 at 1:34 PM, Yakob wrote:
On 9/29/10, Erick Erickson wrote:
Nope, never used jNotify, so I don't have any code handy...

Good luck!
Erick
so I did try JNotify but there is seems to be some bug in it that I
find it hards to integrate in my lucene source code.so I had to try a
looping option instead.


http://stackoverflow.com/questions/3840844/error-exception-access-violation-in-jnotify

so anyway, I had another question now. I was trying to make a lucene
source code that can do indexing and store them first in a memory
using RAMDirectory and then flush this index in a memory into a disk
using FSDirectory. I had done some modifications of this code but to
no avail. maybe some of you can help me out a bit.
here is the source code again.

import java.io.File;
import java.io.FileReader;
import java.io.IOException;
import org.apache.lucene.analysis.SimpleAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.store.FSDirectory;


public class SimpleFileIndexer {


public static void main(String[] args) throws Exception {

int i=0;
while(i<10) {
File indexDir = new
File("C:/Users/Raden/Documents/lucene/LuceneHibernate/adi");
File dataDir = new
File("C:/Users/Raden/Documents/lucene/LuceneHibernate/adi");
String suffix = "txt";

SimpleFileIndexer indexer = new SimpleFileIndexer();

int numIndex = indexer.index(indexDir, dataDir, suffix);

System.out.println("Total files indexed " + numIndex);
i++;
Thread.sleep(10000);

}
}



private int index(File indexDir, File dataDir, String suffix)
throws
Exception {

IndexWriter indexWriter = new IndexWriter(
FSDirectory.open(indexDir),
new SimpleAnalyzer(),
true,
IndexWriter.MaxFieldLength.LIMITED);
indexWriter.setUseCompoundFile(false);

indexDirectory(indexWriter, dataDir, suffix);

int numIndexed = indexWriter.maxDoc();
indexWriter.optimize();
indexWriter.close();

return numIndexed;

}

private void indexDirectory(IndexWriter indexWriter, File dataDir,
String suffix) throws IOException {
File[] files = dataDir.listFiles();
for (int i = 0; i < files.length; i++) {
File f = files[i];
if (f.isDirectory()) {
indexDirectory(indexWriter, f, suffix);
}
else {
indexFileWithIndexWriter(indexWriter, f,
suffix);
}
}
}

private void indexFileWithIndexWriter(IndexWriter indexWriter, File
f, String suffix) throws IOException {
if (f.isHidden() || f.isDirectory() || !f.canRead() ||
!f.exists()) {
return;
}
if (suffix!=null && !f.getName().endsWith(suffix)) {
return;
}
System.out.println("Indexing file " +
f.getCanonicalPath());

Document doc = new Document();
doc.add(new Field("contents", new FileReader(f)));
doc.add(new Field("filename", f.getCanonicalPath(),
Field.Store.YES,
Field.Index.ANALYZED));

indexWriter.addDocument(doc);
}

}

so what's the best way for me to integrate RAMDirectory in that source
code before putting them in FSDirectory. any help would be appreciated
though.
thanks


--
http://jacobian.web.id

--
http://jacobian.web.id

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Search Discussions

  • Suman Holani at Oct 12, 2010 at 5:39 am
    I am also having a similar requirement .But its other way round.

    Basically ,I have indexes in FSDirectory and which is transferred to
    another machine on regular basis. But now for the reason of faster
    searches, it would be better to copy the indexes onto RAM. (RAMDirectory).

    Not sure of how it can be achieved.


    TIA!


    regds,
    Suman


    On Tue, 12 Oct 2010 08:37:12 +0700, Yakob wrote:
    well actually I am doing a kind of a thesis regarding information
    retrieval.and my tutor wanted me to be able to create a program that
    firstly index a document in memory using RAMDirectory and then
    flushing it to disk using FSDirectory periodically. I was having a
    hard time implementing it in my source code. so that's why I am asking
    what's the best way to put RAMDirectory methods in the source code
    below? maybe you can give me an idea.eventhough I know it'll take
    longer time to index due to the use of RAMDirectory and FSDirectory
    but nonetheless I'd still be interested in how to integrate
    RAMDirectory into FSDirectory.

    thanks

    PS : regarding the hijack thread, I thought because I was the one who
    created it so I think I could hijack it.but thanks for the advice
    though and now I am creating a new thread. :-)
    On 10/12/10, Erick Erickson wrote:
    It's a good idea to start a new thread when asking a different
    question,
    see:
    http://people.apache.org/~hossman/#threadhijack

    <http://people.apache.org/~hossman/#threadhijack>I have to ask why you
    want
    to integrate the RAM directory. If you're using it
    to speed up indexing, you're probably making way more work for yourself
    than you need to. If you're trying to do something with Near Real Time,
    one
    suggestion is to just not bother. Add docs to the RAM directory AND
    your
    FSDirectory simultaneously. The data you index to FSDir won't be
    visible
    until you reopen the FSDir reader, so your flush could then be just
    reopen everything...

    Best
    Erick
    On Mon, Oct 11, 2010 at 1:34 PM, Yakob wrote:
    On 9/29/10, Erick Erickson wrote:
    Nope, never used jNotify, so I don't have any code handy...

    Good luck!
    Erick
    so I did try JNotify but there is seems to be some bug in it that I
    find it hards to integrate in my lucene source code.so I had to try a
    looping option instead.

    http://stackoverflow.com/questions/3840844/error-exception-access-violation-in-jnotify
    so anyway, I had another question now. I was trying to make a lucene
    source code that can do indexing and store them first in a memory
    using RAMDirectory and then flush this index in a memory into a disk
    using FSDirectory. I had done some modifications of this code but to
    no avail. maybe some of you can help me out a bit.
    here is the source code again.

    import java.io.File;
    import java.io.FileReader;
    import java.io.IOException;
    import org.apache.lucene.analysis.SimpleAnalyzer;
    import org.apache.lucene.document.Document;
    import org.apache.lucene.document.Field;
    import org.apache.lucene.index.IndexWriter;
    import org.apache.lucene.store.FSDirectory;


    public class SimpleFileIndexer {


    public static void main(String[] args) throws Exception {

    int i=0;
    while(i<10) {
    File indexDir = new
    File("C:/Users/Raden/Documents/lucene/LuceneHibernate/adi");
    File dataDir = new
    File("C:/Users/Raden/Documents/lucene/LuceneHibernate/adi");
    String suffix = "txt";

    SimpleFileIndexer indexer = new SimpleFileIndexer();

    int numIndex = indexer.index(indexDir, dataDir,
    suffix);
    System.out.println("Total files indexed " + numIndex);
    i++;
    Thread.sleep(10000);

    }
    }



    private int index(File indexDir, File dataDir, String suffix)
    throws
    Exception {

    IndexWriter indexWriter = new IndexWriter(
    FSDirectory.open(indexDir),
    new SimpleAnalyzer(),
    true,
    IndexWriter.MaxFieldLength.LIMITED);
    indexWriter.setUseCompoundFile(false);

    indexDirectory(indexWriter, dataDir, suffix);

    int numIndexed = indexWriter.maxDoc();
    indexWriter.optimize();
    indexWriter.close();

    return numIndexed;

    }

    private void indexDirectory(IndexWriter indexWriter, File
    dataDir,
    String suffix) throws IOException {
    File[] files = dataDir.listFiles();
    for (int i = 0; i < files.length; i++) {
    File f = files[i];
    if (f.isDirectory()) {
    indexDirectory(indexWriter, f, suffix);
    }
    else {
    indexFileWithIndexWriter(indexWriter,
    f,
    suffix);
    }
    }
    }

    private void indexFileWithIndexWriter(IndexWriter indexWriter,
    File
    f, String suffix) throws IOException {
    if (f.isHidden() || f.isDirectory() || !f.canRead() ||
    !f.exists()) {
    return;
    }
    if (suffix!=null && !f.getName().endsWith(suffix)) {
    return;
    }
    System.out.println("Indexing file " +
    f.getCanonicalPath());

    Document doc = new Document();
    doc.add(new Field("contents", new FileReader(f)));
    doc.add(new Field("filename", f.getCanonicalPath(),
    Field.Store.YES,
    Field.Index.ANALYZED));

    indexWriter.addDocument(doc);
    }

    }

    so what's the best way for me to integrate RAMDirectory in that source
    code before putting them in FSDirectory. any help would be appreciated
    though.
    thanks


    --
    http://jacobian.web.id
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Ian Lea at Oct 12, 2010 at 5:59 am
    Start by reading the javadocs for RAMDirectory.


    --
    Ian.

    On Tue, Oct 12, 2010 at 6:39 AM, wrote:

    I am also having a similar requirement .But its other way round.

    Basically ,I have indexes in FSDirectory and which is transferred to
    another machine on regular basis. But now for the reason of faster
    searches, it would be better to copy the indexes onto RAM. (RAMDirectory).

    Not sure of how it can be achieved.


    TIA!


    regds,
    Suman


    On Tue, 12 Oct 2010 08:37:12 +0700, Yakob wrote:
    well actually I am doing a kind of a thesis regarding information
    retrieval.and my tutor wanted me to be able to create a program that
    firstly index a document in memory using RAMDirectory and then
    flushing it to disk using FSDirectory periodically. I was having a
    hard time implementing it in my source code. so that's why I am asking
    what's the best way to put RAMDirectory methods in the source code
    below? maybe you can give me an idea.eventhough I know it'll take
    longer time to index due to the use of RAMDirectory and FSDirectory
    but nonetheless I'd still be interested in how to integrate
    RAMDirectory into FSDirectory.

    thanks

    PS : regarding the hijack thread, I thought because I was the one who
    created it so I think I could hijack it.but thanks for the advice
    though and now I am creating a new thread. :-)
    On 10/12/10, Erick Erickson wrote:
    It's a good idea to start a new thread when asking a different
    question,
    see:
    http://people.apache.org/~hossman/#threadhijack

    <http://people.apache.org/~hossman/#threadhijack>I have to ask why you
    want
    to integrate the RAM directory. If you're using it
    to speed up indexing, you're probably making way more work for yourself
    than you need to. If you're trying to do something with Near Real Time,
    one
    suggestion is to just not bother. Add docs to the RAM directory AND
    your
    FSDirectory simultaneously. The data you index to FSDir won't be
    visible
    until you reopen the FSDir reader, so your flush could then be just
    reopen everything...

    Best
    Erick
    On Mon, Oct 11, 2010 at 1:34 PM, Yakob wrote:
    On 9/29/10, Erick Erickson wrote:
    Nope, never used jNotify, so I don't have any code handy...

    Good luck!
    Erick
    so I did try JNotify but there is seems to be some bug in it that I
    find it hards to integrate in my lucene source code.so I had to try a
    looping option instead.

    http://stackoverflow.com/questions/3840844/error-exception-access-violation-in-jnotify
    so anyway, I had another question now. I was trying to make a lucene
    source code that can do indexing and store them first in a memory
    using RAMDirectory and then flush this index in a memory into a disk
    using FSDirectory. I had done some modifications of this code but to
    no avail. maybe some of you can help me out a bit.
    here is the source code again.

    import java.io.File;
    import java.io.FileReader;
    import java.io.IOException;
    import org.apache.lucene.analysis.SimpleAnalyzer;
    import org.apache.lucene.document.Document;
    import org.apache.lucene.document.Field;
    import org.apache.lucene.index.IndexWriter;
    import org.apache.lucene.store.FSDirectory;


    public class SimpleFileIndexer {


    public static void main(String[] args) throws Exception   {

    int i=0;
    while(i<10) {
    File indexDir = new
    File("C:/Users/Raden/Documents/lucene/LuceneHibernate/adi");
    File dataDir = new
    File("C:/Users/Raden/Documents/lucene/LuceneHibernate/adi");
    String suffix = "txt";

    SimpleFileIndexer indexer = new SimpleFileIndexer();

    int numIndex = indexer.index(indexDir, dataDir,
    suffix);
    System.out.println("Total files indexed " + numIndex);
    i++;
    Thread.sleep(10000);

    }
    }



    private int index(File indexDir, File dataDir, String suffix)
    throws
    Exception {

    IndexWriter indexWriter = new IndexWriter(
    FSDirectory.open(indexDir),
    new SimpleAnalyzer(),
    true,
    IndexWriter.MaxFieldLength.LIMITED);
    indexWriter.setUseCompoundFile(false);

    indexDirectory(indexWriter, dataDir, suffix);

    int numIndexed = indexWriter.maxDoc();
    indexWriter.optimize();
    indexWriter.close();

    return numIndexed;

    }

    private void indexDirectory(IndexWriter indexWriter, File
    dataDir,
    String suffix) throws IOException {
    File[] files = dataDir.listFiles();
    for (int i = 0; i < files.length; i++) {
    File f = files[i];
    if (f.isDirectory()) {
    indexDirectory(indexWriter, f, suffix);
    }
    else {
    indexFileWithIndexWriter(indexWriter,
    f,
    suffix);
    }
    }
    }

    private void indexFileWithIndexWriter(IndexWriter indexWriter,
    File
    f, String suffix) throws IOException {
    if (f.isHidden() || f.isDirectory() || !f.canRead() ||
    !f.exists()) {
    return;
    }
    if (suffix!=null && !f.getName().endsWith(suffix)) {
    return;
    }
    System.out.println("Indexing file " +
    f.getCanonicalPath());

    Document doc = new Document();
    doc.add(new Field("contents", new FileReader(f)));
    doc.add(new Field("filename", f.getCanonicalPath(),
    Field.Store.YES,
    Field.Index.ANALYZED));

    indexWriter.addDocument(doc);
    }

    }

    so what's the best way for me to integrate RAMDirectory in that source
    code before putting them in FSDirectory. any help would be appreciated
    though.
    thanks


    --
    http://jacobian.web.id
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Erick Erickson at Oct 12, 2010 at 11:23 pm
    I'm not at all sure this is true. We all start by thinking "of course if the
    entire
    directory were in RAM it would be faster". But lots of work has gone into
    making efficient use of RAM for things like caches etc. So it may not
    make any difference.

    But that doesn't mean you can't try. It's trivial to instantiate a
    RAMdirectory
    by calling the constructor with a Directory. See Ian's comment.

    Best
    Erick
    On Tue, Oct 12, 2010 at 1:39 AM, wrote:


    I am also having a similar requirement .But its other way round.

    Basically ,I have indexes in FSDirectory and which is transferred to
    another machine on regular basis. But now for the reason of faster
    searches, it would be better to copy the indexes onto RAM. (RAMDirectory).

    Not sure of how it can be achieved.


    TIA!


    regds,
    Suman


    On Tue, 12 Oct 2010 08:37:12 +0700, Yakob wrote:
    well actually I am doing a kind of a thesis regarding information
    retrieval.and my tutor wanted me to be able to create a program that
    firstly index a document in memory using RAMDirectory and then
    flushing it to disk using FSDirectory periodically. I was having a
    hard time implementing it in my source code. so that's why I am asking
    what's the best way to put RAMDirectory methods in the source code
    below? maybe you can give me an idea.eventhough I know it'll take
    longer time to index due to the use of RAMDirectory and FSDirectory
    but nonetheless I'd still be interested in how to integrate
    RAMDirectory into FSDirectory.

    thanks

    PS : regarding the hijack thread, I thought because I was the one who
    created it so I think I could hijack it.but thanks for the advice
    though and now I am creating a new thread. :-)
    On 10/12/10, Erick Erickson wrote:
    It's a good idea to start a new thread when asking a different
    question,
    see:
    http://people.apache.org/~hossman/#threadhijack

    <http://people.apache.org/~hossman/#threadhijack>I have to ask why you
    want
    to integrate the RAM directory. If you're using it
    to speed up indexing, you're probably making way more work for yourself
    than you need to. If you're trying to do something with Near Real Time,
    one
    suggestion is to just not bother. Add docs to the RAM directory AND
    your
    FSDirectory simultaneously. The data you index to FSDir won't be
    visible
    until you reopen the FSDir reader, so your flush could then be just
    reopen everything...

    Best
    Erick
    On Mon, Oct 11, 2010 at 1:34 PM, Yakob wrote:
    On 9/29/10, Erick Erickson wrote:
    Nope, never used jNotify, so I don't have any code handy...

    Good luck!
    Erick
    so I did try JNotify but there is seems to be some bug in it that I
    find it hards to integrate in my lucene source code.so I had to try a
    looping option instead.

    http://stackoverflow.com/questions/3840844/error-exception-access-violation-in-jnotify
    so anyway, I had another question now. I was trying to make a lucene
    source code that can do indexing and store them first in a memory
    using RAMDirectory and then flush this index in a memory into a disk
    using FSDirectory. I had done some modifications of this code but to
    no avail. maybe some of you can help me out a bit.
    here is the source code again.

    import java.io.File;
    import java.io.FileReader;
    import java.io.IOException;
    import org.apache.lucene.analysis.SimpleAnalyzer;
    import org.apache.lucene.document.Document;
    import org.apache.lucene.document.Field;
    import org.apache.lucene.index.IndexWriter;
    import org.apache.lucene.store.FSDirectory;


    public class SimpleFileIndexer {


    public static void main(String[] args) throws Exception {

    int i=0;
    while(i<10) {
    File indexDir = new
    File("C:/Users/Raden/Documents/lucene/LuceneHibernate/adi");
    File dataDir = new
    File("C:/Users/Raden/Documents/lucene/LuceneHibernate/adi");
    String suffix = "txt";

    SimpleFileIndexer indexer = new SimpleFileIndexer();

    int numIndex = indexer.index(indexDir, dataDir,
    suffix);
    System.out.println("Total files indexed " + numIndex);
    i++;
    Thread.sleep(10000);

    }
    }



    private int index(File indexDir, File dataDir, String suffix)
    throws
    Exception {

    IndexWriter indexWriter = new IndexWriter(
    FSDirectory.open(indexDir),
    new SimpleAnalyzer(),
    true,
    IndexWriter.MaxFieldLength.LIMITED);
    indexWriter.setUseCompoundFile(false);

    indexDirectory(indexWriter, dataDir, suffix);

    int numIndexed = indexWriter.maxDoc();
    indexWriter.optimize();
    indexWriter.close();

    return numIndexed;

    }

    private void indexDirectory(IndexWriter indexWriter, File
    dataDir,
    String suffix) throws IOException {
    File[] files = dataDir.listFiles();
    for (int i = 0; i < files.length; i++) {
    File f = files[i];
    if (f.isDirectory()) {
    indexDirectory(indexWriter, f, suffix);
    }
    else {
    indexFileWithIndexWriter(indexWriter,
    f,
    suffix);
    }
    }
    }

    private void indexFileWithIndexWriter(IndexWriter indexWriter,
    File
    f, String suffix) throws IOException {
    if (f.isHidden() || f.isDirectory() || !f.canRead() ||
    !f.exists()) {
    return;
    }
    if (suffix!=null && !f.getName().endsWith(suffix)) {
    return;
    }
    System.out.println("Indexing file " +
    f.getCanonicalPath());

    Document doc = new Document();
    doc.add(new Field("contents", new FileReader(f)));
    doc.add(new Field("filename", f.getCanonicalPath(),
    Field.Store.YES,
    Field.Index.ANALYZED));

    indexWriter.addDocument(doc);
    }

    }

    so what's the best way for me to integrate RAMDirectory in that source
    code before putting them in FSDirectory. any help would be appreciated
    though.
    thanks


    --
    http://jacobian.web.id
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Erick Erickson at Oct 12, 2010 at 11:20 pm
    Try IndexWrite.addIndexes. I confess that I haven't used that since
    2.4, but I suspect it's what you want.

    Best
    Erick

    P.S. hijacking. No problem. It's more a matter of helping find topics
    rather than who started the conversation.....
    On Mon, Oct 11, 2010 at 9:37 PM, Yakob wrote:

    well actually I am doing a kind of a thesis regarding information
    retrieval.and my tutor wanted me to be able to create a program that
    firstly index a document in memory using RAMDirectory and then
    flushing it to disk using FSDirectory periodically. I was having a
    hard time implementing it in my source code. so that's why I am asking
    what's the best way to put RAMDirectory methods in the source code
    below? maybe you can give me an idea.eventhough I know it'll take
    longer time to index due to the use of RAMDirectory and FSDirectory
    but nonetheless I'd still be interested in how to integrate
    RAMDirectory into FSDirectory.

    thanks

    PS : regarding the hijack thread, I thought because I was the one who
    created it so I think I could hijack it.but thanks for the advice
    though and now I am creating a new thread. :-)
    On 10/12/10, Erick Erickson wrote:
    It's a good idea to start a new thread when asking a different question,
    see:
    http://people.apache.org/~hossman/#threadhijack

    <http://people.apache.org/~hossman/#threadhijack>I have to ask why you want
    to integrate the RAM directory. If you're using it
    to speed up indexing, you're probably making way more work for yourself
    than you need to. If you're trying to do something with Near Real Time, one
    suggestion is to just not bother. Add docs to the RAM directory AND your
    FSDirectory simultaneously. The data you index to FSDir won't be visible
    until you reopen the FSDir reader, so your flush could then be just
    reopen everything...

    Best
    Erick
    On Mon, Oct 11, 2010 at 1:34 PM, Yakob wrote:
    On 9/29/10, Erick Erickson wrote:
    Nope, never used jNotify, so I don't have any code handy...

    Good luck!
    Erick
    so I did try JNotify but there is seems to be some bug in it that I
    find it hards to integrate in my lucene source code.so I had to try a
    looping option instead.

    http://stackoverflow.com/questions/3840844/error-exception-access-violation-in-jnotify
    so anyway, I had another question now. I was trying to make a lucene
    source code that can do indexing and store them first in a memory
    using RAMDirectory and then flush this index in a memory into a disk
    using FSDirectory. I had done some modifications of this code but to
    no avail. maybe some of you can help me out a bit.
    here is the source code again.

    import java.io.File;
    import java.io.FileReader;
    import java.io.IOException;
    import org.apache.lucene.analysis.SimpleAnalyzer;
    import org.apache.lucene.document.Document;
    import org.apache.lucene.document.Field;
    import org.apache.lucene.index.IndexWriter;
    import org.apache.lucene.store.FSDirectory;


    public class SimpleFileIndexer {


    public static void main(String[] args) throws Exception {

    int i=0;
    while(i<10) {
    File indexDir = new
    File("C:/Users/Raden/Documents/lucene/LuceneHibernate/adi");
    File dataDir = new
    File("C:/Users/Raden/Documents/lucene/LuceneHibernate/adi");
    String suffix = "txt";

    SimpleFileIndexer indexer = new SimpleFileIndexer();

    int numIndex = indexer.index(indexDir, dataDir, suffix);

    System.out.println("Total files indexed " + numIndex);
    i++;
    Thread.sleep(10000);

    }
    }



    private int index(File indexDir, File dataDir, String suffix)
    throws
    Exception {

    IndexWriter indexWriter = new IndexWriter(
    FSDirectory.open(indexDir),
    new SimpleAnalyzer(),
    true,
    IndexWriter.MaxFieldLength.LIMITED);
    indexWriter.setUseCompoundFile(false);

    indexDirectory(indexWriter, dataDir, suffix);

    int numIndexed = indexWriter.maxDoc();
    indexWriter.optimize();
    indexWriter.close();

    return numIndexed;

    }

    private void indexDirectory(IndexWriter indexWriter, File
    dataDir,
    String suffix) throws IOException {
    File[] files = dataDir.listFiles();
    for (int i = 0; i < files.length; i++) {
    File f = files[i];
    if (f.isDirectory()) {
    indexDirectory(indexWriter, f, suffix);
    }
    else {
    indexFileWithIndexWriter(indexWriter, f,
    suffix);
    }
    }
    }

    private void indexFileWithIndexWriter(IndexWriter indexWriter,
    File
    f, String suffix) throws IOException {
    if (f.isHidden() || f.isDirectory() || !f.canRead() ||
    !f.exists()) {
    return;
    }
    if (suffix!=null && !f.getName().endsWith(suffix)) {
    return;
    }
    System.out.println("Indexing file " +
    f.getCanonicalPath());

    Document doc = new Document();
    doc.add(new Field("contents", new FileReader(f)));
    doc.add(new Field("filename", f.getCanonicalPath(),
    Field.Store.YES,
    Field.Index.ANALYZED));

    indexWriter.addDocument(doc);
    }

    }

    so what's the best way for me to integrate RAMDirectory in that source
    code before putting them in FSDirectory. any help would be appreciated
    though.
    thanks


    --
    http://jacobian.web.id

    --
    http://jacobian.web.id

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Yakob at Oct 16, 2010 at 4:48 pm

    On 10/13/10, Erick Erickson wrote:
    Try IndexWrite.addIndexes. I confess that I haven't used that since
    2.4, but I suspect it's what you want.
    ok after I started asking around in stackoverflow regarding this
    problem. someone was kind enough to give an answer. well sort of.

    private int index(File indexDir, File dataDir, String suffix) throws Exception {
    RAMDirectory ramDir = new RAMDirectory(); // 1
    IndexWriter indexWriter = new IndexWriter(
    ramDir, // 2
    new SimpleAnalyzer(),
    true,
    IndexWriter.MaxFieldLength.LIMITED);
    indexWriter.setUseCompoundFile(false);
    indexDirectory(indexWriter, dataDir, suffix);
    int numIndexed = indexWriter.maxDoc();
    indexWriter.optimize();
    indexWriter.close();

    Directory.copy(ramDir, FSDirectory.open(indexDir)); // 3

    return numIndexed;
    }

    what I am asking now is that the last line.

    Directory.copy(ramDir, FSDirectory.open(indexDir)); // 3

    the above line is wrong when I try to compile it, as lucene didn't
    reserve any keyword "copy" in its class. so is there any workaround
    this problem? what's the best solution to copy directory from
    RAMDirectory to FSDirectory besides using copy? thanks.

    http://stackoverflow.com/questions/3913180/how-to-integrate-ramdirectory-into-fsdirectory-in-lucene/3923914#3923914
    --
    http://jacobian.web.id

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedOct 12, '10 at 1:37a
activeOct 16, '10 at 4:48p
posts6
users4
websitelucene.apache.org

People

Translate

site design / logo © 2022 Grokbase