FAQ
Hello everyone,

my name is Lars. I'm new to Java and especially to Lucene. For your
understanding I tell you in short what I'm about to do:

I'm a student from Germany and currently I'm working at our library as a
student worker. I'm study Bioinformatics/Biosystemengineering and because of
that the leader of our library asked me to help him out with some technical
things on our homepage. We currently only have a MySQL based search script
for our users and now want to add a fulltextsearch to higher the performance
of the searching results for our users. So my idea was it to use Lucene,
what I think is the best way. I don't want to use the Zend framework for
that. I need to program with Java in any way. I want to build a small
program working in the background with a frontend for the users.

Now what I did until here:

1. I downloaded Lucene 3.1.0 core (I think it's actually the newest?)
2. I searched for a small tutorial on the web to get into the material and
found the page http://www.lucenetutorial.com
3. I copy-pasted the example code from the page to see if I'm able to run
the code exactly

Here is my first problem: I'm not able to compile the code. The problem lies
in the imported packages. I tried to compile the code with the following
commandline:

javac TextFileIndexer.java -classpath
../Lucene/lucene-core-3.1.0/org/apache/lucene

i tried multiple path to the lucene package but the result was ever the
same:


TextFileIndexer.java:3: package org.apache.lucene.analysis.standard does not
exist
import org.apache.lucene.analysis.standard.StandardAnalyzer;
^
TextFileIndexer.java:4: package org.apache.lucene.document does not exist
import org.apache.lucene.document.Document;
^
TextFileIndexer.java:5: package org.apache.lucene.document does not exist
import org.apache.lucene.document.Field;
^
TextFileIndexer.java:6: package org.apache.lucene.index does not exist
import org.apache.lucene.index.IndexWriter;
^
TextFileIndexer.java:17: cannot find symbol
symbol : class IndexWriter
location: class com.lucenetutorial.apps.TextFileIndexer
private IndexWriter writer;
^
TextFileIndexer.java:69: cannot find symbol
symbol : class IndexWriter
location: class com.lucenetutorial.apps.TextFileIndexer
writer = new IndexWriter(indexDir, new StandardAnalyzer(), true,
IndexWriter.MaxFieldLength.LIMITED);
^
TextFileIndexer.java:69: cannot find symbol
symbol : class StandardAnalyzer
location: class com.lucenetutorial.apps.TextFileIndexer
writer = new IndexWriter(indexDir, new StandardAnalyzer(), true,
IndexWriter.MaxFieldLength.LIMITED);
^
TextFileIndexer.java:69: package IndexWriter does not exist
writer = new IndexWriter(indexDir, new StandardAnalyzer(), true,
IndexWriter.MaxFieldLength.LIMITED);

^
TextFileIndexer.java:89: cannot find symbol
symbol : class Document
location: class com.lucenetutorial.apps.TextFileIndexer
Document doc = new Document();
^
TextFileIndexer.java:89: cannot find symbol
symbol : class Document
location: class com.lucenetutorial.apps.TextFileIndexer
Document doc = new Document();
^
TextFileIndexer.java:95: cannot find symbol
symbol : class Field
location: class com.lucenetutorial.apps.TextFileIndexer
doc.add(new Field("contents", fr));
^
TextFileIndexer.java:100: cannot find symbol
symbol : class Field
location: class com.lucenetutorial.apps.TextFileIndexer
doc.add(new Field("path", fileName,
^
TextFileIndexer.java:101: package Field does not exist
Field.Store.YES,
^
TextFileIndexer.java:102: package Field does not exist
Field.Index.NOT_ANALYZED));
^
14 errors


So my questions:

Can you please help me little bit and tell me why I'm not able to compile
the code?
And tell me if Lucene is the best way for that task oder should I use a
Lucene port like Solr?
What files do I need to give with when I wrote an application using Lucene?

Thank you very much for your help.

Best regards

Lars

P.S.:

Here the code of the TextFileIndexer.java:

package com.lucenetutorial.apps;

import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.IndexWriter;

import java.io.*;
import java.util.ArrayList;

/**
* This terminal application creates an Apache Lucene index in a folder and
adds files into this index
* based on the input of the user.
*/
public class TextFileIndexer {

private IndexWriter writer;
private ArrayList<File> queue = new ArrayList<File>();

public static void main(String[] args) throws IOException {
System.out.println("Enter the path where the index will be created: ");

BufferedReader br = new BufferedReader(
new InputStreamReader(System.in));
String s = br.readLine();

TextFileIndexer indexer = null;
try {
indexer = new TextFileIndexer(s);
} catch (Exception ex) {
System.out.println("Cannot create index..." + ex.getMessage());
System.exit(-1);
}

//===================================================
//read input from user until he enters q for quit
//===================================================
while (!s.equalsIgnoreCase("q")) {
try {
System.out.println("Enter the file or folder name to add into the
index (q=quit):");
System.out.println("[Acceptable file types: .xml, .html, .html,
.txt]");
s = br.readLine();
if (s.equalsIgnoreCase("q")) {
break;
}

//try to add file into the index
indexer.indexFileOrDirectory(s);
} catch (Exception e) {
System.out.println("Error indexing " + s + " : " + e.getMessage());
}
}

//===================================================
//after adding, we always have to call the
//closeIndex, otherwise the index is not created
//===================================================
indexer.closeIndex();
}

/**
* Constructor
* @param indexDir the name of the folder in which the index should be
created
* @throws java.io.IOException
*/
TextFileIndexer(String indexDir) throws IOException {
// the boolean true parameter means to create a new index everytime,
// potentially overwriting any existing files there.
writer = new IndexWriter(indexDir, new StandardAnalyzer(), true,
IndexWriter.MaxFieldLength.LIMITED);
}

/**
* Indexes a file or directory
* @param fileName the name of a text file or a folder we wish to add to
the index
* @throws java.io.IOException
*/
public void indexFileOrDirectory(String fileName) throws IOException {
//===================================================
//gets the list of files in a folder (if user has submitted
//the name of a folder) or gets a single file name (is user
//has submitted only the file name)
//===================================================
listFiles(new File(fileName));

int originalNumDocs = writer.numDocs();
for (File f : queue) {
FileReader fr = null;
try {
Document doc = new Document();

//===================================================
// add contents of file
//===================================================
fr = new FileReader(f);
doc.add(new Field("contents", fr));

//===================================================
//adding second field which contains the path of the file
//===================================================
doc.add(new Field("path", fileName,
Field.Store.YES,
Field.Index.NOT_ANALYZED));

writer.addDocument(doc);
System.out.println("Added: " + f);
} catch (Exception e) {
System.out.println("Could not add: " + f);
} finally {
fr.close();
}
}

int newNumDocs = writer.numDocs();
System.out.println("");
System.out.println("************************");
System.out.println((newNumDocs - originalNumDocs) + " documents
added.");
System.out.println("************************");

queue.clear();
}

private void listFiles(File file) {
if (!file.exists()) {
System.out.println(file + " does not exist.");
}
if (file.isDirectory()) {
for (File f : file.listFiles()) {
listFiles(f);
}
} else {
String filename = file.getName().toLowerCase();
//===================================================
// Only index text files
//===================================================
if (filename.endsWith(".htm") || filename.endsWith(".html") ||
filename.endsWith(".xml") || filename.endsWith(".txt")) {
queue.add(file);
} else {
System.out.println("Skipped " + filename);
}
}
}

/**
* Close the index.
* @throws java.io.IOException
*/
public void closeIndex() throws IOException {
writer.optimize();
writer.close();
}
}

--
View this message in context: http://lucene.472066.n3.nabble.com/General-Questions-and-some-Problems-tp2858378p2858378.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Search Discussions

  • Tales at Apr 24, 2011 at 10:07 pm
    bump

    --
    View this message in context: http://lucene.472066.n3.nabble.com/General-Questions-and-some-Problems-tp2858378p2858950.html
    Sent from the Lucene - Java Users mailing list archive at Nabble.com.

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Tales at Apr 24, 2011 at 10:09 pm
    test

    --
    View this message in context: http://lucene.472066.n3.nabble.com/General-Questions-and-some-Problems-tp2858378p2858957.html
    Sent from the Lucene - Java Users mailing list archive at Nabble.com.

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Em at Apr 24, 2011 at 10:21 pm
    Hi Lars,

    in short, without completly reading through your code, I suggest you to
    use Solr. Solr is better for beginners like you with little experience
    in Lucene and Java and gives you many built-in options, from caches to
    facets - out of the box.

    Everything you need to use Solr is a Servlet-container (I prefer Tomcat
    over Jetty for production, but some people even say that Jetty is okay)
    and that's it.

    Regards,
    Em

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Simon Willnauer at Apr 25, 2011 at 10:12 am
    I briefly read through and it seems that the classpath is wrong, it should be:

    javac TextFileIndexer.java -classpath
    ../Lucene/lucene-core-3.1.0/

    provided this is the directory including org/apache/lucene/*.class | **/*.class

    simon
    On Sun, Apr 24, 2011 at 9:17 PM, tales wrote:
    Hello everyone,

    my name is Lars. I'm new to Java and especially to Lucene. For your
    understanding I tell you in short what I'm about to do:

    I'm a student from Germany and currently I'm working at our library as a
    student worker. I'm study Bioinformatics/Biosystemengineering and because of
    that the leader of our library asked me to help him out with some technical
    things on our homepage. We currently only have a MySQL based search script
    for our users and now want to add a fulltextsearch to higher the performance
    of the searching results for our users. So my idea was it to use Lucene,
    what I think is the best way. I don't want to use the Zend framework for
    that. I need to program with Java in any way. I want to build a small
    program working in the background with a frontend for the users.

    Now what I did until here:

    1. I downloaded Lucene 3.1.0 core (I think it's actually the newest?)
    2. I searched for a small tutorial on the web to get into the material and
    found the page http://www.lucenetutorial.com
    3. I copy-pasted the example code from the page to see if I'm able to run
    the code exactly

    Here is my first problem: I'm not able to compile the code. The problem lies
    in the imported packages. I tried to compile the code with the following
    commandline:

    javac TextFileIndexer.java -classpath
    ../Lucene/lucene-core-3.1.0/org/apache/lucene

    i tried multiple path to the lucene package but the result was ever the
    same:


    TextFileIndexer.java:3: package org.apache.lucene.analysis.standard does not
    exist
    import org.apache.lucene.analysis.standard.StandardAnalyzer;
    ^
    TextFileIndexer.java:4: package org.apache.lucene.document does not exist
    import org.apache.lucene.document.Document;
    ^
    TextFileIndexer.java:5: package org.apache.lucene.document does not exist
    import org.apache.lucene.document.Field;
    ^
    TextFileIndexer.java:6: package org.apache.lucene.index does not exist
    import org.apache.lucene.index.IndexWriter;
    ^
    TextFileIndexer.java:17: cannot find symbol
    symbol  : class IndexWriter
    location: class com.lucenetutorial.apps.TextFileIndexer
    private IndexWriter writer;
    ^
    TextFileIndexer.java:69: cannot find symbol
    symbol  : class IndexWriter
    location: class com.lucenetutorial.apps.TextFileIndexer
    writer = new IndexWriter(indexDir, new StandardAnalyzer(), true,
    IndexWriter.MaxFieldLength.LIMITED);
    ^
    TextFileIndexer.java:69: cannot find symbol
    symbol  : class StandardAnalyzer
    location: class com.lucenetutorial.apps.TextFileIndexer
    writer = new IndexWriter(indexDir, new StandardAnalyzer(), true,
    IndexWriter.MaxFieldLength.LIMITED);
    ^
    TextFileIndexer.java:69: package IndexWriter does not exist
    writer = new IndexWriter(indexDir, new StandardAnalyzer(), true,
    IndexWriter.MaxFieldLength.LIMITED);

    ^
    TextFileIndexer.java:89: cannot find symbol
    symbol  : class Document
    location: class com.lucenetutorial.apps.TextFileIndexer
    Document doc = new Document();
    ^
    TextFileIndexer.java:89: cannot find symbol
    symbol  : class Document
    location: class com.lucenetutorial.apps.TextFileIndexer
    Document doc = new Document();
    ^
    TextFileIndexer.java:95: cannot find symbol
    symbol  : class Field
    location: class com.lucenetutorial.apps.TextFileIndexer
    doc.add(new Field("contents", fr));
    ^
    TextFileIndexer.java:100: cannot find symbol
    symbol  : class Field
    location: class com.lucenetutorial.apps.TextFileIndexer
    doc.add(new Field("path", fileName,
    ^
    TextFileIndexer.java:101: package Field does not exist
    Field.Store.YES,
    ^
    TextFileIndexer.java:102: package Field does not exist
    Field.Index.NOT_ANALYZED));
    ^
    14 errors


    So my questions:

    Can you please help me little bit and tell me why I'm not able to compile
    the code?
    And tell me if Lucene is the best way for that task oder should I use a
    Lucene port like Solr?
    What files do I need to give with when I wrote an application using Lucene?

    Thank you very much for your help.

    Best regards

    Lars

    P.S.:

    Here the code of the TextFileIndexer.java:

    package com.lucenetutorial.apps;

    import org.apache.lucene.analysis.standard.StandardAnalyzer;
    import org.apache.lucene.document.Document;
    import org.apache.lucene.document.Field;
    import org.apache.lucene.index.IndexWriter;

    import java.io.*;
    import java.util.ArrayList;

    /**
    * This terminal application creates an Apache Lucene index in a folder and
    adds files into this index
    * based on the input of the user.
    */
    public class TextFileIndexer {

    private IndexWriter writer;
    private ArrayList<File> queue = new ArrayList<File>();

    public static void main(String[] args) throws IOException {
    System.out.println("Enter the path where the index will be created: ");

    BufferedReader br = new BufferedReader(
    new InputStreamReader(System.in));
    String s = br.readLine();

    TextFileIndexer indexer = null;
    try {
    indexer = new TextFileIndexer(s);
    } catch (Exception ex) {
    System.out.println("Cannot create index..." + ex.getMessage());
    System.exit(-1);
    }

    //===================================================
    //read input from user until he enters q for quit
    //===================================================
    while (!s.equalsIgnoreCase("q")) {
    try {
    System.out.println("Enter the file or folder name to add into the
    index (q=quit):");
    System.out.println("[Acceptable file types: .xml, .html, .html,
    .txt]");
    s = br.readLine();
    if (s.equalsIgnoreCase("q")) {
    break;
    }

    //try to add file into the index
    indexer.indexFileOrDirectory(s);
    } catch (Exception e) {
    System.out.println("Error indexing " + s + " : " + e.getMessage());
    }
    }

    //===================================================
    //after adding, we always have to call the
    //closeIndex, otherwise the index is not created
    //===================================================
    indexer.closeIndex();
    }

    /**
    * Constructor
    * @param indexDir the name of the folder in which the index should be
    created
    * @throws java.io.IOException
    */
    TextFileIndexer(String indexDir) throws IOException {
    // the boolean true parameter means to create a new index everytime,
    // potentially overwriting any existing files there.
    writer = new IndexWriter(indexDir, new StandardAnalyzer(), true,
    IndexWriter.MaxFieldLength.LIMITED);
    }

    /**
    * Indexes a file or directory
    * @param fileName the name of a text file or a folder we wish to add to
    the index
    * @throws java.io.IOException
    */
    public void indexFileOrDirectory(String fileName) throws IOException {
    //===================================================
    //gets the list of files in a folder (if user has submitted
    //the name of a folder) or gets a single file name (is user
    //has submitted only the file name)
    //===================================================
    listFiles(new File(fileName));

    int originalNumDocs = writer.numDocs();
    for (File f : queue) {
    FileReader fr = null;
    try {
    Document doc = new Document();

    //===================================================
    // add contents of file
    //===================================================
    fr = new FileReader(f);
    doc.add(new Field("contents", fr));

    //===================================================
    //adding second field which contains the path of the file
    //===================================================
    doc.add(new Field("path", fileName,
    Field.Store.YES,
    Field.Index.NOT_ANALYZED));

    writer.addDocument(doc);
    System.out.println("Added: " + f);
    } catch (Exception e) {
    System.out.println("Could not add: " + f);
    } finally {
    fr.close();
    }
    }

    int newNumDocs = writer.numDocs();
    System.out.println("");
    System.out.println("************************");
    System.out.println((newNumDocs - originalNumDocs) + " documents
    added.");
    System.out.println("************************");

    queue.clear();
    }

    private void listFiles(File file) {
    if (!file.exists()) {
    System.out.println(file + " does not exist.");
    }
    if (file.isDirectory()) {
    for (File f : file.listFiles()) {
    listFiles(f);
    }
    } else {
    String filename = file.getName().toLowerCase();
    //===================================================
    // Only index text files
    //===================================================
    if (filename.endsWith(".htm") || filename.endsWith(".html") ||
    filename.endsWith(".xml") || filename.endsWith(".txt")) {
    queue.add(file);
    } else {
    System.out.println("Skipped " + filename);
    }
    }
    }

    /**
    * Close the index.
    * @throws java.io.IOException
    */
    public void closeIndex() throws IOException {
    writer.optimize();
    writer.close();
    }
    }

    --
    View this message in context: http://lucene.472066.n3.nabble.com/General-Questions-and-some-Problems-tp2858378p2858378.html
    Sent from the Lucene - Java Users mailing list archive at Nabble.com.

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedApr 24, '11 at 7:17p
activeApr 25, '11 at 10:12a
posts5
users3
websitelucene.apache.org

3 users in discussion

Tales: 3 posts Simon Willnauer: 1 post Em: 1 post

People

Translate

site design / logo © 2022 Grokbase