FAQ
Hi
I have been playing around with the SpellChecker class and so far it looks
really good. While developing a testcase to show it working I came across a
couple of issues which I have resolved but I'm not certain if this is the
correct approach. I would therefore be grateful if anyone could tell me
whether it is correct or I should try something else.

1) Multple Indexes:
I have multiple indexes which store different documents based on certain
subject matter. So inorder to perform the spellchecking against all indexes
I did something like this:

IndexReader spellReader = IndexReader.open(fsDirectory1);

IndexReader spellReader2 = IndexReader.open(fsDirectory2);

MultiReader multiReader = new MultiReader(new IndexReader[]
{spellReader,spellReader2});

LuceneDictionary luceneDictionary = new LuceneDictionary(multiReader,
"content");

Directory spellDirectory = FSDirectory.getDirectory(<single index for
spellcheck);

SpellChecker spellChecker = new SpellChecker(spellDirectory);

spellChecker.indexDictionary(luceneDictionary);


Is this an acceptable approach or should there be a spellcheck index for
each seperate document index?



2) Composite query e.g. Luciene OR doqument

Inorder to handle the above i did the following:


QueryParser queryParser = new AnalyzingQueryParser("content",analyzer);

String input = "luciene OR doqument";

Query query = queryParser.parse(input);

String input2 = query.toString("content");

String[] splitString = input2.split(" ");


For each of the string in the array i performed the suggestSimilar(..).


Is this the most appropriate way of doing this?



Any help would be appreciated.


Cheers

Amin

Search Discussions

  • Amin Mohammed-Coleman at Apr 11, 2009 at 9:00 pm
    Hi
    Another thing that I was wondering is how to apply the construction of the
    spell index. Where is the most appropriate place to create the spell index?


    For example:

    IndexReader spellReader = IndexReader.open(fsDirectory1);

    IndexReader spellReader2 = IndexReader.open(fsDirectory2);

    MultiReader multiReader = new MultiReader(new IndexReader[]
    {spellReader,spellReader2});

    LuceneDictionary luceneDictionary = new LuceneDictionary(multiReader,
    "content");

    Directory spellDirectory = FSDirectory.getDirectory(<single index for
    spellcheck);

    SpellChecker spellChecker = new SpellChecker(spellDirectory);

    spellChecker.indexDictionary(luceneDictionary);


    should this be applied when doing a search or when a document is indexed?
    Should I clear the spellIndex when the main index changes?


    I also noticed that when running some tests I found that the spell index
    contained numbers from the text extracted from a document. Is there a way
    to only include a*lphabetic characters in the indexDictionary process?*



    Any help would be appreciated.


    Cheers
    On Fri, Apr 10, 2009 at 2:28 PM, Amin Mohammed-Coleman wrote:

    Hi
    I have been playing around with the SpellChecker class and so far it looks
    really good. While developing a testcase to show it working I came across a
    couple of issues which I have resolved but I'm not certain if this is the
    correct approach. I would therefore be grateful if anyone could tell me
    whether it is correct or I should try something else.

    1) Multple Indexes:
    I have multiple indexes which store different documents based on certain
    subject matter. So inorder to perform the spellchecking against all indexes
    I did something like this:

    IndexReader spellReader = IndexReader.open(fsDirectory1);

    IndexReader spellReader2 = IndexReader.open(fsDirectory2);

    MultiReader multiReader = new MultiReader(new IndexReader[]
    {spellReader,spellReader2});

    LuceneDictionary luceneDictionary = new LuceneDictionary(multiReader,
    "content");

    Directory spellDirectory = FSDirectory.getDirectory(<single index for
    spellcheck);

    SpellChecker spellChecker = new SpellChecker(spellDirectory);

    spellChecker.indexDictionary(luceneDictionary);


    Is this an acceptable approach or should there be a spellcheck index for
    each seperate document index?



    2) Composite query e.g. Luciene OR doqument

    Inorder to handle the above i did the following:


    QueryParser queryParser = new AnalyzingQueryParser("content",analyzer);

    String input = "luciene OR doqument";

    Query query = queryParser.parse(input);

    String input2 = query.toString("content");

    String[] splitString = input2.split(" ");


    For each of the string in the array i performed the suggestSimilar(..).


    Is this the most appropriate way of doing this?



    Any help would be appreciated.


    Cheers

    Amin
  • Amin Mohammed-Coleman at Apr 15, 2009 at 6:58 am
    Hi

    Apologies for bringing this mail up again. But I have resolved some of the
    issues that I originally started with including composite queries. However
    I just have 1 remaining question which I would be grateful if someone could
    assist me with.

    I have a class whcih performs the creation of the spell index but I'm not
    sure where to apply this class. Do I apply this process whenever a user
    uploads a new file (kicking off the indexing process). It seems as though
    this may not be the most appropriate place as I have one spell index and 4
    document indexes. I'm wondering what the general approach is. Also
    whenever the indexes change should I clear the spell index and start again?


    Once again apologies for bringing this up.


    Cheers
    Amin
    On Sat, Apr 11, 2009 at 9:59 PM, Amin Mohammed-Coleman wrote:

    Hi
    Another thing that I was wondering is how to apply the construction of the
    spell index. Where is the most appropriate place to create the spell index?


    For example:

    IndexReader spellReader = IndexReader.open(fsDirectory1);

    IndexReader spellReader2 = IndexReader.open(fsDirectory2);

    MultiReader multiReader = new MultiReader(new IndexReader[]
    {spellReader,spellReader2});

    LuceneDictionary luceneDictionary = new LuceneDictionary(multiReader,
    "content");

    Directory spellDirectory = FSDirectory.getDirectory(<single index for
    spellcheck);

    SpellChecker spellChecker = new SpellChecker(spellDirectory);

    spellChecker.indexDictionary(luceneDictionary);


    should this be applied when doing a search or when a document is indexed?
    Should I clear the spellIndex when the main index changes?


    I also noticed that when running some tests I found that the spell index
    contained numbers from the text extracted from a document. Is there a way
    to only include a*lphabetic characters in the indexDictionary process?*



    Any help would be appreciated.


    Cheers
    On Fri, Apr 10, 2009 at 2:28 PM, Amin Mohammed-Coleman wrote:

    Hi
    I have been playing around with the SpellChecker class and so far it looks
    really good. While developing a testcase to show it working I came across a
    couple of issues which I have resolved but I'm not certain if this is the
    correct approach. I would therefore be grateful if anyone could tell me
    whether it is correct or I should try something else.

    1) Multple Indexes:
    I have multiple indexes which store different documents based on certain
    subject matter. So inorder to perform the spellchecking against all indexes
    I did something like this:

    IndexReader spellReader = IndexReader.open(fsDirectory1);

    IndexReader spellReader2 = IndexReader.open(fsDirectory2);

    MultiReader multiReader = new MultiReader(new IndexReader[]
    {spellReader,spellReader2});

    LuceneDictionary luceneDictionary = new LuceneDictionary(multiReader,
    "content");

    Directory spellDirectory = FSDirectory.getDirectory(<single index for
    spellcheck);

    SpellChecker spellChecker = new SpellChecker(spellDirectory);

    spellChecker.indexDictionary(luceneDictionary);


    Is this an acceptable approach or should there be a spellcheck index for
    each seperate document index?



    2) Composite query e.g. Luciene OR doqument

    Inorder to handle the above i did the following:


    QueryParser queryParser = new AnalyzingQueryParser("content",analyzer);

    String input = "luciene OR doqument";

    Query query = queryParser.parse(input);

    String input2 = query.toString("content");

    String[] splitString = input2.split(" ");


    For each of the string in the array i performed the suggestSimilar(..).


    Is this the most appropriate way of doing this?



    Any help would be appreciated.


    Cheers

    Amin

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedApr 10, '09 at 1:28p
activeApr 15, '09 at 6:58a
posts3
users1
websitelucene.apache.org

1 user in discussion

Amin Mohammed-Coleman: 3 posts

People

Translate

site design / logo © 2022 Grokbase