FAQ
hi,
thanks for the reply.see: http://lucene.apache.org/java/2_4_1/api/index.html
you will find there the Similarity have created and run to get the similarity between the two Strings.I did the folow:
I created a doc:
doc.add(new Field("term","this expression of galectin-1 in blood vessel walls was correlated with vascular", Field.Store.YES,Field.Index.TOKENIZED));
then I indexed it and i ran the followed Similarity query to get the cosine similarity :
query=SimilarityQueries.formSimilarQuery("this expression of galectin-1 in blood vessel walls was correlated with vascular",analyzer,"term",null);
ScoreDoc[] scoreDocs = searcher.search(query,5).scoreDocs;
I got the score mentioned above.(0.3044460713863373)
thanks.
kamal
Original Message:

What is SimilarityQueries? I'd try the explain capabilities to see
<br />more.
<br />
<br />
<br />On May 5, 2009, at 2:23 PM, Kamal Najib wrote:
<br />
<br />> hi all,
<br />> i got the similarity score 0.3044460713863373 between two docs which
<br />> have the same text content, is it correct? I expected 1.0, hier is
<br />> my result line:
<br />>
<br />> doc:"this expression of galectin-1 in blood vessel walls was
<br />> correlated with vascular"
<br />> doc2 :"this expression of galectin-1 in blood vessel walls was
<br />> correlated with vascular" Score :"0.3044460713863373"
<br />> is the score correct?
<br />> my methode is :
<br />> public double getSimilarity(String v1,String v2) throws Exception
<br />> {
<br />>
<br />> float result=0;
<br />> directory = new RAMDirectory();
<br />> Analyzer analyzer = new StandardAnalyzer();
<br />> IndexWriter writer = new IndexWriter(directory, analyzer,
<br />> true, IndexWriter.MaxFieldLength.LIMITED);
<br />>
<br />>
<br />> Document doc1 = new Document();
<br />> doc1.add(new Field("term",v1, Field.Store.YES,
<br />> Field.Index.TOKENIZED));
<br />> writer.addDocument(doc1);
<br />> writer.close();
<br />> IndexReader ir=IndexReader.open(directory);
<br />> IndexSearcher searcher = new IndexSearcher(directory);
<br />> Query
<br />> query=SimilarityQueries.formSimilarQuery(v2,analyzer,"term",null);
<br />> ScoreDoc[] scoreDocs = searcher.search(query,5).scoreDocs;
<br />> int docNum = scoreDocs[0].doc;
<br />> result = scoreDocs[0].score;
<br />> Document hitDoc = searcher.doc(docNum);
<br />> System.out.println("Term 1 :"+v2+" Term2:"+hitDoc.get("term")+"
<br />> Score :"+result);
<br />> return result;
<br />> }
<br />> please help.
<br />> thanks in advance.
<br />> Kamal
<br />> --
<br />>
<br />>
<br />> ---------------------------------------------------------------------
<br />> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
<br />> For additional commands, e-mail: java-user-help@lucene.apache.org
<br />
<br />--------------------------
<br />Grant Ingersoll
<br />http://www.lucidimagination.com/
<br />
<br />Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
<br />using Solr/Lucene:
<br />http://www.lucidimagination.com/search
<br />
<br />
<br />---------------------------------------------------------------------
<br />To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
<br />For additional commands, e-mail: java-user-help@lucene.apache.org
<br />
<br />

--

Search Discussions

  • Grant Ingersoll at May 7, 2009 at 8:21 pm
    What does the searcher.explain() method say?

    -Grant
    On May 6, 2009, at 2:18 AM, Kamal Najib wrote:

    hi,
    thanks for the reply.see: http://lucene.apache.org/java/2_4_1/api/index.html
    you will find there the Similarity have created and run to get the
    similarity between the two Strings.I did the folow:
    I created a doc:
    doc.add(new Field("term","this expression of galectin-1 in blood
    vessel walls was correlated with vascular",
    Field.Store.YES,Field.Index.TOKENIZED));
    then I indexed it and i ran the followed Similarity query to get the
    cosine similarity :
    query=SimilarityQueries.formSimilarQuery("this expression of
    galectin-1 in blood vessel walls was correlated with
    vascular",analyzer,"term",null);
    ScoreDoc[] scoreDocs = searcher.search(query,5).scoreDocs;
    I got the score mentioned above.(0.3044460713863373)
    thanks.
    kamal
    Original Message:

    What is SimilarityQueries? I'd try the explain capabilities to see
    <br />more.
    <br />
    <br />
    <br />On May 5, 2009, at 2:23 PM, Kamal Najib wrote:
    <br />
    <br />> hi all,
    <br />> i got the similarity score 0.3044460713863373 between two
    docs which
    <br />> have the same text content, is it correct? I expected 1.0,
    hier is
    <br />> my result line:
    <br />>
    <br />> doc:"this expression of galectin-1 in blood vessel walls was
    <br />> correlated with vascular"
    <br />> doc2 :"this expression of galectin-1 in blood vessel walls was
    <br />> correlated with vascular" Score :"0.3044460713863373"
    <br />> is the score correct?
    <br />> my methode is :
    <br />> public double getSimilarity(String v1,String v2) throws
    Exception
    <br />> {
    <br />>
    <br />> float result=0;
    <br />> directory = new RAMDirectory();
    <br />> Analyzer analyzer = new StandardAnalyzer();
    <br />> IndexWriter writer = new IndexWriter(directory, analyzer,
    <br />> true, IndexWriter.MaxFieldLength.LIMITED);
    <br />>
    <br />>
    <br />> Document doc1 = new Document();
    <br />> doc1.add(new Field("term",v1, Field.Store.YES,
    <br />> Field.Index.TOKENIZED));
    <br />> writer.addDocument(doc1);
    <br />> writer.close();
    <br />> IndexReader ir=IndexReader.open(directory);
    <br />> IndexSearcher searcher = new IndexSearcher(directory);
    <br />> Query
    <br />>
    query=SimilarityQueries.formSimilarQuery(v2,analyzer,"term",null);
    <br />> ScoreDoc[] scoreDocs = searcher.search(query,5).scoreDocs;
    <br />> int docNum = scoreDocs[0].doc;
    <br />> result = scoreDocs[0].score;
    <br />> Document hitDoc = searcher.doc(docNum);
    <br />> System.out.println("Term 1 :"+v2+"
    Term2:"+hitDoc.get("term")+"
    <br />> Score :"+result);
    <br />> return result;
    <br />> }
    <br />> please help.
    <br />> thanks in advance.
    <br />> Kamal
    <br />> --
    <br />>
    <br />>
    <br />>
    ---------------------------------------------------------------------
    <br />> To unsubscribe, e-mail: java-user-
    unsubscribe@lucene.apache.org
    <br />> For additional commands, e-mail: java-user-help@lucene.apache.org
    <br />
    <br />--------------------------
    <br />Grant Ingersoll
    <br />http://www.lucidimagination.com/
    <br />
    <br />Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/
    Droids)
    <br />using Solr/Lucene:
    <br />http://www.lucidimagination.com/search
    <br />
    <br />
    <br /
    ---------------------------------------------------------------------
    <br />To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    <br />For additional commands, e-mail: java-user-
    help@lucene.apache.org
    <br />
    <br />

    --


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    --------------------------
    Grant Ingersoll
    http://www.lucidimagination.com/

    Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
    using Solr/Lucene:
    http://www.lucidimagination.com/search


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedMay 6, '09 at 9:19a
activeMay 7, '09 at 8:21p
posts2
users2
websitelucene.apache.org

2 users in discussion

Grant Ingersoll: 1 post Kamal Najib: 1 post

People

Translate

site design / logo © 2022 Grokbase