FAQ
Hi,

I have built an index using the standardAnalyser, and am now querying
that index, also with standardAnalyser. However, I get results I don't
understand.

I know there are a few documents about Brazil in my corpus. My corpus
is in Dutch, and the Dutch term used is "Brazilië".

I query the index in three ways for the string "Brazilië"
1) by entering a query in the command line
2) by hardcoding the query in my java code
3) by reading in a query file, with one query in it.

The first two give me documents back, but the latter doesn't.

###CODE SNIPPET####

QueryParser parser = new QueryParser("skos:prefLabel",new
StandardAnalyzer());//the field that I need is called skos:prefLabel.
Directory fsDir = FSDirectory.getDirectory(indexDir, false);
IndexSearcher is = new IndexSearcher(fsDir);

//1) Query is command line argument
Query query = parser.parse(queryarg);
System.out.println(query.toString());
Hits hits = is.search(query);
System.out.println(hits.length());

//2) Query is hardcoded
query = parser.parse("Brazilië");
System.out.println(query.toString());
hits = is.search(query);
System.out.println(hits.length());

//3) Query is in separate queryfile called testquery.txt
query = parser.parse(queries[0]); //queries is a String[] in which I
read all (in this case only one) queries from the file testquery.txt
System.out.println(query.toString());
hits = is.search(query);
System.out.println(hits.length());

#################


Now when I execute my code with:
java MyQueryPoser2 Brazilië

I get:
skos:prefLabel:brazili?
4
skos:prefLabel:brazili?
4
skos:prefLabel:brazili


The parsed queries are different and the resultlists are different for
the three methods. Why doesn't the query from file give my any results?

I am working on a mac. Could this be due to some special file encoding?

The queryfile does contain the right string:
lauras-macbook-pro:MetLucene laura$ cat testquery.txt
Brazilië


thanks for any help.

Laura


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Search Discussions

  • Ian Lea at Apr 27, 2009 at 1:25 pm
    Hi


    The problem may well lie with the reading of the queries from disk
    into your program.
    Using an InputStreamReader with correct encoding (UTF-8?) should solve it.


    --
    Ian.

    On Mon, Apr 27, 2009 at 12:32 PM, Laura Hollink wrote:
    Hi,

    I have built an index using the standardAnalyser, and am now querying that
    index, also with standardAnalyser. However, I get results I don't
    understand.

    I know there are a few documents about Brazil in my corpus. My corpus is in
    Dutch, and the Dutch term used is "Brazilië".

    I query the index in three ways for the string "Brazilië"
    1) by entering a query in the command line
    2) by hardcoding the query in my java code
    3) by reading in a query file, with one query in it.

    The first two give me documents back, but the latter doesn't.

    ###CODE SNIPPET####

    QueryParser parser = new QueryParser("skos:prefLabel",new
    StandardAnalyzer());//the field that I need is called skos:prefLabel.
    Directory fsDir = FSDirectory.getDirectory(indexDir, false);
    IndexSearcher is = new IndexSearcher(fsDir);

    //1) Query is command line argument
    Query query = parser.parse(queryarg);
    System.out.println(query.toString());
    Hits hits = is.search(query);
    System.out.println(hits.length());

    //2) Query is hardcoded
    query = parser.parse("Brazilië");
    System.out.println(query.toString());
    hits = is.search(query);
    System.out.println(hits.length());

    //3) Query is in separate queryfile called testquery.txt
    query = parser.parse(queries[0]); //queries is a String[] in which I read
    all (in this case only one) queries from the file testquery.txt
    System.out.println(query.toString());
    hits = is.search(query);
    System.out.println(hits.length());

    #################


    Now when I execute my code with:
    java MyQueryPoser2 Brazilië

    I get:
    skos:prefLabel:brazili?
    4
    skos:prefLabel:brazili?
    4
    skos:prefLabel:brazili


    The parsed queries are different and the resultlists are different for the
    three methods. Why doesn't the query from file give my any results?

    I am working on a mac. Could this be due to some special file encoding?

    The queryfile does contain the right string:
    lauras-macbook-pro:MetLucene laura$ cat testquery.txt
    Brazilië


    thanks for any help.

    Laura


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Laura Hollink at May 28, 2009 at 3:21 pm
    Thanks Ian!
    You were right, the problem was simply the InputStreamReader.

    (Sorry this reply is a bit off-topic now, I didn't have time to work
    on this specific problem earlier).

    Laura
    On Apr 27, 2009, at 3:24 PM, Ian Lea wrote:

    Hi


    The problem may well lie with the reading of the queries from disk
    into your program.
    Using an InputStreamReader with correct encoding (UTF-8?) should
    solve it.


    --
    Ian.

    On Mon, Apr 27, 2009 at 12:32 PM, Laura Hollink wrote:
    Hi,

    I have built an index using the standardAnalyser, and am now
    querying that
    index, also with standardAnalyser. However, I get results I don't
    understand.

    I know there are a few documents about Brazil in my corpus. My
    corpus is in
    Dutch, and the Dutch term used is "Brazilië".

    I query the index in three ways for the string "Brazilië"
    1) by entering a query in the command line
    2) by hardcoding the query in my java code
    3) by reading in a query file, with one query in it.

    The first two give me documents back, but the latter doesn't.

    ###CODE SNIPPET####

    QueryParser parser = new QueryParser("skos:prefLabel",new
    StandardAnalyzer());//the field that I need is called skos:prefLabel.
    Directory fsDir = FSDirectory.getDirectory(indexDir, false);
    IndexSearcher is = new IndexSearcher(fsDir);

    //1) Query is command line argument
    Query query = parser.parse(queryarg);
    System.out.println(query.toString());
    Hits hits = is.search(query);
    System.out.println(hits.length());

    //2) Query is hardcoded
    query = parser.parse("Brazilië");
    System.out.println(query.toString());
    hits = is.search(query);
    System.out.println(hits.length());

    //3) Query is in separate queryfile called testquery.txt
    query = parser.parse(queries[0]); //queries is a String[] in which
    I read
    all (in this case only one) queries from the file testquery.txt
    System.out.println(query.toString());
    hits = is.search(query);
    System.out.println(hits.length());

    #################


    Now when I execute my code with:
    java MyQueryPoser2 Brazilië

    I get:
    skos:prefLabel:brazili?
    4
    skos:prefLabel:brazili?
    4
    skos:prefLabel:brazili


    The parsed queries are different and the resultlists are different
    for the
    three methods. Why doesn't the query from file give my any results?

    I am working on a mac. Could this be due to some special file
    encoding?

    The queryfile does contain the right string:
    lauras-macbook-pro:MetLucene laura$ cat testquery.txt
    Brazilië


    thanks for any help.

    Laura


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedApr 27, '09 at 12:26p
activeMay 28, '09 at 3:21p
posts3
users2
websitelucene.apache.org

2 users in discussion

Laura Hollink: 2 posts Ian Lea: 1 post

People

Translate

site design / logo © 2022 Grokbase