FAQ
Hello everyone,

I have created a Lucene Index of Students Database, this database have 5
fields i.e. Name, Address, Class, PhoneNo and ScholarNo.
Now I have opened Searcher and query "Name:Menaria" , Its return 100
results. Now I wants the All unique "Class" names which is return in Hits
objects, How can I get unique Class list without using Loop.

Please suggest me..

Search Discussions

  • Michael Mitiaguin at May 30, 2007 at 7:33 am
    Laxmilal ,

    What's the problem with a hit list iteration ( it should be very fast ) ?
    I am not sure about equivalent of SQL "distinct" in Lucene.
    You didn't describe whether you index ( and plus store ) all fields.
    Effectively you may just store a primary key ( let's say incremental
    int id for the sake of example , though you table doesn't look
    normalised , it doesn't really matter in our case ) which will be
    stored not indexed , the rest may be indexed as ( Name + Address +
    Class ...) or as separate fields if you need to and either stored or
    not stored.
    Having applied search "Name:Menaria" ( and having written all this I
    realised that you still need to iterate a hit list but let me finish
    :) ) you may form a string with coma delimited IDs and then

    "select distinct class from mytable where where id in "(" +
    formedstring + ") "

    Surely if you store everything in Lucene index , there is no need to
    query database and you may use collections for picking up distinct
    "Class" values , but my understanding you still need to iterate
    through Hits

    Regards
    Michael
    On 5/30/07, Laxmilal Menaria wrote:
    Hello everyone,

    I have created a Lucene Index of Students Database, this database have 5
    fields i.e. Name, Address, Class, PhoneNo and ScholarNo.
    Now I have opened Searcher and query "Name:Menaria" , Its return 100
    results. Now I wants the All unique "Class" names which is return in Hits
    objects, How can I get unique Class list without using Loop.

    Please suggest me..

    --
    Thanks,
    Laxmilal menaria

    http://www.minalyzer.com/
    http://www.chambal.com/
  • Erich Eichinger at May 30, 2007 at 8:43 am
    hi,

    I guess it doesn't work without any iteration.

    Another option coming to my mind: If the list of possible classnames isn't too long, you could do (pseudocode)

    Hashtable resultCount = new Hashtable();
    foreach( string classname in possibleClassNames )
    {
    resultlist = index.SearchFor("Name:Menaria AND Class:"+classname)
    resultCount[classname] = resultlist.Count;
    }

    cheers,
    Erich

    -----Original Message-----
    From: Michael Mitiaguin
    Sent: Wednesday, May 30, 2007 9:33 AM
    To: lucene-net-user@incubator.apache.org
    Subject: Re: Get all unique values of specific field

    Laxmilal ,

    What's the problem with a hit list iteration ( it should be
    very fast ) ?
    I am not sure about equivalent of SQL "distinct" in Lucene.
    You didn't describe whether you index ( and plus store ) all fields.
    Effectively you may just store a primary key ( let's say
    incremental int id for the sake of example , though you
    table doesn't look normalised , it doesn't really matter in
    our case ) which will be stored not indexed , the rest
    may be indexed as ( Name + Address + Class ...) or as
    separate fields if you need to and either stored or not stored.
    Having applied search "Name:Menaria" ( and having written
    all this I realised that you still need to iterate a hit list
    but let me finish
    :) ) you may form a string with coma delimited IDs and then

    "select distinct class from mytable where where id in "(" +
    formedstring + ") "

    Surely if you store everything in Lucene index , there is no
    need to query database and you may use collections for
    picking up distinct "Class" values , but my understanding
    you still need to iterate through Hits

    Regards
    Michael
    On 5/30/07, Laxmilal Menaria wrote:
    Hello everyone,

    I have created a Lucene Index of Students Database, this
    database have
    5 fields i.e. Name, Address, Class, PhoneNo and ScholarNo.
    Now I have opened Searcher and query "Name:Menaria" , Its
    return 100
    results. Now I wants the All unique "Class" names which is return in
    Hits objects, How can I get unique Class list without using Loop.

    Please suggest me..

    --
    Thanks,
    Laxmilal menaria

    http://www.minalyzer.com/
    http://www.chambal.com/
  • Laxmilal Menaria at May 30, 2007 at 8:52 am
    Thanks,

    Thats okay for short index, But if index have millions of records or GB's
    data then it will get slow . So what is better ?
    On 5/30/07, Erich Eichinger wrote:


    hi,

    I guess it doesn't work without any iteration.

    Another option coming to my mind: If the list of possible classnames isn't
    too long, you could do (pseudocode)

    Hashtable resultCount = new Hashtable();
    foreach( string classname in possibleClassNames )
    {
    resultlist = index.SearchFor("Name:Menaria AND Class:"+classname)
    resultCount[classname] = resultlist.Count;
    }

    cheers,
    Erich

    -----Original Message-----
    From: Michael Mitiaguin
    Sent: Wednesday, May 30, 2007 9:33 AM
    To: lucene-net-user@incubator.apache.org
    Subject: Re: Get all unique values of specific field

    Laxmilal ,

    What's the problem with a hit list iteration ( it should be
    very fast ) ?
    I am not sure about equivalent of SQL "distinct" in Lucene.
    You didn't describe whether you index ( and plus store ) all fields.
    Effectively you may just store a primary key ( let's say
    incremental int id for the sake of example , though you
    table doesn't look normalised , it doesn't really matter in
    our case ) which will be stored not indexed , the rest
    may be indexed as ( Name + Address + Class ...) or as
    separate fields if you need to and either stored or not stored.
    Having applied search "Name:Menaria" ( and having written
    all this I realised that you still need to iterate a hit list
    but let me finish
    :) ) you may form a string with coma delimited IDs and then

    "select distinct class from mytable where where id in "(" +
    formedstring + ") "

    Surely if you store everything in Lucene index , there is no
    need to query database and you may use collections for
    picking up distinct "Class" values , but my understanding
    you still need to iterate through Hits

    Regards
    Michael
    On 5/30/07, Laxmilal Menaria wrote:
    Hello everyone,

    I have created a Lucene Index of Students Database, this
    database have
    5 fields i.e. Name, Address, Class, PhoneNo and ScholarNo.
    Now I have opened Searcher and query "Name:Menaria" , Its
    return 100
    results. Now I wants the All unique "Class" names which is return in
    Hits objects, How can I get unique Class list without using Loop.

    Please suggest me..

    --
    Thanks,
    Laxmilal menaria

    http://www.minalyzer.com/
    http://www.chambal.com/


    --
    Thanks,
    Laxmilal menaria

    http://www.minalyzer.com/
    http://www.chambal.com/
  • Laxmilal Menaria at May 30, 2007 at 10:22 am

    On 5/30/07, Laxmilal Menaria wrote:
    Thanks karl,

    But if I implement faceted classification, then I know whats our classes
    name, but if I don't know classes name, then what should I do ?

    On 5/30/07, karl wettin wrote:


    30 maj 2007 kl. 10.51 skrev Laxmilal Menaria:
    What's the problem with a hit list iteration ( it should be
    very fast ) ?
    Thats okay for short index, But if index have millions of records
    or GB's
    data then it will get slow .
    Iterate only the top n results when you gather the unique values. If
    you get a million hits, ask the user to narrow down the search a bit.

    Searching the forum archives for facets or faceted classification
    might also be helpful.



    --
    karl



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

    --
    Thanks,
    Laxmilal menaria

    http://www.minalyzer.com/
    http://www.chambal.com/
  • Digy at May 30, 2007 at 4:54 pm
    Hi Laxmilal

    Why don't you use a database for your application. It seems like a DB
    application.

    DIGY

    -----Original Message-----
    From: Laxmilal Menaria
    Sent: Wednesday, May 30, 2007 11:52 AM
    To: lucene-net-user@incubator.apache.org; java-user@lucene.apache.org
    Subject: Re: Get all unique values of specific field

    Thanks,

    Thats okay for short index, But if index have millions of records or GB's
    data then it will get slow . So what is better ?
    On 5/30/07, Erich Eichinger wrote:


    hi,

    I guess it doesn't work without any iteration.

    Another option coming to my mind: If the list of possible classnames isn't
    too long, you could do (pseudocode)

    Hashtable resultCount = new Hashtable();
    foreach( string classname in possibleClassNames )
    {
    resultlist = index.SearchFor("Name:Menaria AND Class:"+classname)
    resultCount[classname] = resultlist.Count;
    }

    cheers,
    Erich

    -----Original Message-----
    From: Michael Mitiaguin
    Sent: Wednesday, May 30, 2007 9:33 AM
    To: lucene-net-user@incubator.apache.org
    Subject: Re: Get all unique values of specific field

    Laxmilal ,

    What's the problem with a hit list iteration ( it should be
    very fast ) ?
    I am not sure about equivalent of SQL "distinct" in Lucene.
    You didn't describe whether you index ( and plus store ) all fields.
    Effectively you may just store a primary key ( let's say
    incremental int id for the sake of example , though you
    table doesn't look normalised , it doesn't really matter in
    our case ) which will be stored not indexed , the rest
    may be indexed as ( Name + Address + Class ...) or as
    separate fields if you need to and either stored or not stored.
    Having applied search "Name:Menaria" ( and having written
    all this I realised that you still need to iterate a hit list
    but let me finish
    :) ) you may form a string with coma delimited IDs and then

    "select distinct class from mytable where where id in "(" +
    formedstring + ") "

    Surely if you store everything in Lucene index , there is no
    need to query database and you may use collections for
    picking up distinct "Class" values , but my understanding
    you still need to iterate through Hits

    Regards
    Michael
    On 5/30/07, Laxmilal Menaria wrote:
    Hello everyone,

    I have created a Lucene Index of Students Database, this
    database have
    5 fields i.e. Name, Address, Class, PhoneNo and ScholarNo.
    Now I have opened Searcher and query "Name:Menaria" , Its
    return 100
    results. Now I wants the All unique "Class" names which is return in
    Hits objects, How can I get unique Class list without using Loop.

    Please suggest me..

    --
    Thanks,
    Laxmilal menaria

    http://www.minalyzer.com/
    http://www.chambal.com/


    --
    Thanks,
    Laxmilal menaria

    http://www.minalyzer.com/
    http://www.chambal.com/
  • Laxmilal Menaria at May 31, 2007 at 5:02 am
    Thanks Digy,

    but if I index files i.e. xls, logs instead of database then how to find
    distinct from Index ?

    On 5/30/07, Digy wrote:

    Hi Laxmilal

    Why don't you use a database for your application. It seems like a DB
    application.

    DIGY

    -----Original Message-----
    From: Laxmilal Menaria
    Sent: Wednesday, May 30, 2007 11:52 AM
    To: lucene-net-user@incubator.apache.org; java-user@lucene.apache.org
    Subject: Re: Get all unique values of specific field

    Thanks,

    Thats okay for short index, But if index have millions of records or GB's
    data then it will get slow . So what is better ?
    On 5/30/07, Erich Eichinger wrote:


    hi,

    I guess it doesn't work without any iteration.

    Another option coming to my mind: If the list of possible classnames isn't
    too long, you could do (pseudocode)

    Hashtable resultCount = new Hashtable();
    foreach( string classname in possibleClassNames )
    {
    resultlist = index.SearchFor("Name:Menaria AND Class:"+classname)
    resultCount[classname] = resultlist.Count;
    }

    cheers,
    Erich

    -----Original Message-----
    From: Michael Mitiaguin
    Sent: Wednesday, May 30, 2007 9:33 AM
    To: lucene-net-user@incubator.apache.org
    Subject: Re: Get all unique values of specific field

    Laxmilal ,

    What's the problem with a hit list iteration ( it should be
    very fast ) ?
    I am not sure about equivalent of SQL "distinct" in Lucene.
    You didn't describe whether you index ( and plus store ) all fields.
    Effectively you may just store a primary key ( let's say
    incremental int id for the sake of example , though you
    table doesn't look normalised , it doesn't really matter in
    our case ) which will be stored not indexed , the rest
    may be indexed as ( Name + Address + Class ...) or as
    separate fields if you need to and either stored or not stored.
    Having applied search "Name:Menaria" ( and having written
    all this I realised that you still need to iterate a hit list
    but let me finish
    :) ) you may form a string with coma delimited IDs and then

    "select distinct class from mytable where where id in "(" +
    formedstring + ") "

    Surely if you store everything in Lucene index , there is no
    need to query database and you may use collections for
    picking up distinct "Class" values , but my understanding
    you still need to iterate through Hits

    Regards
    Michael
    On 5/30/07, Laxmilal Menaria wrote:
    Hello everyone,

    I have created a Lucene Index of Students Database, this
    database have
    5 fields i.e. Name, Address, Class, PhoneNo and ScholarNo.
    Now I have opened Searcher and query "Name:Menaria" , Its
    return 100
    results. Now I wants the All unique "Class" names which is return in
    Hits objects, How can I get unique Class list without using Loop.

    Please suggest me..

    --
    Thanks,
    Laxmilal menaria

    http://www.minalyzer.com/
    http://www.chambal.com/


    --
    Thanks,
    Laxmilal menaria

    http://www.minalyzer.com/
    http://www.chambal.com/

    --
    Thanks,
    Laxmilal menaria

    http://www.minalyzer.com/
    http://www.chambal.com/
  • Digy at May 31, 2007 at 4:34 pm
    Hi Laxmilal,

    I am sorry but i don't understand what you are trying to do.
    I have created a Lucene Index of Students Database, this
    database have
    5 fields i.e. Name, Address, Class, PhoneNo and ScholarNo.
    Now I have opened Searcher and query "Name:Menaria" , Its
    return 100
    results. Now I wants the All unique "Class" names which is return in
    Hits objects, How can I get unique Class list without using Loop
    This is a typical db application where you query like
    "select distinct class from students where name = 'Menaria' " .
    You can use, for ex., an "embedded database" of your choice for this
    purpose.

    But "indexing files" is related with "full text search" and it will probably
    not provide the functionality that you want unless you write some code.
    In addition, there are some databases that claim to have text search
    capabilities but i don't think that they will be as good as lucene when it
    comes to "full text search".





    -----Original Message-----
    From: Laxmilal Menaria
    Sent: Thursday, May 31, 2007 8:02 AM
    To: lucene-net-user@incubator.apache.org
    Subject: Re: Get all unique values of specific field

    Thanks Digy,

    but if I index files i.e. xls, logs instead of database then how to find
    distinct from Index ?

    On 5/30/07, Digy wrote:

    Hi Laxmilal

    Why don't you use a database for your application. It seems like a DB
    application.

    DIGY

    -----Original Message-----
    From: Laxmilal Menaria
    Sent: Wednesday, May 30, 2007 11:52 AM
    To: lucene-net-user@incubator.apache.org; java-user@lucene.apache.org
    Subject: Re: Get all unique values of specific field

    Thanks,

    Thats okay for short index, But if index have millions of records or GB's
    data then it will get slow . So what is better ?
    On 5/30/07, Erich Eichinger wrote:


    hi,

    I guess it doesn't work without any iteration.

    Another option coming to my mind: If the list of possible classnames isn't
    too long, you could do (pseudocode)

    Hashtable resultCount = new Hashtable();
    foreach( string classname in possibleClassNames )
    {
    resultlist = index.SearchFor("Name:Menaria AND Class:"+classname)
    resultCount[classname] = resultlist.Count;
    }

    cheers,
    Erich

    -----Original Message-----
    From: Michael Mitiaguin
    Sent: Wednesday, May 30, 2007 9:33 AM
    To: lucene-net-user@incubator.apache.org
    Subject: Re: Get all unique values of specific field

    Laxmilal ,

    What's the problem with a hit list iteration ( it should be
    very fast ) ?
    I am not sure about equivalent of SQL "distinct" in Lucene.
    You didn't describe whether you index ( and plus store ) all fields.
    Effectively you may just store a primary key ( let's say
    incremental int id for the sake of example , though you
    table doesn't look normalised , it doesn't really matter in
    our case ) which will be stored not indexed , the rest
    may be indexed as ( Name + Address + Class ...) or as
    separate fields if you need to and either stored or not stored.
    Having applied search "Name:Menaria" ( and having written
    all this I realised that you still need to iterate a hit list
    but let me finish
    :) ) you may form a string with coma delimited IDs and then

    "select distinct class from mytable where where id in "(" +
    formedstring + ") "

    Surely if you store everything in Lucene index , there is no
    need to query database and you may use collections for
    picking up distinct "Class" values , but my understanding
    you still need to iterate through Hits

    Regards
    Michael
    On 5/30/07, Laxmilal Menaria wrote:
    Hello everyone,

    I have created a Lucene Index of Students Database, this
    database have
    5 fields i.e. Name, Address, Class, PhoneNo and ScholarNo.
    Now I have opened Searcher and query "Name:Menaria" , Its
    return 100
    results. Now I wants the All unique "Class" names which is return in
    Hits objects, How can I get unique Class list without using Loop.

    Please suggest me..

    --
    Thanks,
    Laxmilal menaria

    http://www.minalyzer.com/
    http://www.chambal.com/


    --
    Thanks,
    Laxmilal menaria

    http://www.minalyzer.com/
    http://www.chambal.com/

    --
    Thanks,
    Laxmilal menaria

    http://www.minalyzer.com/
    http://www.chambal.com/

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouplucene-net-user @
categorieslucene
postedMay 30, '07 at 7:01a
activeMay 31, '07 at 4:34p
posts8
users4
websitelucene.apache.org

People

Translate

site design / logo © 2022 Grokbase