Search Discussions

93 discussions - 389 posts

  • Hi I am looking at building a faceted search using Lucene. I know that Solr comes with this built in, however I would like to try this by myself (something to add to my CV!). I have been looking ...
    Amin Mohammed-ColemanAmin Mohammed-Coleman
    Feb 16, 2009 at 8:38 pm
    Mar 2, 2009 at 9:40 pm
  • Hi, Has there been any work done on getting confidence scores at runtime, so that scores of documents can be compared across queries? I found one reference in the mailing list to some work in 2003, ...
    Ken WilliamsKen Williams
    Feb 20, 2009 at 6:01 pm
    Mar 6, 2009 at 12:10 am
  • Hello everybody, In my research work, I use Lucene to index and search into text documents. At present, I just index and search for single words. I want to extend this to phrases (or nGrams). Could ...
    Nada MimouniNada Mimouni
    Feb 18, 2009 at 9:59 am
    Feb 23, 2009 at 7:17 pm
  • Hi, I am trying to implement a kind of faceted search using Lucene 2.4.0. I have a list of configuration rules that tell me how to generate this facets and the corresponding queries (that can range ...
    Raffaella VentaglioRaffaella Ventaglio
    Feb 7, 2009 at 6:57 pm
    Feb 17, 2009 at 6:12 pm
  • Hi, what is the best approach to merge a database index with a lucene fulltext index? Both databases store a unique ID per doc. This is the join criteria. requirements: * both resultsets may be very ...
    Feb 28, 2009 at 7:08 pm
    Mar 2, 2009 at 12:02 pm
  • The explanation of scores from the same document returned from 2 similar queries differ in an unexpected way. There are 2 fields involved, 'contents' and 'literals'. The 'literals' field has setBoost ...
    Peter KeeganPeter Keegan
    Feb 20, 2009 at 9:03 pm
    Mar 2, 2009 at 4:25 pm
  • Hi, I've actually posted this message in de dev mailing list earlier, because I though my 'issue' is a limitation of the functionality of Lucene, but they redirected me to this mailinglist, so I hope ...
    Feb 13, 2009 at 8:05 am
    Feb 14, 2009 at 11:28 am
  • Hi all, My search backends are only able to eek out 13-15 qps even with the entire index in memory (this makes it very expensive to scale). According to my YourKit profiler 80% of the program's time ...
    Michael StoppelmanMichael Stoppelman
    Feb 3, 2009 at 7:24 am
    Feb 5, 2009 at 10:13 pm
  • We have our own Analyzer which has the following Public final TokenStream tokenStream(String fieldname, Reader reader) { TokenStream result = new StandardTokenizer(reader); result = new ...
    Philip PuffinburgerPhilip Puffinburger
    Feb 17, 2009 at 12:19 am
    Feb 21, 2009 at 5:19 pm
  • Hi, in what order does search(Query query, HitCollector results) return the results? By relevance? Thank you. --------------------------------------------------------------------- To unsubscribe, ...
    Feb 15, 2009 at 5:32 pm
    Feb 15, 2009 at 8:58 pm
  • Hi List, How to find if an empty lucene index has been created for the very first time? Is the generation number 1 enough to determine this? -- Regards, Akshay K. Ukey.
    Feb 11, 2009 at 3:52 pm
    Feb 11, 2009 at 6:23 pm
  • Hi, I've got a weird problem with a lucene index, using 2.3.1. The index contains 6660 files. I don't know how this happened.Maybe somone can tell me something about the files themselves? (examples ...
    John ByrneJohn Byrne
    Feb 3, 2009 at 3:27 pm
    Feb 4, 2009 at 12:20 pm
  • I am using netbeans on windows to test lucene. I have added all the lib files from the /lib directory to my project library. down the end of Indexer.java program, it states the Field.Text method is ...
    Seid MohammedSeid Mohammed
    Feb 19, 2009 at 3:41 pm
    Feb 21, 2009 at 9:54 am
  • Hi, I have a number of documents that each relate to a client. I would like to use an index and queries to answer two question: - Find relevant documents - Find relevant clients The first one is ...
    Feb 16, 2009 at 8:01 pm
    Feb 17, 2009 at 9:43 pm
  • Hi, Let's say I have a single document with 2 fields (namely Field1 and Field2). 2 values are added to each field like below. // Add 2 values to Field1. doc.Add (new Field ("Field1", "A", ...
    Dragon FlyDragon Fly
    Feb 11, 2009 at 10:56 pm
    Feb 12, 2009 at 5:58 pm
  • As per Lucene documentation - "For good search performance, implementations of this method should not call Searcher.doc(int) or IndexReader.document(int) on every document number encountered. Doing ...
    Feb 2, 2009 at 11:54 am
    Feb 3, 2009 at 1:39 pm
  • Why is this code not returning any results? //Create the query and search QueryParser queryParser = new QueryParser("contents", new StandardAnalyzer()); Query query = ...
    Chetan ShahChetan Shah
    Feb 26, 2009 at 6:45 pm
    Mar 5, 2009 at 11:27 pm
  • Looking into TopDocCollector code, I have some questions: * How can a hit have a score of <=0? * What happens if the first hit has the highest score of all hits? It seems that topDocs whould then ...
    Feb 27, 2009 at 11:43 am
    Feb 28, 2009 at 2:04 pm
  • Hello everybody, 1) What is the difference between : - inverted index - nextword index - common index 2) Which one(s) is(are) supported by Lucene? 3) Which class(es) create this(those) index(es)? ...
    Nada MimouniNada Mimouni
    Feb 24, 2009 at 10:36 am
    Feb 24, 2009 at 11:13 pm
  • I am new to lucene, and reading lucene in action book sometimes, i better understand when somone tell me an answer than a book. my queston is when indexing, what actually lucene is doing? if i have a ...
    Seid MohammedSeid Mohammed
    Feb 19, 2009 at 11:10 am
    Feb 19, 2009 at 1:58 pm
  • R2.4 I have been looking through the soon-to-be-superseded (by its 2nd ed.) book "Lucene In Action" (hope it's ok on this newsgroup to say I like that book); also at these two tutorials: ...
    Feb 17, 2009 at 7:19 pm
    Feb 19, 2009 at 3:04 am
  • Hi, I have a weird problem. I use Lucene 2.4 in an web application(Tomcat 5.5.x), running uncer JDK 1.5. After a while (from 1 day to a couple depending on traffic) all memory gets eaten up by a lot ...
    Feb 9, 2009 at 1:57 pm
    Feb 10, 2009 at 8:17 pm
  • dear lucene community, i am playing around with lucene right now. and have come to very bad problem. given environment: a signal source gives signals with eventids ans eventdescriptions for example ...
    Christian BrennsteinerChristian Brennsteiner
    Feb 18, 2009 at 3:21 pm
    Feb 20, 2009 at 7:46 am
  • Hi, We have have an application which manages the data of multiple customers. A customer can only search its own data, never the data of other customers. So what is more efficent in respect of ...
    Feb 14, 2009 at 1:27 pm
    Feb 14, 2009 at 4:48 pm
  • Hi all, Apologies for being slightly off-topic, we are looking at novel visualization approaches for rendering results from Lucene queries. I was wondering if you have any recommendations for ...
    Shashi KantShashi Kant
    Feb 12, 2009 at 8:54 am
    Feb 12, 2009 at 3:56 pm
  • Hi all, I am a new lucene user and got started with in a really quick time! Its been really nice and I love it :) - I am still trying to do a few things the right way and digging through ...
    Vinubalaji GopalVinubalaji Gopal
    Feb 11, 2009 at 7:50 am
    Feb 11, 2009 at 11:55 pm
  • Hi All, Is it possible to somehow ensure that a document will be returned only once when collecting from HitCollector?
    Feb 5, 2009 at 12:18 pm
    Feb 7, 2009 at 2:18 pm
  • Hi, I've been looking into the FieldCache API because of memory problems we've been seeing in our production environment. We use various different sorts so over time the cache builds up and servers ...
    Todd BengeTodd Benge
    Feb 4, 2009 at 4:46 pm
    Feb 4, 2009 at 6:01 pm
  • Hi All, We face serious performance issues when users do 2 letter search e.g ho, jo, pa ma, um ar, ma fi etc. time taken between 10 - 15 secs. Below is our implementation details: 1. Search performs ...
    Mittal, Sourabh \(IDEAS\)Mittal, Sourabh \(IDEAS\)
    Feb 2, 2009 at 11:49 am
    Feb 3, 2009 at 11:47 am
  • Hi, I have two indexes, each has a tokenized field and I would like to combine them both into one field in a new index. How can it be done? (Is it a good approach or is it better to hold them as ...
    Liat orenLiat oren
    Feb 26, 2009 at 12:07 pm
    Feb 27, 2009 at 9:02 pm
  • All, I have a list of java objects and would like to index the contents of those objects. And would like to update the index whenever list of objects is changed. The big question is when users ...
    Ambati, Ravi BGI SFAmbati, Ravi BGI SF
    Feb 26, 2009 at 5:07 pm
    Feb 26, 2009 at 6:58 pm
  • I am doing some research in vertical search? Therefore, i defined some weights of several keywords in my corpus expressing a certain theme,later,how can i use these to compute the similarity with the ...
    Feb 12, 2009 at 6:32 am
    Feb 25, 2009 at 7:47 pm
  • I'm having trouble applying IndexWriter 2-phase commit to make a transaction involving two different indexes. The scenario, 1. Open index1 2. Open index2 3. Make change1 to index1 4. Make change2 to ...
    An HongAn Hong
    Feb 24, 2009 at 2:07 am
    Feb 24, 2009 at 12:05 pm
  • Hello, I'm new to Lucene and I have the following problem: I have a Users with first name, last name etc. and User Jobs (collection) with job title, start date, end date. I need to perform search on ...
    Mykola PeleshchyshynMykola Peleshchyshyn
    Feb 23, 2009 at 9:12 am
    Feb 23, 2009 at 3:10 pm
  • from lucen index, how can we search a sentence or a paragraph which satisfy our query? thanks a lot seid m -- "RABI ZIDNI ILMA" --------------------------------------------------------------------- ...
    Seid MohammedSeid Mohammed
    Feb 19, 2009 at 1:30 pm
    Feb 20, 2009 at 12:33 pm
  • R2.4 So, I may well be missing something here, but: I use <pseudoCode IndexSearcher.search(someQuery, null, count, new Sort());</pseudoCode to get an instance of TopFieldDocs (the "Hits" is ...
    Feb 19, 2009 at 3:30 am
    Feb 19, 2009 at 2:19 pm
  • Hi I'm probably going to get shot down for asking this simple question. Although I think I understand the basic concept of Field I feel there is something that I am missing and I was wondering if ...
    Amin Mohammed-ColemanAmin Mohammed-Coleman
    Feb 5, 2009 at 8:31 am
    Feb 5, 2009 at 11:47 am
  • Hi every body: I am using wordnet to index my document taking in account the synonyms with wordnet. After I indexed the whole documents collections I made a query with the word "snort" but documents ...
    Feb 4, 2009 at 8:27 pm
    Feb 5, 2009 at 2:35 am
  • Hi, By way of clarification, when a filter is used with a search query, is the filter applied only to documents that matched the search query or is it applied to all documents in the index before the ...
    Joel HalbertJoel Halbert
    Feb 19, 2009 at 11:54 am
    Mar 6, 2009 at 12:04 pm
  • Hi All: Is there any study / research done on using scanned paper documents as images (may be PDF), and then use some OCR or other technique for extracting text, and the resultant index quality? ...
    Sudarsan, Sithu D.Sudarsan, Sithu D.
    Feb 26, 2009 at 4:30 pm
    Feb 27, 2009 at 12:57 pm
  • Hi all, We have a business requirement that needs Lucene to search similar to contains (of SQL) such that we can have something like *ucen* which should return lucene and lucent ... unfortunately ...
    Joseph SyjucoJoseph Syjuco
    Feb 26, 2009 at 1:00 pm
    Feb 26, 2009 at 2:35 pm
  • I'm subclassing MultiSearcher and writing a customized searcher on my own. The search( Weight, Filter, int, Sort ) method on MultiSearcher should return TopFieldDocs, but I cannot instantiate one ...
    Cheolgoo KangCheolgoo Kang
    Feb 24, 2009 at 2:15 am
    Feb 25, 2009 at 2:45 pm
  • Hi, What is the best tool (free software) to extract text from Microsoft Office 2007: Word 2007, Excel 2007, Power Point 2007 so that we can index them by lucene? Thanks very much for helps, Lisheng ...
    Zhang, LishengZhang, Lisheng
    Feb 22, 2009 at 6:27 am
    Feb 22, 2009 at 10:21 am
  • Once my app gets the query string from the user, is there a way to tell the query engine to only return documents where these words are at most 5 words apart? I can't tell the user to change their ...
    Ian VinkIan Vink
    Feb 19, 2009 at 12:20 pm
    Feb 22, 2009 at 10:05 am
  • I am new to lucene, and reading lucene in action book sometimes, i better understand when somone tell me an answer than a book. my queston is when indexing, what actually lucene is doing? if i have a ...
    Seid MohammedSeid Mohammed
    Feb 19, 2009 at 11:17 am
    Feb 22, 2009 at 10:05 am
  • Hi Everyone, My question is related to the arabic analysis package under: org.apache.lucene.analysis.ar It is cool and it is doing a great job, but it uses a special tokenizer: ArabicLetterTokenizer ...
    Yusuf AajiYusuf Aaji
    Feb 20, 2009 at 11:23 am
    Feb 20, 2009 at 3:08 pm
  • Hello, I was using lucene 2.3.2 with hits and switch to lucene 2.4.0 and now I am using TopDocCollector. I have two queries which are running against the same index. One query is returning 80bytes ...
    Feb 4, 2009 at 1:37 am
    Feb 19, 2009 at 1:57 am
  • Hi, Are there free Hebrew and Hindi language analyzers for lucene? I searched archive and found some discussions, but did not see clear pointers to downloadable classes. Thanks very much for helps, ...
    Zhang, LishengZhang, Lisheng
    Feb 18, 2009 at 2:55 am
    Feb 18, 2009 at 7:06 pm
  • As i know, the time effciency of creating index is non-linearity with the size of documents. For example, if the size of indexes is 1G, the time cost is 2 hours, If the size of indexes is 10G, the ...
    治江 王治江 王
    Feb 16, 2009 at 5:49 am
    Feb 18, 2009 at 11:25 am
  • Hi, I am creating a tracker for web applications. I am indexing all the user credentials while they are logging . The problem is , user might be hit the same web page many times during the action ...
    Feb 17, 2009 at 6:33 am
    Feb 18, 2009 at 5:08 am
Group Navigation
period‹ prev | Feb 2009 | next ›
Group Overview
groupjava-user @

118 users for February 2009

Erick Erickson: 39 posts Michael McCandless: 31 posts Grant Ingersoll: 16 posts Spring: 12 posts Amin Mohammed-Coleman: 12 posts Michael Stoppelman: 11 posts Seid Mohammed: 11 posts Nada Mimouni: 9 posts Mark Miller: 8 posts Joel Halbert: 7 posts Chris Hostetter: 6 posts Karl Wettin: 6 posts Chris Lu: 5 posts D-fader: 5 posts Paul Elschot: 5 posts Philip Puffinburger: 5 posts Robert Muir: 5 posts Yonik Seeley: 5 posts Ian Lea: 4 posts Mark harwood: 4 posts
show more