Search Discussions

131 discussions - 699 posts

  • Hi I started to migrate my Analyzers, Tokenizer, TokenStreams and TokenFilters to the new API. Since the entire set of classes handled Token before, I decided to not change it for now, and was happy ...
    Shai EreraShai Erera
    Nov 22, 2009 at 11:12 am
    Nov 22, 2009 at 7:30 pm
  • Hello all, I am using Lucene v2.9.1, If I open my reader with positive value for termInfosIndexDivisor then the search works fine. If i set to -1, then search throws exception "terms index was not ...
    Nov 27, 2009 at 5:01 am
    Dec 8, 2009 at 1:46 pm
  • Hi all, Am using lucene 2.3.2 as a search engine in my e-paper site. So that i want the user to search the news. I achieved that objective but now am trying to implement autosuggest so that user can ...
    Nov 25, 2009 at 7:06 am
    Dec 1, 2009 at 10:24 pm
  • Hi, we use Lucene to store around 300 millions of records. We use the index both for conventional searching, but also for all the system's data - we replaced MySQL with Lucene because it was simply ...
    Michel NadeauMichel Nadeau
    Nov 30, 2009 at 3:48 pm
    Nov 30, 2009 at 5:42 pm
  • Hello, I would like to search for all documents that contain both "plan" and "_v" (my part of speech token for verb) at the same position. I have tokenized the documents accordingly so these tokens ...
    Christopher TignorChristopher Tignor
    Nov 19, 2009 at 10:28 pm
    Nov 30, 2009 at 2:20 pm
  • As some of you may recall I've been working on getting the SEN Japanese morphological analyzer working with 2.9. (and also with Solr 1.4, but that's not for this list) I'm getting close to having a ...
    Mark BennettMark Bennett
    Nov 9, 2009 at 7:55 pm
    Nov 10, 2009 at 1:23 am
  • In Turkish alphabet lowercase of I is not i. It is LATIN SMALL LETTER DOTLESS I. LowerCaseFilter which uses Character.toLowerCase() makes mistake just for that character. ...
    Nov 30, 2009 at 7:00 pm
    Dec 1, 2009 at 8:35 pm
  • Hi, The documentation of org.apache.lucene.search.Collector uses the obscure term "out of order". What does "order" mean? The natural order of document IDs, a scoring order, or some other order? -- ...
    Alexander VeitAlexander Veit
    Nov 27, 2009 at 12:41 pm
    Dec 1, 2009 at 2:12 pm
  • Hi, I have a need to filter my queries using a rather large subset of terms (can be 10K or even 50K). All these terms are sure to exist in the index so the number of results can be about the same ...
    Eran SeviEran Sevi
    Nov 22, 2009 at 2:49 pm
    Nov 24, 2009 at 10:18 am
  • Hi all. We updated to Lucene 2.9, and now we find that after closing our text index, it is not possible to rename the directory in which it resides (we are actually renaming a directory further up ...
    Daniel NollDaniel Noll
    Nov 9, 2009 at 3:30 am
    Nov 13, 2009 at 11:47 am
  • Hi, I am using lucene 2.9.1 to index a continuous flow of events. My server keeps an index writer open at all time and write events as groups of a few hundred followed by a commit. While writing, ...
    Nov 23, 2009 at 8:21 pm
    Nov 29, 2009 at 11:10 am
  • Hi, all I'm facing a large index, on a x86 win platform which may not have big enough jvm heap space to hold the entire index. So, I think it's possible to split the index into several smaller ...
    Wenbo ZhaoWenbo Zhao
    Nov 16, 2009 at 3:12 am
    Nov 16, 2009 at 5:20 pm
  • Hi, Dont know if this should be here or in java-dev, posting to this one first. In one of our installations, we have encountered an exception: Exception in thread "Lucene Merge Thread #0" ...
    Nov 26, 2009 at 10:55 am
    Nov 30, 2009 at 2:35 pm
  • Hi, What are the typical scenarios when the index will go corrupt? E.g. can a simple JVM crash during indexing will cause it? What are the best way to minimalize the possibility of corrupt index? ...
    Istvan SoosIstvan Soos
    Nov 25, 2009 at 3:13 pm
    Nov 26, 2009 at 8:44 pm
  • I have a custom query object whose scorer uses the 'AllTermDocs' to get all non-deleted documents. AllTermDocs returns the docId relative to the segment, but I need the absolute (index-wide) docId to ...
    Peter KeeganPeter Keegan
    Nov 16, 2009 at 6:39 pm
    Nov 17, 2009 at 5:37 pm
  • I'm trying to use a two phase commit involving a Lucene index and an external file derived from the index. Here are the steps: 1. prepare commit on Lucene index 2. prepare commit on external file 3. ...
    Peter KeeganPeter Keegan
    Nov 6, 2009 at 3:59 pm
    Nov 9, 2009 at 12:12 am
  • Hi, I am trying to modify the indexing chain of Lucene. To start, I have extracted and modified the default indexing chain. I have just removed the TermVectorsTermsWriter from the chain, i.e., I ...
    Renaud DelbruRenaud Delbru
    Nov 6, 2009 at 5:29 pm
    Jan 7, 2010 at 1:44 pm
  • Sir, I actually meant auto suggest as such available for google suggest similar to autocomplete. Where, users need not type the entire text and instead can go with the suggestions available. Thanks ...
    Nov 23, 2009 at 10:59 am
    Nov 25, 2009 at 1:57 am
  • Hello Lucene users, On behalf of the Lucene dev community (a growing community far larger than just the committers) I would like to announce the first release candidate for Lucene Java 3.0. Please ...
    Uwe SchindlerUwe Schindler
    Nov 17, 2009 at 10:25 pm
    Nov 18, 2009 at 2:40 pm
  • Hi, I am trying to move from a system where I counted the frequency of terms by hand in a highlighter to determine if a result was useful to me. In an earlier post on this list someone suggested I ...
    Max LynchMax Lynch
    Nov 13, 2009 at 10:09 pm
    Nov 14, 2009 at 12:27 am
  • I am pondering a way to allow closing of an index searcher and releasing the pointer to it so that it automatically cleans up by itself when all threads stop using the index searcher. Inspired by the ...
    Jacob RhodenJacob Rhoden
    Nov 11, 2009 at 10:12 pm
    Nov 13, 2009 at 12:24 am
  • Hi all, I got OutOfMemoryError at org.apache.lucene.search.Searcher.search(Searcher.java:183) My index is 43G bytes. Is that too big for Lucene ? Luke can see the index has over 1800M docs, but the ...
    Wenbo ZhaoWenbo Zhao
    Nov 13, 2009 at 7:23 am
    Nov 15, 2009 at 10:17 pm
  • hello all, This is my situation , i've multiple indexes , for example , index1 , index2 , index3 ... i've to update the indexes every night . If i open my IndexWriter create=false (since i want to ...
    Nov 10, 2009 at 9:06 am
    Nov 10, 2009 at 10:08 am
  • Hi, I want to calculate a tag cload for search results. I have seen, that it is possible to extract the top 20 words out of the lucene index. Is there also a possibility to extract the top 20 words ...
    Mathias BankMathias Bank
    Nov 5, 2009 at 3:21 pm
    Nov 6, 2009 at 8:40 am
  • Given two parallel indexes one with slowly changing fields and one with fields which are updated regularly. Is it possible to periodically merge these to form a single index? (thereby representing a ...
    Nov 3, 2009 at 4:25 pm
    Nov 4, 2009 at 1:42 pm
  • Hi, Recently, there is a requirement to sort the hits by both the scores of documents and the updateTime which is a field of document to mark the document's update time. We want the new document in ...
    Wilson WuWilson Wu
    Nov 26, 2009 at 12:01 pm
    Nov 27, 2009 at 12:08 pm
  • What is the best way to iterate across all the documents in a search results? Previously I was using the deprecated Hits object but changed the implentations as recommended in javadocs to ScoreDoc. ...
    Nov 19, 2009 at 3:37 pm
    Nov 21, 2009 at 10:02 am
  • Hi, In our application, we will allow the user to create a primary key defined in the document. We are using lucene 2.9. In this case, when we index the data coming from the client, if the metadata ...
    Java8964 java8964Java8964 java8964
    Nov 16, 2009 at 5:16 pm
    Nov 16, 2009 at 8:19 pm
  • I know this has been asked before, but I couldn't find the thread. The jar file produced from a build of 2.9.0 is 'lucene-core-2.9.jar'. For 2.9.1, it is 'lucene-core-2.9.1-dev.jar'. When does the ...
    Peter KeeganPeter Keegan
    Nov 9, 2009 at 11:38 pm
    Nov 10, 2009 at 1:34 am
  • Hi all, I want to query part of a digital string: say indexed token is "123456789" I want to query 56789 to match this token The "Query Parser Syntax" says wildcard search can not be the first char. ...
    Wenbo ZhaoWenbo Zhao
    Nov 9, 2009 at 4:44 am
    Nov 9, 2009 at 7:20 am
  • Hi, say I have: - Indexreader[] readers = {reader1, reader2, reader3} //containing all different docs - I know the internal docids of documents in reader1, reader2, reader3 seperately Does doing ...
    Nov 4, 2009 at 2:23 pm
    Nov 7, 2009 at 2:36 pm
  • Hi, I want to change the default scoring formula of lucene and one of the changes I want to perform is on the idf term. What I want to do is to include the average number of terms of the documents ...
    Nov 10, 2009 at 12:32 pm
    Dec 18, 2009 at 11:59 am
  • We are going to add full-text search for our mailbox service . The problem is we have more than 1 PB mails there , and obviously we don't want to add another PB storage for search service , so we ...
    Fulin tangFulin tang
    Nov 24, 2009 at 2:36 am
    Nov 26, 2009 at 9:58 am
  • hello all i've a doubt in spell checker , when i search for a keyword hoem am getting the spell results as in the following order (in which am retrieving 4 suggested words) form hold home them my ...
    Nov 19, 2009 at 6:22 am
    Nov 24, 2009 at 1:13 pm
  • Hello everyone, I'm a little bit confused about the docBase parameter of Collector.setNextReader. Imagine the following: - Create new Index - Index 5 docs - Call IndexWriter.commit() - Index 7 docs - ...
    Benjamin HeilbrunnBenjamin Heilbrunn
    Nov 12, 2009 at 9:25 pm
    Nov 13, 2009 at 10:36 am
  • Hello List. I'm having a problem when I add a Sort object to my searcher: docs = searcher.search(parser.parse(search), null, 50, sort); Every time I execute a query I get an OutOfMemoryError ...
    Nuno SecoNuno Seco
    Nov 12, 2009 at 4:17 pm
    Nov 12, 2009 at 7:06 pm
  • Hi I index documents with numeric fields using the new Numeric package. I execute two types of queries: range queries (for example, [1 TO 20}) and equality queries (for example 24.75). Don't mind the ...
    Shai EreraShai Erera
    Nov 11, 2009 at 1:55 pm
    Nov 11, 2009 at 2:32 pm
  • Hello all, I am using Lucene 2.4.1 and My app is running inside Tomcat. In Windows, after database optimization, the old db files are not getting deleted. I enabled the info stream and found the ...
    Nov 2, 2009 at 11:55 am
    Nov 6, 2009 at 7:22 am
  • Hi, I've a requirement that involves frequent, batched update of my Lucene index. This is done by a memory queue and process that periodically wakes and process that queue into the Lucene index. If I ...
    Istvan SoosIstvan Soos
    Nov 27, 2009 at 9:23 am
    Nov 27, 2009 at 12:38 pm
  • I was experimenting how Lucene handles 2-phase commit. Then I noticed I am not catching all Exceptions from Lucene. And I think this is because Lucene's default MergeScheduler is ...
    Teruhiko KurosakaTeruhiko Kurosaka
    Nov 21, 2009 at 12:03 am
    Nov 23, 2009 at 5:45 pm
  • Hello, I have indexed words in my documents with part of speech tags at the same location as these words using a custom Tokenizer as described, very helpfully, here: ...
    Christopher TignorChristopher Tignor
    Nov 18, 2009 at 10:20 pm
    Nov 19, 2009 at 4:26 pm
  • Is there any limit on how many IndexWriter can I keep open at same time? What does it depends on (RAM?) Can I keep 100 or 200 IndexWriters open in say HashMap and use them as I process documents? ...
    Hrishikesh AgasheHrishikesh Agashe
    Nov 14, 2009 at 4:35 pm
    Nov 16, 2009 at 1:11 pm
  • I'm writing a TokenFilter and am confused about why class Token has both an *endOffset* and a *termLength* field. It would appear that the following invariant should always hold for a Token instance: ...
    Babak FarhangBabak Farhang
    Nov 13, 2009 at 10:50 pm
    Nov 14, 2009 at 7:39 am
  • Hi all. I am trying to clean up some deprecated calls which are showing up on upgrading to 2.9.0 (from 2.3.2...), and I have just come across Directory.list(), which says this: * We have files in ...
    Daniel NollDaniel Noll
    Nov 6, 2009 at 5:39 am
    Nov 10, 2009 at 10:08 am
  • Hi, i've got a problem concerning encoding of norms. I want to use int values (0-255) instead of float interpreted bytes. In my own Similarity-Class, which I use for indexing and searching, I ...
    Benjamin HeilbrunnBenjamin Heilbrunn
    Nov 9, 2009 at 4:04 pm
    Nov 10, 2009 at 9:46 am
  • Yes - please share your test programs and I can investigate (ApacheCon this week, so I'm not sure when). And its best to keep communications on the list - that allows others with similar issues (now ...
    Mark MillerMark Miller
    Nov 2, 2009 at 1:05 pm
    Nov 3, 2009 at 10:36 pm
  • Index optimization fails if we don't have enough space on the drive and leaves the hard drive almost full. Is there a way not to even start optimization if we don't have enough space on drive? ...
    Siraj HaiderSiraj Haider
    Nov 30, 2009 at 10:51 pm
    Nov 30, 2009 at 11:05 pm
  • Hi, i'm just using Lucene 2.4 and have a problem with a "." within a field. This field contains a filename and obviously a filename can contain a "." (or multiple of them)... So if i do a search ...
    Karl Heinz MarbaiseKarl Heinz Marbaise
    Nov 25, 2009 at 5:57 pm
    Nov 29, 2009 at 6:35 pm
  • I'm interested in getting the payload information from the matching span, however it's unclear from the javadocs why NearSpansUnordered is different than NearSpansOrdered in this regard. ...
    Jason RutherglenJason Rutherglen
    Nov 20, 2009 at 11:50 pm
    Nov 25, 2009 at 7:29 pm
  • Hi, I have a requirement where I have a list of Suppliers(documents for lucene index) and a list of Products(documents again). Each Product has a supplier. e.g. : Product - RouterX, Supplier - DLink, ...
    Nov 23, 2009 at 9:06 am
    Nov 23, 2009 at 8:14 pm
Group Navigation
period‹ prev | Nov 2009 | next ›
Group Overview
groupjava-user @

114 users for November 2009

Michael McCandless: 83 posts Uwe Schindler: 64 posts Erick Erickson: 40 posts Robert Muir: 20 posts Christopher Tignor: 19 posts Shai Erera: 19 posts Peter Keegan: 17 posts Ian Lea: 16 posts Ganesh: 15 posts Jake Mannix: 15 posts Wenbo Zhao: 15 posts Simon Willnauer: 13 posts DHIVYA M: 12 posts AHMET ARSLAN: 11 posts Anshum: 11 posts Britske: 11 posts M.harig: 11 posts Vsevel: 11 posts Grant Ingersoll: 10 posts Mark Miller: 10 posts
show more