Search Discussions

95 discussions - 470 posts

  • I recently posted some questions about performance problems with large indexes. One key thing about our situation is that we don't need sorted results (either by relevance or any other key). I've ...
    Jun 26, 2009 at 2:12 am
    Jul 13, 2009 at 2:17 pm
  • Hi All, I'm indexing some non-english content. But the page also contains english content. As of now I'm using WhitespaceAnalyzer for all content and I'm storing the full webpage content under a ...
    Jun 3, 2009 at 2:16 pm
    Jun 11, 2009 at 9:05 am
  • Hi, I am using Lucene 2.4.1 to index a database with less than a million records. The resulting index is about 50MB in size. I keep getting an OutOfMemory Error if I re-use the same IndexWriter to ...
    Jun 24, 2009 at 8:09 am
    Jun 26, 2009 at 6:55 am
  • hello all Am doing a search application on lucene, its working fine when my index size is small, am getting java heap space error when am using large size index, i came to know about hadoop with ...
    Jun 29, 2009 at 10:50 am
    Jun 30, 2009 at 12:55 pm
  • Our query performance is surprisingly inconsistent, and I'm trying to figure out why. I've realized that I need to better understand what's going on internally in Lucene when we're searching. I'd be ...
    Jun 23, 2009 at 8:53 pm
    Jun 26, 2009 at 9:14 am
  • Hi. I currently have an index which is 16GB per machine (8 machines = 128GB) (data is stored externally, not in index) and is growing like crazy (we are indexing blogs which is crazy by nature) and ...
    Marcus HerouMarcus Herou
    Jun 26, 2009 at 10:07 pm
    Jul 3, 2009 at 3:33 am
  • I have existing code that's like: final Term t = /* ... */; final Iterator i = searcher.search( new TermQuery( t ) ).iterator(); while ( i.hasNext() ) { final Hit hit = (Hit)i.next(); // "FILE" is ...
    Paul J. LucasPaul J. Lucas
    Jun 10, 2009 at 1:05 am
    Jun 11, 2009 at 2:40 pm
  • Hi All! I'm new to Lucene so forgive me if this question was asked before. I have a database with records in the same table in many different languages (up to 70) it includes all W-European, Arabic, ...
    OBender HotmailOBender Hotmail
    Jun 15, 2009 at 5:11 pm
    Jun 16, 2009 at 12:24 pm
  • I am seeing my Lucene application's search time grows pretty much linearly to the number of Documents. Is this how Lucene is supposed to work, or does it depend on the nature of query? I am not using ...
    Teruhiko KurosakaTeruhiko Kurosaka
    Jun 17, 2009 at 3:48 pm
    Jun 19, 2009 at 8:18 am
  • Hej hej, i have a question regarding lucenes memory usage when launching a query. When i execute my query lucene eats up over 1gig of heap-memory even when my result-set is only a single hit. I found ...
    Benedikt BossBenedikt Boss
    Jun 10, 2009 at 12:24 pm
    Dec 25, 2009 at 10:14 pm
  • hello. i've tried to highlight string using Highligheter(2.4.1) and JapaneseAnalyzer but the following code extract show the problem String F = "f"; String CONTENTS = "AAA :BBB CCC"; JapaneseAnalyzer ...
    Jun 30, 2009 at 2:34 pm
    Jul 2, 2009 at 10:28 pm
  • Hello all, I am using Lucene v2.4.1 1) I have build multiple indexes of total 30 million documents. My memory limit is 512 MB. In this case i am getting frequently OOME. If i increased the memory ...
    Jun 25, 2009 at 7:11 am
    Jun 26, 2009 at 10:14 am
  • Hi, how can i access the index in a concurrently way, so i can perform add/update/delete documents concurrently? Cheers, João -- Cumprimentos, João Carlos Galaio da Silva
    João SilvaJoão Silva
    Jun 16, 2009 at 9:03 am
    Jun 16, 2009 at 12:05 pm
  • Hi, I have an index that is a multi-segment index (how come it is created this way?) When I try to get the freq of a term at the following way: TermDocs tDocs = this.indexReader.termDocs(term); tf = ...
    Liat orenLiat oren
    Jun 28, 2009 at 12:40 pm
    Jun 30, 2009 at 8:27 pm
  • Hi All - What is the best way to load a RAM Directory from a FS Directory, and periodically reload the RAM Directory to pick up new documents? The scenario I have is I create several large ...
    Diamond, GregDiamond, Greg
    Jun 9, 2009 at 5:59 pm
    Jun 12, 2009 at 3:48 pm
  • hello all, i've gone through most of the posts from this forum , i need a code snippet for searching large index, currently am iterating , hits = searher.search(query); for (int inc = 0; inc < ...
    Jun 30, 2009 at 6:01 am
    Jun 30, 2009 at 2:23 pm
  • Hi I am prototyping the following situation: 1) Multiple nodes in a cluster 2) Each node has a local index 3) Search requests are maded against the local index 4) Index updates are sent to a JMS ...
    Amin Mohammed-ColemanAmin Mohammed-Coleman
    Jun 20, 2009 at 5:32 pm
    Jun 23, 2009 at 10:09 am
  • I am trying to figure out the best way to add to a lucene index across a clustered app server. I cannot grab an IndexWriter for each node in the cluster, because I would run into lock file problems. ...
    Newman, BillyNewman, Billy
    Jun 9, 2009 at 8:03 pm
    Jun 15, 2009 at 9:33 am
  • Hi list, I'm having a trouble with achieving good performance when indexing XML wikipedia dump. The indexing process works as follows 1. setup FSDirectory 2. setup IndexWriter 3. setup custom ...
    Mateusz BerezeckiMateusz Berezecki
    Jun 8, 2009 at 10:23 am
    Jun 10, 2009 at 10:36 am
  • The Lucene FAQ says... What is the order of fields returned by Document.fields()? * Fields are returned in the same order they were added to the document. (now getFields() as fields is deprecated) ...
    Matt TurnerMatt Turner
    Jun 25, 2009 at 8:33 pm
    Jul 1, 2009 at 9:24 am
  • Hello! What is the appropriate way to obtain Lucene internal IDs for _all_ the tuples stored in a Lucene index? Thank you for your help Dmitry ...
    Dmitry LizorkinDmitry Lizorkin
    Jun 19, 2009 at 3:37 pm
    Jun 19, 2009 at 5:56 pm
  • Hi, is there any api form of Hits pagination? for example, if i want to retreve the hits between an interval. -- Cumprimentos, João Carlos Galaio da Silva
    João SilvaJoão Silva
    Jun 19, 2009 at 10:57 am
    Jun 19, 2009 at 1:22 pm
  • Hi, I want to update a specific document, but i didn't found updateDocument(Query) or updateDocument(Term[]), so to make a update, i will need to have a term with an unique id, so a retrieve a ...
    João SilvaJoão Silva
    Jun 18, 2009 at 10:20 pm
    Jun 19, 2009 at 9:06 am
  • Is there a Mac port of the Lucene engine?
    Ian VinkIan Vink
    Jun 8, 2009 at 9:55 pm
    Jun 9, 2009 at 1:50 am
  • Hello everyone, I am quite new to development with Nutch, so you must forgive my question if it is amateurish. After some reading of Luke's source code, I found to my dismay that obtaining the ...
    House LessHouse Less
    Jun 8, 2009 at 12:14 am
    Jan 10, 2012 at 4:49 pm
  • I've a web application which uses Lucene for search functionality. Lucene search requests are served by web services sitting on 2 application servers (IIS 7).The 2 application servers are Load ...
    Jun 19, 2009 at 4:11 am
    Jun 20, 2009 at 6:02 pm
  • In my environment, one of the concerns is that new documents are constantly being added (and some documents may be deleted). This means that when a user does a search and pages through results, it is ...
    Scott SmithScott Smith
    Jun 19, 2009 at 6:40 pm
    Jun 19, 2009 at 10:19 pm
  • Hey there, I have noticed I am experiencing sort of a memory leak with a CustomComparatorSource (wich implements SortComparatorSource). I have a HashMap declared as variable of class in ...
    Marc SturleseMarc Sturlese
    Jun 12, 2009 at 10:09 pm
    Jun 13, 2009 at 1:39 pm
  • Hi all, I am writing to gauge the group's interest level in building a P2P application using Lucene. Nothing fancy, just good old-fashioned P2P search across one's social-network or work-network ...
    Shashi KantShashi Kant
    Jun 4, 2009 at 12:04 pm
    Jun 6, 2009 at 8:14 am
  • Hi, When I use lucene 2.4.1 QueryParser with CJKAnalyzer, somehow it always generates an extra space, for example, if the input is "ABC", the query would be: myfield"AB BC " // should be myfield:"AB ...
    Zhang, LishengZhang, Lisheng
    Jun 2, 2009 at 5:13 am
    Jun 2, 2009 at 5:14 pm
  • At the end of the day, I used to build the stats of top indexed terms. I enabled term frequency for the single field. It is working fine. I could able to get the top terms and its frequencies. It ...
    Jun 30, 2009 at 7:37 am
    Jul 2, 2009 at 12:46 pm
  • Hello, Is there any class in lucene which will do encoding for term? Thanks -- View this message in context: http://www.nabble.com/Lucene-Term-Encoder-tp24228145p24228145.html Sent from the Lucene - ...
    John SeerJohn Seer
    Jun 26, 2009 at 10:27 pm
    Jun 29, 2009 at 6:41 pm
  • Hi, I have a case where deleting documents by doc id make sense (I know before hand the docs I want to delete based on the doc id). I am wondering why the API is not exposed in the IndexWriter (as it ...
    Shay BanonShay Banon
    Jun 28, 2009 at 9:21 am
    Jun 29, 2009 at 1:33 pm
  • dears, my problem is that i want to apply a wieght for each word i add to the lucene document, so that when i want to index a sentence like this "Hello how you doing" i want to add Hello with a boost ...
    Jun 26, 2009 at 8:59 am
    Jun 29, 2009 at 10:53 am
  • Hello- We're looking at memory issues we're having with a fair-sized web app that uses Lucene for search. While looking at heap dumps, we discovered that there were 3 instances of ...
    Ulf DittmerUlf Dittmer
    Jun 25, 2009 at 9:14 pm
    Jun 26, 2009 at 4:49 am
  • hello list im figgering about the following problem. in my index i cant find the word BE, but it exists in two documents. im usinglucene 2.4 with the standardanalyzer. other querys with words like ...
    Timon RothTimon Roth
    Jun 24, 2009 at 10:51 pm
    Jun 25, 2009 at 7:31 am
  • Of the late I started using Lucene as main search library for all documents in our intranet. It works extremely well. I am trying to use similarity kinda functionality to find similarity between two ...
    Cool The BreezerCool The Breezer
    Jun 23, 2009 at 9:38 am
    Jun 23, 2009 at 11:23 am
  • Hey, I was wondering if there is a way to read the index and generate n-grams of words for a document in lucene? I am quite new to it and am using pylucene. Thanks, Neha
    Neha GuptaNeha Gupta
    Jun 19, 2009 at 2:15 am
    Jun 20, 2009 at 4:29 am
  • Is there any way to programmatically determine the version of lucene being loaded?
    Scott SmithScott Smith
    Jun 16, 2009 at 10:36 pm
    Jun 17, 2009 at 8:51 am
  • Hi, on 99470 documents (I mean Lucene documents) a FuzzyQuery needs approx 30 seconds but PrefixQuery less than one. All Lucene files need 65MB together. I'm bit surprised of that. Is that possible? ...
    Zsolt KoppanyZsolt Koppany
    Jun 15, 2009 at 2:19 pm
    Jun 15, 2009 at 3:24 pm
  • Hi, Mentioned below are snippets from my indexing and searching code. For some reason, I get zero hits all the time even for terms present in the document collection. Can somebody point out where I'm ...
    Delip RaoDelip Rao
    Jun 6, 2009 at 10:26 pm
    Jun 7, 2009 at 9:33 pm
  • Hi All, I am trying to build a distributed system to build and serve lucene indexes. I came across the Distributed Lucene project- http://wiki.apache.org/hadoop/DistributedLucene ...
    Tarandeep SinghTarandeep Singh
    Jun 1, 2009 at 4:55 pm
    Jun 2, 2009 at 4:23 pm
  • Hi, It's my first experiment with Lucene. Please help me. I'm going to index a set of documents and create a feature vector for each of them. This vector contains all terms belong to the document ...
    Amir Hossein JadidinejadAmir Hossein Jadidinejad
    Jun 29, 2009 at 7:14 pm
    Jun 29, 2009 at 8:26 pm
  • Hi, Am working on a "US based nearest city search within a given radius" functionality using Lucene API. Am indexing city's lat and long values in Lucene as follows: doc.Add(new Field("latitude", ...
    Jun 28, 2009 at 5:39 pm
    Jun 28, 2009 at 6:29 pm
  • Hi, My lucene index has got latitude and longitudes fields indexed as follows: doc.Add(new Field("latitude", latitude.ToString() , Field.Store.YES, Field.Index.UN_TOKENIZED)); doc.Add(new ...
    Jun 27, 2009 at 8:08 pm
    Jun 27, 2009 at 10:58 pm
  • Hi all, I am using a Standard analyzer on both my search field and my query. I use a SpanNearQuery to search on the search field. One of the query terms has special characters like ( - round open ...
    Radha SreedharanRadha Sreedharan
    Jun 24, 2009 at 5:30 pm
    Jun 24, 2009 at 6:24 pm
  • Hi, I know a similar subject has been discussed in this list and this is not a "windows file system" list ;-) But may be someone have encountered the "thing"... and perhaps solved it ! I have a web ...
    Malo PichotMalo Pichot
    Jun 19, 2009 at 7:16 am
    Jun 22, 2009 at 10:10 am
  • Hello all, I have a situation where a field is indexed like this (FAC_NAME(Field.Store.NO, Field.Index.NO_NORMS)) and keyword analyzer is used on this field. Although, I'm aware that NO_NORMS doesn't ...
    Jun 17, 2009 at 7:17 pm
    Jun 18, 2009 at 9:43 am
  • The last few versions of lucene have deprecated several of the interfaces we were using and this is necessitating a fairly major upgrade of our code (which hasn't had much done to it for several ...
    Scott SmithScott Smith
    Jun 17, 2009 at 12:15 am
    Jun 17, 2009 at 7:41 am
  • I know this has been covered a number of time before but I am still confused. I am using all the default values for IndexWriter when writing my index. I loop over all my documents 1000 at a time. For ...
    Newman, BillyNewman, Billy
    Jun 12, 2009 at 3:12 pm
    Jun 12, 2009 at 10:11 pm
Group Navigation
period‹ prev | Jun 2009 | next ›
Group Overview
groupjava-user @

123 users for June 2009

Michael McCandless: 50 posts Simon Willnauer: 21 posts Uwe Schindler: 20 posts KK: 19 posts Robert Muir: 18 posts Erick Erickson: 14 posts Ian Lea: 13 posts M.harig: 13 posts Nigel: 12 posts Stefan: 12 posts João Silva: 11 posts Otis Gospodnetic: 11 posts Ganesh: 10 posts Grant Ingersoll: 9 posts Scott Smith: 9 posts Amin Mohammed-Coleman: 6 posts Eks dev: 6 posts Marcus Herou: 6 posts Mark Miller: 6 posts OBender Hotmail: 6 posts
show more