Search Discussions

101 discussions - 522 posts

  • Hi All, I have successfully used Lucene in the "tradtiional" way to provide full-text search for various websites. Now I am tasked with developing a data-store to back a web crawler. The crawler can ...
    John EvansJohn Evans
    Jul 29, 2008 at 1:53 am
    Aug 3, 2008 at 3:09 pm
  • Hi, I have a number of fields that are used to filter documents from a search. They should not contribute to the score of the document but merely decide which documents are valid. i.e. it doesn't ...
    John PattersonJohn Patterson
    Jul 15, 2008 at 8:08 am
    Jul 16, 2008 at 5:33 pm
  • Hi Everyone I am getting the the following error when executing Hits hits = searchers.search(query, queryFilter, sort): 18007414-java.io.IOException: Bad file descriptor 18007455- at ...
    Jul 23, 2008 at 4:20 pm
    Jul 28, 2008 at 9:56 am
  • Hi all, I am new to lucene , is it possible to Index different files in different folders in lucene for examples , i have two folderes a and b , each contain several files. in lucene args i wrote : ...
    Jul 5, 2008 at 6:42 pm
    Jul 15, 2008 at 5:33 pm
  • I'd like to shorten the time it takes to optimize my index and am willing to sacrifice search and indexing performance. Which parameters (e.g. merge factor) should I change? Thank you. Stay in touch ...
    Dragon FlyDragon Fly
    Jul 28, 2008 at 6:00 pm
    Jul 30, 2008 at 10:45 pm
  • Hi All, Is there any possibility to avoid duplicate records in lucene 2.3.1? -- View this message in context: http://www.nabble.com/How-to-avoid-duplicate-records-in-lucene-tp18543588p18543588.html ...
    Jul 19, 2008 at 11:30 am
    Jul 23, 2008 at 11:19 pm
  • Hi. There's a ThreadLocal field in SegmentReader (it's called termVectorsLocal). Some value is put to it, but it's never cleared. Is it ok? It looks like sometimes this behavior may lead to leaks. ...
    Roman PuchkovskiyRoman Puchkovskiy
    Jul 6, 2008 at 8:33 pm
    Jul 12, 2008 at 4:38 am
  • Hi, I had recently found out that Lucene will retrieve the content of a document from a file ".fdt". I am trying to retrieve the entire file in one go instead of retrieving it based on document ...
    Jul 10, 2008 at 1:01 am
    Jul 11, 2008 at 9:34 am
  • Hi all. This may seem a longish and informal mail, but do correct me if my assumptions are wrong anywhere, otherwise my actual doubt will make no sense. Say I opened an IndexWriter on an initially ...
    Jul 26, 2008 at 4:18 am
    Jul 28, 2008 at 9:18 am
  • Hi all.. I had a question related to the write locks created by Lucene. I use Lucene 2.3.2. Will this newwer version create locks while indexing as older ones? or is there any other way that lucene ...
    Sandeep KSandeep K
    Jul 23, 2008 at 6:59 am
    Jul 25, 2008 at 12:39 pm
  • If I have a SortField with a type of STRING, is there any way to sort in a case-insensitive manner? - Paul --------------------------------------------------------------------- To unsubscribe, ...
    Paul J. LucasPaul J. Lucas
    Jul 1, 2008 at 12:59 am
    Jul 18, 2008 at 5:33 pm
  • I need to perform a query for a term that may or may not have values, and I need to check for the conditions where either no terms are indexed OR any and ALL indexed terms match a wildcard. For ...
    Ronald RudyRonald Rudy
    Jul 10, 2008 at 9:54 pm
    Jul 21, 2008 at 7:06 pm
  • Hi, Every time I send a mail to this list, I get the below error. Any idea where is the problem ? It also appears that my mails are actually reaching the list. Any help in rectifying this is ...
    Preetam RaoPreetam Rao
    Jul 15, 2008 at 8:55 am
    Jul 17, 2008 at 8:03 am
  • Hi, I am indexing content and searching using lucene. It is working fine when I use the simple servlet and jsp mechanism. I am able to search on the indexed content. I tried to implement the same ...
    Jul 3, 2008 at 4:42 am
    Jul 4, 2008 at 3:54 pm
  • Can someone explain this to me? After indexing I can see the terms I expect in the top terms using Luke but then when I search I get no results?? This is really bizarre and is blocker for me. Thanks. ...
    Jul 24, 2008 at 6:46 pm
    Jul 24, 2008 at 7:15 pm
  • Hi, I am indexing lots of text files and need to see how many times a certain word comes up in each text file. Right now I have this constructor for "search": static void search(Searcher searcher, ...
    Jul 9, 2008 at 1:50 pm
    Jul 11, 2008 at 6:40 pm
  • According to SVN history on the next version this will be available: LUCENE-1044: IndexWriter with autoCommit=true now commits (such that a reader can see the changes) far less often than it used to. ...
    Eric DiazEric Diaz
    Jul 8, 2008 at 3:40 pm
    Jul 10, 2008 at 9:45 pm
  • I just did an update from lucene 2.2.0 to 2.3.2 and thought I'd give some kudos for the indexing performance enhancements. The lucene indexing portion is about 6-8 times faster. Previously we were ...
    Beard, BrianBeard, Brian
    Jul 9, 2008 at 1:04 pm
    Jul 10, 2008 at 3:21 pm
  • Hi, I have been using a RAMDirectory for indexing without any problem, but I then moved to a file based directory to reduce memory usage. this has been working fine on Windows and OSX and my version ...
    Paul TaylorPaul Taylor
    Jul 8, 2008 at 8:04 am
    Jul 8, 2008 at 4:15 pm
  • Hi there, I want to index email address in such a way that i can do WildCard, Phrase and Simple search on those items. for each document i will have email addresses string just like in the case of CC ...
    Jul 3, 2008 at 11:31 am
    Jul 7, 2008 at 5:23 am
  • Hi, I'm implementing a custom IndexDeletionPolicy. An IndexCommit object does not have any information whether it's index is optimized or not. How can a IndexDeletionPolicy know which IndexCommit ...
    Shalin Shekhar MangarShalin Shekhar Mangar
    Jul 1, 2008 at 10:48 am
    Jul 2, 2008 at 5:39 pm
  • Dear fellow Java/Lucene developers: I have a question on creating an index from an XML document for the purpose of searching using the Lucene API in Java. I am searching shakespeare's "Hamlet" which ...
    Jul 27, 2008 at 5:59 pm
    Jul 29, 2008 at 4:41 am
  • Hi, I am writing a class to report on an index. This index has documents updated using the IndexWriter.updateDocument(Term, Document) method. That is, documents were deleted and added again. My aim ...
    ನಾಗೇಶ್ ಸುಬ್ರಹ್ಮಣ್ಯ (Nagesh S)ನಾಗೇಶ್ ಸುಬ್ರಹ್ಮಣ್ಯ (Nagesh S)
    Jul 25, 2008 at 11:18 am
    Jul 25, 2008 at 2:57 pm
  • Is there some sort of a scaling strategies listing available? I think there is a Wiki page missing. What are the typical promblems I'll encounter when distributing the search over multiple machines? ...
    Karl WettinKarl Wettin
    Jul 16, 2008 at 1:42 pm
    Jul 18, 2008 at 4:52 pm
  • Hi, Sorry if you get this mail second time. Having some trouble with mail box. Is there a query in Lucene which matches sub phrases ? For example if the document text is "new york existing homes *3 ...
    Preetam RaoPreetam Rao
    Jul 14, 2008 at 5:15 pm
    Jul 15, 2008 at 10:06 am
  • Hi, I have some questions about indexing: 1. Is it possible to open indexes with Multireader+IndexSearcher and add documents to these indexes simultaneously? 2. Is it possible to open indexes with ...
    Jul 13, 2008 at 1:00 pm
    Jul 15, 2008 at 9:51 am
  • hello - if I make a query and get the document ids and delete with the document id - could there be a side effect? my index is committed periodically, but i can not say when it is committed. best ...
    Cam BazzCam Bazz
    Jul 23, 2008 at 8:09 pm
    Aug 1, 2008 at 9:18 pm
  • Could any one tell me please how to print the content of the document after reading the index. for example if i like to print the index terms then i do : IndexReader ir = IndexReader.open(index); ...
    Jul 22, 2008 at 6:53 pm
    Jul 30, 2008 at 6:57 pm
  • Hi, we have a system to archive mails and are facing some issues that we are having with search and indexing performance, the following is what we are currently facing challenges with, we are ...
    Mazhar LateefMazhar Lateef
    Jul 27, 2008 at 8:41 pm
    Jul 28, 2008 at 9:34 am
  • Hi all, I am searching for a way to ignore XML tags in the input when indexing. Is there a built in functionality in Lucene to get this done? I am sorry if this was discussed before. I searched but ...
    Kalani RuwanpathiranaKalani Ruwanpathirana
    Jul 24, 2008 at 6:18 am
    Jul 25, 2008 at 12:48 pm
  • Everything i've read and seen about luceen is search for keywords in documents; I want to do the reverse. I have a huge list of keywords("big boy","red ball","computer") and I have phrases that I ...
    Ryan DetzelRyan Detzel
    Jul 23, 2008 at 7:31 pm
    Jul 23, 2008 at 9:17 pm
  • If a complicated query is running in a Thread, how does Lucene respond to Thread.interrupt()? I want to be able to interrupt an in-progress query. - Paul ...
    Paul J. LucasPaul J. Lucas
    Jul 16, 2008 at 8:44 am
    Jul 22, 2008 at 9:52 pm
  • If a SpanQuery is constructed from one or more BoostingTermQuery(s), the payloads on the terms are never processed by the SpanScorer. It seems to me that you would want the SpanScorer to score the ...
    Peter KeeganPeter Keegan
    Jul 9, 2008 at 6:56 pm
    Jul 19, 2008 at 2:49 pm
  • Hi. Currently using Lucene 2.3.2 in a tomcat webapp. We have an action configured that performs reindexing on our staging server. However, our live server can not reindex since it does not have the ...
    Christopher KolstadChristopher Kolstad
    Jul 10, 2008 at 1:16 pm
    Jul 11, 2008 at 12:42 pm
  • The best strategy. Hello. I want to ask you opinion about to "How store multiple fields of same document". I see now two possibility's. 1. Multiple fields in document 2. One filed: for example named ...
    Sergey KabashnyukSergey Kabashnyuk
    Jul 31, 2008 at 2:37 pm
    Aug 1, 2008 at 6:51 pm
  • Hello, I've filled an index with 1100 text files with the names "monisys1" to "monisys1100". If I start a WildcardQuery WildcardQuery query = new WildcardQuery(new Term("fileId","monisys*")); Hits ...
    Jul 30, 2008 at 7:31 am
    Jul 31, 2008 at 7:05 pm
  • FYI -- there is a nasty bug that affects Lucene in Sun's 1.6 hotspot compiler, starting with 1.6.0_04. At least 3 known cases have been seen on this list. Details are here: ...
    Michael McCandlessMichael McCandless
    Jul 30, 2008 at 6:10 pm
    Jul 31, 2008 at 11:50 am
  • I need to execute a boolean query and get back just the bits of all the matching documents. I do additional filtering (date ranges and entitlements) and then do my own sorting later on. I know that ...
    Robert StewartRobert Stewart
    Jul 22, 2008 at 7:41 pm
    Jul 28, 2008 at 2:40 pm
  • Hi all, I need to replace some db queries with lucene due to response time issues for sure. In this special case I need to do a range query on a field and a prefix query. I'm trying to prepare and ...
    Thomas BeckerThomas Becker
    Jul 25, 2008 at 8:54 am
    Jul 25, 2008 at 9:36 am
  • Hi there, I know lucene is for indexing and not for frequent updates and delete. But i have been using lucene to store my matrix as a document. Since with my algorithm the value of matrix can change ...
    Jul 8, 2008 at 7:34 pm
    Jul 15, 2008 at 7:32 am
  • Hi, Can someone point me in the right direction please? How can I trap this situation correctly? I receive user queries like this (quotes included): /from:"fred flintston*"/ Which produces a query ...
    Chris BamfordChris Bamford
    Jul 3, 2008 at 1:39 pm
    Jul 4, 2008 at 4:19 pm
  • Hello, I don't have a good understanding of what options for avoid this corrupted index problem described in LUCENE-1282. It seems to me that I either downgrade JRE from 1.6.0_06 to 1.6.0_03, or wait ...
    Jul 1, 2008 at 8:02 pm
    Jul 2, 2008 at 1:23 am
  • I seem to recall some discussion about updating a payload, but I can't find it. I was wondering if it were possible to use a payload to implement 'modify' of a Lucene document. For example, I have an ...
    Antony BowesmanAntony Bowesman
    Jul 30, 2008 at 11:06 am
    Jul 31, 2008 at 9:34 am
  • hello, was not there a lucene delete by query feature coming up? I remember something like that, but I could not find an references. best regards, -c.b.
    Cam BazzCam Bazz
    Jul 23, 2008 at 1:53 pm
    Jul 24, 2008 at 10:21 am
  • helo all, In my project, we are indexing the US states...when we try to search on oregon ; state:OR, search on OR is throwing err...i know OR is a logical op in lucene...is there a way to escape such ...
    Aravind YarramAravind Yarram
    Jul 22, 2008 at 1:29 pm
    Jul 22, 2008 at 2:13 pm
  • Hi ALL , This is the exception raised when when am indexing the records (I have 10 million records and after indexing 4 million record i got this exception) java.io.IOException: background merge hit ...
    Jul 12, 2008 at 10:09 am
    Jul 21, 2008 at 10:10 am
  • Hi, I have a set of indices in different languages (very smal indices: on average each index directory has 10,000 documents, which has an overall size of less than 2mb). I want to know if this is a ...
    Mohsen SaboorianMohsen Saboorian
    Jul 17, 2008 at 6:30 am
    Jul 21, 2008 at 2:05 am
  • Hello, Could someone please confirm that calling indexWriter.optimize() is the only way to clean out the deleted documents from the disk? I understand that indexWriter.deleteDocuments() does not ...
    Jul 18, 2008 at 5:48 pm
    Jul 18, 2008 at 9:51 pm
  • Hi, I'm in the process of trying to optimize searches and avoid the dreaded OutOfMemoryError s. We currently return the entire document from each of the search results and then filter the results ...
    Declan NewmanDeclan Newman
    Jul 14, 2008 at 6:47 pm
    Jul 18, 2008 at 6:57 pm
  • You need to include ISOLatinFilter in your analyzer. That will convert all accented characters to their non-accented version. ------Original Message------ From: Aamir.Yaseen@globaldatapoint.com To: ...
    Anand JainAnand Jain
    Jul 16, 2008 at 9:02 am
    Jul 16, 2008 at 1:34 pm
Group Navigation
period‹ prev | Jul 2008 | next ›
Group Overview
groupjava-user @

116 users for July 2008

Michael McCandless: 74 posts Erick Erickson: 23 posts Karl Wettin: 22 posts Grant Ingersoll: 14 posts Starz10de: 14 posts Yonik Seeley: 14 posts John Griffin: 13 posts Chris Hostetter: 12 posts John Patterson: 11 posts Steven A Rowe: 11 posts Paul J. Lucas: 10 posts Ian Lea: 9 posts ನಾಗೇಶ್ ಸುಬ್ರಹ್ಮಣ್ಯ (Nagesh S): 9 posts Blazingwolf7: 8 posts Jamie: 8 posts Chris Bamford: 7 posts Karsten Fissmer: 7 posts Miztaken: 7 posts Mark Miller: 6 posts Matthew Hall: 6 posts
show more