Search Discussions
-
I occasionally get a FileNotFoundException like: Exception in thread "Thread-44" org.apache.lucene.index.MergePolicy $MergeException: java.io.FileNotFoundException: /Stuff/Caches/ ...
Paul J. Lucas
May 30, 2008 at 12:06 am
Jul 6, 2008 at 7:35 pm -
Any recent changes that would expose index corruption? I am getting two new errors when trying to search: nullpointer fieldsreaders line 260 indexoutofbounds on fieldinfo line 185 I am kind of ...
Mark Miller
May 5, 2008 at 7:34 pm
May 6, 2008 at 9:30 am -
Hi all, I have index of size 85MB. My query looks as follows: +(t:boss* d:boss* dd:boss* tg:boss*) +st:act +ntid:0 +cid:1 +dr:[20080410 TO 20081010] +rT:[002 TO 005] All the fields used in the query ...
Rakesh Shete
May 22, 2008 at 5:16 pm
May 30, 2008 at 9:19 pm -
Hi Lucene experts: I am working upgrading Lucene-Oracle integration project to latest Lucene 2.3.1 code. After correcting a minor issue on OJVMDirectory file implementation I have the integration ...
Marcelo Ochoa
May 6, 2008 at 9:37 pm
May 10, 2008 at 9:13 am -
Hi: We are experiencing memory leak with calling IndexReader.reopen(). From eyeballing the lucene source code, I am seeing normCache is not cleared. Anyone else experiencing this? Thanks -John
John Wang
May 28, 2008 at 6:25 am
Jun 1, 2008 at 2:07 pm -
Hi, I need to find a reliable way how to extract content out of Word, Excel and PowerPoint formats prior to indexing and I am not sure if POI is the best way to go. Can anybody share experience with ...
Lukas Vlcek
May 12, 2008 at 2:04 pm
May 13, 2008 at 3:50 pm -
hi all , I have a problem that how to "combine" two score to sort the search result documents. for example I have 10 million pages in lucene index , and i know their pagerank scores. i give a query ...
过佳
May 28, 2008 at 10:03 am
Jun 2, 2008 at 5:24 am -
"Don't iterate over more hits than needed. Iterating over all hits is slow for two reasons. Firstly, the search() method that returns a Hits object re-executes the search internally when you need ...
Stephane Nicoll
May 10, 2008 at 1:36 pm
May 24, 2008 at 8:59 am -
Hello all, I have been doing some evaluation of Lucene on a TReC collection and get a rather disappointing mean average precision (MAP) of 11%. Other sources seem to report a MAP of about 20%. So I ...
DanaWhite
May 4, 2008 at 6:14 pm
May 15, 2008 at 11:20 am -
It would appear that to see all results (including low scoring) I need to pass a different Filter to Searcher.search[1]. If filter is null, only the highest-scoring results are returned. How do I ...
Hasan Diwan
May 16, 2008 at 3:55 am
May 17, 2008 at 1:05 am -
Hi, I have some issue with boolean queries. I am using Lucene-core-2.3.1. I have done test on boolean query with 3 terms (data, store, variable) in my TTL field. The TTL field is indexed and searched ...
Sonu Sudhakar
May 28, 2008 at 11:44 am
Jun 3, 2008 at 5:31 am -
Hi: What is the current status on the distributed lucene project proposed at: http://www.mail-archive.com/general@lucene.apache.org/msg00338.html Thanks -John
John Wang
May 15, 2008 at 4:37 am
Jun 3, 2008 at 4:46 am -
Hi, other than the in memory terms (.tii), and the few kilobytes of opened file buffer, where are some other sources of significant memory consumption when searching on a large index ? ( 100GB). The ...
Alex
May 29, 2008 at 10:18 pm
May 29, 2008 at 11:16 pm -
Hi all I've got a bit of a niggling problem with how one of my searches is working as opposed to how my users would like it too work. We're indexing on UK postcodes, which are in the format of a 3 or ...
Chris Mannion
May 6, 2008 at 4:29 pm
May 24, 2008 at 9:16 am -
Hi, I am looking for a way to filter a SpanQuery according to some other query (on another field from the one used for the SpanQuery). I need to get access to the spans themselves of course. I don't ...
Eran Sevi
May 6, 2008 at 8:15 am
May 12, 2008 at 10:17 pm -
Hi there, We're using lucene with Hibernate search and we're very happy so far with the performance and the usability of lucene. We have however a specific use cases that prevent us to use only ...
Stephane Nicoll
May 1, 2008 at 8:01 am
May 2, 2008 at 4:59 pm -
I get the following error trace - java.io.FileNotFoundException: no segments* file found in ...
Ravi_116
May 29, 2008 at 3:04 am
Jan 5, 2011 at 7:44 am -
Hi, I haven't been able to find the answer to this question easily so any help would be appreciated. Thanks, Tom --------------------------------------------------------------------- To unsubscribe, ...
Tom Conlon
May 24, 2008 at 10:42 am
May 26, 2008 at 1:55 pm -
Dear Fellow Java/Lucene developers: I am trying to use the Highlighter class to return the keywords that the user is searching for in bold. However, instead of returning a fragment of the block of ...
Syedfa
May 21, 2008 at 1:34 am
May 24, 2008 at 4:25 am -
Hi, I've got an application which stores ratings for content in a Lucene index. It works a treat for the most part, apart from the use-case I have for being able to filter out ratings that have less ...
Dan Hardiker
May 12, 2008 at 5:40 pm
May 14, 2008 at 2:43 pm -
Hi, I am a newbie to Lucene. I have a question for making a query that associate 2 index files: - One index has the content index for a list of documents and a key to the document. That means the ...
Michael Siu
May 6, 2008 at 4:14 pm
May 7, 2008 at 12:02 am -
Bravo Grant! Rajesh, I believe the following will work: - delete your small index - optimize your big index (needed? Not 100% sure, but I think it is) - loop through the docs in your "big" index - ...
Otis Gospodnetic
May 1, 2008 at 1:20 am
May 2, 2008 at 1:41 am -
Hello everybody, sorry for posting to the list but I’m kinda helpless. I’m trying to unsubscribe from the mailing list but my unsubscribe email is treated as spam :) SMTP error from remote server ...
Daniel Freudenberger
May 30, 2008 at 11:12 am
Jun 1, 2008 at 8:45 pm -
Hello out there, We have implemented some open source desktop searching app based on Lucene http://sourceforge.net/projects/dynaq Development always goes further, and currently we make experiments ...
Christian Reuschling
May 27, 2008 at 4:37 pm
May 28, 2008 at 11:33 am -
hi, I have a ValueSourceQuery that makes use of a stored field. The field contains roughly 27.27 million untokenized terms. The average length of each term is 8 digits. The first search always takes ...
Alex
May 19, 2008 at 6:57 pm
May 20, 2008 at 6:12 pm -
Hi All, I am dealing with a situation where a document could possibly have multiple attachments to it, and they are all added to the index under a document-id (not lucene doc-id). Now if one of the ...
Dino Korah
May 19, 2008 at 8:17 am
May 19, 2008 at 6:36 pm -
As far as I know Lucene only handle single word synonyms at index time. My life would be much simpler if it was possible to add synonyms that spanned over multiple tokens, such as "lucene in ...
Karl Wettin
May 17, 2008 at 6:29 pm
May 18, 2008 at 6:00 pm -
Dear all, I'd like to do document clustering using full-text with Lucene. In other words, I would like to group similar documents in their respective groups. I searched the mailing list and found ...
Supheakmungkol SARIN
May 15, 2008 at 3:24 am
May 18, 2008 at 1:53 am -
Hi, I have a field in index which has been indexed using StandardAnalyzer and as TOKENIZED. Now I would like to write a query which returns the hit if there is a exact match on the field value. Say, ...
Gauri Shankar
May 13, 2008 at 1:44 pm
May 15, 2008 at 5:52 pm -
I have an index with several million documents that each contains between a few hundred terms and up to about a million terms. To me it feels like there would be a rather big difference between the ...
Karl Wettin
May 14, 2008 at 11:41 pm
May 15, 2008 at 3:55 pm -
Hello All, Any suggestions for extracting text from PDF? I have tried pdfbox, but it works nice, however if the pdf is structured, it wont provide good results. For example consider the pdf: P1 Lorem ...
Cam Bazz
May 14, 2008 at 9:32 am
May 15, 2008 at 3:49 pm -
For some reason it seems that either Lucene or Snowball has a problem with the color purple. According the snowball experts the problem is with lucene. Can anyone shed any light? Thanks, Steve ...
Stephen Cresswell
May 9, 2008 at 11:27 pm
May 10, 2008 at 8:22 am -
-- OS: Linux lg99 2.6.5-7.276-smp #1 SMP Fri Sep 28 20:33:22 AKDT 2007 x86_64 x86_64 x86_64 GNU/Linux -- Lucene: 2.3.2 (tried 2.2.0 as well, since the index was built around 2.2.0, jdk1.6.0_01 ) -- ...
Crspan
May 6, 2008 at 6:03 pm
May 7, 2008 at 3:33 pm -
Hi - trying to execute a search in Lucene and getting results I don't understand :( The index contains fields search_text and type - both indexed tokenized. I'm attempting to execute the query: ...
Casey Dement
May 22, 2008 at 10:21 pm
Jun 2, 2008 at 1:22 pm -
Hi All, I am trying to figure out a quick way to find the top N documents sorted by frequency of a term. I found: IndexRead.termDocs() which provides an enumeration of doc() and freq() but it returns ...
Hider, Sandy
May 28, 2008 at 2:49 pm
Jun 1, 2008 at 8:30 pm -
Hi, I'm running a SpanQuery and get the Spans result which tell me the documents and positions of what I searched for. I would now like to get the payloads in those documents and positions without ...
Eran Sevi
May 22, 2008 at 8:03 am
May 26, 2008 at 7:27 am -
We have a requirement to inform users on a regular basis of new material on which they have expressed interest. How are we to know what is "new" from the point of view of a particular user? Our idea ...
Lucene user
May 22, 2008 at 9:45 am
May 22, 2008 at 8:23 pm -
After upgrading to version 2.3.x from 2.2.0, we started experiencing issues with our index searches. Some searches produced false positives, while others produce no hits for terms known to be in ...
Dan Rugg
May 16, 2008 at 5:49 pm
May 20, 2008 at 6:07 pm -
Hi, I have an application where I need to issue queries with a large number of or-terms with individual boosts. Currently I just construct a BooleanQuery with a large number (often 1000) of ...
John Jensen
May 18, 2008 at 12:26 am
May 18, 2008 at 7:20 pm -
Greetings, I'm searching against a data set using lucene that contains searches such as the following: *ache* *aChe* etc and so forth, sadly this part of the dataset is imported via an external ...
Matthew Hall
May 15, 2008 at 4:35 pm
May 16, 2008 at 6:39 pm -
Hello there! We are starting with lucene, and in order to prove it's usage one of the benefits is performance. I do know that lucene (as other full text search engines) provide many more benefits ...
Vinicius Carvalho
May 16, 2008 at 5:20 pm
May 16, 2008 at 5:41 pm -
Hello, We are using lucene for a while, and we are happy with it. Now we want to optimize some space. We are parsing versions of files and we want to keep track of history and also know which one is ...
Jean-Claude Antonio
May 15, 2008 at 5:16 pm
May 16, 2008 at 12:02 am -
Hi, I have a TokenStream that inserts synonym tokens into the stream when matched. One thing I am wondering about is what is the effect of the startOffset and endOffset. I have something like this: ...
Brendan Grainger
May 12, 2008 at 4:06 pm
May 12, 2008 at 9:12 pm -
What is the limit of Lucene: # of docs per index? If RangeFilter.Bits(), for example, it initializes a bitset to the size of maxDoc from the indexReader. I wonder what happen if the # of docs is ...
Michael Siu
May 8, 2008 at 5:23 pm
May 8, 2008 at 6:29 pm -
I'm new to lucene and have a question on how to create a query for the following example... Say I have two fields, Title and Description, with the following data Item 1 Title: The greatest hits ...
Kelvin Foo Chuan Lyi
May 6, 2008 at 4:07 pm
May 6, 2008 at 4:41 pm -
: Hi Lucene-user and Lucene-dev, Please do not cross post -- java-user is the suitable place for your question. : Obviously there is something wrong with the above approach (as to get the : correct ...
Chris Hostetter
May 2, 2008 at 3:57 pm
May 4, 2008 at 3:14 pm -
I am using Hibernate Search in my Application, the first time i attempt to index records from the database it works and the second time i attempt to add records i notice that it does not work ...
Oyesiji
May 2, 2008 at 11:24 pm
May 3, 2008 at 3:31 am -
I am new to web services. This is the situation: We have a document/corpus indexed by Lucene and say it resides on C:\Lucene\Index We are hosting Lucene as a web service (following the instructions ...
Vatsan
May 30, 2008 at 6:42 pm
Jun 3, 2008 at 5:48 am -
Hi, Folks: What are some average search and retrieval times for Lucene queries in real production use? Would people include relevant stuff like the number of documents in your index, etc.? Thanks for ...
Lucene user
May 31, 2008 at 12:25 pm
Jun 1, 2008 at 8:17 pm -
I have a couple of quick questions about how Lucene indexes metadata: - Does it do anything special with metadata or treat it as a supplement to the words in the document? I have a feeling that the ...
Tod
May 20, 2008 at 5:36 pm
May 21, 2008 at 10:59 am
Group Overview
group | java-user |
categories | lucene |
discussions | 86 |
posts | 424 |
users | 116 |
website | lucene.apache.org |
116 users for May 2008
Archives
- June 2016 (77)
- May 2016 (96)
- April 2016 (116)
- March 2016 (67)
- February 2016 (76)
- January 2016 (78)
- December 2015 (85)
- November 2015 (114)
- October 2015 (95)
- September 2015 (123)
- August 2015 (98)
- July 2015 (107)
- June 2015 (85)
- May 2015 (70)
- April 2015 (103)
- March 2015 (130)
- February 2015 (183)
- January 2015 (111)
- December 2014 (147)
- November 2014 (117)
- October 2014 (118)
- September 2014 (148)
- August 2014 (206)
- July 2014 (161)
- June 2014 (282)
- May 2014 (162)
- April 2014 (152)
- March 2014 (152)
- February 2014 (219)
- January 2014 (147)
- December 2013 (88)
- November 2013 (176)
- October 2013 (220)
- September 2013 (232)
- August 2013 (257)
- July 2013 (320)
- June 2013 (223)
- May 2013 (228)
- April 2013 (233)
- March 2013 (309)
- February 2013 (224)
- January 2013 (425)
- December 2012 (246)
- November 2012 (301)
- October 2012 (200)
- September 2012 (116)
- August 2012 (229)
- July 2012 (302)
- June 2012 (203)
- May 2012 (253)
- April 2012 (172)
- March 2012 (245)
- February 2012 (347)
- January 2012 (302)
- December 2011 (191)
- November 2011 (246)
- October 2011 (251)
- September 2011 (230)
- August 2011 (197)
- July 2011 (254)
- June 2011 (374)
- May 2011 (310)
- April 2011 (310)
- March 2011 (422)
- February 2011 (227)
- January 2011 (365)
- December 2010 (239)
- November 2010 (322)
- October 2010 (295)
- September 2010 (192)
- August 2010 (295)
- July 2010 (296)
- June 2010 (292)
- May 2010 (299)
- April 2010 (359)
- March 2010 (399)
- February 2010 (448)
- January 2010 (467)
- December 2009 (478)
- November 2009 (699)
- October 2009 (609)
- September 2009 (450)
- August 2009 (465)
- July 2009 (582)
- June 2009 (470)
- May 2009 (513)
- April 2009 (609)
- March 2009 (684)
- February 2009 (389)
- January 2009 (356)
- December 2008 (589)
- November 2008 (480)
- October 2008 (508)
- September 2008 (604)
- August 2008 (582)
- July 2008 (522)
- June 2008 (444)
- May 2008 (424)
- April 2008 (453)
- March 2008 (515)
- February 2008 (560)
- January 2008 (619)
- December 2007 (405)
- November 2007 (471)
- October 2007 (392)
- September 2007 (337)
- August 2007 (568)
- July 2007 (584)
- June 2007 (496)
- May 2007 (623)
- April 2007 (542)
- March 2007 (765)
- February 2007 (669)
- January 2007 (602)
- December 2006 (469)
- November 2006 (498)
- October 2006 (598)
- September 2006 (572)
- August 2006 (668)
- July 2006 (692)
- June 2006 (695)
- May 2006 (609)
- April 2006 (497)
- March 2006 (695)
- February 2006 (541)
- January 2006 (544)
- December 2005 (368)
- November 2005 (529)
- October 2005 (565)
- September 2005 (526)
- August 2005 (493)
- July 2005 (409)
- June 2005 (570)
- May 2005 (363)
- April 2005 (464)
- March 2005 (419)
- February 2005 (600)
- January 2005 (636)
- December 2004 (633)
- November 2004 (597)
- October 2004 (460)
- September 2004 (495)
- August 2004 (450)
- July 2004 (552)
- June 2004 (491)
- May 2004 (355)
- April 2004 (362)
- March 2004 (486)
- February 2004 (375)
- January 2004 (285)
- December 2003 (377)
- November 2003 (452)
- October 2003 (217)
- September 2003 (291)
- August 2003 (186)
- July 2003 (226)
- June 2003 (218)
- May 2003 (334)
- April 2003 (256)
- March 2003 (276)
- February 2003 (228)
- January 2003 (190)
- December 2002 (192)
- November 2002 (365)
- October 2002 (280)
- September 2002 (179)
- August 2002 (117)
- July 2002 (203)
- June 2002 (229)
- May 2002 (248)
- April 2002 (282)
- March 2002 (228)
- February 2002 (252)
- January 2002 (134)
- December 2001 (146)
- November 2001 (327)
- October 2001 (177)
- September 2001 (1)