Search Discussions
-
Hi all, I have to index about 4.5Million txt files. When I run the my indexing application through Eclipse, I get this error : "Exception in thread "main" java.lang.OutOfMemoryError: Java heap space" ...
Sahin Buyrukbilen
Oct 20, 2010 at 4:11 am
Oct 22, 2010 at 5:02 pm -
Hi, I'am facing some problems in using Lucene. The index I am using is constructed like this: try { Analyzer analyzer = new SnowballAnalyzer(Version.LUCENE_30, "English"); Directory dir = ...
Subwayne
Oct 14, 2010 at 9:07 am
Oct 15, 2010 at 9:03 am -
hello all, I would like to ask of how to add new documents to an existing lucene index. I mean what's class should I use to achieve this goal. thanks -- http://jacobian.web.id ...
Yakob
Oct 27, 2010 at 12:05 pm
Oct 29, 2010 at 7:21 am -
Hello I am trying to use a TermFreqVector to get a count of all words in a Document as follows: // Search. int hitsPerPage = 10; IndexSearcher searcher = new IndexSearcher(index, true); ...
Martin O'Shea
Oct 20, 2010 at 6:23 pm
Oct 22, 2010 at 2:09 pm -
We are working with a large readonly lucene index(single segment) with large number of fields and documents and are running into memory usage problems. We found that when using a ...
Cabansag, Ronald-Alvin R
Oct 29, 2010 at 1:27 pm
Oct 31, 2010 at 3:26 am -
Hi All, Can anyone help with this issue? I have about 2000 pdf files that I use PDFBox to extract its text, then index them using for loop. The indexing stopped after the fdt file reaches at 7,061 KB ...
Ching
Oct 13, 2010 at 2:39 am
Oct 14, 2010 at 5:33 am -
Hello All: Can any one suggest me the best way to implement both sentence specific and non sentence specific phrase search? The user is going to have a check box for phrase search on the screen that ...
Sirish Vadala
Oct 6, 2010 at 6:33 pm
Oct 8, 2010 at 2:28 am -
I'm stepping tru a rdf file (the project gutenberg catalog) and sending data to a lucene index to allow searches of titles authors and such. However the gutenberg rdf is a little bit "special". It ...
Paulo Levi
Oct 31, 2010 at 5:20 pm
Nov 1, 2010 at 8:54 am -
Hello, I have been looking at the SearcherManager example provided in the "Lucene In Action 2nd Edition" book. It seems like a great way to manage IndexReaders but I had a few questions about the ...
Pulkit Singhal
Oct 27, 2010 at 4:58 pm
Oct 27, 2010 at 5:47 pm -
Hi Group, I have an isue when using MultiFieldQueryParser, I would like to use one query against a number of fields however I get an java.lang.IllegalArgumentException: queries.length != ...
Lev Bronshtein
Oct 14, 2010 at 1:05 am
Oct 25, 2010 at 10:58 am -
Dear All, Currently, I'm using PHP/Java Bridge to have Lucene in my PHP web application, and also using the java extension for PHP. FYI, I'd setup lucene on my PC several months ago and my code below ...
Dian puma
Oct 23, 2010 at 4:01 pm
Oct 25, 2010 at 6:13 am -
Hello I have a StandardAnalyzer working which retrieves words and frequencies from a single document using a TermVectorMapper which is populating a HashMap. But if I use the following text as a field ...
Martin O'Shea
Oct 24, 2010 at 7:59 pm
Oct 25, 2010 at 2:31 am -
well actually I am doing a kind of a thesis regarding information retrieval.and my tutor wanted me to be able to create a program that firstly index a document in memory using RAMDirectory and then ...
Yakob
Oct 12, 2010 at 1:37 am
Oct 16, 2010 at 4:48 pm -
Hi, is there a way to store additional metadata with fields? My Problem is as follows: I'm extracting extended html with tika. This extended html contains references to pages, x,y values of the text ...
Christoph Hermann
Oct 14, 2010 at 10:18 am
Oct 15, 2010 at 7:23 pm -
Hi, I am curious. Do you know why the book Lucene in Action, Second Edition is not available on sale (as new) on Amazon UK? http://www.amazon.co.uk/Lucene-Action-Michael-McCandless/dp/1933988177 Do ...
Paolo Castagna
Oct 12, 2010 at 5:25 am
Oct 12, 2010 at 8:27 am -
Hi all, I'm having some issues with Numeric Range queries not working as expected. My underlying storage medium is the Lucandra index reader and writer, so I'm not sure if this is an issue within ...
Todd Nine
Oct 4, 2010 at 4:14 am
Oct 6, 2010 at 7:15 pm -
Hi all, I need to retrieve the score of a term in a document? I dont want to play different scoring schemes. I just checked my index with Luke and it shows me a score for each term in each document ...
Sahin Buyrukbilen
Oct 1, 2010 at 3:33 pm
Oct 2, 2010 at 6:49 pm -
I'd like to provide myself with a searchable index of email. I'm familiar with the Javamail library, so will use this to fetch the mail. Anyone out there done any indexing of email? On Sourceforge, ...
Hasan Diwan
Oct 27, 2010 at 9:58 pm
Oct 28, 2010 at 6:46 pm -
I've written a blog regarding a work around for updating index in Lucene using parallel reader. It's explained with results and pictures. It would be great if you have a look at it. The link: ...
Nilesh Vijaywargiay
Oct 20, 2010 at 6:58 pm
Oct 28, 2010 at 5:53 am -
Hello, I've been running into a problem during a merge. Would appreciate knowing what to look for since the exception doesn't seem too explanatory. I get: -- --- Nested Exception --- ...
Cristian Vat
Oct 20, 2010 at 6:46 pm
Oct 20, 2010 at 8:19 pm -
Hello I would like to store data retrieved hourly from RSS feeds in a database or in Lucene so that the text can be easily indexed for word frequencies. I need to get the text from the title and ...
Appy74
Oct 14, 2010 at 2:18 pm
Oct 15, 2010 at 6:16 pm -
Hi there, I'm currently trying to work out how I can determine the type (string/number/date/etc)of a term. I've not seen any off the shelf way to do it so am trying to store a payload against each ...
Sykes, Derek
Oct 13, 2010 at 3:38 pm
Oct 15, 2010 at 2:29 pm -
I have two index, A and B. Can two documents doc1[in index A] and doc2[in index B] have a common field? doc1 and doc2 have same document Id's.
Nilesh Vijaywargiay
Oct 14, 2010 at 3:43 pm
Oct 15, 2010 at 1:43 am -
Hi all, I only want to index the latest one week's data, the previous data can be deleted. So I'd like to know about lucene's delete performance and whether it will has impact on the search ...
Jeff Zhang
Oct 13, 2010 at 1:38 pm
Oct 13, 2010 at 8:55 pm -
Hi Group, I understand that the process of updating a document in lucene index is to delete the document and add it again. But I do not want to delete the document. I was thinking of a approach where ...
Nilesh Vijaywargiay
Oct 12, 2010 at 6:06 am
Oct 12, 2010 at 6:50 pm -
When running application on Windows XP 32 bit machine the search time is 0.5 second. JVM is IBM Java 5 for 32 bit. But when running the same application on much more powerfull Windows Server 2007 64 ...
Sergey
Oct 6, 2010 at 10:22 am
Oct 7, 2010 at 5:09 am -
Having upgraded a live system from 2.4 to 2.9.3 the client is reporting a change in merge behaviour that is causing some issues with their update monitoring logic. The suggestion is that any merge ...
Mark Harwood
Oct 5, 2010 at 10:27 pm
Oct 6, 2010 at 3:25 pm -
Hello, I'd like to know which field got hit in each doc in the hit results. To implement it, I thought I could use Scorer.freq() which was introduced 3.1/4.0: ...
Koji Sekiguchi
Oct 4, 2010 at 1:59 am
Oct 4, 2010 at 5:17 pm -
Hi guys, I am trying to get some information on what enterprise hardware folks use out there. We are using Lucene extensively. Our total catalogs size is roughly 50GB between roughly 8 various ...
Kovnatsky, Eugene
Oct 26, 2010 at 12:17 am
Oct 27, 2010 at 10:02 am -
hey - is there an API that return the number of term indexed? I found the API return the amount of document indexed (IndexWriter.docCount) but cant find an API for the amount of terms in the index. ...
APOLO_11
Oct 16, 2010 at 5:53 am
Oct 16, 2010 at 10:16 am -
Hi my original problem is to index large number of documents which contains 360 integers in rage from 0-90K. Searching it's a little bit complicated - I need to find most similar documents where ...
Zaharije Pasalic
Oct 15, 2010 at 8:31 am
Oct 15, 2010 at 1:27 pm -
Hi, I am keeping a ConcurrentMap of o.a.l.index.IndexReader which I use in my system. These readers are retrieved by multiple threads and I have no knowledge when these readers are actively used and ...
Mindaugas Žakšauskas
Oct 5, 2010 at 10:14 am
Oct 5, 2010 at 4:01 pm -
Hi all, The JavaDocs do not appear to mention that only stored fields persist IndexWriter.updateDocument. When opening new readers, from either IndexWriter.getReader or IndexReader.open, neither ...
Justin
Oct 4, 2010 at 6:03 pm
Oct 4, 2010 at 6:20 pm -
I need to auto-categorize a large number of documents. They are basically news articles from major news sources (nytimes, npr, abcnews, etc). I'd like to categorize them automatically. Any ...
Maria Vazquez
Oct 27, 2010 at 8:12 pm
Oct 28, 2010 at 2:13 am -
Am about to implement a custom query that is sort of mash-up of Facets, Highlighting, and SpanQuery - but thought I'd see if anyone has done anything similar. In simple words, I need facet on the ...
Lucene
Oct 26, 2010 at 12:32 pm
Oct 26, 2010 at 12:53 pm -
I got stuck on a problem using NumericFields using with lucene 2.9.3 I add values to the document by doc.add(new NumericField("minprice").setDoubleValue(net_price)); If I want to search with a sorter ...
Uwe Goetzke
Oct 26, 2010 at 6:33 am
Oct 26, 2010 at 7:47 am -
I'm currently working on building a Geocoder. The purpose of a Geocoder is to find the coordinates belonging to any given input address. I have a rather simple version based on Lucene working, ...
Jasper de Barbanson
Oct 20, 2010 at 6:48 am
Oct 20, 2010 at 9:05 am -
I have many fields in my document and want to parse my query including each of them QueryParser parser = new QueryParser(Version.LUCENE_29, "Field2",new StandardAnalyzer(Version.LUCENE_29)); Should I ...
Nilesh Vijaywargiay
Oct 18, 2010 at 5:55 pm
Oct 18, 2010 at 10:24 pm -
Is there interest in having a Meetup at ApacheCon? Who's going? Would anyone like to present? We could do something less formal, too, and just have drinks and Q&A/networking. Thoughts? -Grant ...
Grant Ingersoll
Oct 18, 2010 at 6:57 pm
Oct 18, 2010 at 10:05 pm -
Hi, I would like to change the IDF value of the Lucene similarity computation to "inverse document frequency inside category". Not the complete collection should be considered, but only the documents ...
Max Jakob
Oct 18, 2010 at 11:27 am
Oct 18, 2010 at 2:33 pm -
Hello, We're currently evaluating utilizing Lucene to index a large English corpus and we were are optimizing for space. We're basically concerned that the size of the postings lists will become ...
Mahmoud Abdelkader
Oct 17, 2010 at 7:17 am
Oct 17, 2010 at 8:31 pm -
Hello, how can i copy the Payload from the current token to the following token in a TokenFilter? I have implemented a TokenFilter and thought, that i could use input.incrementToken() to advance the ...
Christoph Hermann
Oct 16, 2010 at 6:32 pm
Oct 17, 2010 at 6:03 pm -
Hi all, I am having issues building Lucene and Solr from svn checkout. I had this problem earlier but I was able to figure out the combination of ant and maven-ant-tasks that worked. Last few months ...
Pradeep Singh
Oct 11, 2010 at 3:54 am
Oct 11, 2010 at 3:51 pm -
Hi Guys, Is there way to detect org.apache.lucene.util.Version of an index having IndexReader or just FSDirectory? I know I can open segments file and read the proper bytes according to rules of ...
Ivan Vasilev
Oct 8, 2010 at 12:35 pm
Oct 8, 2010 at 1:00 pm -
Hi Everyone, Recently we have migrated from lucene 2.2 to lucene 2.9.3. We are having some issues in search. During the load, searchers are getting hung up. When we took a process stack, we sound ...
Shailendra Mudgal
Oct 7, 2010 at 5:13 am
Oct 7, 2010 at 3:46 pm -
Hi, I was indexing some documents, but my program crashed after several days of work. If I reopen this index it is empty. I guess the reason is that auto-commit was not set and I never performed a ...
Philippe Thomas
Oct 6, 2010 at 10:02 am
Oct 6, 2010 at 10:30 am -
In lucene 3, is there an equivalent to obtaining a BitSet of documents from an Index as there was in version 2.x? I'm trying to put together an upgrade path. Thanks! ...
Jordon Saardchit
Oct 4, 2010 at 7:10 pm
Oct 5, 2010 at 1:47 pm -
Lets say the segment infos file is missing, and I'm aware of CheckIndex, however is there a tool to recreate a segment infos file? ...
Jason Rutherglen
Oct 4, 2010 at 7:25 pm
Oct 5, 2010 at 12:39 pm -
How do I use MemoryIndex or RAMDirectory, but score using term statistics from a corpus given during preprocessing? Let's say I want to use a MemoryIndex or RAMDirectory to store a *single* document, ...
Joseph Turian
Oct 29, 2010 at 12:07 am
Nov 1, 2010 at 8:40 pm -
Dear All Was setting up a web search with a query language that uses (, !, ), ^, *, ?, {, } and < in its syntax. For example: hot dog: Looks for documents with hot and dog in close vincinity. (hot ...
Jan Burse
Oct 28, 2010 at 7:05 pm
Oct 28, 2010 at 11:36 pm
Group Overview
group | java-user |
categories | lucene |
discussions | 73 |
posts | 295 |
users | 102 |
website | lucene.apache.org |
102 users for October 2010
Archives
- June 2016 (77)
- May 2016 (96)
- April 2016 (116)
- March 2016 (67)
- February 2016 (76)
- January 2016 (78)
- December 2015 (85)
- November 2015 (114)
- October 2015 (95)
- September 2015 (123)
- August 2015 (98)
- July 2015 (107)
- June 2015 (85)
- May 2015 (70)
- April 2015 (103)
- March 2015 (130)
- February 2015 (183)
- January 2015 (111)
- December 2014 (147)
- November 2014 (117)
- October 2014 (118)
- September 2014 (148)
- August 2014 (206)
- July 2014 (161)
- June 2014 (282)
- May 2014 (162)
- April 2014 (152)
- March 2014 (152)
- February 2014 (219)
- January 2014 (147)
- December 2013 (88)
- November 2013 (176)
- October 2013 (220)
- September 2013 (232)
- August 2013 (257)
- July 2013 (320)
- June 2013 (223)
- May 2013 (228)
- April 2013 (233)
- March 2013 (309)
- February 2013 (224)
- January 2013 (425)
- December 2012 (246)
- November 2012 (301)
- October 2012 (200)
- September 2012 (116)
- August 2012 (229)
- July 2012 (302)
- June 2012 (203)
- May 2012 (253)
- April 2012 (172)
- March 2012 (245)
- February 2012 (347)
- January 2012 (302)
- December 2011 (191)
- November 2011 (246)
- October 2011 (251)
- September 2011 (230)
- August 2011 (197)
- July 2011 (254)
- June 2011 (374)
- May 2011 (310)
- April 2011 (310)
- March 2011 (422)
- February 2011 (227)
- January 2011 (365)
- December 2010 (239)
- November 2010 (322)
- October 2010 (295)
- September 2010 (192)
- August 2010 (295)
- July 2010 (296)
- June 2010 (292)
- May 2010 (299)
- April 2010 (359)
- March 2010 (399)
- February 2010 (448)
- January 2010 (467)
- December 2009 (478)
- November 2009 (699)
- October 2009 (609)
- September 2009 (450)
- August 2009 (465)
- July 2009 (582)
- June 2009 (470)
- May 2009 (513)
- April 2009 (609)
- March 2009 (684)
- February 2009 (389)
- January 2009 (356)
- December 2008 (589)
- November 2008 (480)
- October 2008 (508)
- September 2008 (604)
- August 2008 (582)
- July 2008 (522)
- June 2008 (444)
- May 2008 (424)
- April 2008 (453)
- March 2008 (515)
- February 2008 (560)
- January 2008 (619)
- December 2007 (405)
- November 2007 (471)
- October 2007 (392)
- September 2007 (337)
- August 2007 (568)
- July 2007 (584)
- June 2007 (496)
- May 2007 (623)
- April 2007 (542)
- March 2007 (765)
- February 2007 (669)
- January 2007 (602)
- December 2006 (469)
- November 2006 (498)
- October 2006 (598)
- September 2006 (572)
- August 2006 (668)
- July 2006 (692)
- June 2006 (695)
- May 2006 (609)
- April 2006 (497)
- March 2006 (695)
- February 2006 (541)
- January 2006 (544)
- December 2005 (368)
- November 2005 (529)
- October 2005 (565)
- September 2005 (526)
- August 2005 (493)
- July 2005 (409)
- June 2005 (570)
- May 2005 (363)
- April 2005 (464)
- March 2005 (419)
- February 2005 (600)
- January 2005 (636)
- December 2004 (633)
- November 2004 (597)
- October 2004 (460)
- September 2004 (495)
- August 2004 (450)
- July 2004 (552)
- June 2004 (491)
- May 2004 (355)
- April 2004 (362)
- March 2004 (486)
- February 2004 (375)
- January 2004 (285)
- December 2003 (377)
- November 2003 (452)
- October 2003 (217)
- September 2003 (291)
- August 2003 (186)
- July 2003 (226)
- June 2003 (218)
- May 2003 (334)
- April 2003 (256)
- March 2003 (276)
- February 2003 (228)
- January 2003 (190)
- December 2002 (192)
- November 2002 (365)
- October 2002 (280)
- September 2002 (179)
- August 2002 (117)
- July 2002 (203)
- June 2002 (229)
- May 2002 (248)
- April 2002 (282)
- March 2002 (228)
- February 2002 (252)
- January 2002 (134)
- December 2001 (146)
- November 2001 (327)
- October 2001 (177)
- September 2001 (1)