Search Discussions
-
Hi, I tried stressing Lucene in a controlled environment: one static IndexSearcher for an index that doesn't change, and in same process I create a number of Threads that call this Searcher ...
Oren Shir
Nov 21, 2005 at 3:23 pm
Apr 5, 2006 at 9:52 pm -
28
Memory Usage
Hi. What is the expected memory usage of Lucene these days? I dug up an old email [1] from 2001 which gave the following summary of memory usage: An IndexReader requires: one byte per field per ...Daniel Noll
Nov 10, 2005 at 12:43 am
Nov 18, 2005 at 2:38 am -
13
Lucene + LSI
Have any one implemented LSI in Lucene? Kindly let me know how hard/easy it is. thanks chandanaChandana
Nov 30, 2005 at 11:42 pm
Dec 20, 2005 at 9:20 am -
Hello, My analyzer sometimes gives multiple terms for the same word. This makes them generated at the same position. When I use PhraseQuery to search for this term, it matches documents with all ...
Ahmed El-dawy
Nov 4, 2005 at 8:46 am
Nov 7, 2005 at 9:08 pm -
13
Creating document fields by providing termvector directly (bypassing the analyzing/tokenizing stage)
Hi, I'm using lucene (which rocks, btw ;) behind the scenes at www.last.fm for various things, and i've run into a situation that seems somewhat inelegant regarding populating fields which i already ...Richard Jones
Nov 2, 2005 at 1:11 pm
Nov 2, 2005 at 5:21 pm -
I'm working on building a custom highlighter for a client, which may eventually be generalizable. In my work, I've come across some issues I'd like to discuss. One issue is of appended fields ...
Erik Hatcher
Nov 20, 2005 at 1:48 pm
Nov 22, 2005 at 8:55 am -
When I boost fields while indexing, the fields still have a boost of 1.0 during searching. When I view the values via Luke, it confirms the value of 1.0. Do I have to boost it agin during search? I ...
Daniel Clark
Nov 17, 2005 at 10:39 am
Nov 19, 2005 at 4:22 pm -
Hi all, Can somebody please suggest a way/ways on how to optimize execution times this query below (or to use some of Trunk BooleanScorers)... Probably I do not see obvious. Use Case: Here I have ...
Eks dev
Nov 9, 2005 at 5:34 pm
Dec 23, 2005 at 12:04 pm -
Hi. I was wondering if anyone else has seen this before. I'm using lucene 1.4.3 and have indexed about 3000 text documents using the statement: doc.add(Field.Text("contents", new FileReader(f), ...
Marigoldcc
Nov 21, 2005 at 7:53 am
Nov 21, 2005 at 8:44 pm -
Dear all, I'd like to extract each term and its frequency in the index and each file in order to get the potential keywords of each file. Does Lucene provide any built-in method to do that? Thank you ...
Supheakmungkol SARIN
Nov 14, 2005 at 3:22 am
Nov 15, 2005 at 4:55 pm -
Hi, I am having 1,00,000 documents in a index but in near future it will be 3 million and more. I am somewhat concerned about the searhing process with this much number of document. I am giving order ...
Manoj Kr. Sheoran
Nov 4, 2005 at 6:07 am
Nov 5, 2005 at 4:10 am -
I am a brand new newbie with respect to Lucene, and I am just figuring out how to include it into an application I am building. (personal blog) In essence I have a set of articles that reside in a ...
Alan Chandler
Nov 23, 2005 at 7:30 pm
Jan 28, 2006 at 8:43 pm -
Hi, I have a requirement to highlight search keywords in the results and display the matching fragment of the text with the results. I am using the Hits highlighting mentioned in Lucene in Action. ...
Harini Raghavan
Nov 30, 2005 at 12:47 pm
Dec 5, 2005 at 12:35 pm -
I'm trying to reverse sort a result set by it's date field (YYYYMMDDhhmm). pseudocode: Boolean order = {false | true}; Sort sorter = new Sort(new SortField("date", order)); hits = ...
Michael Pow
Nov 29, 2005 at 1:59 am
Dec 1, 2005 at 5:57 am -
Hi, I am not that experienced with Java and am attempting to implement the commit method for the IndexReader for the application I'm developing. I am trying to extend the IndexReader class but it ...
Malcolm Clark
Nov 25, 2005 at 12:32 pm
Nov 28, 2005 at 4:25 pm -
Is using a QueryParser to parse a query using the same, single instance of Analyzer thread-safe? Or should I create a new Analyzer each time? ...
Sharma, Siddharth
Nov 1, 2005 at 3:46 pm
Nov 1, 2005 at 8:54 pm -
Hello, I use Lucene in a long-running server application on a Linux server, and the other day I got the "Too many open files" exception. I've increased the number of allowed file handles, but was ...
Matt Magoffin
Nov 18, 2005 at 1:08 am
Nov 18, 2005 at 6:22 pm -
I am indexing persons that has the usual fields name, address etc. I need to keep track of which name and addresses are active now and which ones are old. I do that by having a two sets of fields ...
Lasse L
Nov 11, 2005 at 6:27 pm
Nov 14, 2005 at 8:23 am -
Hello all, I am wondering how many of you actually work with own scoring mechanism (overwriting Lucenes standard scoring) and how many of you do work on how to normalise this score. I would like to ...
Karl Koch
Nov 5, 2005 at 8:26 pm
Nov 7, 2005 at 6:43 pm -
Hello group, the scoring formula for Lucene is well explained in "Lucene in Action". However, is this formula also valid for Lucene 1.2 (which I am using). I need to know that for documentation ...
Karl Koch
Nov 4, 2005 at 7:52 pm
Nov 6, 2005 at 1:35 am -
Hi I am trying to download the source code for tm-extractors-0.4.jar from http://www.textmining.org/ Looks like the site has been hacked. Does anyone know the location of the CVS or SVN repository? ...
Patrick Kimber
Nov 24, 2005 at 4:47 pm
Jan 5, 2006 at 10:55 am -
Hi, I use Lucene to index stuff that are changed very often but don't need to be real-time to searchers. e.g. the search result can be changed couple times per minute, but I only need to show the ...
Victor Lee
Nov 24, 2005 at 10:51 pm
Nov 25, 2005 at 9:31 pm -
Hi, For final finish up on work for my project. We intend to do search clustering. Now I have already read that there is no clear cut way of doing that in lucene. Wondering, if anyone has tackled ...
Supreet Sethi
Nov 23, 2005 at 12:05 pm
Nov 24, 2005 at 11:33 am -
Hello, Luceners! What is "stemming"? I have Lucene in Action and found the following definitions on page 103: - reducing words to a root form (stemming) - changing words into the basic form ...
Koji Sekiguchi
Nov 20, 2005 at 3:48 pm
Nov 21, 2005 at 6:34 am -
I have indexed a set of documents that do not have fields. I want to use the getTermFreqVector method from IndexReader to get the frequencies. However when I do that as: TermFreqVector[] z = ...
Anna Buczak
Nov 18, 2005 at 3:37 am
Nov 18, 2005 at 5:41 pm -
Hi Folks. I downloaded the Lucene and tried to do an ant. It initially gave me the following error: BUILD FAILED file:/home/parikpol/downloads/lucene-1.4.3/build.xml:11: Unexpected element "tstamp" I ...
Pol, Parikshit
Nov 16, 2005 at 8:02 pm
Nov 17, 2005 at 7:44 pm -
Hi all. I have a question about sorting. Lucene in Action says: "For numeric types, each field being sorted for each document in the index requires that four bytes be cached. For String types, each ...
Monsur Hossain
Nov 10, 2005 at 1:36 am
Nov 11, 2005 at 12:12 am -
Hi, If I understand correctly, when sorting by Sort.INDEXORDER the oldest documents that were added to the index will be returned first. I want the reverse, because I'm more interested in newer ...
Oren Shir
Nov 3, 2005 at 2:46 pm
Nov 3, 2005 at 4:31 pm -
Hi all, I am interested in developing a system which will use Lucene to implement the search functionality. A key characteristic of this system is that certain information about the indexed documents ...
Marios Skounakis
Nov 16, 2005 at 9:47 am
Jun 17, 2008 at 8:03 pm -
This is an other good reason for buying the book ultimately! Thx stefan -----Ursprüngliche Nachricht----- Von: Erik Hatcher Gesendet: Dienstag, 29. November 2005 15:57 An: java-user@lucene.apache.org ...
Stefan Gusenbauer
Nov 29, 2005 at 3:06 pm
Nov 30, 2005 at 8:57 pm -
Hello, I am searching over multiple indices using MultiSearcher. Thus I get hits from various indices. Is it possible to determine from which index a hit comes? The solution I found is to store the ...
Pbatcoi
Nov 29, 2005 at 1:48 pm
Nov 30, 2005 at 12:41 pm -
I've read many comments from users on the list indicating that sorting may/will be performance-heavy. Is high CPU utilization with a sorted search one of the expected performance hits? In tests for ...
Jeff Rodenburg
Nov 20, 2005 at 9:28 pm
Nov 21, 2005 at 7:04 am -
I indexed dates using Field.Keyword(String,Date). The values seem to be encoded when I retrieve them via document.get("date"). Luke confirmed it. How do I decode the Date when retrieving from ...
Daniel Clark
Nov 17, 2005 at 10:43 am
Nov 17, 2005 at 12:14 pm -
Hi All, I'm using a bunch of SpanNearQueries combined in a SpanOrQuery to do a set of searches matches a phrase with a prefix search at the end. I.e. "phrase with prefix s*" kind of thing that ...
Greg K
Nov 16, 2005 at 8:20 pm
Nov 17, 2005 at 7:48 am -
Howdy all, have a quick question for you... I am seeing quite a difference between optimized index and one that is not optimized. I have read a few papers that say that it shouldn't matter, but I am ...
Aigner, Thomas
Nov 16, 2005 at 7:23 pm
Nov 16, 2005 at 10:28 pm -
Dear all, I'd like to add some other stopwords to the StandardAnalyzer. How do i do this? Thanks a lot in advance, Mungkol Yahoo! FareChase: Search multiple travel sites in one click. ...
Supheakmungkol SARIN
Nov 14, 2005 at 6:02 am
Nov 15, 2005 at 2:45 am -
I have several indexes I want to search together. What performs better a single searcher on a multi reader or a single multi searcher on multiple searchers (1 per index). Thanks Mike
Mike Streeton
Nov 11, 2005 at 8:49 am
Nov 14, 2005 at 8:54 am -
Hi I'm using highlighter and have this problem: The query is over two or more fields, like: *body:home AND title:sale* I want to highlight over body field, but not highlight "sale" if "sale" is in ...
Ernesto De Santis
Nov 11, 2005 at 3:34 pm
Nov 11, 2005 at 6:40 pm -
I've used the example posted at http://wiki.apache.org/jakarta-lucene/LuceneFAQ#head-a801793d7479264e29157d92440199b35266dc18 to find all terms used in a complete index, but was wondering if there is ...
Matt Magoffin
Nov 9, 2005 at 11:50 pm
Nov 11, 2005 at 12:37 am -
Is there a way to limit the number of hits I want returned? Sometimes I just want one document. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Daniel Clark, Senior Consultant Sybase Federal Professional ...
Daniel Clark
Nov 10, 2005 at 12:55 am
Nov 10, 2005 at 11:34 pm -
If I have a Document object (doc), and I also have an IndexReader open, how can I find out IndexReader's docid corresponding to (doc)? IndexReader has a map from docid - Document, but I don't see the ...
Tlittell
Nov 9, 2005 at 3:55 pm
Nov 9, 2005 at 9:15 pm -
Hi, I am looking at Lucene to index and search file metadata - filename, size, permissions, mtime, ctime, atime, etc. I do not need to index and search the contents of the file. I was wondering if ...
Lmuxer-mailinglists
Nov 2, 2005 at 10:33 pm
Nov 3, 2005 at 9:22 am -
I have a body of text which is being added to a document as unstored. All the words in the body text are coming through in the token stream for analyzing. For some reason I can search on some of ...
Combs, Craig
Nov 30, 2005 at 1:05 pm
Nov 30, 2005 at 8:44 pm -
Hi, I am stumped. I can't seem to get word docs indexed. I have tried both POI and textmining libraries to little or no real affect. I dump the doc files into a text file with the same variable I use ...
Steven Bell
Nov 27, 2005 at 5:49 am
Nov 28, 2005 at 4:28 am -
This is the results for the StandardTokenizer: input - output token - output type 1. 1.2 - 1.2 - <HOST 2. 1.2. - 1.2 - <HOST 3. a.b - a.b - <HOST 4. a.b. - a.b. - <ACRONYM 5. www.apache.org - ...
Yahootintin 11533894
Nov 22, 2005 at 12:40 am
Nov 22, 2005 at 6:44 pm -
Hi I have checked out the latest version of Lucene from CVS and have found a change in the results compared to version 1.4.3. The issue is with the deprecated API in the BooleanQuery class. The ...
Patrick Kimber
Nov 15, 2005 at 10:24 am
Nov 18, 2005 at 11:09 am -
Hi Our index contains articles with special characters. For instance, the string P&O is indexed as P&O. The correct entity codes are indexed for all the special characters we use. My question is ...
Lucene User
Nov 15, 2005 at 4:00 pm
Nov 16, 2005 at 5:05 pm -
All, In the project I'm working on we have a separate index for each database. There are 12 databases now. but in the future there may be as many as 20. They all have their own release cycle so I ...
Jason Calabrese
Nov 14, 2005 at 10:52 pm
Nov 15, 2005 at 11:16 pm -
I got this nagging problem that I can't figure out in the source code of Lucene. In the file org/apache/lucene/queryParser/QueryParser.java, there's a method called parse that returns a Query (see ...
Eugene Ezekiel
Nov 13, 2005 at 6:39 pm
Nov 13, 2005 at 9:56 pm -
Hi, Was wondering if someone could help me out with a few things in Korean as related to Lucene: 1. Which Analyzer do you recommend? From the list, I see that some have had success with the ...
Grant Ingersoll
Nov 11, 2005 at 1:37 pm
Nov 12, 2005 at 9:47 am
Group Overview
group | java-user |
categories | lucene |
discussions | 127 |
posts | 529 |
users | 140 |
website | lucene.apache.org |
140 users for November 2005
Archives
- June 2016 (77)
- May 2016 (96)
- April 2016 (116)
- March 2016 (67)
- February 2016 (76)
- January 2016 (78)
- December 2015 (85)
- November 2015 (114)
- October 2015 (95)
- September 2015 (123)
- August 2015 (98)
- July 2015 (107)
- June 2015 (85)
- May 2015 (70)
- April 2015 (103)
- March 2015 (130)
- February 2015 (183)
- January 2015 (111)
- December 2014 (147)
- November 2014 (117)
- October 2014 (118)
- September 2014 (148)
- August 2014 (206)
- July 2014 (161)
- June 2014 (282)
- May 2014 (162)
- April 2014 (152)
- March 2014 (152)
- February 2014 (219)
- January 2014 (147)
- December 2013 (88)
- November 2013 (176)
- October 2013 (220)
- September 2013 (232)
- August 2013 (257)
- July 2013 (320)
- June 2013 (223)
- May 2013 (228)
- April 2013 (233)
- March 2013 (309)
- February 2013 (224)
- January 2013 (425)
- December 2012 (246)
- November 2012 (301)
- October 2012 (200)
- September 2012 (116)
- August 2012 (229)
- July 2012 (302)
- June 2012 (203)
- May 2012 (253)
- April 2012 (172)
- March 2012 (245)
- February 2012 (347)
- January 2012 (302)
- December 2011 (191)
- November 2011 (246)
- October 2011 (251)
- September 2011 (230)
- August 2011 (197)
- July 2011 (254)
- June 2011 (374)
- May 2011 (310)
- April 2011 (310)
- March 2011 (422)
- February 2011 (227)
- January 2011 (365)
- December 2010 (239)
- November 2010 (322)
- October 2010 (295)
- September 2010 (192)
- August 2010 (295)
- July 2010 (296)
- June 2010 (292)
- May 2010 (299)
- April 2010 (359)
- March 2010 (399)
- February 2010 (448)
- January 2010 (467)
- December 2009 (478)
- November 2009 (699)
- October 2009 (609)
- September 2009 (450)
- August 2009 (465)
- July 2009 (582)
- June 2009 (470)
- May 2009 (513)
- April 2009 (609)
- March 2009 (684)
- February 2009 (389)
- January 2009 (356)
- December 2008 (589)
- November 2008 (480)
- October 2008 (508)
- September 2008 (604)
- August 2008 (582)
- July 2008 (522)
- June 2008 (444)
- May 2008 (424)
- April 2008 (453)
- March 2008 (515)
- February 2008 (560)
- January 2008 (619)
- December 2007 (405)
- November 2007 (471)
- October 2007 (392)
- September 2007 (337)
- August 2007 (568)
- July 2007 (584)
- June 2007 (496)
- May 2007 (623)
- April 2007 (542)
- March 2007 (765)
- February 2007 (669)
- January 2007 (602)
- December 2006 (469)
- November 2006 (498)
- October 2006 (598)
- September 2006 (572)
- August 2006 (668)
- July 2006 (692)
- June 2006 (695)
- May 2006 (609)
- April 2006 (497)
- March 2006 (695)
- February 2006 (541)
- January 2006 (544)
- December 2005 (368)
- November 2005 (529)
- October 2005 (565)
- September 2005 (526)
- August 2005 (493)
- July 2005 (409)
- June 2005 (570)
- May 2005 (363)
- April 2005 (464)
- March 2005 (419)
- February 2005 (600)
- January 2005 (636)
- December 2004 (633)
- November 2004 (597)
- October 2004 (460)
- September 2004 (495)
- August 2004 (450)
- July 2004 (552)
- June 2004 (491)
- May 2004 (355)
- April 2004 (362)
- March 2004 (486)
- February 2004 (375)
- January 2004 (285)
- December 2003 (377)
- November 2003 (452)
- October 2003 (217)
- September 2003 (291)
- August 2003 (186)
- July 2003 (226)
- June 2003 (218)
- May 2003 (334)
- April 2003 (256)
- March 2003 (276)
- February 2003 (228)
- January 2003 (190)
- December 2002 (192)
- November 2002 (365)
- October 2002 (280)
- September 2002 (179)
- August 2002 (117)
- July 2002 (203)
- June 2002 (229)
- May 2002 (248)
- April 2002 (282)
- March 2002 (228)
- February 2002 (252)
- January 2002 (134)
- December 2001 (146)
- November 2001 (327)
- October 2001 (177)
- September 2001 (1)