Search Discussions

102 discussions - 471 posts

  • I've got a DB of about 20000 pages which I thought I'd update to Lucene 2.2. I removed the old index (2.0 based) completely, and started re-indexing all the documents. I do this in stages, of about ...
    Bill JanssenBill Janssen
    Nov 28, 2007 at 5:42 pm
    Dec 18, 2007 at 4:16 pm
  • Using solr, we have been running an indexing process for a while and when I checked on it today, it spits out an error: java.lang.RuntimeException: java.io.FileNotFoundException: ...
    Ryan McKinleyRyan McKinley
    Nov 10, 2007 at 9:06 pm
    Nov 13, 2007 at 7:03 pm
  • Hi. I have a situation where I'm searching amongst some 100K feeds and only want one result per site in return. I have developed a really simple method of grouping which just scrolls through the ...
    Marcus HerouMarcus Herou
    Nov 5, 2007 at 5:58 am
    Aug 2, 2009 at 8:56 am
  • I have a problem of performance when I need group the result do search I have the code below: for (int i = 0; i < hits.length(); i++) { doc = hits.doc(i); obj1 = ...
    Haroldo NascimentoHaroldo Nascimento
    Nov 18, 2007 at 9:32 pm
    Nov 19, 2007 at 10:41 pm
  • Hi everyone, I am trying to obtain the score for each document in the index relative to a given query. For example, if I have the query "search file", I am trying to get the list of all documents in ...
    Nov 19, 2007 at 4:06 pm
    Nov 20, 2007 at 11:10 pm
  • In Lucene 2.x, in method Lock#obtain(long lockWaitTimeout) I see the following line: int maxSleepCount = (int)(lockWaitTimeout / LOCK_POLL_INTERVAL); Since I wanted to set the lock timeout to the ...
    Nikolay DiakovNikolay Diakov
    Nov 7, 2007 at 1:22 pm
    Nov 8, 2007 at 11:33 am
  • We've run into a blocking problem with our use of Lucene: we get OutOfMemoryError when performing a one-term search in our index. The search, if completed, should give only a few thousand hits, but ...
    Lars ClausenLars Clausen
    Nov 13, 2007 at 2:38 pm
    Jan 25, 2008 at 10:00 pm
  • Hi, Optimizing my index of 1.5 million documents takes days and days. I have a collection of 10 million documents that I am trying to index with Lucene. I've divided the collection into chunks of ...
    Barry ForrestBarry Forrest
    Nov 11, 2007 at 11:17 pm
    Nov 12, 2007 at 10:49 pm
  • Our document contains a total of 23 fields in one document and we STORE all of them in lucene index. We have recently had some performance issues and our analysis has shown the bottleneck to be ...
    Nov 20, 2007 at 11:30 am
    Nov 22, 2007 at 4:51 pm
  • Hi All, I'll explain what I'm working on, and then I'll ask my two questions. I'm working on the issue https://issues.apache.org/jira/browse/SOLR-380 which is a feature request that allows one to ...
    Tricia WilliamsTricia Williams
    Nov 17, 2007 at 3:37 am
    Nov 20, 2007 at 6:39 pm
  • I have posted before about a problem with TermDocs.skipTo () but never managed to reproduce it. I have now got it to fail using the following program, please can someone try it and see if they get ...
    Mike StreetonMike Streeton
    Nov 9, 2007 at 10:52 am
    Nov 14, 2007 at 1:27 pm
  • Hi I have indexed this html document =============z1======================== <html <body <h1 zo zo zo zo zo zo zo zo zo zo zo zo </h1 <br <h1 zo zo zo zo zo zo zo zo zo zo zo zo </h1 <br <h1 zo zo zo ...
    Jamal jamalatorJamal jamalator
    Nov 1, 2007 at 1:24 am
    Nov 2, 2007 at 5:26 pm
  • Hi! I search an 1.5 gig index and fuzzy queries are really slow; something like avg. ~500ms (IndexSearcher.search(Query, HitCollector)). When performing exact queries I archieve response times <25ms. ...
    Timo NentwigTimo Nentwig
    Nov 24, 2007 at 4:36 pm
    Nov 26, 2007 at 6:36 pm
  • I was wondering about methods for analyzing various languages and that what I understand (please correct me if I wrong): 1. To analyze non English language I need to use specific analyzer. Link to ...
    Nov 18, 2007 at 5:09 pm
    Nov 26, 2007 at 4:31 pm
  • Hallo *; I went through some examples of the Lucene in Action book to find that the API has changed and then applied the corrections with the help of this forum. One runtime problem however remains. ...
    Nov 18, 2007 at 10:16 pm
    Nov 20, 2007 at 10:53 am
  • Hi! I do have different document types (Books, Magazines, Author whatever) in the index and a FieldSelector is document type specific (for Books LOAD isbn and title for Author name, ...). The ...
    Timo NentwigTimo Nentwig
    Nov 30, 2007 at 11:28 am
    Dec 5, 2007 at 7:22 pm
  • Folks I have some additional textual data that is user specific, basically annotations about documents. I would like to be able to do **combined** searches, looking for some words in the document and ...
    Lucene userLucene user
    Nov 26, 2007 at 10:34 pm
    Nov 28, 2007 at 3:23 am
  • Boosting a one term query does not have an affect on the score. For example: apple Has the same score as: apple^3 But repeating the term will up the score apple apple apple I expected the score to go ...
    Nov 29, 2007 at 9:34 pm
    Dec 13, 2007 at 12:41 am
  • Hi there. I am a new Lucene user and I have been searching the group archives but couldn't solve the problem. I have just joined a project that uses Lucene. We use the StandardAnalyzer for indexing ...
    Ruchi ThakurRuchi Thakur
    Nov 30, 2007 at 6:53 pm
    Dec 3, 2007 at 1:19 pm
  • This is exactly right. That final true (which is the "create" arg) will clear out the index. This looks right to me! Mike --------------------------------------------------------------------- To ...
    Michael McCandlessMichael McCandless
    Nov 25, 2007 at 9:52 am
    Nov 26, 2007 at 10:16 am
  • Is there a problem with the term frequency count (tf) and the IndexSearcher.explain method? I'm searching the following string (fieldname is description) for the term 'salesman' and receive the ...
    John GriffinJohn Griffin
    Nov 23, 2007 at 11:23 pm
    Nov 25, 2007 at 3:00 am
  • Hi, I want to realize a search that finds the exact phrase I provide. If the word I am searching for is "green tree", I do NOT want to get results for "green" or "tree", but only results for "green ...
    Nov 22, 2007 at 11:07 am
    Nov 24, 2007 at 3:13 pm
  • Hello all, I don't know if this is a somehow naive question, but here we go: Does Lucene support index by sections? Like having a text document with three sections divided by XML tags indexed in a ...
    Cláudio FernandesCláudio Fernandes
    Nov 13, 2007 at 12:22 pm
    Nov 13, 2007 at 5:10 pm
  • Hi, We are having an issue while indexing Chinese Documents in Lucene. Some background first: Since CJK languages doesn't have space between words, we first have to determine the words from ...
    Cedric HoCedric Ho
    Nov 9, 2007 at 3:00 am
    Nov 11, 2007 at 2:45 am
  • I've been looking at the highlighter examples. All of them seem to deal with fragments. I need to highlight an entire document as it is displayed (i.e., highlight all of the keywords in it). Can ...
    Scott SmithScott Smith
    Nov 28, 2007 at 7:13 am
    Nov 29, 2007 at 5:01 am
  • Hi Guys, I made a problem in implement some extra scores besides the VSM model. My works entails with re-ranking the returned documents from the extra scores like page quality or page property ( good ...
    Zhou QiZhou Qi
    Nov 15, 2007 at 3:15 am
    Nov 17, 2007 at 5:25 am
  • Is anyone on this list using the gdata server? I have been trying to get it working and have been running into some problems.
    Lyth, Christopher [USA]Lyth, Christopher [USA]
    Nov 15, 2007 at 10:32 pm
    Nov 16, 2007 at 9:09 am
  • Hi: I want to build a custom termfreq vector an add it to the field to store it to the index. I want to use lucene for research, I'm thinking to make some experimentation so I need to store a term ...
    Nov 6, 2007 at 10:28 pm
    Nov 10, 2007 at 4:22 am
  • Hi, I tried to use the CheckIndex tool (the latest svn code) and I was surprised to notice that all my indexes from production (around 30) are corrupt. This is highly unlikely because they were ...
    Bogdan GhidireacBogdan Ghidireac
    Nov 27, 2007 at 2:44 pm
    Nov 28, 2007 at 1:54 pm
  • Hi there, I am using currently a FSDirectory to build my index. The reason for using a file system based index is that a full index rebuild takes around 30 minutes and I want to keep a persistent ...
    Hardy FerentschikHardy Ferentschik
    Nov 27, 2007 at 6:55 am
    Nov 27, 2007 at 3:10 pm
  • Hi, I have a question ? Lucene offers a mixing structure of storage of index, that is, first do search in memoria (ARMDirectory) and in case of not found do search in index file automatically ? For ...
    Haroldo NascimentoHaroldo Nascimento
    Nov 24, 2007 at 3:26 pm
    Nov 26, 2007 at 6:56 pm
  • Hi: What is the right way of setting customized position value on a token at indexing time. Thanks -John --------------------------------------------------------------------- To unsubscribe, e-mail: ...
    John WangJohn Wang
    Nov 19, 2007 at 9:14 pm
    Nov 20, 2007 at 3:12 am
  • Hi All, Can some explain to me this line. I encounter this line while setting up Lucene... Connect to the top-level of your Lucene installation Kindly guide me in this regard. Liaqat Ali ...
    Liaqat AliLiaqat Ali
    Nov 19, 2007 at 10:43 am
    Nov 19, 2007 at 1:56 pm
  • Hi there, Currently I am trying to get synonyms to work. I have gotten as far as injecting them into the index as Token.type SYNONYM. Lucene then finds the original word and synonym and points to the ...
    Matthijs BiermanMatthijs Bierman
    Nov 12, 2007 at 2:55 pm
    Nov 15, 2007 at 2:13 pm
  • I have briefly reviewed the SimpleFSLock of Lucene 2.1 and 2.2. I see that the lock release mechanism does not check the return value of delete: public void release() { lockFile.delete(); } On most ...
    Nikolay DiakovNikolay Diakov
    Nov 9, 2007 at 3:18 pm
    Nov 10, 2007 at 2:53 pm
  • Hi , We have been developing an enterprise logging service at the Wachovia bank. The logs (Busines, application, error) for all the bank related applications are consolidated at one single location ...
    Sandeep MahendruSandeep Mahendru
    Nov 4, 2007 at 6:49 pm
    Nov 6, 2007 at 2:36 am
  • Hi, I have an application using Lucene 2.2.0 that opens an IndexSearcher only once to optimize performance, because opening the index is a heavy operation. My question is, if I modify the index with ...
    Enrique LamasEnrique Lamas
    Nov 5, 2007 at 1:46 pm
    Dec 10, 2007 at 9:34 am
  • Hi, I faced some problem with prefix query search when the prefix text contains a hyphen. i'm using lucene-2.1. Search query is like this ttl:co-operative it returns more than 50 results, but if i ...
    Nov 27, 2007 at 7:48 am
    Nov 30, 2007 at 8:54 am
  • Hi all, I want to compute the co-occurence frequency between a word and a phrase( this phrase contains some words, and the words in it should be successive and in order). It's like an NEAR operation ...
    Nov 28, 2007 at 11:42 am
    Nov 29, 2007 at 2:21 pm
  • Hi all, I am having a problem with Lucene 2.2.0 with regard to the contents of the Explanation objects after a PhraseQuery search. I indexed two documents doc1 and doc2 and then issue an OR Boolean ...
    Ng VinnyNg Vinny
    Nov 27, 2007 at 8:55 pm
    Nov 27, 2007 at 10:23 pm
  • Hi I show the results of searches as two criterios of sorting ("priority" and to after "score") of each document. I need present the result with same score of ramdomize form. For example: *Result of ...
    Haroldo NascimentoHaroldo Nascimento
    Nov 26, 2007 at 11:32 pm
    Nov 27, 2007 at 3:50 pm
  • Is their such a thing as a jdbc driver for Lucene that allows you to run SQL to select from an index. Many Thanks Mike
    Mike StreetonMike Streeton
    Nov 26, 2007 at 9:13 am
    Nov 26, 2007 at 4:40 pm
  • How do I force the MultiFieldQueryParser to interpret a string like "dock boat" as "dock* boat*" and therefore use PrefixQuery instead of TemQuery? The customer wants always to search with <word * as ...
    Anders LybeckerAnders Lybecker
    Nov 21, 2007 at 7:16 pm
    Nov 23, 2007 at 10:36 am
  • Hi, I am willing to have a query parser which is fault tolerant. I have search over the archive, and I have found this : http://www.nabble.com/Error-tolerant-query-parsing-tf108987.html#a300382 I ...
    Nicolas LalevéeNicolas Lalevée
    Nov 20, 2007 at 12:56 pm
    Nov 22, 2007 at 9:49 pm
  • Hi: It was interesting hearing about the need for real time indexing at the BirdsOfAFeather round table. We also needed to solve this problem. We took this approach: A large disk index that indexes ...
    John WangJohn Wang
    Nov 16, 2007 at 6:44 am
    Nov 16, 2007 at 12:06 pm
  • Hi, I have a question regarding the way I got around the 'TooManyClauses' exception when using wild card queries ...
    Hardy FerentschikHardy Ferentschik
    Nov 12, 2007 at 9:45 pm
    Nov 14, 2007 at 9:53 pm
  • Hi Everyone, We are planning on scaling our current web server by adding a machine with similar specification. Both machine will be running lucene searches. What we plan to do is add a load balancer ...
    Nov 12, 2007 at 9:48 am
    Nov 13, 2007 at 7:41 am
  • Hello, I know this has gone around a bit but anyone had any success with pulling text from Office 2007 files? Any recommendations? Thanks, Michael ...
    Michael PrichardMichael Prichard
    Nov 8, 2007 at 1:34 pm
    Nov 8, 2007 at 3:44 pm
  • i want to retrieve lucene search results from the web page and want to put them into oracle database through JDBC, and after some manipulation want to display results again after fetching it from ...
    Nov 8, 2007 at 12:47 am
    Nov 8, 2007 at 9:38 am
  • Hi, I'm using IndexReader.deleteDocuments(Term) to delete documents in batches. I need the deleted count, so I cannot use IndexWriter.deleteDocuments(). What I want to do is delete documents based on ...
    Antony BowesmanAntony Bowesman
    Nov 26, 2007 at 5:53 am
    Dec 4, 2007 at 9:18 am
Group Navigation
period‹ prev | Nov 2007 | next ›
Group Overview
groupjava-user @

127 users for November 2007

Grant Ingersoll: 41 posts Michael McCandless: 31 posts Erick Erickson: 27 posts Bill Janssen: 17 posts Mark Miller: 14 posts Yonik Seeley: 13 posts Haroldo Nascimento: 12 posts Cool Coder: 10 posts Daniel Naber: 10 posts Liaqat Ali: 10 posts Chhabra, Kapil: 9 posts Chris Hostetter: 9 posts Mark harwood: 9 posts Mike Streeton: 8 posts Nikolay Diakov: 8 posts Shai Erera: 8 posts German Kondolf: 7 posts John Wang: 7 posts Matthijs Bierman: 7 posts Ryan McKinley: 7 posts
show more