Search Discussions

135 discussions - 552 posts

  • I'm trying to burn an index of 14M documents. I have two problems. 1. I have to run optimize() every 50k documents or I run out of file handles. this takes TIME and of course is linear to the size of ...
    Kevin A. BurtonKevin A. Burton
    Jul 7, 2004 at 5:45 am
    Jul 26, 2004 at 5:23 pm
  • Hello All, I have got all the answers from this fantastic mailing list. I have another question ;) What is the best way (Best Practices) to integrate Lucene with live database, Oracle to be more ...
    Hetan ShahHetan Shah
    Jul 15, 2004 at 12:27 am
    Jul 19, 2004 at 3:18 pm
  • When rc3 came out, I modified the classes used for Sorting to, in addition to Integer, Float and String-based sort keys, use Long values. All I did was add extra statements in 2 classes (SortField ...
    Greg GershmanGreg Gershman
    Jul 21, 2004 at 1:22 pm
    Jul 22, 2004 at 3:14 pm
  • I have seen that default lucene core has GermanAnalyzer, Russian analyzer. Also snowball has German and Russian Stemmers. If I want to support german characters, which analyzer should I be using ...
    Praveen PeddiPraveen Peddi
    Jul 2, 2004 at 6:06 pm
    Aug 6, 2004 at 3:28 pm
  • So.. the other day I sent an email about building an index with 14M documents. That went well but the optimize() was taking FOREVER. It took 7 hours to generate the whole index and when complete as ...
    Kevin A. BurtonKevin A. Burton
    Jul 8, 2004 at 6:02 pm
    Jul 9, 2004 at 4:05 pm
  • Is the package information and import paths ready to deploy on Tomcat server. I tried extracting lucene on the server, but when i compile files, it just throws numerous no class definition errors and ...
    Ian McDonnellIan McDonnell
    Jul 21, 2004 at 12:10 pm
    Jul 22, 2004 at 4:04 am
  • I was going to create a new IDField class which just calls super( name, value, false, true, false) but noticed I was prevented because Field.java is final? Why is this? I can't see any harm in making ...
    Kevin A. BurtonKevin A. Burton
    Jul 11, 2004 at 12:38 am
    Jul 14, 2004 at 7:08 am
  • Hi guys, I have question dealing with addIndexes. When an addIndexes method is done merging indexes together, it calls optimize. We would like to avoid the optimize and had though about just opening ...
    Jul 2, 2004 at 10:16 pm
    Jul 8, 2004 at 6:49 am
  • Hello all, I need to search words on URL which have been indexed. For exemple, I have "www.jakarta.org", If I search "jakarta", Lucene won't return a result. If I search "www.jakarta*", Lucene ...
    Thomas QUESTEThomas QUESTE
    Jul 23, 2004 at 11:54 am
    Jul 23, 2004 at 1:58 pm
  • Hi: I am trying to store some Databased like field values into lucene. I have my own way of storing field values in a customized format. I guess my question is wheather we can make the Reader/Writer ...
    John WangJohn Wang
    Jul 20, 2004 at 3:29 pm
    Jul 21, 2004 at 8:03 am
  • My search results are only displaying the top portion of the indexed documents. It does match the query in the later part of the document. Where should I look to change the code in demo3 of default ...
    Hetan ShahHetan Shah
    Jul 2, 2004 at 8:15 pm
    Jul 14, 2004 at 12:32 pm
  • Hi all, First let me explain what I found out. I'm running Lucene on a 4 CPU server. While doing some stress tests I've noticed (by doing full thread dump) that searching threads are blocked on the ...
    Jul 12, 2004 at 2:10 pm
    Jul 14, 2004 at 9:55 am
  • As per 1.3 (or was it 1.4) Lucene migrated to using java.iot.tmpdir to store the locks for the index. While under most situations this is save a lot of application servers change java.io.tmpdir at ...
    Kevin A. BurtonKevin A. Burton
    Jul 8, 2004 at 4:34 am
    Jul 12, 2004 at 7:14 am
  • Hi gurus: I am trying to be able to control the indexing process. While lucene tokenizes the words in the document, it counts the frequency and figures out the position, we are trying to bypass this ...
    John WangJohn Wang
    Jul 7, 2004 at 6:28 pm
    Jul 8, 2004 at 9:09 pm
  • Hi, It is possible to retrieve tokens offsets (Token.startOffset(), Token.endOffset()) later when document is found and returned in hit collection? I need these values for hihglighting. I've already ...
    Stepan MikStepan Mik
    Jul 21, 2004 at 10:59 am
    Jul 23, 2004 at 8:08 am
  • Hi, What is the best way to get Lucene to assign weightings to certain fields from a database? For example, the 'name' field should be weighted higher than the 'description' field. Thanks, John. ...
    John PattersonJohn Patterson
    Jul 21, 2004 at 2:07 pm
    Jul 22, 2004 at 2:37 pm
  • See the explain functionality in the Javadocs and previous threads. You can ask Lucene to explain why it got the results it did for a give hit. I search the index on multiple fields. Could the search ...
    Grant IngersollGrant Ingersoll
    Jul 12, 2004 at 8:59 pm
    Jul 14, 2004 at 7:18 am
  • Possibly a silly question - but how would I go about searching multiple indexes using lucene? Do I need to basically repeat the code I use to search one index for each one, or is there a better way ...
    Toby TremayneToby Tremayne
    Jul 1, 2004 at 10:39 pm
    Jul 2, 2004 at 12:36 am
  • That's not quite right. If you use the same IndexSearcher (or IndexReader) for all of the searches, then only 175MB are used. The arrays in question (the norms) are read-only and can be shared by all ...
    Doug CuttingDoug Cutting
    Jul 1, 2004 at 7:37 pm
    Sep 20, 2004 at 4:14 pm
  • Hi, If I do this: - open index writer - add document - open reader - search - close reader - close writer Will the reader pick up the document that was added to the index since it was opened after ...
    Yahootintin 1247688Yahootintin 1247688
    Jul 28, 2004 at 3:19 pm
    Aug 6, 2004 at 2:42 pm
  • Hi all, I am trying to make an automatic index update file based o a background thread, but it gives errors in deleting the existing index, if (only if) the server accesses the index at the same time ...
    Jitender ahujaJitender ahuja
    Jul 28, 2004 at 9:31 am
    Jul 29, 2004 at 3:49 am
  • Hi, i want to join two lucene indexes but i dont know how to do that. For example i have a student index and a school index. In the scholl index i have the studentId field. How to do that ? Any idea ...
    Jul 20, 2004 at 5:19 pm
    Jul 23, 2004 at 1:00 am
  • Hi, I've gone through all of the past messages regarding the CJKAnalyzer but I still must be doing something wrong because my searches don't work. I'm using the IndexHTML application from the ...
    Jon SchusterJon Schuster
    Jul 2, 2004 at 8:50 pm
    Jul 15, 2004 at 3:08 pm
  • Hallo, I have documents that only have numeric values (and dates) and I want to be able to do the following: given e.g that the document represents a Person the fields are ...
    Akmal SarhanAkmal Sarhan
    Jul 26, 2004 at 3:18 pm
    Nov 17, 2004 at 4:13 pm
  • Hello, Can someone on the mailing list send me a copy of sample code of how to implement the phrase query for my search. Regular Query is working fine, but the Phrase Query does not seem to work. ...
    Hetan ShahHetan Shah
    Jul 26, 2004 at 11:12 pm
    Jul 30, 2004 at 7:51 am
  • Dear All I need help to update the index created for the database search I created the index with three field mapping to the three column of database(oid(primarykey),title, contents) Then I created ...
    Jul 26, 2004 at 9:32 am
    Jul 27, 2004 at 12:51 pm
  • Someone came into my office today and asked me about the project I am trying to Lucene for -- "why aren't you just using a MySQL full-text index to do that" -- after thinking about it for a few ...
    Tim BrennanTim Brennan
    Jul 20, 2004 at 7:29 pm
    Jul 22, 2004 at 6:34 pm
  • Hi, I have multiple threads reading an index. Should they all be using the same IndexReader and using a pool of IndexSearchers? Or should they be using a pool of IndexReaders? Basically, one reader ...
    Yahootintin 1247688Yahootintin 1247688
    Jul 11, 2004 at 5:43 am
    Jul 13, 2004 at 4:14 pm
  • I've been working with the Field class doing index conversions between an old index format to my new external content store proposal (thus the email about the 14M convert). Anyway... I find the whole ...
    Kevin A. BurtonKevin A. Burton
    Jul 11, 2004 at 9:19 am
    Jul 13, 2004 at 1:46 am
  • Hi Guys, Finally I have sorted the problem of hits score thanks to the great help of Franck. I have hit another problem with the boolean operators now. When I search for "Winston and churchill" i get ...
    Niraj AlokNiraj Alok
    Jul 7, 2004 at 3:00 pm
    Jul 9, 2004 at 7:44 am
  • Howdy, I am new to Lucene and thus far I am very impressed. Thanks to all who have worked on this project! I am working on a project where I want to do the following: 1.) Index a bunch of document. ...
    Matt GallowayMatt Galloway
    Jul 28, 2004 at 8:47 pm
    Jul 29, 2004 at 4:37 pm
  • Hi, I have a serious performance problem while extracting text from pdf. Here is the code (w/o try/catch blocks): File file = new File("test.pdf"); FileInputStream reader = new FileInputStream(file); ...
    Miroslaw MilewskiMiroslaw Milewski
    Jul 28, 2004 at 9:30 pm
    Jul 29, 2004 at 2:24 pm
  • Can Lucenes indexer be used to store info in fields in a mysql db? If so can anybody point me to an example or some documentation relating to it. Ian Sign up for FREE email from SpinnersCity Online ...
    Ian McDonnellIan McDonnell
    Jul 20, 2004 at 12:45 pm
    Jul 20, 2004 at 2:31 pm
  • Just trying to do a src build using ant on lucene 1.4 final. and getting compile error for SortComparator.java. Any ideas? ##################################### D:\lucene-1.4-final ant Buildfile: ...
    Juan dixJuan dix
    Jul 16, 2004 at 5:58 pm
    Jul 16, 2004 at 9:24 pm
  • How do I remove document normalization from scoring in Lucene? I just want to stick to TF IDF. Thanks.
    Jones GJones G
    Jul 14, 2004 at 8:52 pm
    Jul 15, 2004 at 5:20 pm
  • Hi All, I am looking for a java spell checker, open source or not. Can anyone recommend a good one? Thanks in advance, Lynn --------------------------------------------------------------------- To ...
    Lynn LiLynn Li
    Jul 13, 2004 at 2:46 pm
    Jul 13, 2004 at 3:54 pm
  • I have a Lucene Document with a field named Code which is stored and indexed but not tokenized. The value of the field is ABC5-LB. The only way I can match the field when searching is by entering ...
    Polina LitvakPolina Litvak
    Jul 7, 2004 at 8:16 pm
    Jul 9, 2004 at 8:55 pm
  • I traverse a series of files under a parent directory (similar to the demo sample) and store the filename in a Document Keyword field called 'Filename'. I am using the StandardAnalyzer for both ...
    Robert BrownRobert Brown
    Jul 5, 2004 at 4:21 pm
    Jul 5, 2004 at 9:04 pm
  • Hi there, my question is a pretty short one! How can I prevent Lucene from cutting out special characters (i.e. the "_") during tokenization of a text? It's quite essential for me to have some non ...
    Marcus RauMarcus Rau
    Jul 29, 2004 at 10:47 am
    Jul 30, 2004 at 9:16 am
  • Is there any way to cache TermDocs? Is this a good idea? --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org For ...
    John PattersonJohn Patterson
    Jul 26, 2004 at 7:41 pm
    Jul 27, 2004 at 3:05 pm
  • Hi: Maybe this has been asked before. Is there a plan to support ACL check on the documents in lucene? Say I have a customized ACL check module, e.g.: boolean ACLCheck(int docID,String user,String ...
    John WangJohn Wang
    Jul 22, 2004 at 5:59 pm
    Jul 23, 2004 at 10:02 am
  • I would like to implement the following functionality: - Search a specific field (category) and limit the search where the title field begins with a given letter, and return the results sorted in ...
    O'Hare, ThomasO'Hare, Thomas
    Jul 9, 2004 at 2:28 am
    Jul 12, 2004 at 5:46 pm
  • Hi, I'm trying to search for a term that contains an asterisk. This is the field that I indexed: - new Field("testField", "Hello *foo bar", true, true, true); I'm trying to find this document by ...
    Yahootintin 1247688Yahootintin 1247688
    Jul 7, 2004 at 6:27 pm
    Jul 7, 2004 at 7:53 pm
  • This demo runs on a handful of boxes. It was originally running on three dual-processor boxes, but I think Yahoo! subsequently moved it to six or eight single-processor boxes. Queries are broadcast ...
    Doug CuttingDoug Cutting
    Jul 1, 2004 at 7:24 pm
    Jul 1, 2004 at 11:35 pm
  • Hi, Is there anything in lucene that would help with the implementation of a progress bar. Somewhere I could throw an event that says the search is 10%, 20% complete etc. Or is there already an ...
    Hannah cHannah c
    Jul 28, 2004 at 5:32 pm
    Jul 29, 2004 at 4:28 pm
  • Dear All How to know that, when(lastmodified time) last document is added to in index Thanks and regards Raju
    Jul 27, 2004 at 10:41 am
    Jul 27, 2004 at 1:42 pm
  • FYI, I am using PDFBox.jar to Convert PDF to Text. Problem is in the runtime its printing lot of object messages How can I avoid this one??? How can I go with this one. import java.io.InputStream; ...
    Jul 23, 2004 at 2:58 pm
    Jul 23, 2004 at 3:15 pm
  • Hello guys, What are some general techniques to make lucene search faster? I'm thinking about splitting up the index. My current index has approx 1.8 million documents (small documents) and index ...
    Anson LauAnson Lau
    Jul 21, 2004 at 3:00 am
    Jul 22, 2004 at 11:59 pm
  • I wanna use the * in the middle of a phrase search like "java j2*". Anyone knows how can i achieve that? Thanks, Albert --------------------------------------------------------------------- To ...
    Albert VilaAlbert Vila
    Jul 5, 2004 at 9:07 am
    Jul 6, 2004 at 11:24 am
  • I have read many emails in lucene mailing list regarding analyzers. Following is the list of languages lucene supports out of box. So they will be supported with no change in our code but just a ...
    Praveen PeddiPraveen Peddi
    Jul 1, 2004 at 9:15 pm
    Jul 2, 2004 at 4:42 pm
Group Navigation
period‹ prev | Jul 2004 | next ›
Group Overview
groupjava-user @

124 users for July 2004

Erik Hatcher: 39 posts Doug Cutting: 36 posts Kevin A. Burton: 23 posts Karthik N S: 18 posts Otis Gospodnetic: 18 posts Daniel Naber: 17 posts John Wang: 17 posts Aviran: 15 posts Hetan Shah: 15 posts Ian McDonnell: 12 posts Grant Ingersoll: 11 posts Yahootintin 1247688: 10 posts Praveen Peddi: 10 posts Wallen: 9 posts Anson Lau: 9 posts Don Vaillancourt: 9 posts Lingaraju: 9 posts Sergiu Gordea: 9 posts Jones G: 8 posts David Spencer: 7 posts
show more