Search Discussions

102 discussions - 515 posts

  • Hi, I am trying to delete a document without using the hits object. What is the unique field in the index that I can use to delete the document? I am trying to make a web interface where index can be ...
    Varun soodVarun sood
    Mar 12, 2008 at 9:00 pm
    Mar 17, 2008 at 7:57 pm
  • Hi all. We're using the document ID to associate extra information stored outside Lucene. Some of this information is being stored at load-time and some afterwards; later on it turns out the ...
    Daniel NollDaniel Noll
    Mar 11, 2008 at 5:50 am
    Mar 17, 2008 at 10:24 pm
  • Hi When bulk loading into a new index I'm seeing this exception Exception in thread "Thread-1" org.apache.lucene.index.MergePolicy$MergeException: org.apache.lucene.index.CorruptIndexException: doc ...
    Ian LeaIan Lea
    Mar 18, 2008 at 11:39 am
    Mar 24, 2008 at 9:12 am
  • This is my situation. I have an index, which has a lot of search requests coming into it. I use just a single instance of IndexSearcher to process these requests. At the same time, this index is also ...
    Sridhar RamanSridhar Raman
    Mar 6, 2008 at 10:31 am
    Mar 14, 2008 at 8:10 am
  • I'm running Lucene 2.3.1 with Java 1.5.0_14 on 64 bit linux. We have fairly large collections (~1gig collection files, ~1,000,000 documents). When I try to load test our application with 50 users, ...
    Richard BolenRichard Bolen
    Mar 17, 2008 at 9:58 pm
    Mar 18, 2008 at 1:23 pm
  • Hi, I am trying to use Lucene index to implement a tag cloud system. I add a new field named "tags" in index to store all the tags,and we don't support tags with more than one word, so different tags ...
    Mar 31, 2008 at 7:09 am
    Apr 7, 2008 at 5:56 am
  • Hi, I browsed the forum searching for a way to make a query that retrieves document that do not have any value for a given field (say MY_FIELD_NAME). I read several posts advising to use this syntax ...
    Mar 11, 2008 at 2:17 pm
    Sep 29, 2010 at 6:15 pm
  • Hello all! I had a problem this week, and I like to share with you all. My weblogic server that generate my index hrows its logs in a shared storage. During my indexing process (SOLR+Lucene), this ...
    Lucas F. A. TeixeiraLucas F. A. Teixeira
    Mar 26, 2008 at 2:45 pm
    Mar 27, 2008 at 12:10 pm
  • This week I switched the lucene library version on one customer system. The indexing speed went down from 46m32s to 16m20s for the complete task including optimisation. Great Job! We index product ...
    Uwe GoetzkeUwe Goetzke
    Mar 1, 2008 at 8:45 am
    Mar 26, 2008 at 3:04 am
  • this is my search content QueryParser parser = new QueryParser("keyword",new StandardAnalyzer()); Query query = parser.parse("1"); Sort sort = new Sort(new SortField(sortField)); Hits hits = ...
    Mar 18, 2008 at 1:24 pm
    Mar 19, 2008 at 11:53 pm
  • Hi, I'm planning to implement a search infrastructure on a P2P overlay. To achieve this, I want to first distribute the indices to various nodes connected by this overlay. My approach is to partition ...
    Yin QiuYin Qiu
    Mar 1, 2008 at 6:17 pm
    Mar 3, 2008 at 12:47 pm
  • Hi Everyone, We are using Lucene to search on a index of around 20G size with around 3 million documents. We are facing performance issues loading large results from the index. Based on the various ...
    Shailendra MudgalShailendra Mudgal
    Mar 25, 2008 at 12:43 pm
    Mar 26, 2008 at 5:55 pm
  • Hi all: Maybe this has been asked before: I am building an index consists of multiple languages, (stored as a field), and I have different analyzers depending on the language of the language to be ...
    John WangJohn Wang
    Mar 13, 2008 at 12:41 am
    Mar 14, 2008 at 6:09 am
  • Hey Guys, just a quick question to confirm an assumption I have. Is it correct that I can have around 100 Indexes each at its Integer.MAX_VALUE limit of documents, but can happily search them all ...
    Mar 6, 2008 at 3:29 pm
    Mar 11, 2008 at 7:19 pm
  • Hi all, Breaking proximity data has been discussed several times before, and concluded that setPositionIncrement is the way to go. In regards of it: 1. Where should it be called exactly to create the ...
    Itamar Syn-HershkoItamar Syn-Hershko
    Mar 26, 2008 at 12:01 pm
    May 16, 2008 at 10:25 pm
  • Time for another dose of inspiration for investigating Solid State Drives. And no, I don't get percentages from the chip manufacturers :-) This time I'll argue that there's little gain in using a ...
    Toke EskildsenToke Eskildsen
    Mar 13, 2008 at 11:04 am
    Apr 18, 2008 at 4:16 pm
  • Dear all, I'm trying to sort query results using a date criteria. My dates are stored as "long" in the database (I cannot change this) and indexed as untokenized. The sorted resuIts I get aren't ...
    Legrand thomasLegrand thomas
    Mar 8, 2008 at 6:58 pm
    Mar 20, 2008 at 12:08 am
  • Hi, I currently use multiple fieldable instances for indexing sentences of a document. When there is only one single fieldable instance, the token offset generation performed in DocumentWriter is ...
    Renaud DelbruRenaud Delbru
    Mar 4, 2008 at 6:05 pm
    Jul 2, 2008 at 5:22 pm
  • What's the easiest way to extract the values of 2 fields from each document in the index. For example, each document has 5 fields: Id Name Address Phone Preference I'd like to extract the values for ...
    Dragon FlyDragon Fly
    Mar 20, 2008 at 1:55 pm
    Mar 25, 2008 at 11:34 am
  • Hello, I am a newbie here and still experimenting with Lucene. I have annotations and features generated by GATE for many documents and would like to index the original content of the documents in ...
    Lucene-seme1 sLucene-seme1 s
    Mar 17, 2008 at 12:55 pm
    Mar 19, 2008 at 5:10 am
  • Could anyone provide any insight on why someone would use nutch/lucene or any other search engines to index relational databases? With use cases if possible? Shouldn't the database's own indexing ...
    Duan, NickDuan, Nick
    Mar 4, 2008 at 6:32 pm
    Mar 5, 2008 at 11:28 am
  • Hi Guys, Has anybody integrated the Spell Checker contributed to Lucene. I need advise from where to get free dictionary file (one that contains all words in English) that could be used to create ...
    Ivan VasilevIvan Vasilev
    Mar 25, 2008 at 12:58 pm
    Mar 26, 2008 at 4:22 pm
  • Let's say "the" is considered stopword. And for example two documents are document A, content: "... search the database..." document B, content: "... search database..." So when the user's input is ...
    Chris LuChris Lu
    Mar 22, 2008 at 1:21 am
    Mar 23, 2008 at 2:48 am
  • Hi List, Thanks in advance for any help. I'm working with the contrib highlighting class and am having issues when doing searches with a phrase. I've been able to duplicate this behaviour in the ...
    Spencer TicknerSpencer Tickner
    Mar 18, 2008 at 8:28 pm
    Mar 19, 2008 at 10:53 pm
  • Hi all, I guess this question is a bit off the track. Are there any language identification modules inside Lucene ??? If not can somebody please suggest me a good one. Thank You.
    Raghu RamRaghu Ram
    Mar 14, 2008 at 5:29 am
    Mar 14, 2008 at 3:50 pm
  • I'm using lucene 2.2.0. I'm in the process of re-writing some queries to build BooleanQueries instead of using query parser. Bypassing query parser provides almost an order of magnitude improvement ...
    Beard, BrianBeard, Brian
    Mar 5, 2008 at 8:09 pm
    Mar 10, 2008 at 1:09 pm
  • hey everybody, I'm wondering if it's possible to combine wildcards and phrase query. For example "term1 term*" I know that the documentation says "Lucene supports single and multiple character ...
    Mar 6, 2008 at 10:42 am
    Mar 8, 2008 at 3:53 am
  • Is there a direct way to ask an IndexReader what segment it is pointing at? That would make implementing custom deletion policies a LOT easier. It seems like it should be pretty simple -- keep a list ...
    Tim BrennanTim Brennan
    Mar 1, 2008 at 3:04 am
    Mar 5, 2008 at 10:32 pm
  • HI There I keep getting the following error when simultaneously reindexing my documents and searching through the index. java.io.IOException: Cannot overwrite: C:\index9121\_2.cfs at ...
    Mar 29, 2008 at 6:28 am
    Mar 29, 2008 at 10:57 am
  • Jay, Have a look at Lucene config, it's all there, including tests. This filter will take a token such as "foobar" and chop it up into n-grams (e.g. foobar - fo oo ob ba ar would be a set of ...
    Otis GospodneticOtis Gospodnetic
    Mar 25, 2008 at 10:02 pm
    Mar 26, 2008 at 8:49 pm
  • Hi, I'm trying to write to a specific index from several different processes and encounter problems with locked files (deletable for example). I don't perform any specific locking because as I ...
    Eran SeviEran Sevi
    Mar 19, 2008 at 1:55 pm
    Mar 25, 2008 at 1:48 am
  • Hi All, Can someone please guide me on how to use IndexReader's getFieldNames() method properly? I want to get all the filed names in the index. Currently I am getitng it via Document object but that ...
    Varun soodVarun sood
    Mar 19, 2008 at 1:17 pm
    Mar 19, 2008 at 5:54 pm
  • Hi, I have some question about the index size on a single machine: What is your biggest index you use in production? Do you use MultiReader/Searcher? What hardware do you need to serve it? What kind ...
    Mar 10, 2008 at 9:06 pm
    Mar 17, 2008 at 3:15 am
  • Hi, I would like to ask for suggestions of the best design for the following scenario: I have a very large number of XML files (around 1M). Each file contains several sections. Each section contains ...
    Eran SeviEran Sevi
    Mar 11, 2008 at 1:24 pm
    Mar 12, 2008 at 8:55 am
  • Hello, My machine is Ubuntu 7.10. I am working with Apache Lucene. I have done with indexer and tried with command line Searcher (the default command line included in Lucene package: ...
    Mar 21, 2008 at 8:25 pm
    Apr 12, 2008 at 1:19 pm
  • Hi, I'm using 2.3.0 Lucene build and have following merge parameters, mergeFactor = 100 maxMergeDocs = 99999 maxBufferedDocs = 10000 maxRAMBufferSizeMB = 200 After running with this setting for a ...
    Vivek sarVivek sar
    Mar 30, 2008 at 11:42 pm
    Apr 1, 2008 at 7:03 am
  • Hi All, I need help on retrieving results based on relevance + freshness. As of now, i get based on either of the fields, either on relevance or freshness. how can i achieve this. Lucene retrieves ...
    Mar 19, 2008 at 3:36 pm
    Mar 20, 2008 at 3:21 am
  • Hello there! I have just started with lucene. Bought the Lucene in action book [right now I'm at chap 4, plus the 10th chapter, great explanation by Terence from jGuru, really nice stuff], also I'm ...
    Vinicius CarvalhoVinicius Carvalho
    Mar 19, 2008 at 2:17 pm
    Mar 19, 2008 at 6:06 pm
  • Hi, Thnxs for spending time for the problem. When sorting the results of lucene search it takes more time and not looks not that much usefull can any one help Below is my code.. sort = new Sort(new ...
    Mar 13, 2008 at 10:50 am
    Mar 17, 2008 at 12:10 pm
  • Hi, I'd like to find out if I can do the following with Lucene (on Windows). On server A: - An index writer creates/updates the index. The index is physically stored on server A. - An index searcher ...
    Dragon FlyDragon Fly
    Mar 14, 2008 at 12:18 pm
    Mar 14, 2008 at 5:16 pm
  • Hi, I want to create an index with one unique field. Before inserting a document i must be sure that "unique field" is unique. John ...
    Ion BaditaIon Badita
    Mar 11, 2008 at 3:07 pm
    Mar 13, 2008 at 1:49 pm
  • Hi all, I'm trying to index documents so that a) I have all the documents indexed 'normally' (in that I can search for documents that match certain words, and b) parts of the document that I consider ...
    Steve SuppeSteve Suppe
    Mar 7, 2008 at 6:39 pm
    Mar 7, 2008 at 9:52 pm
  • Hi, I need to use stop-word bigrams, liike the Nutch analyzer, as described in LIA 4.8 (Nutch Analysis). What I don't understand is, why does it keep the original stop word intact? I can see great ...
    John ByrneJohn Byrne
    Mar 3, 2008 at 10:41 am
    Mar 3, 2008 at 5:19 pm
  • Hello I just joined the list and need some help. I have a database of music tracks.These tracks have been added to an index. They are classified using keywords, so a track can have up to 20 keywords ...
    Fiaz KhanFiaz Khan
    Mar 31, 2008 at 9:41 am
    Mar 31, 2008 at 2:55 pm
  • Suppose I have two field field1 and field2 and let the score for a query from field1 and field2 are score1 and score2 respectively.now when computing the final lucene score I instead of score1,I want ...
    Mar 24, 2008 at 6:17 am
    Mar 30, 2008 at 1:36 am
  • Hi all, our problem is to choose the best (the fastest) way to iterate over huge set of documents (basic and most important case is to iterate over all documents in the index). Some slow process ...
    Wojtek HWojtek H
    Mar 26, 2008 at 9:49 am
    Mar 27, 2008 at 7:23 pm
  • Hi: Is there a way to random accessing term value in a field? e.g. in my field, content, the terms are: lucene, is, cool Is there a way to access content[2] - cool? Thanks -John
    John WangJohn Wang
    Mar 25, 2008 at 5:32 pm
    Mar 26, 2008 at 3:20 pm
  • Hi all, I posted this in Solr mailing but then I thought it would be more appropriate to have it here. I thought many people would encounter the situation I'm having here. Basically, we'd like to ...
    Mar 23, 2008 at 11:37 pm
    Mar 24, 2008 at 11:42 am
  • Hi, I am using the Directory class's copy method to periodically sync my RAM based index to a file based index that's supposed to serve as a hot backup. I want to know if this is the right way to ...
    Roger dimitriRoger dimitri
    Mar 20, 2008 at 6:48 pm
    Mar 21, 2008 at 5:46 pm
  • Hey folks, I was wondering what the status of LUCENE-933 (stop words can cause the queryparser to end up with no results, due to an e.g. +(the) clause in the resultant BooleanQuery). According to the ...
    Jake MannixJake Mannix
    Mar 18, 2008 at 5:14 pm
    Mar 19, 2008 at 12:03 am
Group Navigation
period‹ prev | Mar 2008 | next ›
Group Overview
groupjava-user @

119 users for March 2008

Michael McCandless: 62 posts Erick Erickson: 33 posts Grant Ingersoll: 24 posts Chris Hostetter: 18 posts Mathieu Lecarme: 18 posts Daniel Noll: 13 posts Mark Miller: 12 posts Otis Gospodnetic: 10 posts Cam Bazz: 9 posts JensBurkhardt: 9 posts John Wang: 9 posts Yonik Seeley: 9 posts Spring: 8 posts Dragon Fly: 8 posts Ian Lea: 8 posts Itamar Syn-Hershko: 8 posts Jamie: 8 posts Sandyg: 8 posts Eran Sevi: 7 posts Toke Eskildsen: 7 posts
show more