Search Discussions

118 discussions - 542 posts

  • Hi All, I have the following problem - we have OutOfMemoryException when seraching on the indexes that are of size 20 - 40 GB and contain 10 - 15 million docs. When we make searches we perform query ...
    Ivan VasilevIvan Vasilev
    Apr 6, 2007 at 11:10 am
    Apr 25, 2007 at 7:28 pm
  • Hello, I have the following three documents in my index: - Java programming is required to write Lucene application. - Java is a popular computer language. I like Java. - Perl is not a kind of ...
    Koji SekiguchiKoji Sekiguchi
    Apr 11, 2007 at 5:08 pm
    Apr 13, 2007 at 4:20 am
  • All, Sorry for long email. I have two questions on indexing. My data consists of an id, short headline and story text. Story text has some html tags. Here is an example. In early 2005, it seemed that ...
    Tony QianTony Qian
    Apr 12, 2007 at 3:24 pm
    Apr 19, 2007 at 4:35 pm
  • Hello Lucene users, I'm rather new to lucene and java but have done work with other search engines some time before. Right now I'm trying my hands (and luck) on a 'search as you type'- sort of high ...
    Steffen HeinrichSteffen Heinrich
    Apr 11, 2007 at 8:33 pm
    Apr 12, 2007 at 7:29 pm
  • I'm trying to understand the specifics behind the notation +(...) and -(...) as it applies to the standard parser. I have three lists of words. I want documents that have at least one word from list ...
    Walt StoneburnerWalt Stoneburner
    Apr 9, 2007 at 3:28 am
    Apr 11, 2007 at 11:16 pm
  • Hi, I wonder if there is a way to search for documents containing only a certain word in a specified field. For example, I would like to search for documents that contain only "the" in title field. ...
    Kun HongKun Hong
    Apr 27, 2007 at 6:22 am
    May 1, 2007 at 3:33 am
  • For those who may be interested, DBSight 1.4.0 now has unlimited index size with Free version! Basically DBSight is more like SOLR + database adapter. You just point it with one or several SQLs to ...
    Chris LuChris Lu
    Apr 24, 2007 at 9:22 am
    Apr 26, 2007 at 5:22 pm
  • Hi, I've just started using Lucene. Can anybody assist me in calculating the term frequencies of the terms(words) that occur in a document(*.txt), when a particular doc is submitted. Say when i ...
    Sai hariharanSai hariharan
    Apr 11, 2007 at 7:23 pm
    Apr 12, 2007 at 7:21 pm
  • Hello all, I would like to extract the term freq vector from the hit results as a total vector not by document. I have searched the mailing and I found many have talked about this issue but I still ...
    Sengly HengSengly Heng
    Apr 10, 2007 at 2:02 pm
    Apr 11, 2007 at 1:37 pm
  • All, After playing around with Lucene, we decided to replace old full-text search engine with Lucene. I got "Lucene in Action" a week ago and finished reading most of the book. I got several ...
    Tony QianTony Qian
    Apr 27, 2007 at 2:07 pm
    Sep 4, 2007 at 8:14 pm
  • I am trying to index a huge documents on batches . Batch size is parameterized to the application say X docs , that means it will hold X no. of Docs in the RAM before I flush to file system using ...
    Chandan TamrakarChandan Tamrakar
    Apr 29, 2007 at 11:53 am
    May 11, 2007 at 10:39 pm
  • Hello, I am having some issues with the SpanQuery functionality. As a matter of fact, I index a single french file containing for instance "climatisation automatique" (which means automatic ...
    Apr 30, 2007 at 9:23 am
    May 1, 2007 at 7:32 am
  • Hi is it possible (or a trickery way) to search with a given query in which we can set an equality for two fields for example: Document: field1 field2 field3 field4 Query: field1:"test phrase" AND ...
    Mohammad NorouziMohammad Norouzi
    Apr 11, 2007 at 9:19 am
    Apr 19, 2007 at 9:44 pm
  • My documents are cars... i.e., Nissan Altima Sports Package Nissan Altima Standard The problem I have is when i search "Nissan Altima", I want to get the 2nd hit back first, i.e. "Nissan Altima ...
    John KlevenJohn Kleven
    Apr 2, 2007 at 6:39 pm
    Apr 6, 2007 at 6:10 am
  • I asked this question on the Solr user list because that is the current lucene server implementation I'm using, but I didn't get any feedback there and the problem isn't really Solr specific so I ...
    Daniel EinspanjerDaniel Einspanjer
    Apr 11, 2007 at 12:04 am
    May 31, 2007 at 1:31 am
  • Hi, I tried using MoreLikeThis contrib feature to extract "interesting terms" from a document. This works very well - but only for SINGLE words. I am looking for a way to extra "keyPHRASES" from a ...
    Apr 29, 2007 at 9:24 pm
    May 10, 2007 at 1:13 am
  • I have build a blog project under tomcat5.5 with Lucene2.0. And I want to search my blog by full text, but there is somthing wrong: ---------------------------------------------- The project flow: ...
    Apr 9, 2007 at 9:27 am
    Apr 12, 2007 at 2:51 pm
  • Hi! I feel somewhat stupid for asking this but...I let two thread build an index and the merge it into one on disk via addIndexes(), optimize() and close(). This is what it looks like on disk: ...
    Apr 8, 2007 at 4:59 pm
    Apr 8, 2007 at 6:15 pm
  • Hi I need the id of the document that returned by Hits as a result of a query. Hits result = searchable.find(myQuery....); now I need something like result.getId() is there any way to get it? Thanks ...
    Mohammad NorouziMohammad Norouzi
    Apr 5, 2007 at 10:08 am
    Apr 5, 2007 at 11:59 am
  • Hi, Is there a way to do emulate paged search in Lucene? I can use the following peace of code for returning the first page (10 items in each page), but don't know how to navigate to the next page ...
    Mohsen SaboorianMohsen Saboorian
    Apr 1, 2007 at 6:30 am
    Apr 4, 2007 at 3:37 am
  • I need an urgent help. I want to change the page ranking algorithm in lucene and I do not know where to start from and what file should I change? I do not know what classes are involved. I have only ...
    Apr 14, 2007 at 4:16 am
    Aug 30, 2007 at 2:56 pm
  • Hi, all, I'm looking for a simple, straightforward example of how to use the Snowball stemmer to make Lucene search results return all variants of the terms searched for. For example, if I search for ...
    Apr 24, 2007 at 7:32 pm
    Apr 26, 2007 at 5:36 pm
  • Hi, Is there a goal for lucene to always be able to read indexes written by older versions of Lucene? For instance, I noticed that I could read 2.0 and 1.9 indexes with a 2.1 Lucene jar. (I also ...
    Lucifer HammerLucifer Hammer
    Apr 23, 2007 at 4:11 am
    Apr 24, 2007 at 3:23 am
  • Hi there, I am new to Lucene and would appreciate any help on this. Thank you in advance. I want the order of the search results based on the keywords mentioned in the meta information of the ...
    Apr 19, 2007 at 9:07 pm
    Apr 20, 2007 at 12:24 am
  • I'd like to share index merge performance data and have a couple of questions about it... We (AXS-One, www.axsone.com) build one "master" index per day. For backup and recovery purposes, we also ...
    D mD m
    Apr 18, 2007 at 2:57 pm
    Apr 19, 2007 at 6:46 pm
  • Hi Ratnesh, 1. There is no need to use that many question marks, really. 2. Use java-user list, not java-dev 3. You cannot delete using negative criteria. You can delete 1 Document using its document ...
    Otis GospodneticOtis Gospodnetic
    Apr 6, 2007 at 3:28 pm
    Apr 10, 2007 at 1:16 pm
  • I was wondering if anyone has done people name matching using Lucene. For example, I have a name coming from some external source that I would like to match with the one I have in my DB. Lets say my ...
    Apr 5, 2007 at 7:59 pm
    Apr 6, 2007 at 2:11 pm
  • I want to modify the norms to only include values between 0 and 100. Currently, I have a custom implementation of the default similarity. Is it sufficient to override the encodeNorm and decodeNorm ...
    Apr 30, 2007 at 7:33 pm
    May 1, 2007 at 8:25 pm
  • I'm pondering on long term maintenance issues with Lucene indexes and would like to know of anyone's suggestions or recommendations to backing up these indexes. My goal is to have a weekly, or even ...
    Larry hughesLarry hughes
    Apr 27, 2007 at 4:29 pm
    May 1, 2007 at 8:12 pm
  • Hi list. I am trying to implement some TopScoreHitCollector class; a kind of TopDocCollector which collects the documents the score of which is higher than a threshold. The threshold will be ...
    Apr 22, 2007 at 10:29 am
    Apr 23, 2007 at 8:23 am
  • I've been on this list long enough to have a vast repository of information about using a MultiSearcher versus an IndexSearcher that works on a MultiReader. However, after looking through several ...
    Kirk RobertsKirk Roberts
    Apr 17, 2007 at 6:49 pm
    Apr 21, 2007 at 11:23 pm
  • Hi, I was wondering if the addIndexes() method in IndexWriter can be used for updating documents. Specifically, I'd like to leave my primary index alone during the update process. Instead, I want to ...
    Moti NisensonMoti Nisenson
    Apr 16, 2007 at 8:00 am
    Apr 21, 2007 at 3:46 pm
  • I am following all the points which are mentioned in the following link: http://wiki.apache.org/lucene-java/LuceneFAQ#head-3558e5121806fb4fce80fc022d889484a9248b71 I am having the following issues: ...
    Apr 11, 2007 at 4:38 am
    Apr 12, 2007 at 5:14 pm
  • I am using an RMI architecture for calling a remote service which uses an IndexSearcher in its own JVM. I am starting the service with the following provisions for memory allocation and garbage ...
    Craig W ConwayCraig W Conway
    Apr 4, 2007 at 5:26 pm
    Apr 6, 2007 at 2:44 am
  • Hi, I'm looking at benchmarking Paul's http://issues.apache.org/jira/browse/LUCENE-584 code. I'd like to compare either: HitCollector.collect(doc, score) vs. MatchCollector.collect(doc) or ...
    Otis GospodneticOtis Gospodnetic
    Apr 2, 2007 at 10:36 pm
    Apr 5, 2007 at 1:36 am
  • I'm copying this reply from a topic with the same title from the defunct 'lucene-user' list. My comments follow it. : I thought of putting empty strings instead of null values but I think : empty ...
    Peter KeeganPeter Keegan
    Apr 12, 2007 at 8:25 pm
    May 10, 2007 at 7:27 pm
  • Hello, I have a scenario, where we need to set up our application, that uses Lucene (and has on-demand indexing of documents) in Disaster-recovery site. The simple files/attachments used by our ...
    Rajendranath, DivyaRajendranath, Divya
    Apr 10, 2007 at 4:06 pm
    May 4, 2007 at 3:45 pm
  • When we use IndexModifier's docCount() method, it calls it's underlying IndexReader's numDocs() or IndexWriter's docCount() method. Here is the problem that IndexReader.numDocs() cares about deleted ...
    Cheolgoo KangCheolgoo Kang
    Apr 6, 2007 at 2:41 pm
    May 3, 2007 at 2:18 pm
  • Hi, all, Another quick request succinct for code examples, or an explanation of what we're doing wrong here. We've successfully gotten the Snowball Spanish stemmer working in our test harness. An ...
    Andrew GreenAndrew Green
    Apr 26, 2007 at 7:41 pm
    Apr 30, 2007 at 7:14 pm
  • Hello All, Could any one help me find solution to the following problem ? I am facing problems while trying to add files of size 50MB to my application. The application has on-demand indexing of ...
    Divya RajendranathDivya Rajendranath
    Apr 24, 2007 at 11:03 am
    Apr 25, 2007 at 11:32 pm
  • Hi all. I'm considering making a kind of IndexReader where each time terms() is called it might return a different sequence even though the reader hasn't been reopened. Would that kind of thing ...
    Daniel NollDaniel Noll
    Apr 20, 2007 at 1:25 am
    Apr 24, 2007 at 7:01 am
  • I can't get TestSpellCheck to work. Documents appear to be added but all queries return zero hits. Is this TestCase working for anyone? -- View this message in context: ...
    Apr 3, 2007 at 6:55 pm
    Apr 20, 2007 at 8:26 am
  • Hi, I have been using Lucene "out of the box" since 1.4.3, wonderful full text engine, I love it. But I can't use it "out of the box" any more, I am going to have to write some code (Oh no! Mr ...
    Jim shirreffsJim shirreffs
    Apr 18, 2007 at 5:16 pm
    Apr 19, 2007 at 3:34 pm
  • I know this is a relatively fundamental thing to arrange, but I'm having trouble. Can I instantiate a standard analyzer with an argument containing my own stop words? If so, how? Will they be ...
    Michael BarbarelliMichael Barbarelli
    Apr 13, 2007 at 5:03 am
    Apr 13, 2007 at 5:50 pm
  • Hello. I am using Lucene to submit fuzzy queries against an index. I have noticed that relevant matches are often retreived, but the scoring is not at all what I expected. For example, if my query is ...
    Michael BarbarelliMichael Barbarelli
    Apr 11, 2007 at 6:06 am
    Apr 11, 2007 at 3:16 pm
  • Hi, I'm indexing documents, and some of them are provided in several languages. Thanks to this mailing list participants, I know that I have two choices to index these multiple instances of ...
    Melanie LangloisMelanie Langlois
    Apr 10, 2007 at 8:56 am
    Apr 10, 2007 at 4:56 pm
  • Hey, I was just wondering if you are supposed to be able to search on UN_TOKENIZED fields? It seems like you can from the docs, but I have been unsuccessful. I want to do exact string matching on a ...
    Ryan O'HaraRyan O'Hara
    Apr 5, 2007 at 6:57 pm
    Apr 9, 2007 at 11:46 pm
  • We are running a search service on the internet using two machines. We have a crawler machine which crawls the web and merges new documents found into the Lucene index. We have a searcher machine ...
    Chun Wei HoChun Wei Ho
    Apr 3, 2007 at 2:40 pm
    Apr 7, 2007 at 2:22 am
  • I'm hoping someone can offer some insight into the FunctionQuery. I've just discovered this, and I think it's exactly what I've been looking for, but I'm having some trouble getting it to work. I can ...
    Annona KeeneAnnona Keene
    Apr 4, 2007 at 2:58 pm
    Apr 6, 2007 at 6:35 pm
  • Hello everybody, I need to index and search real numbers in Lucene. I found NumberUtils class in Solr project which permits one to encode doubles into string so that alpha numeric ordering would ...
    Apr 5, 2007 at 5:38 pm
    Apr 6, 2007 at 5:20 pm
Group Navigation
period‹ prev | Apr 2007 | next ›
Group Overview
groupjava-user @

144 users for April 2007

Erick Erickson: 50 posts Chris Hostetter: 35 posts Otis Gospodnetic: 30 posts Karl wettin: 24 posts Grant Ingersoll: 16 posts Doron Cohen: 14 posts Yonik Seeley: 14 posts Lokeya: 12 posts Mohammad Norouzi: 11 posts Daniel Noll: 9 posts Michael McCandless: 9 posts Tony Qian: 9 posts Bublic Online: 8 posts Jafarim: 8 posts Sengly Heng: 7 posts Steffen Heinrich: 7 posts Lucene: 6 posts Mike Klaas: 6 posts Mohsen Saboorian: 6 posts Nilesh Bansal: 6 posts
show more