Search Discussions

88 discussions - 231 posts

  • Hi, I've been reading the API and I couldn't figure out a nice and fast way to solve the following problem: I'd like to enumerate the tokens of a document (or document field). Do the internal ...
    Nestel, FrankNestel, Frank
    Oct 10, 2001 at 7:34 am
    Oct 13, 2001 at 7:36 pm
  • Greetings, I have to apologize for so many messages to the list, but I really have to get the TermVector stuff working within the next few days because the next release of our application is going to ...
    Dmitry SerebrennikovDmitry Serebrennikov
    Oct 13, 2001 at 7:50 am
    Oct 16, 2001 at 1:33 am
  • A while back I wrote a CachingDirectory implementation for Lucene which allows for caching an index on a local machine other than the "root" machine. This can be very useful for handling heavy load ...
    Maik SchreiberMaik Schreiber
    Oct 8, 2001 at 2:03 am
    Oct 9, 2001 at 4:16 pm
  • Hello All, I'm trying to get a word count information for exact phrases, i-e to know how many times a given form occur in the index. Does anyone know how I can do this in a clean way? Does it ...
    Nioche, JulienNioche, Julien
    Oct 19, 2001 at 4:21 pm
    Oct 19, 2001 at 9:35 pm
  • This fixes a potential race condition in SegmentsReader where a call to numDocs() could return -1 if a document is being deleted at the same time by another thread. (Caviat: I don't actually have a ...
    Dmitry SerebrennikovDmitry Serebrennikov
    Oct 10, 2001 at 8:36 pm
    Oct 14, 2001 at 9:40 pm
  • Hi, I just joined the list, and I am trying to locate 1.02, but it doesn't seem to be in the Jakarta dist server, either source or binary. Anyone have a link? -Matt Bishop
    Matthew BishopMatthew Bishop
    Oct 1, 2001 at 12:18 am
    Oct 2, 2001 at 6:24 am
  • I figured that I might as well be adding comments as I am reading and figuring out the code. One thing I was not clear on - characters are stored with 1 to 3 bytes. Is that sufficient to represent ...
    Dmitry SerebrennikovDmitry Serebrennikov
    Oct 11, 2001 at 6:44 pm
    Oct 12, 2001 at 6:07 pm
  • Is everyone getting 2 copies of every email sent on lucene-dev @ jakarta. I am getting one with a [Lucene-dev] header and one without.
    Eugene GluzbergEugene Gluzberg
    Oct 9, 2001 at 10:50 pm
    Oct 11, 2001 at 1:02 am
  • Lucene has term query boosting for fields. But does anyone know how to do individual Document boosting? So basically I want to put a numerical value into a document and depending on its weight have ...
    Oct 18, 2001 at 6:55 pm
    Oct 19, 2001 at 7:01 pm
  • The latest build.xml works fine with Ant and without the batch files, but it has a classpath statement that fails if anakia is not present. I don't have it and I don't think developers typically need ...
    Dmitry SerebrennikovDmitry Serebrennikov
    Oct 13, 2001 at 7:56 pm
    Oct 15, 2001 at 6:25 pm
  • Lucene does not directly support paragraph-based searching. Lucene does support proximity searches, e.g., exact phrases, and within-N words (slop). Please see the documentation for PhraseQuery, ...
    Doug CuttingDoug Cutting
    Oct 8, 2001 at 5:36 pm
    Oct 10, 2001 at 10:23 pm
  • Here's one vote for putting locks in a separate directory. Anyone dislike that? Doug -----Original Message----- From: Snyder, David Sent: Friday, October 05, 2001 11:23 AM To: Doug Cutting Subject: ...
    Doug CuttingDoug Cutting
    Oct 5, 2001 at 6:54 pm
    Oct 5, 2001 at 8:12 pm
  • Hi, I used Lucene as a search engine in a project for which fuzzy searching was crucial. Therefore I implemented Soundex, Metaphone and DoubleMetaphone and integrated them into Lucene with special ...
    Claus EngelClaus Engel
    Oct 18, 2001 at 2:43 pm
    Oct 18, 2001 at 6:52 pm
  • Hello all, I am trying to search using a DateFilter so I get a resultset that lies between two dates. Without the DateFilter the search works perfectly, but when I use the DateFilter I get the ...
    Anders NielsenAnders Nielsen
    Oct 11, 2001 at 2:51 pm
    Oct 11, 2001 at 3:34 pm
  • The mask logic in the BooleanQuery.scorer is no longer needed because it has moved to the BooleanScorer.add. Besides, the 32-clause limitation may need to go in the future and it would be simpler to ...
    Dmitry SerebrennikovDmitry Serebrennikov
    Oct 8, 2001 at 12:09 am
    Oct 10, 2001 at 5:58 pm
  • I just checked out everything from the jakarta-lucene CVS and tried to build it. I'm running on Win 2K and using Sun JDK 1.3. The first problem was that the lib directory includes ant-1.3.jar but ...
    Dmitry SerebrennikovDmitry Serebrennikov
    Oct 6, 2001 at 11:34 pm
    Oct 8, 2001 at 5:19 pm
  • Hi, I've been lurking around the Lucene source code for about a week now... There are a couple of things I can't work out how to do properly I'd be grateful for any help with. I'm having a bit of ...
    Lee MallaboneLee Mallabone
    Oct 4, 2001 at 4:00 pm
    Oct 8, 2001 at 9:40 am
  • I have been having the same problems reported by Scott a few weeks ago and I was hoping to look at the latest CVS tree for the lucene source code to try to download just the relevant files that Doug ...
    Joanne SprostonJoanne Sproston
    Oct 5, 2001 at 10:38 am
    Oct 5, 2001 at 11:00 am
  • ---------------------------------------------------- This email is autogenerated from the output from: <http://jakarta.apache.org/builds/gump/2001-10-03/jakarta-lucene.html ...
    Jason van ZylJason van Zyl
    Oct 3, 2001 at 11:27 am
    Oct 3, 2001 at 5:11 pm
  • Ted, Jason, Any progress on getting a milestone &/or nightly Lucene release together? That and posting the Javadoc are the last things needed before we can announce the move of Lucene to Jakarta. Is ...
    Doug CuttingDoug Cutting
    Oct 1, 2001 at 5:42 pm
    Oct 1, 2001 at 6:51 pm
  • Hello, I would like to be able to make a search that is like a phrase-query, except that the last term in the phrase can be a prefix. An example would be a search for this is a t* which would match ...
    Anders NielsenAnders Nielsen
    Oct 31, 2001 at 9:28 pm
    Oct 31, 2001 at 11:07 pm
  • Hi there, I tested the new token definitions with the lucence sources from 2001-10-19. The query works fine with search terms starting with the german umlauts 'ä', 'ö', 'ü'. Ralf Zimmermann Yes. The ...
    Ralf ZimmermannRalf Zimmermann
    Oct 31, 2001 at 8:37 am
    Oct 31, 2001 at 10:06 am
  • Hello, I think the token definition list has some problem that causes the ParseException if a term starts with any not English character. Joanne's solution helps in case of three other chars but do ...
    Halácsy PéterHalácsy Péter
    Oct 27, 2001 at 9:10 am
    Oct 31, 2001 at 8:25 am
  • Hi, yes, I can confirm this bug. I have the same problem with query terms starting with german umlauts like 'ä', 'ö' and 'ü': Exception occurred during event dispatching: ...
    Ralf ZimmermannRalf Zimmermann
    Oct 22, 2001 at 4:04 pm
    Oct 26, 2001 at 7:39 pm
  • Hi! There seem to be a bug in the lucene-1.2-rc1.jar distribution. Searching for the following string returns an error message from the query parser. String katakana = "\u30AB\u30BF\u30AB\u30CA"; - - ...
    Geir Ove GrønmoGeir Ove Grønmo
    Oct 11, 2001 at 9:06 am
    Oct 22, 2001 at 12:53 pm
  • Greetings, everyone! I have the first version of the term vector support ready to go. I'm attaching a file with release notes that explain breifly what the new capabilities are and what there changes ...
    Dmitry SerebrennikovDmitry Serebrennikov
    Oct 18, 2001 at 8:55 pm
    Oct 20, 2001 at 6:18 am
  • I've found a pretty good solution for retrieving un-stemmed version of index terms, in case anyone is interested. This uses only the features already in 1.2-rc1 release. The trick is to create an ...
    Dmitry SerebrennikovDmitry Serebrennikov
    Oct 18, 2001 at 9:03 pm
    Oct 19, 2001 at 8:56 am
  • First, I would like to appologise for this message being so long, but I have tried to provide sufficient information for somone to potentialy help me diagnose my problem - this being that using the ...
    Joanne SprostonJoanne Sproston
    Oct 11, 2001 at 4:34 pm
    Oct 12, 2001 at 2:51 pm
  • Greetings, Doesn't this implementation of skipTo(int target) fail when the TermDocs is already set on the target? I got this one out of SegmentsTermDocs in SegmentsReader.java, but I think there are ...
    Dmitry SerebrennikovDmitry Serebrennikov
    Oct 11, 2001 at 1:29 am
    Oct 11, 2001 at 5:32 pm
  • I'm getting a NullPointerException when a requested Term is greater than (using compareTo) all existing Terms in a given segment. Here are my proposed changes: [Index: SegmentsReader.java] ...
    Scott GanyoScott Ganyo
    Oct 10, 2001 at 4:07 pm
    Oct 10, 2001 at 7:21 pm
  • I'm looking through the TermQuery code (and generally trying to understand exactly how the searching works) and I found this code that looks suspicious to me. It is very likeley that I just don't ...
    Dmitry SerebrennikovDmitry Serebrennikov
    Oct 7, 2001 at 9:19 pm
    Oct 8, 2001 at 6:00 pm
  • I second that having the lock location configurable would be a "nice to have"... but this brings up the problem of the searcher process being configured differently than the updating process, which ...
    Snyder, DavidSnyder, David
    Oct 5, 2001 at 7:32 pm
    Oct 6, 2001 at 1:07 am
  • Greetings, I'm doing some stress testing and optimization for out application for high concurrency rates and I'm seeing a lot of contention over the synchronization monitor in ...
    Dmitry SerebrennikovDmitry Serebrennikov
    Oct 27, 2001 at 8:26 pm
    Oct 29, 2001 at 6:09 pm
  • Testing mailing list from new location -- To unsubscribe, e-mail: For additional commands, e-mail:
    Pier FumagalliPier Fumagalli
    Oct 29, 2001 at 2:13 am
    Oct 29, 2001 at 2:15 am
  • Does anyone have a stories of how scalable Lucene is? I mean how many documents, or size of index that type of thing. I know it can index 200 meg an hour on reasonable hardware. Thanks Martin
    Schray, MartinSchray, Martin
    Oct 26, 2001 at 2:13 pm
    Oct 27, 2001 at 2:19 am
  • In case anyone is thinking of incorporating a RDF parser with Lucene, here is a project that uses Java to do that on top of SAX2 API. http://sourceforge.net/projects/rdf-filter/ Otis
    Otis GospodneticOtis Gospodnetic
    Oct 25, 2001 at 12:33 pm
    Oct 26, 2001 at 6:57 am
  • I made this change when I realized that I could make releases by specifying -Dversion=xxx on the command line. It's good to keep the version that folks build themselves different from an official ...
    Doug CuttingDoug Cutting
    Oct 22, 2001 at 6:02 pm
    Oct 22, 2001 at 6:22 pm
  • cutting 01/10/19 10:15:19 Modified: . build.properties build.xml src/java overview.html src/test/org/apache/lucene HighFreqTerms.java Log: Added source code into distribution. Revision Changes Path ...
    Oct 19, 2001 at 5:22 pm
    Oct 22, 2001 at 5:31 pm
  • Hi all, I am using lucene for indexing and then searching for files available on a LAN. I am not indexing the contents of the files as i just want to search on the bases of file names. Now, i would ...
    Rahul SawhneyRahul Sawhney
    Oct 13, 2001 at 1:31 pm
    Oct 14, 2001 at 10:52 pm
  • Here are the corrected comments per Doug's remarks. =================================================================== RCS file: ...
    Dmitry SerebrennikovDmitry Serebrennikov
    Oct 13, 2001 at 5:41 am
    Oct 13, 2001 at 5:53 am
  • Hi, this was a thread when lucene was still on Sourceforge. I've done a rough but working port of the text_cat PERL script for n-gram based language guessing to Java. If this is useful, it can be ...
    Frank NestelFrank Nestel
    Oct 12, 2001 at 10:01 am
    Oct 12, 2001 at 4:52 pm
  • Greetings Doug, I saw the pattents on your site since a long time ago, and wanted to ask you if any of those pattents cover the technology that Lucene is built on? Thanks. Dmitry.
    Dmitry SerebrennikovDmitry Serebrennikov
    Oct 9, 2001 at 11:23 pm
    Oct 10, 2001 at 5:49 pm
  • HI... I'm actually interested in a search module for a forum. I am trying to use de Jive forum that happens to use de Lucene search module, but the binary Jive distribution brings support only for ...
    Jacob GutierrezJacob Gutierrez
    Oct 7, 2001 at 3:30 am
    Oct 8, 2001 at 4:59 pm
  • Lucene has now officially moved to Jakarta. The new website is: http://jakarta.apache.org/lucene There is a new version of Lucene available for download from Jakarta, 1.2rc1. The Lucene mailing lists ...
    Doug CuttingDoug Cutting
    Oct 2, 2001 at 9:24 pm
    Oct 5, 2001 at 7:58 pm
  • Hi! I need information about the implementation of the "suffix tree clustering" algorithm.... does anybody have information about implementation details? Thanks for your help! Gerardo PS. After I ...
    Gerardo ArroyoGerardo Arroyo
    Oct 2, 2001 at 1:07 am
    Oct 2, 2001 at 1:26 am
    Oct 31, 2001 at 10:58 pm
    Oct 31, 2001 at 10:58 pm
  • cutting 01/10/30 16:12:30 Modified: src/java/org/apache/lucene/store RAMDirectory.java Log: Fixed a bug where RAMInputStream could not read across more than across a single buffer boundary. Revision ...
    Oct 31, 2001 at 12:24 am
    Oct 31, 2001 at 12:24 am
  • What is the best way to store a field that you may want to have phrases in? for example- lets say a field called keywords in a document has these vales: keywords: small cars, compact cars, little ...
    Oct 29, 2001 at 11:54 pm
    Oct 29, 2001 at 11:54 pm
  • As you might have noticed, one-by-one, all mailing lists for Jakarta.Apache.ORG have been moved off to a new server, to split the load of all Apache Mailing Lists on two separate machines. What does ...
    Pier FumagalliPier Fumagalli
    Oct 29, 2001 at 3:44 am
    Oct 29, 2001 at 3:44 am
    Oct 24, 2001 at 6:45 pm
    Oct 24, 2001 at 6:45 pm
Group Navigation
period‹ prev | Oct 2001 | next ›
Group Overview
groupdev @

40 users for October 2001

Dmitry Serebrennikov: 51 posts Doug Cutting: 47 posts Cutting: 14 posts Dave Kor: 9 posts Joanne Sproston: 8 posts Bugzilla: 8 posts Jason van Zyl: 6 posts Scott Ganyo: 6 posts Anders Nielsen: 5 posts Maurits van Wijland: 5 posts Otis Gospodnetic: 5 posts Jon Stevens: 4 posts Lee Mallabone: 4 posts Maik Schreiber: 4 posts Nelson Minar: 4 posts Ralf Zimmermann: 3 posts Soshima: 3 posts Brian Goetz: 3 posts Eugene Gluzberg: 3 posts Frank Nestel: 3 posts
show more