Search Discussions

78 discussions - 452 posts

  • if you didn't have to change the index then you haven't got all the factors needed to do it well. terms can't cross sentence boundaries and the index doesn't store sentence boundaries. Herb... ...
    Chong, HerbChong, Herb
    Nov 14, 2003 at 6:13 pm
    Nov 18, 2003 at 2:55 pm
  • Hello , now that the topic is clustering methods: has there been any effort in implementing Latent semantic indexing in Lucene? Google only indicates someone else asking this in february. Is there an ...
    Thomas KrämerThomas Krämer
    Nov 11, 2003 at 7:38 pm
    Nov 14, 2003 at 1:19 pm
  • Hi, does Lucene implement a Vector Space Model? If yes, does anybody have an example of how using it? Cheers, Ralf -- NEU FÜR ALLE - GMX MediaCenter - für Fotos, Musik, Dateien... Fotoalbum, File ...
    Nov 12, 2003 at 6:11 pm
    Nov 16, 2003 at 5:49 pm
  • Hi, does anyone have any sample code/documentation available for doing document based clustering using lucene? Thanks, Marc
    Nov 11, 2003 at 1:22 pm
    Nov 12, 2003 at 9:03 am
  • Hi, I am new here. May I know how to refresh indexes in Lucene immediately after new documents have been added without re-compiling again to reindex the documents in that particular directory? Thanks.
    Tun LinTun Lin
    Nov 22, 2003 at 4:22 pm
    Nov 28, 2003 at 2:27 pm
  • Hello, we (DENIC) are the world's second largest domain registry (.de-zone has almost 6.9 million domains) and are using Lucene to index and search our website in a high-traffic scenario. Most of our ...
    Ulrich MayringUlrich Mayring
    Nov 27, 2003 at 11:32 am
    Dec 2, 2003 at 5:38 pm
  • Hi guys - First off I want to just give the Lucene project credit for producing an API like this. Truly great stuff. I was just wondering if anyone could share some wisdom on a couple of issues: 1. ...
    Dion AlmaerDion Almaer
    Nov 23, 2003 at 2:34 am
    Dec 2, 2003 at 11:35 am
  • Hi, I have a huge data file with 4 gb data. The data in the file never changes. The format of the file is as follows: Col1,col2,col3,Value ---------------------------- abababc,xyzzzzzza,ccccc,100 ...
    Kumar MettuKumar Mettu
    Nov 12, 2003 at 3:00 am
    Nov 13, 2003 at 2:23 pm
  • I had a thought on my earlier post on "Poor Performance when searching for 500+ terms". The problem is on how to improve the performance when searching for 500+ OR search terms. i.e. enter a search ...
    Jie YangJie Yang
    Nov 13, 2003 at 4:01 pm
    Nov 14, 2003 at 12:06 am
  • Hi. I need to tokenize text while indexing but I don't want space to be delimiter. Delimiter should be my custom character (for example comma). I understand that I would probably need to implement my ...
    Dragan JotanovicDragan Jotanovic
    Nov 25, 2003 at 11:35 am
    Nov 26, 2003 at 12:17 pm
  • My only concern with this being integrated into lucene is that it be done in a way that doesn't make its use mandatory. Lucene is powerful enough that it can be used for a lot of cases where NLP ...
    Dan QuaroniDan Quaroni
    Nov 17, 2003 at 2:47 pm
    Nov 18, 2003 at 3:12 am
  • I raised the question two days ago. My question was too specific to the application that I have been working on. I have decided to re-phrase my question. People say that Lucene is very flexible. I ...
    Caroline JenCaroline Jen
    Nov 4, 2003 at 2:53 am
    Nov 7, 2003 at 9:19 am
  • The documents I have index contain information regarding file names also. For instance 'return_results.pl' or something like that may be in the document fields. I am not understanding Lucene's way of ...
    Pleasant, TracyPleasant, Tracy
    Nov 25, 2003 at 5:20 pm
    Nov 26, 2003 at 5:07 pm
  • Hello, I'm trying to move a Document from one Index to another, without necessarily reindexing it... The Document is composed of one Field.Keyword and a bunch of Field.UnStored. Reading such a ...
    Nov 20, 2003 at 12:09 pm
    Nov 20, 2003 at 2:54 pm
  • I thought Erik's article was great. There was one unanswered brainbender I had which I was hoping was in there, but... Maybe you can add this topic to the next one, Erik? Here is my issue: When using ...
    Tomcat ProgrammerTomcat Programmer
    Nov 13, 2003 at 4:53 am
    Nov 17, 2003 at 4:37 am
  • Hi, A couple questions... 1). If I delete a term using an IndexReader, can I use an existing IndexWriter to write to the index? Or do I need to close and reopen the IndexWriter? 2). Is it safe to ...
    Wilton, ReeceWilton, Reece
    Nov 11, 2003 at 6:02 pm
    Nov 13, 2003 at 3:36 pm
  • Hi all, I'm thinkin' about writing a search tool for my filesystem. I know such things exist already but programming it myself is much more fun ;-) So, I would have Lucene crawl through my filesystem ...
    Marcel StorMarcel Stor
    Nov 5, 2003 at 8:51 am
    Nov 5, 2003 at 11:24 pm
  • Thank you for the replies! My indexes are currently looking like they might be 12GB when finished on the current run. I have spotted a tool on the lucene site for listing the most frequently occuring ...
    Jt oobJt oob
    Nov 4, 2003 at 11:10 am
    Nov 5, 2003 at 9:57 pm
  • When I first read this changelog entry: I just assumed that this was an optional feature. I think this is a dangerous change and should be disabled by default (or only enabled with lock files can't ...
    Kevin A. BurtonKevin A. Burton
    Nov 9, 2003 at 8:29 pm
    Nov 24, 2003 at 4:05 am
  • Hi, I am thinking to give spell check functionality to the search. I am trying to achieve two things to complement search. 1. Spell check where dictionary will be composed of all text I am creating ...
    Sam sSam s
    Nov 21, 2003 at 12:07 am
    Nov 21, 2003 at 7:59 pm
  • Dear All, All I would like to know is how many times a query was found in a particular document. I have no problems getting the score from hits.score(). hits.length is the number of times in total ...
    Kent GibsonKent Gibson
    Nov 29, 2003 at 11:37 pm
    Nov 30, 2003 at 5:55 pm
  • Dear Group Members, I have looked in archives for a simple tutorial which could guide me throught process of integrating Lucene with a website based on Struts. The website uses tiles, the content of ...
    Michal SMichal S
    Nov 27, 2003 at 8:37 am
    Nov 28, 2003 at 7:48 am
  • If I have words within a document like red_car If I search for 'red' would it return documents containing 'red_car'? --------------------------------------------------------------------- To ...
    Pleasant, TracyPleasant, Tracy
    Nov 25, 2003 at 5:09 pm
    Nov 25, 2003 at 5:53 pm
  • Hi, Anyone knows when the full version of Lucene version 1.3 will be released? Please advise. Thanks.
    Tun LinTun Lin
    Nov 24, 2003 at 2:39 am
    Nov 25, 2003 at 11:08 am
  • Rules of linguistics? Is there such a thing? :) Yes there are. How can you expect communication (the goal of the game that natural language is about) to work if the game has no rules? Anyway, Herb is ...
    Karsten KonradKarsten Konrad
    Nov 15, 2003 at 12:16 pm
    Nov 17, 2003 at 9:25 pm
  • I have created the following "log4j.properties" and put it in your classpath but it still has that error. Anyone can help? log4j.rootCategory=stdout ...
    Tun LinTun Lin
    Nov 26, 2003 at 6:06 am
    Nov 27, 2003 at 1:24 am
  • Hello togehter, I am asking this group because I think people here might know about this since it is a similar approach. Is there a Java based API which assist developers of collaborative filtering ...
    Nov 25, 2003 at 3:53 pm
    Nov 26, 2003 at 8:34 pm
  • Are there any hits highlighting functions? I have a simple one, but it gets complicated with searching multiple words, having tokens, etc. ...
    Pleasant, TracyPleasant, Tracy
    Nov 25, 2003 at 5:25 pm
    Nov 25, 2003 at 5:59 pm
  • Hi folks, I've got a feeling the answer to this has either been posted on here recently, or is on the site somewhere - but i can't find it. Apologies if i'm going over old ground. What is the best ...
    Jt oobJt oob
    Nov 19, 2003 at 2:44 pm
    Nov 19, 2003 at 5:37 pm
  • Hello, I am considering using the document id in order to implement a fast 'join' during relational search. My first question is: should I steer clear of this all together? And why? If not, I need to ...
    Tate AveryTate Avery
    Nov 17, 2003 at 5:23 pm
    Nov 18, 2003 at 8:49 pm
  • If you have a lot of terms in that range, you can see that there is obviously some cycles spinning to do the work needed. If the number of different date terms causes this effect, why not "round" the ...
    Karsten KonradKarsten Konrad
    Nov 15, 2003 at 4:39 pm
    Nov 17, 2003 at 11:28 pm
  • Hi, We're seeing slow response time when we apply datefilter. A search that takes 7 msec with no datefilter takes 368 msec when I filter on the last fifteen days, and 632 msec on the last 30 days. ...
    Dror MatalonDror Matalon
    Nov 15, 2003 at 12:16 am
    Nov 16, 2003 at 12:39 am
  • Classes for index Pdf and word files in lucene. Ernesto. ----- Original Message ----- From: "Ernesto De Santis" <[email protected] To: <[email protected] Sent: Wednesday, October 29, ...
    Ernesto De SantisErnesto De Santis
    Nov 11, 2003 at 7:02 pm
    Nov 12, 2003 at 5:37 pm
  • I would appreciate some clarification on how to generate multiple tokens from a single input token. In a previous message: (see: ...
    Peter KeeganPeter Keegan
    Nov 10, 2003 at 2:46 pm
    Nov 10, 2003 at 3:36 pm
  • What is the easyest way to eliminate duplicate documents if one is doing two searches on the same index? Have anybody done something similar?
    Dragan JotanovicDragan Jotanovic
    Nov 25, 2003 at 4:12 pm
    Nov 26, 2003 at 4:38 pm
  • Hi, I'm using the Multi Field Search to search all the fields of my documents during the search. When it returns results the scores are numerically low - .06, .17, etc. I would think if I searched ...
    Pleasant, TracyPleasant, Tracy
    Nov 24, 2003 at 7:03 pm
    Nov 25, 2003 at 7:08 pm
  • My lucene indexes contain fields with values like this www.xxx.yyy.zzz which are treated as HOST tokens. My problem is the following : search results never contain documents with such fields when ...
    Pascal NadalPascal Nadal
    Nov 12, 2003 at 10:58 am
    Nov 12, 2003 at 4:04 pm
  • Hi Lucene experts, Can you help on this? I have included the following code in FileDocument to print out the summary but I have funny output like: The result after searching, the summary is displayed ...
    Tun LinTun Lin
    Nov 28, 2003 at 3:59 pm
    Nov 28, 2003 at 10:33 pm
  • Hi, I'm new to Lucene and have just started to familiarise myself with the API. I'm trying the web demo out, running IndexHTML. I'm wondering if it is possible to include the protocol and host as ...
    Justyna LubkowskiJustyna Lubkowski
    Nov 26, 2003 at 6:13 am
    Nov 26, 2003 at 6:46 am
  • If I search for "like" I would want the search to return documents containing "like", "liked", "likes", etc.. variations of the word. Is there a way to tell Lucene to do this? ...
    Pleasant, TracyPleasant, Tracy
    Nov 25, 2003 at 4:48 pm
    Nov 25, 2003 at 6:43 pm
  • Hello group, does Lucene offer an effective and flexible way to treat XML files. I know that as soon as an InputStream is provided Lucene can basically index (evtl. after clearning) everything. How ...
    Nov 25, 2003 at 4:12 pm
    Nov 25, 2003 at 4:24 pm
  • Hi, http://jakarta.apache.org/lucene/docs/queryparsersyntax.html The OR operator is the default conjunction operator. This means that if there is no Boolean operator between two terms, the OR ...
    Dror MatalonDror Matalon
    Nov 24, 2003 at 8:33 am
    Nov 24, 2003 at 7:04 pm
  • Hi, I am a very beginner of Lucene und started to look into some articles and the API documentation. I know the theories behind Information Retrieval and want to find out about Lucene. I think it is ...
    Ralf BRalf B
    Nov 24, 2003 at 5:23 pm
    Nov 24, 2003 at 6:22 pm
  • Hi, Occasionally I get an "Illegal seek" error while loading a document into lucene. I am new to lucene so I am not sure what to look for. Does any one have an idea of what may cause this error. Can ...
    Dan PeltonDan Pelton
    Nov 18, 2003 at 9:33 pm
    Nov 18, 2003 at 9:42 pm
  • Hi, I used lucene 1.2/it.unige.csita.lucene.RODirectory inside an applet on CD-ROM. In lucene 1.3 the system property 'disableLuceneLocks' was introduced to make it.unige.csita.lucene.RODirectory or ...
    Thomas FuchsThomas Fuchs
    Nov 6, 2003 at 6:55 pm
    Nov 6, 2003 at 11:51 pm
  • Hi, May I know how do I analyse Chinese input from Chinese text in Lucene? Do I use Analyser function in Lucene? If yes, how to go about using it?
    Tun LinTun Lin
    Nov 26, 2003 at 5:14 am
    Nov 26, 2003 at 11:42 am
  • Hi, assume a field has the following text "Adenylate kinase (mitochondrial GTP:AMP phosphotransferase) " the following searches all return this document AMP &AMP &AMP; can someone explain this to ...
    Nov 26, 2003 at 12:53 am
    Nov 26, 2003 at 9:18 am
  • Hi, I looked around the archives and didn't see anything about this subject. Is there a Lucene command line tool that lets you look at an index, run queries, and look at different metrics? Something ...
    Dror MatalonDror Matalon
    Nov 25, 2003 at 7:02 pm
    Nov 25, 2003 at 10:12 pm
  • This was raised in http://www.mail-archive.com/[email protected]/msg04696.html and not really answered. If I do +(contents:luc* description:luc*) Things work fine. However if I do ...
    Dror MatalonDror Matalon
    Nov 25, 2003 at 6:39 pm
    Nov 25, 2003 at 9:29 pm
  • I'm having difficulty creating an IndexSearcher from an FSDirectory in 1.3-rc2. The code is as follows (log.writeToLog is a convenience method): log.writeToLog(Log.DEBUG,"directory path ="+hitPath); ...
    Nov 24, 2003 at 9:31 pm
    Nov 24, 2003 at 9:54 pm
Group Navigation
period‹ prev | Nov 2003 | next ›
Group Overview
groupjava-user @

88 users for November 2003

Erik Hatcher: 56 posts Chong, Herb: 41 posts Otis Gospodnetic: 31 posts Dror Matalon: 26 posts Petite_abeille: 24 posts Pleasant, Tracy: 19 posts Doug Cutting: 18 posts Tun Lin: 14 posts Victor Hadianto: 13 posts Stefan Groschupf: 10 posts Jie Yang: 9 posts Ralf Bierig: 8 posts Hackl, Rene: 7 posts Eric Jain: 6 posts Gerret Apelt: 6 posts MOYSE Gilles (Cetelem): 6 posts Andrzej Bialecki: 5 posts Caroline Jen: 5 posts Dan Quaroni: 5 posts Dragan Jotanovic: 5 posts
show more