Search Discussions

95 discussions - 469 posts

  • Hello, Is there a way to influence lucene's generation of ids while indexing. my requirement is. I want to have different indexes where no index should have ids that have been assigned to an index ...
    Waheed MohammedWaheed Mohammed
    Dec 11, 2006 at 3:15 pm
    Dec 19, 2006 at 6:13 pm
  • Hello! I wrote a custom analyzer that has synonyms of some words to help on search. I use the analyzer when searching the user's entered keyword. What is happening that I don't understand why is that ...
    Dec 5, 2006 at 7:01 pm
    Dec 6, 2006 at 2:58 pm
  • Hi All, I am using lucene 1.9.1 for search functionality in my j2ee application using JBoss as app server. The lucene index directory size is almost 20G right now. There is a Quartz job that is ...
    Harini RaghavanHarini Raghavan
    Dec 21, 2006 at 4:14 pm
    Dec 27, 2006 at 5:50 pm
  • Hello group, The coord(q,d) normalisation is "a score factor based on how many of the query terms are found in the specified document." and described here: ...
    Karl KochKarl Koch
    Dec 12, 2006 at 10:32 am
    Dec 19, 2006 at 8:57 pm
  • I'm investigating some performance issues with the way we're using Lucene in our web app and am interested if anyone could shed some light on what might be going on. Hopefully I can provide enough ...
    Bryan DotzourBryan Dotzour
    Dec 20, 2006 at 8:59 pm
    Dec 22, 2006 at 7:25 pm
  • Hi, I'm attempting to use Lucene under Coldfusion MX, however when I try to create and index I am coming up against the following error message when I try to add a document: The selected method ...
    Aaron ShawAaron Shaw
    Dec 4, 2006 at 10:08 am
    Dec 5, 2006 at 3:54 pm
  • I'm been having a hard time finding any kind of reasonable documentation on lucene. It seems that the javadocs are mostly empty, and the stuff on the wiki really doesn't explain anything. Is there a ...
    Eric HaszlakiewiczEric Haszlakiewicz
    Dec 26, 2006 at 7:35 pm
    Dec 29, 2006 at 3:53 pm
  • I am not sure if this is a problem with Lucene or if I am building my Query object improperly. It seems to me, when performing a search that should exclude certain terms, MultiFieldQueryParser ...
    Scott SellmanScott Sellman
    Dec 19, 2006 at 10:06 pm
    Dec 22, 2006 at 2:49 pm
  • I have not really looked into this yet, but maybe you can save me some time -- Is it feasible/simple to sort by the number of hits found per document? Would this require changing the scoring system ...
    Mark MillerMark Miller
    Dec 16, 2006 at 5:08 pm
    Dec 21, 2006 at 3:27 am
  • I have a collection of documents for which I've always returned the results sorted on the date/time of the document (using a sort object in the search method on my Searcher). It works great. ...
    Scott SmithScott Smith
    Dec 9, 2006 at 1:26 am
    Dec 13, 2006 at 5:51 pm
  • I'm currently investigating the best ways of clustering Lucene. I've heard of both Solr, Terracotta but do not know how well they scale. Their examples talk of a 4 node cluster. This is way too small ...
    Dec 27, 2006 at 3:57 pm
    Dec 28, 2006 at 9:35 pm
  • At the bottom of this email is the sample xml file that we are using today. We have about 10 million of these. We need to know whether Lucene can support the following functionalities. (1) Each field ...
    Mark MeiMark Mei
    Dec 13, 2006 at 5:43 pm
    Dec 14, 2006 at 2:06 pm
  • I've noticed that after stress-testing my application (uses Lucene 2.0) for I while, I have almost 200mb of byte[]s hanging around, the top two culprits being: 24 x SegmentReader.Norm.bytes = 112mb 2 ...
    Eric JainEric Jain
    Dec 11, 2006 at 11:03 pm
    Dec 12, 2006 at 4:39 am
  • Hi, I have some questions about the scoring function and about how different scores can be compared. I use Lucene for indexing an archive of the web. /archive/day1/differentsites = indexday1 ...
    Nils HöllerNils Höller
    Dec 1, 2006 at 10:32 am
    Dec 4, 2006 at 2:02 pm
  • Hi, I indexed first 220,000, all with a special keyword, I did a simple query and only fetched 5 docs, with Hits.length()=220,000. Then I indexed 440,000 docs, with the same keyword, query it again ...
    Zhang, LishengZhang, Lisheng
    Dec 5, 2006 at 2:50 am
    Mar 21, 2007 at 6:55 pm
  • Hi, Please see the following data-structure +--------+----------+ +--------+----------+ My requirement is to find all values in FIELD1 where FIELD2 contains all values of FIELD1 where FIELD2 contains ...
    Kapil ChhabraKapil Chhabra
    Dec 26, 2006 at 9:33 am
    Dec 28, 2006 at 4:23 pm
  • Hello all, I am trying to boost more recent Docs, i.e. Docs with a greater year Value like this: if (title.getEJ() != null) { titleDocument.setBoost(new Float("1." + title.getEJ())); } so a doc from ...
    Martin BraunMartin Braun
    Dec 20, 2006 at 4:32 pm
    Dec 26, 2006 at 7:19 pm
  • Consider the following interesting situation, A library has around 100K book, and want to be indexed by Lucene, this seems to be straight forward, but.... The target is: 0. You can search all books ...
    Howard chenHoward chen
    Dec 16, 2006 at 10:30 am
    Dec 16, 2006 at 8:58 pm
  • Hello Lucene users, in the past, I asked a number of times about the scoring that was applied for Lucene 1.2 (which might also still be valid in current Lucene versions). At that time I was ...
    Dec 9, 2006 at 1:24 pm
    Dec 13, 2006 at 5:14 am
  • Hi, In my test case, four Quartz jobs are starting each third minute storing records in a database followed by an index update. After doing a test run over a period of 16 hours, I got this exception ...
    Trond LindangerTrond Lindanger
    Dec 5, 2006 at 9:05 am
    Dec 5, 2006 at 3:13 pm
  • Hi, Can anyone please tell me how to specify multiple character wildcard searches in "Term" Below is my requirement 1) I want search all names that starts with Z (Z*) 2) My programme will receive ...
    Eshwaramoorthy BabuEshwaramoorthy Babu
    Dec 4, 2006 at 8:45 am
    Dec 4, 2006 at 6:50 pm
  • Hi , we have a requirement to compare 2 xml files and generate result(reconcilation report). The xml file size is 6MB each and the flrmat is as below <Data <Id 123</Id <Amount 123</Amount </Data I ...
    Eshwaramoorthy BabuEshwaramoorthy Babu
    Dec 4, 2006 at 7:29 am
    Dec 4, 2006 at 3:04 pm
  • I have an index that's approximately 875MB. I'm using JBoss Application Server 4.04 w/ Apache HTTP Server 2.2. My min/max JVM size is: 128MB/512MB. On initial startup, everything works fine. I'm able ...
    Van NguyenVan Nguyen
    Dec 20, 2006 at 11:40 pm
    Dec 23, 2006 at 5:37 am
  • Hey All, I am very interested in indexing a 3NF Data Structure. Is there any advice that someone can provide with this? From what I have seen Lucene is typically a flat "First Normal Form" (Flat) ...
    Andrew HughesAndrew Hughes
    Dec 13, 2006 at 6:58 am
    Dec 21, 2006 at 2:43 pm
  • Hi!!!! I have a problem: i must create a matrix term for document in which every element of the matrix it represents the number of occurrences of that term in the document. How can I do? Can someone ...
    Dec 13, 2006 at 6:13 pm
    Dec 15, 2006 at 2:26 am
  • Hi, Just wondering is there anyone used Digester to extract xml content and index the xml file? Is there any source that I can refer to on how to extract the xml contents. Or is there any other xml ...
    Dec 14, 2006 at 10:54 am
    Dec 15, 2006 at 12:23 am
  • Hi, Is anyone index an excel file before? I took a look at the API classes provided by POI HSSF, however, I did not find any method to extract the text from excel file and index them. Please assist ...
    Dec 14, 2006 at 1:21 am
    Dec 14, 2006 at 4:32 am
  • Hi, I have ask this question before but may be the question wasn't clear. How can I delete particular index that I want to and keep the rest? For instance, I have been indexed document Id, date, user ...
    Dec 12, 2006 at 8:35 am
    Dec 13, 2006 at 2:13 am
  • Well, the performance isn't bad considering you're executing the *search* around 1,000 times....... One of the characteristics of a Hits object is that it's optimized for getting the top 100 docs or ...
    Erick EricksonErick Erickson
    Dec 7, 2006 at 6:31 pm
    Dec 9, 2006 at 4:23 am
  • Hi every body: I am getting a problem during the indexing process, I am indexing big amounts of texts most of them in pdf format I am using pdf box 0.6 version. The space in hard disk before that the ...
    Ariel Isaac Romero CartayaAriel Isaac Romero Cartaya
    Dec 4, 2006 at 2:39 pm
    Dec 7, 2006 at 6:24 pm
  • Hi, About 2-3 weeks ago I emailed about a memory leak in my application. I then found some problems in my code (I wasn't closing IndexSearchers explicitly) and took care of those. Now I see my app is ...
    Otis GospodneticOtis Gospodnetic
    Dec 15, 2006 at 5:29 pm
    Feb 23, 2007 at 6:54 pm
  • I have finally delved back into the Lucene Query parser that I started a few months back. I am very closing to wrapping up it's initial development. I am currently looking for anybody willing to help ...
    Mark MillerMark Miller
    Dec 6, 2006 at 1:22 am
    Jan 3, 2007 at 5:32 pm
  • Greetings All, In the past week, we have released the first public beta of our first Lucene based product. Minalyzer (Miner and Analyzer) is a unique product that combines search with data analysis. ...
    Saurabh DaniSaurabh Dani
    Dec 30, 2006 at 11:25 pm
    Dec 31, 2006 at 1:46 am
  • I would like to use the data stored in the Lucene indexes, like the words and their frequencies and store them in a database. Can anyone suggest a way of going about it or is it possible at all? TIA ...
    Dec 13, 2006 at 11:03 am
    Dec 26, 2006 at 4:08 am
  • How can i remove the duplicates records in the search results. i.e., I have multiple results with the same title in 'title' field, and I want to only 1 record per title, how can I achieve that? ...
    Qaz zaqQaz zaq
    Dec 14, 2006 at 10:18 pm
    Dec 15, 2006 at 5:53 am
  • Hello All, Apolgies if it is a naive question a) Indexing large file ( more than 4MB ) Do i need to read the entire file as string using java.io and create a Document object ? The file contains ...
    Abdul aleemAbdul aleem
    Dec 13, 2006 at 1:10 pm
    Dec 14, 2006 at 8:31 am
  • Hi, I'm running load tests with Lucene 2.0, SUN's JDK 6 on Windows XP2, dual core CPU. I have 8 worker threads adding a few hundred K documents, split between two Lucene indexes, I've started getting ...
    Antony BowesmanAntony Bowesman
    Dec 22, 2006 at 2:58 am
    Jan 2, 2007 at 5:55 pm
  • Hi All, I'm getting a 'TooManyClauses' Exception and I'm not sure how to fix this. Here's a sample query that I'm using: +(+freeform_text:exhibit* +(+freeform_text:dispaly +freeform_text:event*) ...
    Chris SalemChris Salem
    Dec 27, 2006 at 2:43 pm
    Dec 27, 2006 at 4:34 pm
  • Hey Luceners, There have been several changes to the website (http:// lucene.apache.org/java/docs/index.html) that may or may not affect how people use Lucene documentation. Previously, the website ...
    Grant IngersollGrant Ingersoll
    Dec 23, 2006 at 12:56 pm
    Dec 23, 2006 at 11:08 pm
  • Hello, Does anyone know about a modified version of the French Stemmer ? This one has too many bad results. For example, if I use the word : "ours" (bear) The stemmer stemm it into "our".....which ...
    Renaud PaquayRenaud Paquay
    Dec 22, 2006 at 9:54 am
    Dec 22, 2006 at 2:18 pm
  • I have some problem to sort words. Somehow it sorts in strange way. sort result is below: ... BILLIARD & CAFE BIZIM CAFE BOLSA CAFE BIDA BONAMICO CAFE BONESSIMO CAFE CAFE BAR AZZURRI A BICA CAFE ...
    Dec 21, 2006 at 3:01 pm
    Dec 21, 2006 at 4:12 pm
  • I am bothered about security problems with lucene. Is it vulnerable to any kind of injection like mysql injection? many times the query from user is passed to lucene for search without validating. -- ...
    Dec 21, 2006 at 9:59 am
    Dec 21, 2006 at 11:31 am
  • Hi, I'm working on learning Lucene for my job, and the book one of my professors purchased for myself and her is Lucene In Action, which is a good book but it is based on version 1.4.3 (I believe). I ...
    JT KimbellJT Kimbell
    Dec 19, 2006 at 3:36 pm
    Dec 20, 2006 at 7:39 pm
  • Hi, I am trying to figure out how to give different weights to different terms in a same document. Anybody knows how to do this? For example, doc A contains field1 : term1 (weight C) field1 : term2 ...
    Eun Yong KangEun Yong Kang
    Dec 20, 2006 at 8:42 am
    Dec 20, 2006 at 7:07 pm
  • So, I am still new to Lucene, so please take this into consideration when reading this. Up until now, a novice like myself has been able to finagle Lucene into doing what we want. But now we have a ...
    Dec 15, 2006 at 9:17 pm
    Dec 16, 2006 at 6:14 am
  • Yahoo! has a search suggestion feature so that if you search for say 'shoes' then it also reccomends payless shoes, jordan shoes, aldo shoes, nike shoes, bakers shoes and a bunch of others. Has ...
    Simon WistowSimon Wistow
    Dec 14, 2006 at 10:23 pm
    Dec 16, 2006 at 1:12 am
  • Hello, how can I make a query to bring documents between timestamp begin and timestamp end, given that I have stored my dates using DateTools.timeToString(long)? Best regards, -C.B.
    Cam BazzCam Bazz
    Dec 14, 2006 at 4:21 pm
    Dec 15, 2006 at 11:52 am
  • Hi, Does anyone know if lucene can handle complex boolean queries like the following ones? 1. T: (A OR NOT B) 2. (T:A AND NOT T:B) OR NOT T:C Cause I figured out in some tests that the results were ...
    Marcelo OhashiMarcelo Ohashi
    Dec 6, 2006 at 6:35 pm
    Dec 7, 2006 at 6:27 pm
  • Dear .., I am a new user to Lucene. I am having a requirement as follows. I am using SQL Server 2005 database, The Database having a Table named --- Product and its columns are 1 Prod_id 2 Prod_name ...
    Saroj K MSaroj K M
    Dec 5, 2006 at 5:59 am
    Dec 5, 2006 at 8:58 pm
  • Hi, I have a query related to the full text searching on documents saved in database as BLOB. In our application, we are planning to save our documents in the database as BLOB and we have a ...
    Inderjeet KalraInderjeet Kalra
    Dec 1, 2006 at 7:17 am
    Dec 1, 2006 at 7:58 pm
Group Navigation
period‹ prev | Dec 2006 | next ›
Group Overview
groupjava-user @

131 users for December 2006

Erick Erickson: 41 posts Chris Hostetter: 24 posts Mark Miller: 15 posts Doron Cohen: 14 posts Otis Gospodnetic: 14 posts Grant Ingersoll: 13 posts Michael McCandless: 13 posts Yonik Seeley: 13 posts Daniel Naber: 12 posts Karl Koch: 10 posts Spinergywmy: 10 posts Abdul aleem: 9 posts Soeren Pekrul: 9 posts Alice: 8 posts Erik Hatcher: 8 posts Aaron Shaw: 7 posts Steven Rowe: 7 posts Eshwaramoorthy Babu: 6 posts Grant Ingersoll: 6 posts Kapil Chhabra: 6 posts
show more