Search Discussions

125 discussions - 598 posts

  • I am pleased to announce the launch of Monster's new job search Beta web site, powered by Lucene, at: http://jobsearch.beta.monster.com (notice the Lucene logo at the bottom of the page!). The jobs ...
    Peter KeeganPeter Keegan
    Oct 27, 2006 at 7:29 pm
    Mar 16, 2007 at 10:06 pm
  • Hello, I would like to make a link database using lucene. Similar to one that nutch uses. I have read the basic documentation and understood how document indexing, search, and scoring works. But what ...
    Cam BazzCam Bazz
    Oct 8, 2006 at 11:26 am
    Oct 9, 2006 at 8:45 am
  • Hi All I am trying to index a field which has more than one word with space e.g. "My Word" i am indexng it UN_TOKENIZED .. but when i use TermQuery to query "My Word" its not yielding any result.. if ...
    Ismail SiddiquiIsmail Siddiqui
    Oct 9, 2006 at 12:22 am
    Oct 9, 2006 at 5:43 pm
  • Hi chaps, Just looking for some ideas/experience as to how to improve our current architecture. We have a single-index system containing approx. 2.5 million docs of about 1-3k each. The Lucene ...
    Paul WaitePaul Waite
    Oct 17, 2006 at 7:32 pm
    Oct 26, 2006 at 6:32 pm
  • Hi, Is there a way to add / replace the text for the boolean operators used by the query parser? We would like to replace (or even better, add), "AND", "OR" and "NOT" by "ET", "OU" and "SAUF". Is ...
    Patrick TurcottePatrick Turcotte
    Oct 3, 2006 at 3:17 pm
    Oct 13, 2006 at 8:56 pm
  • I'd have to check CHANGES.txt, but I don't think that many bugs have been fixed and not that many new features added that anyone is itching for a new release. Otis ----- Original Message ---- From: ...
    Otis GospodneticOtis Gospodnetic
    Oct 15, 2006 at 4:41 am
    Dec 19, 2006 at 8:31 pm
  • Well, we defined this problem away for one of our products, but it's back for a different product. Siiiiigggghhhh...... I'm valiantly trying to get our product manager (hereinafter PM) to define this ...
    Erick EricksonErick Erickson
    Oct 6, 2006 at 12:37 pm
    Oct 11, 2006 at 6:57 pm
  • Hi I have a custom-built Analyzer where I tokenize all non-whitespace characters as well available in the field "TERM" (which is the only field being tokenised). If I now query my index file for a ...
    Oct 1, 2006 at 11:05 am
    Oct 10, 2006 at 11:06 am
  • Hi, I have a question about ParallelMultiSearcher performance. I want to search documents on about 10 gigabytes of index. (The index has 10,000,000 documents.) I get very slow performance using ...
    Oct 3, 2006 at 4:42 am
    Oct 4, 2006 at 2:45 pm
  • How to eliminate near duplicates from the index? Someone suggested that I could look at the TermVectors and do a comparision to remove the duplicates. One major problem with this is the structure of ...
    Find MeFind Me
    Oct 17, 2006 at 3:54 pm
    Oct 26, 2006 at 7:23 am
  • Hi, (I am using Lucene 2.0.0) I have been looking at a way to use stable IDs with Lucene. The reason I want this is so I can efficiently store and retrieve information outside of Lucene for filtering ...
    Johan StuytsJohan Stuyts
    Oct 17, 2006 at 2:38 pm
    Oct 19, 2006 at 1:38 pm
  • I have been reading the lists for couple of week now, and I noticed people asking about placing their indexes into a RDBMS. What is the advantage of that? So far lucene was able to solve all my ...
    Mag GamMag Gam
    Oct 5, 2006 at 4:19 am
    Oct 11, 2006 at 9:49 am
  • Hello, If I want make sure that only documents that contain at least two of the N TermQueries A, B, C, and D (N=4) are considered matches, what is the best way to approach this? I know I can expand ...
    Ryan HeinenRyan Heinen
    Oct 6, 2006 at 5:26 pm
    Oct 7, 2006 at 1:38 am
  • Hi all I'm having trouble with FileNotFoundException that pops up every once and a while. Everything works fine in my application (description below), but after running for some time (eg. 20 hours) ...
    Hes SiemelinkHes Siemelink
    Oct 4, 2006 at 9:21 am
    Oct 5, 2006 at 3:29 pm
  • Hi list. I am using lucene search for one of my site search. In which i am fetching values from the database and then index it. The fields which the docuement contains is: 1. hwid 2. title 3. author ...
    Amit SoniAmit Soni
    Oct 30, 2006 at 6:11 am
    Oct 31, 2006 at 11:13 pm
  • I'm developing an application used by scientists -- people who have a pretty good idea of what logic is -- and they were shocked to find out that neither of these queries return the same results: 1- ...
    Renaud WalduraRenaud Waldura
    Oct 12, 2006 at 11:11 pm
    Oct 15, 2006 at 1:03 pm
  • Hi, Can you tell me how indexing takes place in lucene(Depth).if document has 1....n indices then which algorithm it uses,which information retrival model it uses... Thanks & Regards, Akil Ajani ...
    Ajani, Akil \(Cognizant\)Ajani, Akil \(Cognizant\)
    Oct 3, 2006 at 9:05 am
    Oct 12, 2006 at 10:35 am
  • Has anyone dealt with the problem of constructing sub-queries given a multi-word query ? Here is an example to illustrate what I mean: user queries for - A B C D right now I change that query to "A B ...
    Oct 16, 2006 at 6:00 am
    Oct 20, 2006 at 12:04 am
  • Hi - Can someone explain the reason why I'm getting the TooManyClauses exception? I have a general understanding of the issue based on my reading, but I don't understand the mechanics of the it. ...
    Bushey, JohnBushey, John
    Oct 16, 2006 at 5:44 pm
    Oct 17, 2006 at 6:30 pm
  • Hi - I'm having a bit of trouble building a query to match a range of values in a field that is not continuous. For an example, say I want to find all people with last names starting with A-C, and ...
    Tom HillTom Hill
    Oct 4, 2006 at 8:35 pm
    Oct 5, 2006 at 7:13 pm
  • Hi, Is there a way to query all numbers that is close to a particular number (query), and score by how close they are to that number (query) ? To illustrate further, assume document with single field ...
    Oct 4, 2006 at 12:51 am
    Oct 4, 2006 at 11:59 pm
  • Hi, can anybody be so kind to tell me if it is possible to search a Term by its position? I search a term (for excample "soccer") and get back the DocId's and positions as follows: TermPositions ...
    Renzo SchefferRenzo Scheffer
    Oct 2, 2006 at 9:10 pm
    Oct 3, 2006 at 2:20 pm
  • Hi, I have tried this options too and the Term Vector return null. Which do you think that it is the problem? 2006/10/24, beatriz ramos <beatriz.ramos.moreno@gmail.com :
    Paz BelmontePaz Belmonte
    Oct 26, 2006 at 7:23 am
    Oct 26, 2006 at 12:24 pm
  • Here's my problem: We're indexing books. I need to a return books ordered by relevancy b for any single book, return the number of hits in each chapter (which, of course, may be many pages). 1 If I ...
    Erick EricksonErick Erickson
    Oct 18, 2006 at 8:51 pm
    Oct 19, 2006 at 12:41 pm
  • I hope now I am in the right mailinglist. In the -dev mailinglist Steven Parkes said, that I have to change this: But it seems that there isnt such a method declaration. Where is the mistake? -- Jan ...
    Jan PieperJan Pieper
    Oct 5, 2006 at 9:53 pm
    Oct 6, 2006 at 6:27 am
  • Hi, I'm new to Lucene and IR in general. I'm a bit confused on the concept of fields. From what I've read, a field does not have to be indexed but its value can be stored in an index. Likewise a ...
    Los MoralesLos Morales
    Oct 2, 2006 at 6:09 pm
    Oct 3, 2006 at 3:44 am
  • I am working on the development of a product that is using Lucene. A corrupt index was reported by testers and it is in an odd state. The indexes are built in batches (to multiple ram indexes in ...
    Nick PuzNick Puz
    Oct 10, 2006 at 6:24 pm
    Jan 25, 2007 at 10:19 pm
  • hello all, i would like to retrieve during query time, the part of speech of each word in a query, does anyone know of an implementation of a java part of speech api? thanks in advance, ...
    Zzzzz shalevZzzzz shalev
    Oct 20, 2006 at 10:25 am
    Nov 21, 2006 at 11:05 pm
  • Hello, I'm new to Lucene and wanted some advice on analyzers, stemmers and language analysis. I've got LIA, so have read it's chapters. I am writing a framework that needs to be able to index ...
    Antony BowesmanAntony Bowesman
    Oct 13, 2006 at 7:43 am
    Nov 21, 2006 at 10:30 pm
  • Hi Martin, ----- Original Message ---- From: Martin Braun <mbraun@uni-hd.de To: java-user@lucene.apache.org Sent: Monday, October 23, 2006 4:29:03 AM Subject: experiences with lingpipe hi all, does ...
    Otis GospodneticOtis Gospodnetic
    Oct 26, 2006 at 12:27 am
    Nov 3, 2006 at 3:04 pm
  • Hi, Newbie question. How do we index floating point number in Lucene, so that it is sortable ? There is a built-in utility class 'NumberTools' which help with indexing integer. Does Lucene has the ...
    Oct 30, 2006 at 9:58 am
    Nov 2, 2006 at 4:35 pm
  • Hi, I have a program to create a lucene index, and another program for searching that index. The Search program create an IndexSearcher object once in its constructor, and I created a method doSearch ...
    Sunil Kumar PKSunil Kumar PK
    Oct 26, 2006 at 10:57 am
    Oct 27, 2006 at 6:33 am
  • When I run this java code: Long dates = new Long("1154481345000"); Date dada = new Date(dates.longValue()); System.out.println(dada.toString()); System.out.println(DateTools.dateToString(dada, ...
    Michael J. PrichardMichael J. Prichard
    Oct 18, 2006 at 7:26 pm
    Oct 18, 2006 at 8:00 pm
  • Hi everybody: I have a big problem making prallel searches in big indexes. I have indexed with lucene over 60 000 articles, I have distributed the indexes in 10 computers nodes so each index not ...
    Ariel Isaac Romero CartayaAriel Isaac Romero Cartaya
    Oct 11, 2006 at 9:37 pm
    Oct 17, 2006 at 4:21 pm
  • Hi, I'm looking for a stemmer that is capable of returning all morphological variants of a query term (to be used for high-recall search). For example, given a query term of 'cares', I would like to ...
    Jong KimJong Kim
    Oct 14, 2006 at 7:58 pm
    Oct 16, 2006 at 3:27 pm
  • Hi folks, I am using Lucene 2.0 In our application, I am indexing a stream of documents. Each document is fairly small (< 1 KB), but there can be 10's of millions of documents. Each document has a ...
    Oct 12, 2006 at 7:05 pm
    Oct 16, 2006 at 1:50 pm
  • Supposed I want to index 500,000 documents (average document size is 4kBs). Let's assume I create a single index and that the index is static (I'm not going to add any new documents to it). I would ...
    Scott SmithScott Smith
    Oct 12, 2006 at 9:17 pm
    Oct 14, 2006 at 5:20 am
  • Hello, This is a design question: For Lucene to be able to process a million documents and in the purpose for the search application to be scalable and still have a good response time do we need to ...
    Chenini, MohamedChenini, Mohamed
    Oct 12, 2006 at 2:26 pm
    Oct 13, 2006 at 12:38 pm
  • Hi, we are using a search system based on Lucene and have recently tried to add incremental updating of the index instead of building a new index every now and then. However we now run into problems ...
    Rickard BäckmanRickard Bäckman
    Oct 9, 2006 at 9:19 am
    Oct 11, 2006 at 10:09 am
  • Hello ALL, i'm new to Lucene and wandering where i can start from Lucene? : ) basically my application is: when user input some keywords (can be more than one words) within an academic research site, ...
    Lily yanLily yan
    Oct 5, 2006 at 11:54 pm
    Oct 6, 2006 at 5:41 pm
  • Hello! I've indexed HTML pages and stored html codes as UN_TOKENIZED fields. So, I need to search for specific tags in those documents, for example: <option name="test" Do I need to write some custom ...
    John BuggerJohn Bugger
    Oct 2, 2006 at 12:50 pm
    Oct 4, 2006 at 11:37 am
  • I'm using DateTools with Resolution.DAY. I know that dates internally are converted to GMT. Converting dates "2006-10-01 00:00" and "2006-10-01 15:00" from "Etc/GMT-2" timezone will give us ...
    Volodymyr BychkoviakVolodymyr Bychkoviak
    Oct 2, 2006 at 2:55 pm
    Oct 3, 2006 at 10:36 am
  • Hi, I have a program to create a lucene index, and another program for searching that index. The Search program create an IndexSearcher object once in its constructor, and I created a method doSearch ...
    Sunil Kumar PKSunil Kumar PK
    Oct 26, 2006 at 10:57 am
    Oct 28, 2006 at 4:35 am
  • Hi - I'm currently looking into adding full text search capabilities to our site. While some threads in this list had the same basic question (RDBMS full-text versus lucene), their configurations and ...
    Rene PinedaRene Pineda
    Oct 17, 2006 at 4:02 pm
    Oct 18, 2006 at 5:13 am
  • I have a few questions regarding writing a custom analyzer. My situation is that I would like to use the StandardAnalyzer but with some data-specific rules. I was wondering if there was a way of ...
    Ryan O'HaraRyan O'Hara
    Oct 16, 2006 at 8:29 pm
    Oct 16, 2006 at 10:14 pm
  • Hi guys, I am a newbee so excuse me if this is a repost. From the javadoc it seems Reader.deleteDocuments deletes only documents that have the provided term, but the implementation examples that I ...
    Oct 15, 2006 at 4:49 am
    Oct 16, 2006 at 4:18 am
  • did someone delete the shared doc ? sachin.khaire@noemacorp.com wrote: --------------------------------------------------------------------- To unsubscribe, e-mail: ...
    Prasenjit MukherjeePrasenjit Mukherjee
    Oct 12, 2006 at 10:52 am
    Oct 13, 2006 at 12:47 pm
  • Hi there! I have a index structure like this: document_id some_text ..... when searching for some set of documents, there could be a case when several comments for the same document match the search ...
    Eugeny N DzhurinskyEugeny N Dzhurinsky
    Oct 11, 2006 at 2:47 pm
    Oct 11, 2006 at 4:33 pm
  • Hello All, I have question regarding the use of Compound file fo rindex, what is the advantage & disadvantage of enabling use of compound file(which is default I think) or disabling the useo of it. ...
    Supriya Kumar ShyamalSupriya Kumar Shyamal
    Oct 10, 2006 at 10:23 am
    Oct 10, 2006 at 7:25 pm
  • Hello, I'm currently running a site which allows users to post. Lately posts have been getting out of hand. I was wondering if anyone knows of an open source spam filter that I can add to my project ...
    Rajiv RoopanRajiv Roopan
    Oct 4, 2006 at 8:32 pm
    Oct 7, 2006 at 5:08 pm
Group Navigation
period‹ prev | Oct 2006 | next ›
Group Overview
groupjava-user @

154 users for October 2006

Erick Erickson: 67 posts Doron Cohen: 37 posts Chris Hostetter: 25 posts Chris Lu: 20 posts Erik Hatcher: 16 posts Yonik Seeley: 15 posts Otis Gospodnetic: 14 posts Ismail Siddiqui: 13 posts Mark Miller: 12 posts Karl wettin: 8 posts Paul Elschot: 8 posts Vasu shah: 8 posts Hes Siemelink: 7 posts KEGan: 7 posts Michael McCandless: 7 posts Peter Keegan: 7 posts Steven Rowe: 7 posts Volodymyr Bychkoviak: 7 posts Antony Bowesman: 6 posts Bill Taylor: 6 posts
show more