Search Discussions

88 discussions - 405 posts

  • Hi, Doro Cohen Thanks for your reply, but I am facing a small problem over here. As I am using notepad for coding, then in which format the file should be saved. public static final String[] ...
    Liaqat AliLiaqat Ali
    Dec 26, 2007 at 7:37 pm
    Dec 27, 2007 at 10:48 am
  • hi, i am using lucene for the very first time and want to manipulate the results, by adding some more factors to it, which file should i edit to manipulate the search results.... Thanks Sumit Tyagi ...
    Dec 21, 2007 at 4:51 am
    Feb 21, 2008 at 12:47 am
  • Hi, I need to index a Wikipedia dump. I know there is code in contrib/benchmark for indexing *English* Wikipedia for benchmarking purposes. However, I'd like to index a non-English dump, and I ...
    Otis GospodneticOtis Gospodnetic
    Dec 12, 2007 at 5:35 am
    Jan 3, 2008 at 3:35 am
  • I have an index of about 10mb. Since it's so small, I would like to keep it loaded in memory, and reload it about every minute or so, assuming that it has changed on disk. I have the following code, ...
    Ruslan SivakRuslan Sivak
    Dec 11, 2007 at 10:38 pm
    Dec 13, 2007 at 5:27 pm
  • Hy, I got a large index and when searching for a term I want the newer documents be at the begining of the result set. I dont need a real order by time but lucene should prioritze the newer ...
    Dominik BruhnDominik Bruhn
    Dec 30, 2007 at 12:54 am
    Jan 12, 2008 at 1:07 am
  • I have the following code for search: BooleanQuery bQuery = new BooleanQuery(); Query queryAuthor; queryAuthor = new TermQuery(new Term(IFIELD_LEAD_AUTHOR, author.trim().toLowerCase())); ...
    Sirish VadalaSirish Vadala
    Dec 17, 2007 at 6:38 pm
    Dec 18, 2007 at 9:07 pm
  • Suppose I have an index containing the terms impostor, imposter, fraud, and fruad, then presumably regardless of whether I spell impostor and fraud correctly, Lucene SpellChecker will offer the ...
    Dec 3, 2007 at 3:14 am
    Dec 11, 2007 at 7:27 pm
  • Hi there I am having a problem in using escape characters with lucene demo code. I used the following code for IndexFiles and SearchFiles. The code works fine for regular searching and also with ...
    Baljeet DhaliwalBaljeet Dhaliwal
    Dec 19, 2007 at 10:07 pm
    Dec 21, 2007 at 6:39 pm
  • Hello All, I am seeing this issue and would like to understand if its a bug or I am missing something and doing the wrong way: (Note that I am doing all exception handling - but deleted the exception ...
    Tushar BTushar B
    Dec 19, 2007 at 10:10 am
    Dec 20, 2007 at 12:57 pm
  • Hi all, I am facing problem with the following multifield query: i_title:indoor* i_description:indoor* -i_published:false +i_topicsClasses.id:1_1_*_* The above query returns me even results which ...
    Rakesh SheteRakesh Shete
    Dec 18, 2007 at 6:22 pm
    Dec 19, 2007 at 12:34 pm
  • Dear Fellow Java & Lucene developers: I am a Java developer learning lucene and I am currently going through the book Lucene in Action. At present, I am trying to run the sample code for indexing an ...
    Dec 5, 2007 at 3:23 am
    Dec 10, 2007 at 6:58 pm
  • I'm not even sure if it can be considered Named Entity Recognition, but what the hell... so here's my problem... I was asked to retrieve a the named entities out of a collection of documents, and ...
    Dec 12, 2007 at 9:45 am
    Jan 9, 2008 at 4:35 pm
  • Hello, I'm building a ticketing system for my company and am using Lucene for some of the more complicated queries. I'd say my application differs from the typical lucene application in that my ...
    Bob DahaBob Daha
    Dec 10, 2007 at 7:57 pm
    Dec 12, 2007 at 4:10 am
  • Hi all, I am using Hibernate Search (http://www.hibernate.org/410.html) which is a wrapper around Lucene for performing search over info stored in the DB. I have questions related to Lucene boosting ...
    Rakesh SheteRakesh Shete
    Dec 21, 2007 at 3:51 am
    Dec 24, 2007 at 3:10 pm
  • Hi, I am trying to retreive documents from an index. Each document has a date field and other fields. while making a seach i want to give some extra boost to the more recent items (as per the date ...
    Prabin meiteiPrabin meitei
    Dec 19, 2007 at 9:51 am
    Dec 20, 2007 at 1:57 pm
  • I have a few fields that use package names and class names and I've been looking for some suggestions for analyzing these fields. A few examples - Text (class name) - ...
    Dec 15, 2007 at 11:17 pm
    Dec 18, 2007 at 4:54 am
  • Hi, since I need highlighting, I need to 'rewrite' a query. Query.rewrite takes an object of type IndexReader But what for? As I understand it, rewrite transforms a possibly complicated query into an ...
    Helmut JarauschHelmut Jarausch
    Dec 13, 2007 at 11:40 am
    Dec 17, 2007 at 3:09 pm
  • Hello, I'm looking for suggestions on how to deal with the following (simplified) scenario (Lucene 2.2.0): Documents in my index have some number of fields that are searched in various combinations ...
    Tom EmersonTom Emerson
    Dec 7, 2007 at 6:42 pm
    Dec 16, 2007 at 3:49 pm
  • Hi, It's been a while since I've written a custom TokenFilter, and I'm not having luck getting tokens out of the TokenStream using 2.3-dev. I'm hitting that default term buffer of the size 10 using ...
    Otis GospodneticOtis Gospodnetic
    Dec 8, 2007 at 10:44 pm
    Dec 12, 2007 at 4:07 am
  • I have developed a fuzzy search application over a database of books (titles, authors etc) and it works really well. (I use Lucene.Net but read the JavaDocs and forums for java Lucene) However I've ...
    Dec 7, 2007 at 11:34 am
    Dec 11, 2007 at 2:00 pm
  • Happy festivus everyone, So I have my fancy new stemmed synonym based Lucene index. Let's say I have the following synonym defined: radiation - radiotherapy (and the reverse) The search results rank ...
    Frank SchimaFrank Schima
    Dec 27, 2007 at 9:19 pm
    Jan 3, 2008 at 9:56 pm
  • Hi, What is the most efficient way to do pagination in Lucene? I have always done the following because this "flavor" of the search call allows me to specify the top N hits (e.g. 1000) and a Sort ...
    Dragon FlyDragon Fly
    Dec 22, 2007 at 3:20 pm
    Dec 27, 2007 at 6:48 pm
  • Hello, I would like to search documents by "CUSTOMER". So I search on the field "CUSTOMER" using a KeywordAnalyzer. The CUSTOMER field is indexed with those params: Field.Index.UN_TOKENIZED ...
    Dec 27, 2007 at 2:34 pm
    Dec 27, 2007 at 3:40 pm
  • Do you guys have article links or other documents to describe the lucene database. Eg. what is it composed of? -- Berlin Brown http://botspiritcompany.com/botlist/spring/help/about.html ...
    Berlin BrownBerlin Brown
    Dec 23, 2007 at 2:11 am
    Dec 26, 2007 at 5:18 pm
  • Hi guys, I met some trouble in optimizing the index. The index looks fine in Luke and I can carry out the search in the index. However, when I try to merge all these seperated files into a complete ...
    Zhou QiZhou Qi
    Dec 22, 2007 at 5:22 am
    Dec 26, 2007 at 1:17 pm
  • Hello, I am using Lucene to build an index from roughly 10 million documents in number. The documents are about 4 TB in total. After some trial runs, indexing a subset of the documents I am trying to ...
    V kV k
    Dec 18, 2007 at 5:03 am
    Dec 22, 2007 at 4:23 pm
  • Is this at least a semi-active list? James
    Hartrich, James CTR USTRANSCOM J6Hartrich, James CTR USTRANSCOM J6
    Dec 19, 2007 at 7:01 pm
    Dec 19, 2007 at 8:10 pm
  • Hi All, I am parsing this query: "Auto* machine"~4. Will it work? If yes then right now it's not working. Can anyone help on this? Thanks & Regards Shakti Sareen DISCLAIMER: This email (including any ...
    Dec 12, 2007 at 8:55 am
    Dec 18, 2007 at 9:22 am
  • I have an index that contains three sorts of documents: Car brand Tire brand Tire pressure (Please bear with me, the real index has nothing to do with cars. I just try to explain the problem in an ...
    Karl WettinKarl Wettin
    Dec 14, 2007 at 5:06 pm
    Dec 17, 2007 at 1:32 am
  • Hi, I know how to set DEFAULT_OPERATOR_AND for an individual QueryParser Objekt (after creation) Since I always want this to be set, is there a means to set a (global) option such that any ...
    Helmut JarauschHelmut Jarausch
    Dec 11, 2007 at 2:45 pm
    Dec 12, 2007 at 6:35 am
  • Hi I am trying to run a code from Lucene In Action, but it generate some errors.There is one one warning at compilation time and the errors generate at run time. Given below the code and errors. ...
    Liaqat AliLiaqat Ali
    Dec 6, 2007 at 9:49 am
    Dec 6, 2007 at 3:25 pm
  • Does anyone know why JVM heap use almost doubles at the very end when indexing in memory? around 9 megs @ 1:03 min into indexing - around 18 megs @ 1:05 min when indexing is complete - heap use jumps ...
    Dec 27, 2007 at 5:19 pm
    Dec 27, 2007 at 6:14 pm
  • Hello all, I'm trying to implement a synonym engine in Lucene 2.2 based on the code in the Lucene In Action book. However, I'm getting compile errors: My Synonym filter looks like this: import ...
    Frank SchimaFrank Schima
    Dec 27, 2007 at 3:56 pm
    Dec 27, 2007 at 5:44 pm
  • I don't care about score, but I do care about the # of times a query was hit within a document? example: the quick brown fox jumped over the lazy dog the quick brown fox jumped over the lazy dog the ...
    Dec 20, 2007 at 7:54 pm
    Dec 21, 2007 at 3:38 am
  • Hi, please help I am totally puzzled. The same query, once with a direct call to FuzzyQuery succeeds while the same query with QueryParser fails. What am I missing? Sorry, I'm using pylucene (with ...
    Helmut JarauschHelmut Jarausch
    Dec 17, 2007 at 9:28 am
    Dec 17, 2007 at 6:01 pm
  • There's an interesting article on state-of-the-art setup with Mtron Solid State Drives at http://www.nextlevelhardware.com/storage/battleship/ The concise version is that Mtron flash drives puts all ...
    Toke EskildsenToke Eskildsen
    Dec 14, 2007 at 10:59 am
    Dec 16, 2007 at 6:50 pm
  • Hello, I got a quick question. I am handling hughe CSV files. They start with a key in the first column and are followed by data. I need to retrieve randomly this data based on the key. So it is kind ...
    Tobias RotheTobias Rothe
    Dec 13, 2007 at 11:26 pm
    Dec 15, 2007 at 5:08 am
  • Hello, I am looking for some advice regarding which tools I might use to solve my problem. I apologize ahead of time for the long explanation. Problem Description: I would like to index a set of very ...
    Jose LunaJose Luna
    Dec 11, 2007 at 6:30 pm
    Dec 12, 2007 at 4:16 pm
  • Hi all, I want to index an XML file,containing 200 Urdu language (Varient of Arabic and Persian) documents. This corpus is in CES format,consisting of information about author and many more, I just ...
    Liaqat AliLiaqat Ali
    Dec 4, 2007 at 6:05 pm
    Dec 12, 2007 at 6:31 am
  • Here goes, I'm developing an application using lucene which will evaluate the representativeness of a list of keywords within a collection of documents. I'm doing this by indexing the documents and ...
    Dec 10, 2007 at 10:58 am
    Dec 10, 2007 at 11:43 am
  • Hello all, I’ve been looking into using the nice power of the SpanNearQuery instead of PhraseQuery, mostly because of the simplification of the slop factors. However, I’m wondering if the ...
    Arnone, AnthonyArnone, Anthony
    Dec 4, 2007 at 9:42 pm
    Dec 5, 2007 at 12:31 pm
  • Hello All, I want to calculate the Precision and Recall of the current system, based on Lucene. What should be the procedure and either there are some tools available for this purpose. Kindly guide ...
    Liaqat AliLiaqat Ali
    Dec 29, 2007 at 10:45 am
    Dec 30, 2007 at 7:46 am
  • Hi all, I encounter a strange probelm. To improve performance, I open the indexreader at the start time and reuse it in later search. I have another process running to do online indexing. The search ...
    Zhou QiZhou Qi
    Dec 27, 2007 at 1:59 pm
    Dec 27, 2007 at 2:36 pm
  • hello, I am try to make an index of 191 documents stored in 191 text files. I developed a program, which works well with files containing single line, but files with multiple lines posing a ...
    Liaqat AliLiaqat Ali
    Dec 25, 2007 at 9:03 pm
    Dec 26, 2007 at 6:14 am
  • I am getting the following exception when I run our indexer: Unsupported MIME type (text/html;charset=US-ASCII) type so ignoring: http://zfin.org/... It appears if a page Http header does not specify ...
    Christian PichChristian Pich
    Dec 21, 2007 at 6:42 pm
    Dec 22, 2007 at 12:01 am
  • Hi, according to the LiA book the FuzzyQuery distance is computed as 1- distance / min(textlen,targetlen) Given def addDoc(text, writer): doc = Document() doc.add(Field("field", text, ...
    Helmut JarauschHelmut Jarausch
    Dec 17, 2007 at 8:43 am
    Dec 17, 2007 at 7:16 pm
  • Hi, We've got a requirement that we need to give our users the ability to search on exact phrases within a field, or, if they prefer, they can match on plurals(either via stems, or another plural ...
    Lucifer HammerLucifer Hammer
    Dec 12, 2007 at 6:26 pm
    Dec 13, 2007 at 3:07 am
  • My application batch adds documents to the index using IndexWriter.addDocument. Another thread handles searchers, creating new ones as needed, based on a policy. These searchers open a new ...
    Antony BowesmanAntony Bowesman
    Dec 9, 2007 at 11:01 am
    Dec 9, 2007 at 11:33 am
  • Hi, With Lucene 1.4.3, we had used this constructor for Field. What is its equivalent in Lucene 2.2.0? /** Constructs a String-valued Field that is tokenized and indexed, and is stored in the index, ...
    Dec 6, 2007 at 8:23 pm
    Dec 7, 2007 at 6:28 pm
  • I did some searching on the lucene site and wiki, but didn't quite find what I was looking for in regards to a basic approach to how and when to reload index data. I have a long running process that ...
    Dec 6, 2007 at 5:44 pm
    Dec 6, 2007 at 7:47 pm
Group Navigation
period‹ prev | Dec 2007 | next ›
Group Overview
groupjava-user @

103 users for December 2007

Erick Erickson: 33 posts Grant Ingersoll: 25 posts Doron Cohen: 16 posts Liaqat Ali: 16 posts Michael McCandless: 15 posts Helmut Jarausch: 11 posts Doron Cohen: 10 posts Ruslan Sivak: 9 posts Chris Hostetter: 8 posts Karl Wettin: 8 posts Zhou Qi: 8 posts Otis Gospodnetic: 7 posts Rakesh Shete: 7 posts Sumittyagi: 7 posts Tom: 6 posts Daniel Naber: 6 posts Erik Hatcher: 6 posts Mark harwood: 6 posts Mark Miller: 6 posts Smokey: 6 posts
show more