Search Discussions

153 discussions - 765 posts

  • I'm attempting to create a profanity filter. I thought to use a QueryFilter created with a Query of (-$#!+ AND -@#$% AND etc). The problem I have run into is that, as a pure negative query is not ...
    Greg GershmanGreg Gershman
    Mar 7, 2007 at 3:08 pm
    Mar 14, 2007 at 3:01 am
  • Hello, I am planning to index Word 2003 files. I read I have to use Jakarta Apache POI, but I also read on the POI site that their work with doc's is in an early stage. Is POI advisable? Or are there ...
    E J W VanbloemE J W Vanbloem
    Mar 23, 2007 at 5:49 pm
    Mar 28, 2007 at 10:00 am
  • Hi, I am trying to index the content from XML files which are basically the metadata collected from a website which have a huge collection of documents. This metadata xml has control characters which ...
    Mar 17, 2007 at 4:58 am
    Mar 22, 2007 at 6:29 am
  • This is request for an enhancement to FieldSortedHitQueue/PriorityQueue that would prevent duplicate documents from being inserted, or alternatively, allow the application to prevent this (reason ...
    Peter KeeganPeter Keegan
    Mar 29, 2007 at 1:39 pm
    Mar 30, 2007 at 9:17 pm
  • I am using some JBoss products and they have a very nice and great forum, I am wondering why Apache still uses this old-fashioned mailing list?? -- Regards, Mohammad
    Mohammad NorouziMohammad Norouzi
    Mar 27, 2007 at 6:04 am
    Mar 29, 2007 at 10:40 am
  • Hello, I want to manage user subscriptions to specific documents. So I would like to store the subscription (query) into the lucene directory, and whenever I receive a new document, I will search all ...
    Melanie LangloisMelanie Langlois
    Mar 23, 2007 at 1:12 am
    Mar 28, 2007 at 1:27 pm
  • Hello, Please suggest what should be the query String for a pharse search. Thanks and Regards, Ruchi
    Ruchi thakurRuchi thakur
    Mar 7, 2007 at 5:26 pm
    Mar 16, 2007 at 8:27 am
  • Hi, I want to make my index as small as possible. I noticed about field.setOmitNorms(true), I read in the list the diff is 1 byte per field per doc, not huge but hey...is the only effect the score ...
    Mar 14, 2007 at 10:03 am
    Jun 22, 2007 at 2:55 pm
  • hi can some one help me by giving any sample programs for indexing pdfs and .doc files thanks regards ashwin
    Ashwin kumarAshwin kumar
    Mar 8, 2007 at 9:37 am
    Mar 9, 2007 at 1:11 pm
  • Hello It's a whil that I am using lucene and as most of people seemingly do, I used to save only some important fields of a docuemnt in the index. But recently I thought why not store the whole ...
    Mar 19, 2007 at 12:00 am
    Apr 4, 2007 at 7:26 pm
  • Hi, Gurus, One thing I want to do is: one index has fields like [primary-key, not-so-frequently-updated-fields, large-content-fields,...], and another index has [primary-key, ...
    Chris LuChris Lu
    Mar 26, 2007 at 3:58 pm
    Mar 27, 2007 at 3:08 am
  • Hi there, Just like google: the more user clicks of search results, the higher rank they are. How to implement this in lucene? I've read the javadoc of org.apache.lucene.search package, but still ...
    Mar 15, 2007 at 11:30 am
    Mar 25, 2007 at 2:46 am
  • Hi all, I'm new to this group, I'm using lucene for indexing. I have a problem. Any help gratly appreciate. Please see the following code // three fields MultiFieldQueryParser parser = new ...
    Chaminda AmarasingheChaminda Amarasinghe
    Mar 10, 2007 at 11:49 am
    Mar 12, 2007 at 9:56 am
  • Hi to all members of the user group! Let me get to my problem. I use Lucene in two different parts of the application. One is the SearchService and one is an AOP interceptor that intercepts any ...
    MC MoiseiMC Moisei
    Mar 4, 2007 at 9:25 pm
    Mar 6, 2007 at 12:07 am
  • hi all, How i can print the index content in order to use them for some application. I did use TermEnum terms=ir.terms(); while (terms.next()) { System.out.println(terms.term().text()); } I still ...
    Mar 3, 2007 at 12:55 pm
    Mar 3, 2007 at 11:00 pm
  • hi I have two separated index but there are some fields that are common between them. now I want to search from one index and then apply the result to the second one. what solution do you suggest? ...
    Mohammad NorouziMohammad Norouzi
    Mar 26, 2007 at 6:18 am
    Mar 31, 2007 at 7:44 am
  • Nutch recently added a search query timeout (NUTCH-308). Are there any plans to add such functionality to the Lucene HitCollector directly? Or is there some reason that this is a bad idea? I'm using ...
    Sean TimmSean Timm
    Mar 15, 2007 at 6:45 pm
    Mar 19, 2007 at 1:24 am
  • Have an interesting scenario I'd like to get your take on with respect to Lucene: A data provider (e.g. someone with a private website or corporately shared directory of proprietary documents) has ...
    Walt StoneburnerWalt Stoneburner
    Mar 8, 2007 at 3:29 pm
    Mar 10, 2007 at 2:17 am
  • I want to search for phrase „innere Organe" bezeichnet am using query q1 = "„innere Organe\" bezeichnet" is there any issue with q1 am getting Exception in retrieveQuery().IndexDirec:Lexical error at ...
    Ruchi thakurRuchi thakur
    Mar 12, 2007 at 6:44 am
    Mar 12, 2007 at 3:35 pm
  • Hi I want to search through lucene's index from a start date to end date. when I pass this query it works, say, admitDate:1978/05/05 however, when I use a range syntax it return no records: ...
    Mohammad NorouziMohammad Norouzi
    Mar 4, 2007 at 6:24 am
    Mar 6, 2007 at 4:03 am
  • Hello Luceners, I have a collections of vector of terms (token) that I extracted from files. I am looking for ways to calculate TF/IDF of each term. I wanted to use Lucene to do this but Lucene is ...
    Sengly HengSengly Heng
    Mar 28, 2007 at 8:38 am
    Mar 29, 2007 at 3:40 am
  • Hi, I've looked the uses of MergeFactor and MaxBufferedDocs. If I set MergeFactor = 100 and MaxBufferedDocs=250 , then first 100 segments will be merged in RAMDir when 100 docs arrived. At the end of ...
    SK RSK R
    Mar 23, 2007 at 6:52 am
    Mar 25, 2007 at 3:08 pm
  • Hi: First of all apology to those friends who follow all the list. Often times I work offline and I do not have any commit rights to any of the projects. All the modifications I make for various ...
    Mar 22, 2007 at 11:15 am
    Mar 22, 2007 at 2:08 pm
  • Someone noted that textmining.org gets hacked. There is test- mining.org which appears to be a commercial site. Can someone tell me where to get the download of the original GPL textmining.org ...
    Bill TaylorBill Taylor
    Mar 1, 2007 at 3:23 pm
    Mar 21, 2007 at 4:13 pm
  • There are (at least) two ways to generate a BitSet which can be used for filtering. Filter.bits() BitSet bits = new BitSet(reader.maxDoc()); TermDocs td = reader.termDocs(new Term("field", "text"); ...
    Antony BowesmanAntony Bowesman
    Mar 13, 2007 at 4:40 am
    Mar 15, 2007 at 3:27 pm
  • Hi, I have put this question as "urgent" because I can notice I don't have often answers, If I'm asking the wrong way, please tell me... Before I delete a document I search it in the index to be sure ...
    Mar 13, 2007 at 4:24 pm
    Mar 14, 2007 at 9:27 pm
  • Hi all, The documentation for the above method mentions something called a vectorized field. Does anyone know what a vectorized field is? This email and any attached files are confidential and ...
    Kainth, SachinKainth, Sachin
    Mar 13, 2007 at 5:25 pm
    Mar 14, 2007 at 4:00 pm
  • Not sure if I'm going about this the right way, but I want to use Query instances as a key to a HashMap to cache BitSet instances from filtering operations. They are all for the same reader. That ...
    Antony BowesmanAntony Bowesman
    Mar 6, 2007 at 7:04 am
    Mar 7, 2007 at 2:34 am
  • Is there a easy way to clear locks ? If I redeploy my war file and it happens that there is an indexing happening the lock is not cleared. I know I can tell JVM to run the finalizers before it exits ...
    MC MoiseiMC Moisei
    Mar 4, 2007 at 9:29 pm
    Mar 6, 2007 at 10:23 am
  • Ah, I once worked in a place where we did exactly that - recognition and extraction of useful nuggets from emails - dates, emails, URLs, attachments, people, places...see divmod.com for the next ...
    Otis GospodneticOtis Gospodnetic
    Mar 1, 2007 at 10:25 am
    Mar 5, 2007 at 8:53 pm
  • Hello folks, Maybe one of you can help me with this (sorry, long read). I have implemented a FuzzyPhraseQuery that works similar to Lucene's native PhraseQuery. I.e. it can retrieve phrases for a ...
    Philipp NanzPhilipp Nanz
    Mar 6, 2007 at 12:08 am
    Oct 18, 2007 at 1:59 am
  • Hi all, We have a web-based application that searches a large lucene index. This application only creates object of type IndexSearcher only (and no IndexWriters) for searching the index. After the ...
    Nilesh BansalNilesh Bansal
    Mar 31, 2007 at 11:09 pm
    Apr 2, 2007 at 8:42 am
  • Hi, I have a requirement to sort search results in a round robin. Ex:sorting results by field "customer" suppose following customers are found (number of results in brackets) and results are sorted ...
    Ramana JeldaRamana Jelda
    Mar 2, 2007 at 10:45 am
    Mar 27, 2007 at 2:01 pm
  • I'm using contrib/benchmark to do some tests for my ApacheCon talk and have some questions. 1. In looking at micro-standard.alg, it seems like not all braces are closed. Is a line ending a separator ...
    Grant IngersollGrant Ingersoll
    Mar 18, 2007 at 5:16 pm
    Mar 23, 2007 at 5:12 am
  • Hello Dear Lucene Users! Back in the old days (well, last year) the lucene/java/trunk subversion path was always stable enough for everyone to use into production code. Now, with the 2.0/2.1/2.2 ...
    Jean-Philippe RobichaudJean-Philippe Robichaud
    Mar 15, 2007 at 2:48 am
    Mar 15, 2007 at 6:02 pm
  • (Lucene 1.9.1) I have a "filename" field in Lucene that holds a value, like this: pagefile.sys If I run searches through QueryParser, and I do a search for: pagefile.sys pagefile pagefile. This all ...
    McGuigan, ColinMcGuigan, Colin
    Mar 9, 2007 at 9:11 pm
    Mar 12, 2007 at 8:53 pm
  • hi all i am able to convert a pdf in to a text file using pdfbox. and this is the code that i used import org.pdfbox.pdfparser.PDFParser; import org.pdfbox.pdmodel.PDDocument; import ...
    Ashwin kumarAshwin kumar
    Mar 12, 2007 at 6:03 am
    Mar 12, 2007 at 2:49 pm
  • All, I'm evaluating Lucene as a full-text search engine for a project. I got one of the requirements as following: 4) Plural Literal Search If you use the plural of a term such as bears the results ...
    Tony QianTony Qian
    Mar 8, 2007 at 4:52 pm
    Mar 9, 2007 at 8:33 am
  • I got the following exception this morning when running one last test on a data set that has been indexed many times before over the past few months. java.io.FileNotFoundException: ...
    Antony BowesmanAntony Bowesman
    Mar 31, 2007 at 8:03 am
    Apr 2, 2007 at 12:22 am
  • So I assumed a linear decay of performance as an index got bigger. For some reason when going from an index size of 1.89 to 1.95 gigs dramatically increased cpu across all of our servers. I was ...
    Scott OshimaScott Oshima
    Mar 28, 2007 at 6:06 pm
    Mar 29, 2007 at 1:51 am
  • As part of XTF, an open source publishing engine that uses Lucene, I developed a new spelling correction engine specifically to provide "Did you mean..." links for misspelled queries. I and a small ...
    Martin HayeMartin Haye
    Mar 20, 2007 at 10:25 pm
    Mar 22, 2007 at 12:51 pm
  • Hi, I saw that there are many post on the mailing list about indexing in multiple language, so I will try to not post duplicate question. In my case, I want to index rss feeds, so one feed contains ...
    Melanie LangloisMelanie Langlois
    Mar 22, 2007 at 6:07 am
    Mar 22, 2007 at 8:30 am
  • ...to everyone who helps make Lucene and Solr such fantastic tools. I'm the Platform Architect for a leading online event ticket after-marketplace (think eBay for tickets), and we've just completed a ...
    Cass CostelloCass Costello
    Mar 20, 2007 at 7:58 pm
    Mar 21, 2007 at 8:59 pm
  • Hi, I can get docFreq. of single term like (f1:test) by using indexReader.docFreq(new Term("f1","test")). But can't get docFreq. of phrase term like f2:"test under") by the same method. Is anything ...
    SK RSK R
    Mar 20, 2007 at 10:33 am
    Mar 20, 2007 at 11:44 am
  • Hi there, I'm using lucene to index and store entries from a database table for ultimate retrieval as search results. This works fine. But I find myself in the position of wanting to occasionally ...
    Thomas K. BurkholderThomas K. Burkholder
    Mar 14, 2007 at 10:37 pm
    Mar 19, 2007 at 6:09 pm
  • Is there a SpellChecker.jar compatible with Lucene 2.1. After updating to Lucene 2.1, I seem to have lost the ability to create a spell index using spellchecker-2.0-rc1-dev.jar. Any help would be ...
    Ryan O'HaraRyan O'Hara
    Mar 14, 2007 at 8:47 pm
    Mar 15, 2007 at 8:57 pm
  • I'd like to add a field to every document in an index... that I'd rather not rebuild from scratch (yet). This is behind Solr (so a ParallelReader won't work without core modifications, right?). Is ...
    Erik HatcherErik Hatcher
    Mar 14, 2007 at 3:49 am
    Mar 14, 2007 at 3:23 pm
  • Hi all I am going to index our database. one approach is to join them and then index the fields. but the information are very large say more than 3 millions. so the Sql Server fails to select them. I ...
    Mohammad NorouziMohammad Norouzi
    Mar 31, 2007 at 7:30 am
    Mar 31, 2007 at 6:22 pm
  • I am trying to link the nutch index and the index generated from my database using Lucene. So at the time of indexing my database, I want to pull the indexes in from nutch and link the content from ...
    Mar 25, 2007 at 2:02 am
    Mar 26, 2007 at 6:48 pm
  • Hi, I know how to index terms in lucene, now I wanna see how can I index phrases like "information retreival" in lucene and calculate the number of times that phrase has appeared in the document. Is ...
    Mar 22, 2007 at 6:25 pm
    Mar 23, 2007 at 9:37 am
Group Navigation
period‹ prev | Mar 2007 | next ›
Group Overview
groupjava-user @

168 users for March 2007

Erick Erickson: 61 posts Chris Hostetter: 47 posts Karl wettin: 37 posts Doron Cohen: 32 posts Antony Bowesman: 30 posts Mohammad Norouzi: 22 posts Grant Ingersoll: 19 posts Michael McCandless: 18 posts Daniel Noll: 17 posts Mark harwood: 17 posts Ruchi thakur: 15 posts Ashwin kumar: 13 posts Otis Gospodnetic: 13 posts Chris Lu: 11 posts Erik Hatcher: 11 posts Kainth, Sachin: 10 posts Melanie Langlois: 10 posts DECAFFMEYER MATHIEU: 9 posts Grant Ingersoll: 9 posts Mark Miller: 9 posts
show more