Search Discussions

111 discussions - 422 posts

  • Hi, sorry I've already asked few days ago, but I got no reply and I really need some help on this.. I'm running several queries against a doc collection. The queries are documents of the collection ...
    Patrick DiviaccoPatrick Diviacco
    Mar 28, 2011 at 7:44 am
    Mar 29, 2011 at 8:58 am
  • I am trying to index content withing certain HTML tags, how do I index it ? Which is the best parser/tokenizer available to do this ? -- View this message in context: ...
    Mar 11, 2011 at 11:03 am
    Mar 15, 2011 at 4:56 am
  • I'm using the following code because I want to see the entire collection in my query results: //adding wildcards-term to see all results rest = new TermQuery(new Term("*","*")); ...
    Patrick DiviaccoPatrick Diviacco
    Mar 22, 2011 at 8:23 am
    Mar 23, 2011 at 8:18 am
  • Hi, I would like to announce Crawl Anywhere. Crawl-Anywhere is a Java Web Crawler. It includes : * a crawler * a document processing pipeline * a solr indexer The crawler has a web administration in ...
    Dominique BejeanDominique Bejean
    Mar 2, 2011 at 12:26 am
    Jul 16, 2013 at 4:04 am
  • Catchall field has its own disadvantages of increasing index size. MultiFieldQueryParser has to know the field names before hand. How do I do a multi field search - without knowing fields in the ...
    Mar 15, 2011 at 10:46 am
    Mar 17, 2011 at 2:22 pm
  • Is there a way one could detect duplicates (say by using some unique hash of certain fields) and marking a document as a duplicate but not remove it. Here is an example: Doc 1) This is my test Doc 2) ...
    Mar 5, 2011 at 6:50 am
    Mar 10, 2011 at 4:12 pm
  • Hello all, Is there any issue with ParallelMultiSearcher in Lucene 3.0.2? If we search more frequently then OutOfMemoryError is triggered or It is throwing "Not able to create native thread".. It is ...
    Mar 17, 2011 at 11:48 am
    Mar 23, 2011 at 3:08 am
  • When I run my Lucene app and a parse a xml file I get the following error due to some fonts such as "é" written in the text file. If I save the text file as UTF-8 with my text editor I don't have ...
    Patrick DiviaccoPatrick Diviacco
    Mar 28, 2011 at 7:04 am
    Apr 1, 2011 at 4:09 pm
  • Hi all. I'm trying to parallelise writing documents into an index. Let's set aside the fact that 3.1 is much better at this than 3.0.x... but I'm using 3.0.3. One of the things I need to know is the ...
    Mar 29, 2011 at 12:32 am
    Mar 31, 2011 at 7:10 pm
  • I'm new to Lucene. If I use description = new TermQuery(new Term("description", "my string")); I ask Lucene to consider "my string" as unique word, right ? I actually need to consider each word, ...
    Patrick DiviaccoPatrick Diviacco
    Mar 21, 2011 at 4:01 pm
    Mar 23, 2011 at 7:34 am
  • I am asking for partial update in Lucene, where I want to update only a selected field of all fields in the document. Does Lucene provide any way to do this ? How to approach this ? -- View this ...
    Mar 22, 2011 at 7:00 am
    Mar 22, 2011 at 10:39 am
  • Hi, I am facing the problem The line in the loop is going very slow giving me a performance hit for (int i = 0; i < hits.length; ++i) { int docId = hits[i].doc; Document d = searcher.doc(docId); ...
    Mar 10, 2011 at 9:36 am
    Mar 11, 2011 at 10:46 am
  • hi I have a code that work fine with lucene 3.2 where i used TermDocs to find the corpusTF here is the code public void calculateCorpusTF(IndexReader reader) throws IOException { // TODO ...
    Mar 22, 2011 at 7:44 pm
    Mar 24, 2011 at 5:09 pm
  • I've downloaded Lucene nightly build because I need to customize the similarity *per field*. However I don't see the field parameter passed to the methods to compute the score such as "tf" and ...
    Patrick DiviaccoPatrick Diviacco
    Mar 3, 2011 at 3:26 pm
    Mar 5, 2011 at 9:47 am
  • Is there a Filter to get a limited number of random collection docs from the index which DO NOT contain a specific term ? i.e. term="pizza" I want to run the query against 10 random documents of the ...
    Patrick DiviaccoPatrick Diviacco
    Mar 29, 2011 at 6:40 pm
    Mar 30, 2011 at 9:45 am
  • I've downloaded the nightly build of Lucene (TRUNK) and I'm referring to the following documentation: https://hudson.apache.org/hudson/view/G-L/view/Lucene/job/Lucene-trunk/javadoc/all/index.html But ...
    Patrick DiviaccoPatrick Diviacco
    Mar 29, 2011 at 11:21 am
    Mar 29, 2011 at 7:19 pm
  • Hello everybody, I have an enquiry about StandardAnalyzer. Can I use it for other languages except from English? I give the right list of stop words at initialization. Is there anything else inside ...
    Vasiliki GkoutaVasiliki Gkouta
    Mar 13, 2011 at 11:24 pm
    Mar 14, 2011 at 10:33 pm
  • I've downloaded Lucene nightly build and I've seen that WhitespaceAnalyzer.java is not anymore there. Has this analyzer been removed from the library ? What should I use instead ? thanks
    Patrick DiviaccoPatrick Diviacco
    Mar 2, 2011 at 10:33 pm
    Mar 4, 2011 at 3:14 pm
  • Hi, What are my options for distributing an application that uses Lucene? Our current application works against a database of INVENTORY. We schedule hourly checks for modified items ...
    Sol myrSol myr
    Mar 22, 2011 at 8:31 am
    Mar 31, 2011 at 9:20 am
  • Hi, Can someone help me with this problem please? I got these when running my program: java.io.FileNotFoundException: /Users/vonhutuan/Documents/workspace/InformationExtractor/index_wordlist/_i82.frq ...
    Vo Nhu TuanVo Nhu Tuan
    Mar 23, 2011 at 9:49 am
    Mar 23, 2011 at 11:04 am
  • Hi, I would like to create an index with Lucene to a document collections of text files. The index should be created in such a way, that for the search I can enforce that query term A and query term ...
    Michael WiegandMichael Wiegand
    Mar 4, 2011 at 7:06 am
    Mar 11, 2011 at 10:04 am
  • Dear Lucene/Solr user, It is possible you may not know of an Apache project called ManifoldCF, whose purpose is to provide content to Solr for index. If you have interest in this project, this is to ...
    Karl WrightKarl Wright
    Mar 2, 2011 at 7:21 am
    Mar 10, 2011 at 4:35 pm
  • hi, I performing multiple queries (stored in a 100MB XML file) against a collection (indexed with lucene, and it was stored before in a 100MB XML file). The process seems pretty long on my machine ...
    Patrick DiviaccoPatrick Diviacco
    Mar 29, 2011 at 9:22 am
    Mar 29, 2011 at 10:00 am
  • Hi, I would like to build a search system where a search for "Dan" would also search for "Daniel" and a search for "Will", "William" . Any ideas on how to go about implementing that? I can think of ...
    Deepak KonidenaDeepak Konidena
    Mar 24, 2011 at 6:32 pm
    Mar 25, 2011 at 2:16 pm
  • Hello, I would index the same document with 2 different Analyzer. So I have to create 2 different index. How can I do that ? thank you for your help, Amel.
    Amel FraisseAmel Fraisse
    Mar 25, 2011 at 12:00 pm
    Mar 25, 2011 at 1:48 pm
  • Hi, I need to search a Catalog. Most users search *this* year's catalog, but on rare occasions they may ask for old products (from previous years). I'm trying to select between 2 options: 1) Keep ...
    Sol myrSol myr
    Mar 24, 2011 at 2:01 pm
    Mar 24, 2011 at 2:45 pm
  • Is there a way to display Lucene scores per field instead of the global one ? Both my query and my docs have 3 fields. I would like to see the scores for each field in the results. Can I ? Or should ...
    Patrick DiviaccoPatrick Diviacco
    Mar 22, 2011 at 8:35 am
    Mar 23, 2011 at 5:29 am
  • I am trying to index in Lucene a field that could have label of concepts in different languages. Most of the approaches I have seen so far are: - Use a single index, where each document has a field ...
    Stephane FellahStephane Fellah
    Mar 11, 2011 at 3:30 am
    Mar 14, 2011 at 1:50 pm
  • What's the best way to replace WhitespaceAnalyzer in this line in Lucene nightly build 4.0 ? Is there a generic analyzer I can use ? writer = new IndexWriter(FSDirectory.open(INDEX_DIR), new ...
    Patrick DiviaccoPatrick Diviacco
    Mar 4, 2011 at 2:21 pm
    Mar 13, 2011 at 1:45 pm
  • hi it seems my mail is judged as spam. Technical details of permanent failure: Google tried to deliver your message, but it was rejected by the recipient domain. We recommend contacting the other ...
    Li LiLi Li
    Mar 11, 2011 at 9:35 am
    Mar 11, 2011 at 1:44 pm
  • Hi, I am developing a pdf search engine, just use in local computer to search massive pdf documents. I used pdfbox+lucene to index and search, and then I have to display the context to the user in ...
    Mar 6, 2011 at 1:31 pm
    Mar 7, 2011 at 9:28 pm
  • Hello! I am curious to know if the Lucene Project or other associated entity accepts paid technical support/subscriptions for high-priority technical or bug resolution type of support. Thanks! -David.
    Jarrin, DavidJarrin, David
    Mar 3, 2011 at 8:14 pm
    Mar 3, 2011 at 10:34 pm
  • Hi all, Is there a way to find the length of a field of a lucene index document? Thanks, Lahiru
    Lahiru SamarakoonLahiru Samarakoon
    Mar 1, 2011 at 5:35 am
    Mar 1, 2011 at 9:06 am
  • Hi, I'm trying to sort results by a NumericField but the results do not sort (still appear in default score order). The NumericField was indexed using the code below: NumericField field = new ...
    Azhar JassalAzhar Jassal
    Mar 25, 2011 at 2:23 pm
    Mar 26, 2011 at 11:19 am
  • Hi Folks, Before I run off and reinvent the wheel here - has anyone done any form of result grouping with lucene? My use case looks something like this: Newspaper pages are stored as documents in the ...
    Dawn Zoë RaisonDawn Zoë Raison
    Mar 22, 2011 at 10:44 am
    Mar 25, 2011 at 10:31 am
  • I've some issues to open my index with Luke. I get the following error message: Unknown format version: -12 I build the index using the following code: http://codepad.org/OxGRGTRb The index type is ...
    Patrick DiviaccoPatrick Diviacco
    Mar 23, 2011 at 7:57 am
    Mar 23, 2011 at 8:07 am
  • Hi, My highlight code is shown as following: QueryScorer scorer = new QueryScorer(query); Highlighter highlighter = new Highlighter(simpleHTMLFormatter, scorer); highlighter.setTextFragmenter(new ...
    Mar 15, 2011 at 8:48 pm
    Mar 17, 2011 at 8:26 am
  • Hi, I have two web applications that uses lucene 2.3.2. Both share the same index and can write or read. Writing is synchronized based on file system to allow only one IndexWriter to work at the ...
    Mar 9, 2011 at 7:45 pm
    Mar 10, 2011 at 8:37 pm
  • We are developing a large 4-tier multi-server app that will accept Questions and related Comments supplied by users. There will be 100K's of users that live in Shards. Also, ideally there would be no ...
    BrightMinds DevBrightMinds Dev
    Mar 4, 2011 at 6:00 pm
    Mar 9, 2011 at 7:17 pm
  • Hello list, Does this look correct? I am told it is not functioning, in that new entries to the index are not being picked-up? Thanks Lee try { if (! reader.isCurrent()){ IndexReader newReader = ...
    Lee GoddardLee Goddard
    Mar 4, 2011 at 1:20 pm
    Mar 4, 2011 at 5:32 pm
  • Hello all, Could any one guide me how to backup or do replication with Lucene. Regards Ganesh Send free SMS to your Friends on Mobile from your Yahoo! Messenger. Download Now! ...
    Mar 1, 2011 at 6:36 am
    Mar 3, 2011 at 5:38 pm
  • Hi, OK so I will not bother using TieredMergePolicy for now. I will do some more tests with the contrib balanced merge policy, playing with the optimize(maxNumSegments) to try decreasing the optimize ...
    V SevelV Sevel
    Mar 1, 2011 at 8:18 am
    Mar 2, 2011 at 11:01 am
  • I need to define different similarity scores per document field. For example for field A I want to use Lucene tf.idf score, for the numerical field B I want to use a different metric (difference ...
    Patrick DiviaccoPatrick Diviacco
    Mar 1, 2011 at 7:42 pm
    Mar 1, 2011 at 10:47 pm
  • Is there a minimum string length requirement for proximity search? For example, would "a~" or "an~" trigger proximity search? The result would be horrible if there is no such requirement. Thanks, ...
    Andy YangAndy Yang
    Mar 31, 2011 at 1:54 am
    Mar 31, 2011 at 2:01 am
  • Hello All Recently, I am trying to develop an automatic definition extraction system for Amharic Language - using machine learning technique (Version Space learning). Can anyone suggest me some java ...
    Henok sahiluHenok sahilu
    Mar 29, 2011 at 7:12 am
    Mar 29, 2011 at 1:36 pm
  • Hi, I am using MultiFieldQueryParser with a custom analyzer for parsing search text. Now, when I say MultiFieldQueryParser qp = new MultiFieldQueryParser(Version, new String[] {"field1", "field2", ...
    Deepak KonidenaDeepak Konidena
    Mar 24, 2011 at 5:49 pm
    Mar 25, 2011 at 2:19 pm
  • Is there some sort of default limit imposed on the Lucene indexes? I try to index 50k or 60k documents but when I use Luke to go inside the index and check the total # of entries indexed, it shows ...
    Pulkit SinghalPulkit Singhal
    Mar 24, 2011 at 10:07 pm
    Mar 25, 2011 at 9:56 am
  • Hi Luceners, this is my 1st experience with ARQ, LARQ & Lucene; everyth. went smooth so far, however the slope seems to be getting steeper suddenly. The initial problem was to develop a Java app to ...
    Fr JurainFr Jurain
    Mar 22, 2011 at 3:16 pm
    Mar 24, 2011 at 9:47 am
  • I'm new to Lucene and I would like to know what's the difference (if there is any) between PhraseQuery.add(Term1) PhraseQuery.add(Term2) PhraseQuery.add(Term3) and term1 = new TermQuery(new ...
    Patrick DiviaccoPatrick Diviacco
    Mar 21, 2011 at 5:43 pm
    Mar 22, 2011 at 1:38 pm
  • I'm having a problem with the performance of lazily-loaded fields with lucene. The basic structure of the code is that I get a set of documents back from a query, then iterate through them, reading ...
    Brian HurtBrian Hurt
    Mar 21, 2011 at 6:16 pm
    Mar 22, 2011 at 1:23 pm
Group Navigation
period‹ prev | Mar 2011 | next ›
Group Overview
groupjava-user @

114 users for March 2011

Patrick Diviacco: 60 posts Ian Lea: 40 posts Erick Erickson: 22 posts Uwe Schindler: 21 posts Shrinath.m: 18 posts Anshum: 12 posts Li Li: 11 posts Michael McCandless: 11 posts Simon Willnauer: 8 posts Ganesh: 7 posts Suman.holani: 7 posts Ahmet Arslan: 6 posts Grant Ingersoll: 6 posts Lahiru Samarakoon: 6 posts Michael Wiegand: 6 posts Paul Libbrecht: 6 posts Karl Wright: 5 posts Deepak Konidena: 5 posts Gong Li: 5 posts Robert Muir: 5 posts
show more