Search Discussions

77 discussions - 292 posts

  • Hi, I am trying to compile some arguments in favour of lucene as management is deciding weather to standardize on lucene or a competing commercial product (we have a couple of produc, one using ...
    Jun 23, 2010 at 8:01 am
    Jul 21, 2010 at 9:25 am
  • Hi, I know the indexed content contains the following text: "This is a test". And the search phrase I used is "This is a formal test", and then I set the slop of the PhraseQuery as 2 with setSlop(2), ...
    A pengA peng
    Jun 27, 2010 at 2:12 pm
    Jun 30, 2010 at 1:09 am
  • Hi, Formerly the HitCollector stored only docs with score bigger than 0.0f. This check is not implemented in any Collector implementation. Especially in the two implementation used by solr: ...
    Jan KurellaJan Kurella
    Jun 2, 2010 at 9:12 am
    Jun 2, 2010 at 6:37 pm
  • Hello to all ! I have _0.cfs file of a lucene index directory but segments.gen and segments_2 are missing. Can I generate the segments.gen and segments_2 files without having to regenerate the _0.cfs ...
    Maryam ma'danipourMaryam ma'danipour
    Jun 9, 2010 at 1:34 pm
    Jun 21, 2010 at 2:38 pm
  • Hello, We are using lucene 2.9.0. and ran into OutOfMemory error when sorting on a highly unique field on a big index. After doing some research we learned that lucene will load the sort field value ...
    Jun 9, 2010 at 8:23 pm
    Jun 15, 2010 at 9:20 am
  • This is more of a unix related question than Lucene specific however because Lucene is being used, I'm asking here as perhaps other people have run into a similar issue. On an Amazon EC2 merge, read, ...
    Jason RutherglenJason Rutherglen
    Jun 3, 2010 at 6:13 pm
    Jun 4, 2010 at 3:39 pm
  • Hey, I need to add a new field (a stored , not indexed field) for all documents present in an existing large index. Reindexing the whole index will be very costly. Is there a way to do this or any ...
    Naveen KumarNaveen Kumar
    Jun 29, 2010 at 11:41 am
    Jul 7, 2010 at 1:09 pm
  • Hi all, We've been waiting for LUCENE-1879 and LUCENE-2425 and have written our own ParallelWriter class in the meantime. Apparently our indexes are falling out of sync (I suspect my colleague is ...
    Jun 23, 2010 at 10:44 pm
    Jun 24, 2010 at 9:24 pm
  • Hi, Is there a way for an application to index a document along with its "term weighted vector" (Lucene's TermFreqVector). I.e., override the term frequencies computed by Lucene, with an ...
    Naama KrausNaama Kraus
    Jun 23, 2010 at 10:39 am
    Jun 24, 2010 at 8:33 am
  • Hi, I have an IndexWriter singleton in my program, and an IndexSearcher singleton based on a readonly IndexReader singleton. When I use the IndexWriter to index a large document to lucene, and then, ...
    Jun 21, 2010 at 3:02 pm
    Jun 21, 2010 at 7:16 pm
  • Hello everybody, I am new to Apache Lucene and it seems to fit perfectly my needs for my application. However I'm a little concerned about something (pardon me if it's a recurrent question, I've ...
    Victor KabdebonVictor Kabdebon
    Jun 20, 2010 at 6:01 pm
    Jun 21, 2010 at 2:44 pm
  • Hi! I ran into a strange behaviour of the StandardTokenizer. Terms containing a '-' are tokenized differently depending on the context. For example, the term 'nl-lt' is split into 'nl' and 'lt'. The ...
    Anna HuneckeAnna Hunecke
    Jun 17, 2010 at 1:32 pm
    Jun 21, 2010 at 9:03 am
  • Hello How can I find and save the position of search hits from Lucene ?.. Like this: doc1 : 1 doc2: 2 ... doc 100: 100 I use lucene 3.0 Thank U Titus -- View this message in context: ...
    Jun 10, 2010 at 3:54 pm
    Jun 10, 2010 at 7:58 pm
  • Hi! We are working on an experimental code-search engine that helps users to find example code snippets based on what a developer already typed inside her editor. Our "homemade search engine" ...
    Marcel BruchMarcel Bruch
    Jun 11, 2010 at 1:35 pm
    Jul 6, 2010 at 7:44 am
  • This is dumb, I know, but I don't find a MultipleTermPositions.java file under org.apache.lucene.index in the svn trunk. Will someone on the list have mercy on me and tell me what obvious thing I'm ...
    Peter WilkinsPeter Wilkins
    Jun 27, 2010 at 7:15 pm
    Jul 4, 2010 at 5:06 pm
  • Dear all, I have to solve the following problem but without success yet. We need to search for a content in a field 'name' that contains the wildcard symbol appearing somewhere in a string. E.g. ...
    Jun 25, 2010 at 10:44 am
    Jun 29, 2010 at 8:42 am
  • Hi, I am new to lucene and I am using Lucene 3.0.2. I am using Lucene to parse text which may contain URLs. I noticed the StandardTokenizer keeps the email addresses in one token, but not the URLs. I ...
    Sudha VermaSudha Verma
    Jun 23, 2010 at 6:07 pm
    Jun 26, 2010 at 3:35 am
  • Hi all, I'm new to Lucene, as well as Cassandra. I'm working on the Lucandra project to modify it to add some extra functionality. It hasn't been fully testing with range queries, so I've created ...
    Todd NineTodd Nine
    Jun 23, 2010 at 5:54 am
    Jun 24, 2010 at 8:12 am
  • Currently I am on Lucene 2.2, migrating to 2.9 before eventually plan to move to 3.1. In Lucene 2.2, I have a custom hit collector that does both filtering and sorting my search results. Let me put ...
    Sirish VadalaSirish Vadala
    Jun 11, 2010 at 6:32 pm
    Jun 15, 2010 at 4:36 pm
  • Hello, I am sorry if this is posted somewhere else, but I think I sent it to the wrong list and I am trying again. Is there anywhere I can find specifications for StandardAnalyzer? I am looking for ...
    Jun 3, 2010 at 3:28 am
    Jun 3, 2010 at 4:35 pm
  • Hi, I have recently been in charge of converting code that was using pre-3.0 API to be compatible with 3.0 API. There was a piece of code which was storing a date field: String date = ...
    Mindaugas ŽakšauskasMindaugas Žakšauskas
    Jun 1, 2010 at 11:54 am
    Jun 1, 2010 at 1:56 pm
  • while calling addindexes or addindexes with no optimize can any gurantee be given about the document order in the new documents given that the order of directories/indexreader is fixed. So is it that ...
    Apoorv SharmaApoorv Sharma
    Jun 30, 2010 at 3:13 am
    Jul 1, 2010 at 5:56 am
  • I was wondering if any of you know of any open-source solutions for general issues which arise in web crawling - how do you remove headers/footers/javascript and generally cleanup html of a web-page ...
    Boris AleksandrovskyBoris Aleksandrovsky
    Jun 28, 2010 at 8:08 pm
    Jun 28, 2010 at 8:58 pm
  • Hello there! I've been using lucene as a Fult Text Search solution for some time. And although I'm familiar with Analyzers and Stemmers I never used them directly. I'm testing a few experiments on ...
    Vinicius CarvalhoVinicius Carvalho
    Jun 23, 2010 at 2:50 am
    Jun 23, 2010 at 5:57 pm
  • I am trying to run a search using search(query, filter, n, sort) method which return TopFieldDocs. The sort is defined like: sort = new Sort(new SortField("DATEISSUED", SortField.LONG, true)); and I ...
    Siraj HaiderSiraj Haider
    Jun 17, 2010 at 5:07 pm
    Jun 18, 2010 at 2:26 pm
  • Hi, I've heard about flexible indexing in the talk Simon Willnauer and Uwe Schindler at BerlinBuzzwords.de last week. Now I'd like to get into this flexible indexing thing, but don't find any ...
    Thomas KochThomas Koch
    Jun 14, 2010 at 2:35 pm
    Jun 15, 2010 at 10:15 am
  • Hi, I am using lucene 3.0.1. I use a MultiFieldQueryParser with a GermanAnalyzer. In my index are some values among others one document with the title "bauer". I append to every word in my query a ...
    Markus MehrwaldMarkus Mehrwald
    Jun 12, 2010 at 12:54 am
    Jun 12, 2010 at 3:47 pm
  • Hello, I'm using Lucene 2.9 and when reading java doc for the Sort class I noticed it says "The field must be indexed, but should not be tokenized". But I tried to sort on a tokenized field, it works ...
    Jun 9, 2010 at 3:36 pm
    Jun 10, 2010 at 1:08 am
  • Hi All, I storing synonyms in an index. e.g. 'institute' as a synonym for 'organization'. Since I want to highlight the orginal term when showing the result i am storing a Payload with each ...
    Aad NalesAad Nales
    Jun 8, 2010 at 7:20 am
    Jun 9, 2010 at 4:38 am
  • Hi, I need to index HTML documents and one of the requirements is to highlight documents while maintaining all of the original formatting. The documents are relatively simple HTML, meaning no ...
    Hans MerklHans Merkl
    Jun 7, 2010 at 6:48 pm
    Jun 8, 2010 at 11:57 am
  • Hi, I have a question on how IndexWriter manages its memory when it comes to RawPostingList. Its pretty late, so sorry if the question is obvious, but the question is when does the RawPostingList ...
    Shay BanonShay Banon
    Jun 5, 2010 at 2:11 am
    Jun 5, 2010 at 11:51 am
  • Hello, Does anyone have a recommendation for implementing the function previously done by the deprecated StandardTokenizer.next() method? and/or, can anyone point me to where I might find the reason ...
    Jun 3, 2010 at 7:52 pm
    Jun 4, 2010 at 12:58 am
  • such as the detailed process of store data structures, index, search and sort. not just apis. thanks. --------------------------------------------------------------------- To unsubscribe, e-mail: ...
    Li LiLi Li
    Jun 3, 2010 at 12:54 am
    Jun 3, 2010 at 5:53 pm
  • Hi all, I made a mistake, finished indexing all my database (millions of documents..), regarding field dates as usual fields. Instead of doing: doc.add(Field.Keyword("indexDate", new Date()); I added ...
    Liat orenLiat oren
    Jun 29, 2010 at 7:32 am
    Jun 29, 2010 at 9:53 am
  • Hi All, I wonder if it is possible to create Lucene indexes with a multiple level structure. For instance, a field named "institutions" with all institutions I´ve worked and sub-fields to detail my ...
    Alexandre Leopoldo GonçalvesAlexandre Leopoldo Gonçalves
    Jun 25, 2010 at 5:40 pm
    Jun 26, 2010 at 5:55 am
  • Hi, I have thousands of article titles in lucene index. So for a query "Oil spill" I want to return all the article title starts with "Oil spill". I do not want those titles which has this phrase but ...
    Rakesh rakeshRakesh rakesh
    Jun 17, 2010 at 7:24 pm
    Jun 19, 2010 at 11:55 pm
  • Hi, Using the following programme I was able to get the entire file path of indexed files which matched with the given queries. But my intention is to get only the file names even without .txt ...
    Manjula wijewickremaManjula wijewickrema
    Jun 11, 2010 at 10:20 am
    Jun 15, 2010 at 7:35 am
  • A news bout google search index. Index system of Lucene can also support realtime search, Is there some difference between them? With Caffeine, we analyze the web in small portions and update our ...
    Jun 9, 2010 at 2:19 pm
    Jun 13, 2010 at 10:52 pm
  • Hi All, I wanted to ask regarding search results scores equality: In case two documents get an equal score - how does Lucene "break" equality ? I.e. by which criteria one document would be ranked ...
    Naama KrausNaama Kraus
    Jun 13, 2010 at 1:39 pm
    Jun 13, 2010 at 3:59 pm
  • Hi, Has anyone done a performance comparison for an index on a Solid State Drive (vs any other hard drive ... SATA/SCSI)? Thanks, Rob.
    Rob BygraveRob Bygrave
    Jun 13, 2010 at 12:02 am
    Jun 13, 2010 at 12:55 pm
  • Hi All, Years ago we implemented a Lucene solution which we are updating today, and i am a bit lost on the following. In Lucene 1.x and 2.x it was possible to add a token in a Filter simply by ...
    Aad NalesAad Nales
    Jun 7, 2010 at 2:43 pm
    Jun 7, 2010 at 2:56 pm
  • anyone could show me some detail information about it ? thanks --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org ...
    Li LiLi Li
    Jun 1, 2010 at 8:42 am
    Jun 3, 2010 at 10:52 am
  • I want to only support boolean or query(as many search engine do). But I want to boost document whose terms are closer. e.g. the query terms are 'apache lucene' doc1 apache has many projects such as ...
    Li LiLi Li
    Jun 1, 2010 at 2:08 pm
    Jun 2, 2010 at 7:11 am
  • Hello All: Can any one suggest me the best way to get the no. of occurrences of each word per document in Lucene? Eg: Let the indexed text be: If you are posting a question, please try search first. ...
    Sirish VadalaSirish Vadala
    Jun 1, 2010 at 8:54 pm
    Jun 2, 2010 at 6:28 am
  • Hi, I am kind of struggling to setup Solr to search pdf files. I am following documents from lucidimagination and wiki. Can someone please point to a good Solr tutorial which involve step by step ...
    Jun 1, 2010 at 1:17 am
    Jun 2, 2010 at 1:10 am
  • Hi All, I am finally having some time to upgrade our lucene from the 2.4 series to the 2.9 series. And I am having a problem that while everything compiles great I am getting a new ...
    Jerven BollemanJerven Bolleman
    Jun 29, 2010 at 9:25 am
    Jul 4, 2010 at 6:10 pm
  • Hi all, I'm looking for a functionality similar to IndexWriter.updateDocument() ...
    Pablo MendesPablo Mendes
    Jun 29, 2010 at 4:00 pm
    Jul 4, 2010 at 5:58 pm
  • Hi, I use an AnalyzingQueryParser with the StandardAnalyzer and german stopwords in Lucene 3.0.1. If I have a query with a stopword followd by a wildcard (e. g. das*) I get a ParseException: Cannot ...
    Markus MehrwaldMarkus Mehrwald
    Jun 30, 2010 at 9:17 pm
    Jul 1, 2010 at 6:00 pm
  • Can someone point me to a code example that demonstrates processing terms from a query result? I want to extract payloads from the terms that are selected by the query. I'm having difficulty getting ...
    Peter WilkinsPeter Wilkins
    Jun 29, 2010 at 5:10 pm
    Jun 30, 2010 at 7:52 pm
  • Can someone point me to a code example that demonstrates processing tokens from a query result? I want to iterate over TermPositions but can't find my way to an object that instantiates that ...
    Peter WilkinsPeter Wilkins
    Jun 29, 2010 at 8:41 pm
    Jun 30, 2010 at 3:32 am
Group Navigation
period‹ prev | Jun 2010 | next ›
Group Overview
groupjava-user @

83 users for June 2010

Michael McCandless: 18 posts Erick Erickson: 15 posts Otis Gospodnetic: 13 posts Uwe Schindler: 13 posts Rebecca Watson: 11 posts Li Li: 10 posts Simon Willnauer: 9 posts Ian Lea: 8 posts Naama Kraus: 7 posts Ahmet Arslan: 6 posts Allasso: 6 posts Itamar Syn-Hershko: 6 posts Jm: 6 posts Aad Nales: 5 posts A peng: 5 posts Fujian: 5 posts Justin: 5 posts Lance Norskog: 5 posts Steven A Rowe: 5 posts Tarun sapra: 5 posts
show more