Search Discussions

65 discussions - 223 posts

  • I'm building a distributed index (mostly as a reasearch project for school) and I'm evaluating indexing the entire collection in memory (like google, facebook and others have done years ago). The ...
    Emmanuel EspinaEmmanuel Espina
    Jun 28, 2013 at 9:29 pm
    Jul 6, 2013 at 8:00 pm
  • Hi guys, I am trying to figure out whether there is a query that would do matching based od token payloads. JUST on token payloads. I think, there is not such query. Thus, I thought about ...
    Michal samekMichal samek
    Jun 20, 2013 at 3:18 pm
    Jun 23, 2013 at 6:49 pm
  • Hi, My name is Rafaela and I am just starting to work with Lucene for a project that involves quite a few security aspects. I am working on an app that aims to manage data by using Lucene on a mobile ...
    Rafaela VoiculescuRafaela Voiculescu
    Jun 23, 2013 at 1:48 pm
    Jul 8, 2013 at 12:05 pm
  • Hello, Im using Lucene 3.6.2. and my file, which i indexed ,is something like this : FIELD-1 FIELD-2 FIELD-3 FIELD-4 DOC1: A 1111 2222 ABC DOC2: B 1111 KKKK ABC DOC3: C 1111 JJJJ ABC DOC4: D 1111 ...
    Neeraj shahNeeraj shah
    Jun 22, 2013 at 6:08 am
    Jun 24, 2013 at 9:35 am
  • Hi, I'm using lucene and solr right now in a production environment with an index of about a million docs. I'm working on a recommender that basically would list the n most similar items to the user ...
    Luis Carlos Guerrero CovoLuis Carlos Guerrero Covo
    Jun 28, 2013 at 5:18 pm
    Jun 30, 2013 at 12:51 am
  • Hi! I'm managing the development of LIRE (https://code.google.com/p/lire/), a image search toolbox based on Lucene. While optimizing different search routines for global image features I came around ...
    Mathias LuxMathias Lux
    Jun 23, 2013 at 5:27 pm
    Jun 25, 2013 at 8:51 am
  • Quick question on segments: For my use case of having all docs sorted by a static rank and being able to cut off retrieval after a certain number of docs, I have to sort all my docs using the static ...
    Sriram SankarSriram Sankar
    Jun 14, 2013 at 9:25 pm
    Jun 21, 2013 at 1:32 am
  • Can someone point me to the code that traverses the posting lists? I trying to understand how it works. Thanks, Sriram
    Sriram SankarSriram Sankar
    Jun 13, 2013 at 2:57 am
    Jun 13, 2013 at 8:01 pm
  • I am working on an application that is using Tika to index text based documents and store the text results in Lucene. These documents can range anywhere from 1 page to thousands of pages. We are ...
    Todd HuntTodd Hunt
    Jun 27, 2013 at 5:15 pm
    Jul 6, 2013 at 12:55 pm
  • Hello, I'm trying to understand BlockGroupingCollector. I thought I would start by running the tests in the debugger. However the only test I can find is ...
    Tom Burton-WestTom Burton-West
    Jun 18, 2013 at 4:48 pm
    Jun 21, 2013 at 10:20 am
  • Hi, I have a requirement to perform a full-text search in a new application and I came across Lucene and I want to check if it helps our cause. Requirement: I have a SQL Server database table with ...
    Raghavendra K RaoRaghavendra K Rao
    Jun 17, 2013 at 9:03 pm
    Jun 19, 2013 at 1:21 pm
  • Hello, My first time post in this group. I have been using Lucene recently. I have a question. Where can I find a good explanation on Indexes. Or rather how indexing (Not really the mathematical ...
    Nikhil desaiNikhil desai
    Jun 10, 2013 at 5:25 pm
    Jun 11, 2013 at 12:45 am
  • I'm trying to build trunk and when I run "ant compile" the build hangs right after "Building replicator" at the line "common.resolve:". (see below for more context) I'm not familiar with Ivy so I'm ...
    Tom Burton-WestTom Burton-West
    Jun 20, 2013 at 4:00 pm
    Jun 22, 2013 at 4:48 pm
  • I have a solr index built with solr 1.4 a few years ago, and later upgraded to solr 3.6, and now the index is consisting of 150 million documents. Now I want to read all values of a DateField from ...
    Mingfeng YangMingfeng Yang
    Jun 14, 2013 at 10:54 pm
    Jun 15, 2013 at 12:22 am
  • Hi everyone, I am trying to add a CharFilter to my Analyzer. I started with a StandardAnalyzer wrapped with an ASCIIFoldingFilter. Then I realized that it does not handle searches for names that ...
    Steven SchlanskerSteven Schlansker
    Jun 11, 2013 at 11:53 pm
    Jun 14, 2013 at 5:01 pm
  • Hi Guys, I was trying to better the filtering mechanism for my use case. When i use the existing filters like FieldCacheTermsFilter, TermsFilter i see that the first filtering take up enough time may ...
    Arun Kumar KArun Kumar K
    Jun 6, 2013 at 9:23 am
    Jun 6, 2013 at 10:46 am
  • I have an application that is indexing the text from various reports and forms that are generated from our core system. The reports will contain dollar amounts and various indexes that contain all ...
    Todd HuntTodd Hunt
    Jun 28, 2013 at 6:18 pm
    Jul 1, 2013 at 8:50 pm
  • hello, Is there any way to get all the search result. In lucene we get top documents by giving the limit like top 100,1000... etc. but if i want to get all results. How can I achieve that?? Query qu ...
    Neeraj shahNeeraj shah
    Jun 19, 2013 at 7:12 am
    Jun 19, 2013 at 4:27 pm
  • Dear, I built my own scoring class by extending the DefaultSimilarity. Three major methods from DefaultSimilarity were overrided, including: 1. public float lengthNorm(FieldInvertState state) 2 ...
    Oliver XuOliver Xu
    Jun 12, 2013 at 1:03 pm
    Jun 13, 2013 at 12:40 pm
  • Hi I had an index of precious files with Lucene 2.4, i wanted to switch to lucene 4.3 as Lucene 2.4 index is giving an exception of illegal arguments on function moreLikeThis. I upgrade that index to ...
    Uzair KamalUzair Kamal
    Jun 11, 2013 at 6:07 am
    Jun 11, 2013 at 12:34 pm
  • Hi All, I recently looked at the settings for the TieredMergedPolicy [1] and was puzzled by the note on the setSegmentsPerTier method indicating it should be equal or larger to the MaxMergeAtOnce ...
    Boaz LeskesBoaz Leskes
    Jun 9, 2013 at 8:39 am
    Jun 9, 2013 at 12:33 pm
  • I have indexed documents in Lucene based on three fields: *title*, *address* , *city*. Now I want to build my query say, *C A B *so that I can retrieve the documents as follows: *C* must be present ...
    Abhishek MallikAbhishek Mallik
    Jun 6, 2013 at 6:06 am
    Jun 7, 2013 at 3:29 pm
  • I have a DrillDownQuery for taxonomy Search (Category path based search) The following works beautifully: However there is no Constructor or method arguments to allow sorting; so I did: ...which ...
    Arjun DharArjun Dhar
    Jun 3, 2013 at 12:41 pm
    Jun 3, 2013 at 2:22 pm
  • This post was updated on Jun 03, 2013; 3:48am. If One refers to the JavaDoc for Sort it states that INDEX and and the field should NOT be TOKENIZED. Its a common use case for Numbers to be Sorted. Am ...
    Arjun DharArjun Dhar
    Jun 3, 2013 at 8:02 am
    Jun 3, 2013 at 12:36 pm
  • After we use IndexReader do we always need call decRef explicitly? What will happen, if I don't call decRef? Thanks Sent from my iPad ...
    Yonghui ZhaoYonghui Zhao
    Jun 1, 2013 at 12:13 am
    Jun 2, 2013 at 7:01 am
  • Hi, I have indexed a database table which has about 70 columns out of which 60 columns have been indexed and the rest have been stored. There are 70 million records in this table. This is a static ...
    Raghavendra K RaoRaghavendra K Rao
    Jun 26, 2013 at 8:37 pm
    Jun 27, 2013 at 5:58 pm
  • Hello, is there some kind of a filter or component that I could use to filter non-english text? I have a preprocessing step that I only want to index English documents. Best, Gucko
    Hang MangHang Mang
    Jun 27, 2013 at 3:46 pm
    Jun 27, 2013 at 5:15 pm
  • I have a use case where I build my index only occasionally and am willing to pay the cost to build a read-only index that occupies as small a memory footprint as possible and also remains efficient ...
    Sriram SankarSriram Sankar
    Jun 25, 2013 at 4:49 pm
    Jun 25, 2013 at 5:22 pm
  • Hi Guys, I am using Lucene 4.2. 1 For my use case i am doing a search say name:xyz* and then i have a need to do a grouping with (from query same as name:xyz* + Filter + GroupSort) may be in ...
    Arun Kumar KArun Kumar K
    Jun 24, 2013 at 10:55 am
    Jun 24, 2013 at 11:28 am
  • Hi I am migrating from Lucene 3.6.1 to 4.3.0. I am however not sure how to migrate my custom collector below to 4.3.0 (this page http://lucene.apache.org/core/4_3_0/MIGRATE.html gives some hints but ...
    Peyman FaratinPeyman Faratin
    Jun 18, 2013 at 4:58 am
    Jun 18, 2013 at 1:05 pm
  • Hello All, I have a new requirement within my text search implementation to perform stemming. I have done some research and implemented snowball, but however the customers found it too aggressive and ...
    Sirish VadalaSirish Vadala
    Jun 14, 2013 at 5:31 pm
    Jun 14, 2013 at 6:26 pm
  • Hello, I've just started using Lucene and I'm not sure which Query Classes I should use in my project. My goal is to compare paragraphs of text. Paragraph A is a query and paragraph B is a document ...
    Malgorzata UrbanskaMalgorzata Urbanska
    Jun 14, 2013 at 4:24 pm
    Jun 14, 2013 at 4:45 pm
  • Hello all, I'm trying the following code (trying to play with Tokenizers in order to create my own Analyzer) but I'm getting an exception: public class TokenizerTest { public static void ...
    Gucko GuckoGucko Gucko
    Jun 12, 2013 at 5:48 pm
    Jun 12, 2013 at 6:09 pm
  • Hi Guys, I am trying to get hands on Lucene 4.2 Doc Values (RAM Based Which is by default). I have a 1GB index with 540000 documents. When retrieving the DocVals for matched docs i am able to ...
    Arun Kumar KArun Kumar K
    Jun 4, 2013 at 7:17 am
    Jun 4, 2013 at 10:06 am
  • What is a Lucene query that will find two words at the same term position? Is there a class that will do this? Is the feature available from the Lucene query syntax or any other syntax parsers? For ...
    Lance NorskogLance Norskog
    Jun 3, 2013 at 3:47 am
    Jun 3, 2013 at 6:06 pm
  • I am trying to implement Lucene on high volume. We implemented Lucene 4.1, The search with Facets is very slow and consume high amount of RAM and cause Swapping. We have 7gb of index, it includes 70 ...
    Oded SoferOded Sofer
    Jun 2, 2013 at 11:00 pm
    Jun 3, 2013 at 10:32 am
  • I can't find the Class of TermAttriubte,which has the way of term.so I can't do it work.how to set encoder to the class(termAttribute)
    Jun 25, 2013 at 12:57 pm
    Jun 25, 2013 at 2:32 pm
  • In Unicorn (Facebook's search backend), we used mmap'd indices. We could load them on a separate process - which meant that we could make scoring changes and test rapidly since we did not have to ...
    Sriram SankarSriram Sankar
    Jun 21, 2013 at 4:19 pm
    Jun 21, 2013 at 4:28 pm
  • I'm relatively new to Lucene and am in the process of upgrading from 4.0 to 4.3.1. I'm trying to figure out if I need to leave my version at LUCENE_40 or if it is safe to change it to LUCENE_43. Does ...
    Becker, ThomasBecker, Thomas
    Jun 20, 2013 at 11:56 am
    Jun 20, 2013 at 1:08 pm
  • Hi, I would like an expert opinion about how to optimally do concurrent searches on the same index (let's suppose there are several threads doing searches). Consider these options: a) one ...
    Roberto RagusaRoberto Ragusa
    Jun 19, 2013 at 10:58 am
    Jun 19, 2013 at 11:03 am
  • Hi, I have multiple index that i want to search against, thus i am using MultiReader for that. Along with this I also want all the matches to the query so i am using Collector class for this. The ...
    Amit nandaAmit nanda
    Jun 19, 2013 at 9:02 am
    Jun 19, 2013 at 9:30 am
  • I noticed if I do the merging in the following way, IndexWriter.mabyeMerge() is never triggered automatically by the merge scheduler. IndexWriter writer = ...; IndexReader[] readers = ... ...
    Jun 13, 2013 at 11:06 pm
    Jun 15, 2013 at 9:10 am
  • Hi I have just started out on lucene and experimenting with some possibilities. My goal is to try to exploit an existing database index (which in my case is an inverted index) to serve as a Lucene ...
    Pradeep BPradeep B
    Jun 15, 2013 at 4:56 am
    Jun 15, 2013 at 8:58 am
  • Hello all, is there a filter I can use to remove emails from a TokenStream? so far I'm using this to remove numbers, URls, and I would like to remove emails too: Tokenizer tokenizer = new ...
    Gucko GuckoGucko Gucko
    Jun 12, 2013 at 6:39 pm
    Jun 12, 2013 at 7:04 pm
  • Dear all, We recently migrated from lucene 2.3.1 to lucene 4.1. We have a custom facet implementation, which has also been migrated. We resorted to stay with the same facet approach instead of moving ...
    Ramprakash RamamoorthyRamprakash Ramamoorthy
    Jun 10, 2013 at 12:36 pm
    Jun 10, 2013 at 1:53 pm
  • I'm responsible for the OpenNLP wiki page: https://wiki.apache.org/solr/OpenNLP Please add me to the list of editors.
    Lance NorskogLance Norskog
    Jun 10, 2013 at 2:56 am
    Jun 10, 2013 at 8:11 am
  • For those of you curious about Lucene's finite state transducers (FSTs)... I just built simple web app that lets you enter input/output pairs and see the resulting FST: It's running here ...
    Michael McCandlessMichael McCandless
    Jun 9, 2013 at 3:09 pm
    Jun 9, 2013 at 7:39 pm
  • Hi, I just noticed that the HunspellStemmer outputs more than one tokens, the original word plus the stems as far as I understood. This is not quite what I would expect and becomes tricky especially ...
    Luca CavannaLuca Cavanna
    Jun 7, 2013 at 1:17 pm
    Jun 8, 2013 at 12:01 am
  • Hi folks, I'm looking for some advice on the following scenario: We have a large static index. Our application currently copies the index wholesale and writes new docs to it, but the existing docs ...
    Joel BarryJoel Barry
    Jun 5, 2013 at 6:13 pm
    Jun 6, 2013 at 12:04 am
  • Hello! I've implemented a SpanQuery class that acts like SpanPositionCheckQuery but also matches payloads. For example, here is the "gram" field in a single indexed document: "gram": N|1|1 sg|1|0 ...
    Igor ShalyminovIgor Shalyminov
    Jun 3, 2013 at 5:15 pm
    Jun 5, 2013 at 1:47 pm
Group Navigation
period‹ prev | Jun 2013 | next ›
Group Overview
groupjava-user @

82 users for June 2013

Adrien Grand: 17 posts Michael McCandless: 15 posts Uwe Schindler: 12 posts Jack Krupansky: 10 posts Sriram Sankar: 10 posts Neeraj shah: 7 posts Arjun Dhar: 6 posts Arun Kumar K: 6 posts Hang Mang: 6 posts Raghavendra K Rao: 5 posts Lance Norskog: 5 posts Michal samek: 5 posts Ian Lea: 4 posts Mathias Lux: 4 posts Michael Sokolov: 4 posts Mingfeng Yang: 4 posts Robert Muir: 4 posts Roberto Ragusa: 4 posts Shai Erera: 4 posts Steven Schlansker: 4 posts
show more