Search Discussions

83 discussions - 299 posts

  • Hi, Appologies if this the wrong place to post this, or if it has been answered somewhere (I have searched and failed to find anything matching my case exactly). We're using Lucene 2.3.2 (an old ...
    Regan HeathRegan Heath
    May 25, 2010 at 9:42 am
    Jul 4, 2010 at 5:30 pm
  • Hi, I wanted to verify if my understanding is correct. Assuming that I use NRT, and refresh, say, every 1 second, caching based on IndexReader, such is what is used in the CachingWrapperFilter is ...
    Shay BanonShay Banon
    May 17, 2010 at 9:01 pm
    May 18, 2010 at 1:33 am
  • Hi All, I was wondering if there is a way to retrieve the number of unique terms in the lucene ( version 2.4.0) ... I am aware of the terms() && terms(Term) method that returns an enumeration ...
    Kannan chandrasekaranKannan chandrasekaran
    May 27, 2010 at 6:32 pm
    May 28, 2010 at 12:20 pm
  • Hi all, We realize that there is a bug in Lucene's ranking function. Most ranking functions, use a non-linear method to saturate the computation of the frequencies. This is due to the fact that the ...
    José Ramón Pérez AgüeraJosé Ramón Pérez Agüera
    May 5, 2010 at 4:58 pm
    May 6, 2010 at 9:35 pm
  • The FAQ clearly states that document IDs will not be re-assigned unless something was deleted. http://wiki.apache.org/lucene-java/LuceneFAQ#When_is_it_possible_for_document_IDs_to_change.3F However, ...
    May 14, 2010 at 1:39 am
    May 18, 2010 at 9:41 am
  • Hi all, In a clustered environment I search the index from the web application. In the web application I am creating IndexReader on each request. is it expensive to do like this? I read somewhere in ...
    Vijay VeeraraghavanVijay Veeraraghavan
    May 3, 2010 at 10:21 am
    May 6, 2010 at 2:53 am
  • Hi, I am testing some ranking methods with /contrib/benchmark/quality package, and i was wondering if there is a simple way of building a precision-recall graph with the info gathered(maybe ...
    May 26, 2010 at 9:32 am
    Jun 5, 2010 at 9:17 pm
  • Lucene, JSON is the format used for all the configuration and property files in the RIA application we are developing. Is Lucene able to create a document from a given JSON file and index it? Is ...
    Visual LogicVisual Logic
    May 30, 2010 at 5:35 pm
    Jun 1, 2010 at 2:36 am
  • Hello all, I'm considering Lucene for a specific application and am trying to ensure that it is the right tool for what I'm trying to accomplish. At a high level I have a list of restaurants in a ...
    Frank AFrank A
    May 31, 2010 at 11:21 pm
    Jun 1, 2010 at 12:10 am
  • Hi, when I am running Lucene on a 512 MB system. I am getting the following error Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at ...
    Saurabh AgarwalSaurabh Agarwal
    May 27, 2010 at 3:53 pm
    May 28, 2010 at 2:40 pm
  • Hello, Lucene core doesn't seems to use relative word positioning (?) for scoring. For example, indexing that phrase "a b c d e f g h i j k l m n o p q r s t u v w x y z", these queries give the same ...
    May 3, 2010 at 7:20 pm
    Aug 20, 2010 at 8:46 pm
  • Hi all, I'm new to lucene but have used it succesfully for a few simple tasks. I am experimenting with the vector space representation of documents and have managed to store and retrieve ...
    Dionisis KoumourasDionisis Koumouras
    May 31, 2010 at 10:26 am
    Jun 8, 2010 at 10:50 am
  • There seems to be considerable buzz on the internets about document oriented dbs such as MongoDB, CouchDB etc. I am at a loss as to what are the principal differences between Lucene and the "DODBs". ...
    Shashi KantShashi Kant
    May 31, 2010 at 4:21 pm
    Jun 1, 2010 at 10:56 am
  • I read in 'lucene in action" that to save space, we can omit termfreq and postion information. But as far as I know, lucene's default scoring model is vsm, which need tf(term,doc) to calcuate score. ...
    Li LiLi Li
    May 31, 2010 at 8:48 am
    May 31, 2010 at 7:32 pm
  • Hi, Why DuplicateFilter doesn't work together with other filters? For example, if a little remake of the test DuplicateFilterTest, then the impression that the filter is not applied to other filters ...
    Паша МинченковПаша Минченков
    May 31, 2010 at 6:27 am
    May 31, 2010 at 8:01 am
  • Hi, I have a Lucene index that contains source code tags (a tag can be any named source code element - function, class, variable). Each document contains a field with the tag name and some additional ...
    Shlomy ReinsteinShlomy Reinstein
    May 23, 2010 at 2:04 pm
    May 24, 2010 at 1:54 pm
  • Is there a good way to combine the wildcard queries and stemming? As is, the field which is stemmed at index time, won't work with some wildcard queries. We were thinking to create two separate index ...
    Ivan ProvalovIvan Provalov
    May 20, 2010 at 8:17 pm
    May 21, 2010 at 1:19 pm
  • Second one should be ordered in a "levels" structure. Here is the example: Unsorted: DocId SortFieldA SortFieldB 1 101A 2 102 B 3 102 A | Sorted: First, all results are ordered by SortFieldA in ...
    Dragan JotanovicDragan Jotanovic
    May 18, 2010 at 4:02 pm
    May 20, 2010 at 9:31 am
  • Hi, I wrote a code with a view to display the indexed terms and get their term frequencies of a single document. Although it displys those terms in the index, it does not give the term frequencies. ...
    Manjula wijewickremaManjula wijewickrema
    May 17, 2010 at 11:23 am
    May 20, 2010 at 8:59 am
  • Hi, I am using Lucene 3.0.0 to index (with the demo application IndexFiles) a 6 GB corpus which is on NFS, more over I am storing my index on NFS too. But when I run the program I get following ...
    Saurabh AgarwalSaurabh Agarwal
    May 18, 2010 at 5:57 am
    May 18, 2010 at 8:43 am
  • Hello there, In oracle text search there is a feature to reverse search using ctxrule. What it does is, you create an index (ctxrule) on a column having your search criteria(s) and then throw a ...
    Siraj HaiderSiraj Haider
    May 17, 2010 at 8:39 pm
    May 17, 2010 at 10:09 pm
  • Hi, Is it possible to put the indexed terms into an array in lucene. For example, imagine I have indexed a single document in Lucene and now I want to acces those terms in the index. Is it possible ...
    Manjula wijewickremaManjula wijewickrema
    May 14, 2010 at 9:36 am
    May 15, 2010 at 5:39 am
  • Dear All, I am trying to get the term frequencies (through TermFreqVector) of a document (using Lucene 2.9.1). In order to do that I have used the following code. But there is a compile time error in ...
    Manjula wijewickremaManjula wijewickrema
    May 13, 2010 at 10:07 am
    May 14, 2010 at 9:30 am
  • Hi, I am new to lucene. the constructors in Feild show that I can give the value as byte[] I wanted to ask if I store a integer as a byte array the how it will ve searched during search say file ...
    Saurabh AgarwalSaurabh Agarwal
    May 13, 2010 at 7:45 pm
    May 14, 2010 at 7:07 am
  • Hi, If I index a document (single document) in Lucene, then how can I get the term frequencies (even the first and second highest occuring terms) of that document? Is there any class/method to do ...
    Manjula wijewickremaManjula wijewickrema
    May 10, 2010 at 12:11 pm
    May 13, 2010 at 9:50 am
  • Hi Guys, Can anybody tell me how to avoid sharing of docStore files (term vectors & stored fields)? I mean to avoid creation of cfx files. This is important for us because we support some operations ...
    Ivan VasilevIvan Vasilev
    May 12, 2010 at 9:33 am
    May 12, 2010 at 2:16 pm
  • Hi all, I have an index task which will index thousands of records with lucene 3.0.1. My confusion is lucene will always create a .cfx and a .cfs file in the file system, sometimes more, while I ...
    May 5, 2010 at 6:24 am
    May 6, 2010 at 1:42 am
  • Hello, Are Strings that are got via FieldCache.DEFAULT.getStrings( reader, field ) interned? Since I have a requirement for having FieldCaches of some fields in 250M docs index, I'd like to estimate ...
    Koji SekiguchiKoji Sekiguchi
    May 1, 2010 at 3:18 am
    May 2, 2010 at 2:04 am
  • Hi, Dear colleagues! I have one question concerning IndexReader.getSequentialSubReaders() and it's usage. Imagine there is a class extending DirectoryReader or MultiReader. Usually directory- or ...
    Nikolay ZamosenchukNikolay Zamosenchuk
    May 27, 2010 at 9:00 am
    Jun 8, 2010 at 4:26 pm
  • I want to analyzer a text twice so that I can get some statistic information from this text TokenStream tokenStream=null; Analyzer wa=new WhitespaceAnalyzer(); try { tokenStream = ...
    Li LiLi Li
    May 28, 2010 at 4:52 am
    May 29, 2010 at 7:41 am
  • I'd like to have all my queries and terms run through Unicode Normalization prior to being executed/indexed. I've been using the StandardAnalyzer with pretty good luck for the past few years, so I ...
    May 26, 2010 at 10:29 pm
    May 27, 2010 at 9:38 pm
  • Hi Everyone, Thanks in advance for any help. I've been building lucene index to a MS Windows Server 2003 test environment with no problem. When attempting to build the same index onto a Windows ...
    Spencer TicknerSpencer Tickner
    May 25, 2010 at 10:43 pm
    May 27, 2010 at 4:09 pm
  • Hi: Can you give me some details about loading lazily? What happens when I load fields lazily? Thanks in advance Luo Lei
    May 24, 2010 at 1:40 am
    May 27, 2010 at 8:53 am
  • Hi all. We are seeing an exception like this: java.lang.NullPointerException at org.apache.lucene.search.CachingWrapperFilter.docIdSetToCache(CachingWrapperFilter.java:84) at ...
    Daniel NollDaniel Noll
    May 26, 2010 at 6:57 am
    May 26, 2010 at 11:03 pm
  • Hi, Right now I'm using Lucene with a basic Whitespace Anayzer but I'm having problems with stemming. Does anyone have a recommendation for other text analyzers that handle stemming and also keep ...
    Larry HendrixLarry Hendrix
    May 18, 2010 at 6:05 pm
    May 20, 2010 at 2:03 am
  • Hi All I've got a problem I'm trying to solve the whole day: Let's say I have an index with two fields, the first one is always filled and the second one only sometimes. Now I want to search ...
    comparis.ch - Roman Baeriswylcomparis.ch - Roman Baeriswyl
    May 18, 2010 at 4:19 pm
    May 19, 2010 at 9:20 pm
  • Hi, if I want to store the Content field through the constructor Field(string,Reader). Is there any possible way of doing it?? Regards Saurabh Agarwal
    Saurabh AgarwalSaurabh Agarwal
    May 17, 2010 at 3:04 pm
    May 17, 2010 at 4:36 pm
  • I have a problem. I found the store field in a document is not consistent. Here are some small case about my program. Field A = new Filed(Store.Yes,FieldAValue); FieldBValue.add(FieldAValue); // ...
    May 11, 2010 at 2:21 pm
    May 11, 2010 at 11:11 pm
  • Hi, I am using Lucene 2.9.1 . I have downloaded and run the 'HelloLucene.java' class by modifing the input document and user query in various ways. Once I put the document sentenses as 'Lucene in ...
    Manjula wijewickremaManjula wijewickrema
    May 7, 2010 at 8:53 am
    May 10, 2010 at 11:33 am
  • Hi, I am new to Lucene. If I want to know the term or phrase frequency of an input document, will it be possible through Lucene? Thanks, Manjula
    Manjula wijewickremaManjula wijewickrema
    May 6, 2010 at 10:40 am
    May 7, 2010 at 2:53 pm
  • Dear all, I am using lucene 3.0 to index the pdf reports that I generate dynamically. I index the pdf file name (without extension), file path and its absolute path as fields. I search with the file ...
    Vijay VeeraraghavanVijay Veeraraghavan
    May 3, 2010 at 7:22 am
    May 3, 2010 at 9:34 am
  • hi all, I read lucene in action 2nd Ed. It says SimpleSpanFragmenter will "make fragments that always include the spans matching each document". And also a SpanScorer existed for this use. But I ...
    Li LiLi Li
    May 19, 2010 at 4:59 am
    Mar 9, 2011 at 6:41 am
  • Hi all, I hope someone can enlighten me. I am trying to figure out how spatial searches are to be implemented with Lucene. From walking through mailing lists and various web pages, looking at the ...
    Klaus MalornyKlaus Malorny
    May 11, 2010 at 1:18 pm
    Jun 2, 2010 at 9:30 am
  • Hi, I wrote aprogram to get the ferquencies and terms of an indexed document. The output comes as follows; If I print : +tfv[0] Output: array terms are:{title: capabl/1, code/2, frequenc/1, lucen/4, ...
    Manjula wijewickremaManjula wijewickrema
    May 20, 2010 at 9:15 am
    May 26, 2010 at 3:37 am
  • Hi, I would like to have more than 1 filter on a query - I have two range filters, and some filters on other fields. What is the best way to do it? Many thanks, liat
    Liat orenLiat oren
    May 25, 2010 at 12:16 pm
    May 25, 2010 at 12:47 pm
  • Hi guys! does there exist a way to define some threshold on the terms I wanna store in the index(before they are indexed). I need to store the terms with higheest frequencies. I done it with term ...
    May 24, 2010 at 11:26 am
    May 25, 2010 at 8:49 am
  • Hi , I am planning to use Apache lucense in one of my projects, I want to index files based on the file properties (I won’t be indexing the data) and I want lucense to query the index so that I can ...
    Vijay reddyVijay reddy
    May 18, 2010 at 6:57 am
    May 18, 2010 at 8:30 am
  • Hi, I am struggling with using HighFreTerms class for the purpose of find high fre. terms in my index. My target is to get the high frequency terms in an indexed document (single document). To do ...
    Manjula wijewickremaManjula wijewickrema
    May 15, 2010 at 5:50 am
    May 17, 2010 at 11:13 am
  • How easy is it to influence the score of search results in lucene 2.9? The situation is that we have a large number of dated documents that match the term "john" but we want to return the latest ...
    Gregory TarrGregory Tarr
    May 12, 2010 at 5:04 pm
    May 13, 2010 at 10:24 am
  • Hi, I have been using the FieldCache in lucene version 2.9 compared to that in 2.4. The load time is massively decreased, however I am not seeing any benefit in getting a field cache after re-open of ...
    Carl AustinCarl Austin
    May 11, 2010 at 1:28 pm
    May 11, 2010 at 1:47 pm
Group Navigation
period‹ prev | May 2010 | next ›
Group Overview
groupjava-user @

91 users for May 2010

Erick Erickson: 20 posts Manjula wijewickrema: 20 posts Ian Lea: 18 posts Yonik Seeley: 14 posts Uwe Schindler: 13 posts Saurabh Agarwal: 11 posts Grant Ingersoll: 10 posts Michael McCandless: 10 posts Shay Banon: 6 posts Vijay Veeraraghavan: 6 posts José Ramón Pérez Agüera: 5 posts Koji Sekiguchi: 5 posts Li Li: 5 posts Robert Muir: 5 posts Паша Минченков: 5 posts Ahmet Arslan: 4 posts Andrzej Bialecki: 4 posts Daniel Noll: 4 posts Kannan chandrasekaran: 4 posts Nigel: 4 posts
show more