Search Discussions

66 discussions - 322 posts

  • Hi everybody, I really need some good advice! I need to index in lucene something like 1.4 billions documents. I had experience in lucene but I've never worked with such a big number of documents. ...
    Luca RondaniniLuca Rondanini
    Nov 21, 2010 at 11:33 pm
    Nov 29, 2010 at 11:06 am
  • Hello, One of our Lucene indexes has started misbehaving on indexWriter.close and I'm searching for ideas about what may have happened and how to fix it. Here's our scenario: - We have seven Lucene ...
    Mark KristenssonMark Kristensson
    Nov 1, 2010 at 10:55 pm
    Feb 22, 2011 at 5:15 pm
  • I have an active lucene implementation that has been in place for a couple years and was recently upgraded to the 3.02 branch. We are now occasionally seeing documents returned from searches that ...
    David FertigDavid Fertig
    Nov 4, 2010 at 4:52 pm
    Nov 17, 2010 at 9:50 pm
  • I need to find the most frequent terms that are appeared with a query. HighFreqTerms.java can be used only to obtain the high frequency terms in the whole index. I need just to find the high ...
    Nov 4, 2010 at 8:47 am
    Nov 6, 2010 at 12:50 pm
  • hello all I was wondering, if I want to measure precision and recall in lucene then what's the best way for me to do it? is there any sample cource code that I can use? thanks though -- ...
    Nov 29, 2010 at 1:01 pm
    Dec 1, 2010 at 12:55 pm
  • I used KeywordAnalyzer and KeywordTokenizer as templates for a new analyzer. The analyzer works fine but the result never reaches the index. My analyzer is called in ...
    Bernd FehlingBernd Fehling
    Nov 25, 2010 at 1:08 pm
    Nov 29, 2010 at 7:41 am
  • Hello, 1) On Windows, I often shut down my application server (which has active IndexWriters open) using the ctrl+c keys. 2) I inspect my directories on the file system I see that the write.lock file ...
    Pulkit SinghalPulkit Singhal
    Nov 10, 2010 at 2:38 pm
    Nov 11, 2010 at 6:43 pm
  • Hi All I was wondering whether I can use TermRangeQuery for my use case. I have a collection of ids (represented as XDF-123) and I would like to do a search for all the ids (might be in the range of ...
    Amin Mohammed-ColemanAmin Mohammed-Coleman
    Nov 26, 2010 at 2:19 pm
    Nov 29, 2010 at 3:51 am
  • Hi, In my work, I am using Lucene and two java classes. In the first one, I index a document and in the second one, I try to search the most relevant document for the indexed document in the first ...
    Manjula wijewickremaManjula wijewickrema
    Nov 29, 2010 at 9:32 am
    Dec 2, 2010 at 1:09 pm
  • Hi, I found a blog post from 2008 where it says, there will be additional custom attributes for tokens in the future, that will be searchable. What is the status of these? Jan
    Jan KurellaJan Kurella
    Nov 23, 2010 at 1:44 pm
    Nov 25, 2010 at 3:56 pm
  • Hi: I have two documents: title body Lucene In Action A high-performance, full-featured text search engine library. Lucene Practice Use lucene in your application Now,I search "lucene performance" ...
    Maven apacheMaven apache
    Nov 30, 2010 at 11:43 am
    Nov 30, 2010 at 1:14 pm
  • Hello, I use the searcherManager for LiveIndexing. With watch -n 60 "lsof | grep indexname | grep deleted | wc -l" I see the number of deleted file handles. The number of handles fluctuates during ...
    Thomas RewigThomas Rewig
    Nov 12, 2010 at 10:49 am
    Nov 19, 2010 at 10:46 am
  • Hi, i have an assignment in my Text Analytics class. I am supposed to create an index and search it. The corpus is a PubMed-like XML file. it is possible to query terms (programcall a few terms) and ...
    Nov 17, 2010 at 4:48 pm
    Nov 18, 2010 at 6:52 pm
  • hello all, I would like to ask about lucene index. I mean I created a simple program that created lucene indexes and stored it in a folder. also I had use a diagnostic tools name Luke to be able to ...
    Nov 16, 2010 at 6:06 am
    Nov 17, 2010 at 8:39 pm
  • Hi. We have a large index (~ 28 GB) which is distributed in three different directories, each representing a country. Each of these country wise indexes is further distributed on the basis of last ...
    Samarendra PratapSamarendra Pratap
    Nov 3, 2010 at 10:06 am
    Nov 10, 2010 at 6:27 am
  • Hi, Now lucene uses integer as document id, so it means we cannot have more than 2^31-1 documents within one collection? Even if we use MultiSearcher the document id is still integer so it seems this ...
    Zhang, LishengZhang, Lisheng
    Nov 1, 2010 at 9:16 pm
    Nov 3, 2010 at 3:56 pm
  • Hello, we are quite new to lucene. At first we want to create a simple user search for our web application. My first thought was to map die 'display name' (= firstname + lastname) to a single field ...
    Dirk ReskeDirk Reske
    Nov 2, 2010 at 2:12 pm
    Nov 2, 2010 at 3:10 pm
  • Hello, I'm trying to retrieve payloads from the highlighteds terms by Highlighter class. In my tests, all terms returned from Highlighter has null as payload. Example: Highlighter h = new ...
    Fabiano NunesFabiano Nunes
    Nov 30, 2010 at 3:20 pm
    Dec 1, 2010 at 12:10 pm
  • What is the difference between the "AND" and "+" operator? ALso,what is the difference between a query and a filter? For example String[] fields={"name","address","classId"}; If I want to search the ...
    yang Yangyang Yang
    Nov 29, 2010 at 8:02 am
    Nov 30, 2010 at 9:40 am
  • I am doing some test about merge indexing and have a performance doubt I am doing merge in a simple way, something like: FSDirectory indexes[] = new FSDirectory[indexList.size()]; for (int i = 0; i < ...
    Marc SturleseMarc Sturlese
    Nov 12, 2010 at 3:34 pm
    Nov 13, 2010 at 5:09 am
  • Hello list, I'm new to lucene, trying to find out if this is possible : In Luke, I can write a query that gets me the results I want, that is : +denominator:([10000 TO 10000] OR [20000 TO 20000]) I'd ...
    Alain CamusAlain Camus
    Nov 5, 2010 at 3:33 pm
    Nov 8, 2010 at 10:06 am
  • Hi guys, I have this problem: I'm using Lucene to create a search engine on people profiles. I have a set of hobbies (let's say {"reading" , "singing"} for example) and I want to find people who have ...
    Claudia GriecoClaudia Grieco
    Nov 25, 2010 at 11:48 am
    Nov 25, 2010 at 4:48 pm
  • I am using this code, with SnowBall and TopDocScore the code: http://pastebin.com/3X3gbpXE Example of Question: - What is the role of PrnP in mad cow disease? I am running in 11.638 documents and the ...
    Celso FontesCelso Fontes
    Nov 15, 2010 at 7:21 pm
    Nov 16, 2010 at 5:56 am
  • Hi All, I'm new to Lucene and have picked up the Lucene in Action book to get started. Really enjoying it but I have a small nagging question. Is the index stored in the same "physical document" as ...
    Farouk alhassanFarouk alhassan
    Nov 7, 2010 at 7:24 am
    Nov 7, 2010 at 10:16 am
  • Hi ! I am newbie in lucene, and i have some problems to create a simple code to query a text file collection. My code is this (http://pastebin.com/HqrbBPtp), but does not works. What is Wrong? ...
    Celso FontesCelso Fontes
    Nov 5, 2010 at 2:34 am
    Nov 5, 2010 at 4:08 pm
  • Hello List, Lucene 3.0.1 Windows Vista Premium Home Edition I am currently attempting to configure my IndexFiles.java file. My intention is to add the following functionality to the code as I require ...
    McGibbney, Lewis JohnMcGibbney, Lewis John
    Nov 25, 2010 at 12:37 pm
    Nov 29, 2010 at 10:04 am
  • Greetings! When using KeywordAnalyzer for indexing a field which has the Field.Index.ANALYZED option selected. Does the use of KeywordAnalyzer automatically mean that there is no point in trying to ...
    Pulkit SinghalPulkit Singhal
    Nov 18, 2010 at 12:09 am
    Nov 18, 2010 at 12:20 pm
  • We are using the hibernate search which is based on lucene as the search engine to build a full text search for our position-related data in the MYSQL db. This is the main structure of the table(it ...
    yang Yangyang Yang
    Nov 18, 2010 at 2:08 am
    Nov 18, 2010 at 11:03 am
  • Hello everybody, I would like to implement the paper "Compact Full-Text Indexing of Versioned Document Collections" [1] from Torsten Suel for my diploma thesis in Lucene. The basic idea is to create ...
    Alex vBAlex vB
    Nov 9, 2010 at 10:30 pm
    Nov 16, 2010 at 8:39 pm
  • Hello all, We have an extremely large number of terms in our indexes. I want to be able to extract a sample of the terms, say something like every 128th term. If I use code based on ...
    Burton-West, TomBurton-West, Tom
    Nov 10, 2010 at 9:03 pm
    Nov 11, 2010 at 5:12 am
  • It occurs in David's index and in my much simplifed test/demo index. There is nothing special in mine so I'd guess the problem isn't really index or data related, but certainly can't vouch for that. ...
    Ian LeaIan Lea
    Nov 8, 2010 at 12:23 pm
    Nov 8, 2010 at 6:18 pm
  • Hi, I use Lucene to index my documents and search. Actually I have 800k documents indexed in Lucene. Those documents have some fields: Id: is a Numeric field to index the documents Name: is a textual ...
    Iam JabourIam Jabour
    Nov 1, 2010 at 7:25 pm
    Nov 1, 2010 at 9:56 pm
  • hello everyone I have this test code: IndexReader ir = getReader(); TermQuery q = new TermQuery(new Term("sub_id",NumericUtils.intToPrefixCoded(57))); Filter f = new QueryWrapperFilter(q); try { ...
    Nov 29, 2010 at 3:51 am
    Nov 29, 2010 at 10:10 am
  • Hi there, I was composing a Query like the Solr.DisMaxQueryHandler would do on my own as I needed a different Tokenizing strategy for non whitespace separated languages and more. The concept I took ...
    Jan KurellaJan Kurella
    Nov 26, 2010 at 1:40 pm
    Nov 26, 2010 at 2:04 pm
  • Hi, What I need is a Not TermQuery. I did not see one in the API, so I did the following: Query query = new BooleanQuery(new BooleanClause(new TermQuery(..), BooleanClause.Occur.MUST_NOT))); This did ...
    Nabib El-RahmanNabib El-Rahman
    Nov 24, 2010 at 1:04 am
    Nov 24, 2010 at 7:56 am
  • Hi, if there is a solr newsgroup better suited form y question, please point me there. Using the SearchHandler with the deftype=”dismax” option enables the DisMaxQParserPlugin. From investigating it ...
    Jan KurellaJan Kurella
    Nov 22, 2010 at 9:57 am
    Nov 23, 2010 at 5:20 pm
  • Hello, I was wondering if there is any API call in Lucene that allows something like the following: Step 1: Take the user input "hello world" you are beautiful Step 2: QueryParser does its thing ...
    Pulkit SinghalPulkit Singhal
    Nov 18, 2010 at 3:36 pm
    Nov 19, 2010 at 7:09 pm
  • Hi Guys, I just find out about Lucene; after reading the main things on wiki it seems to be a great tool, but I still didn't find out how can I use it for my needs. What I want to do is a small tool ...
    Ciprian URSUCiprian URSU
    Nov 13, 2010 at 2:23 pm
    Nov 14, 2010 at 1:16 am
  • Extract the high frequent terms in the search result set. I need to know how to extract the most frequent terms in the search result set after submitting the query. Here the class where you can use ...
    Nov 8, 2010 at 12:14 pm
    Nov 9, 2010 at 11:50 am
  • hello Lucene list, I have a question about a custom Analyzer we're trying to write. The intention is that it tokenizes on whitespace, and abstracts over upper/lowercase and accented characters. It is ...
    Nov 4, 2010 at 9:07 am
    Nov 5, 2010 at 10:11 am
  • Hi, I have a weird result: If I access the same document through the IndexReader or IndexSearcher, they are not equal and have different hash values: Document doc1 = indexSearcher.doc(i); Document ...
    Carmit SaharCarmit Sahar
    Nov 4, 2010 at 8:47 am
    Nov 4, 2010 at 9:30 am
  • Hello, I would like to search several fields while applying different Filter's to the results of different fields. Is it possible to (efficiently) filter out results according to which fields they ...
    Francisco BorgesFrancisco Borges
    Nov 1, 2010 at 5:32 pm
    Nov 2, 2010 at 1:45 pm
  • While implementing a solution for keeping warmed indexReaders around for our various indexes (so users don't have to wait while we open an indexReader for our one slow index), I've run into some ...
    Mark KristenssonMark Kristensson
    Nov 30, 2010 at 1:58 am
    Dec 1, 2010 at 10:02 am
  • Hi, Could someone tell me the effect (if any) of having term vectors set to WITH_POSITIONS_OFFSETS vs YES in terms of search performance? I did some testing and the results were inconclusive. In one ...
    Maricris VillarealMaricris Villareal
    Nov 30, 2010 at 7:28 pm
    Nov 30, 2010 at 7:41 pm
  • Hello list, I am currently attempting to extract keywords from pdf documents, my aim is then to begin constructing a domain ontology using the words which are extracted. I do not need to index ...
    McGibbney, Lewis JohnMcGibbney, Lewis John
    Nov 30, 2010 at 5:09 pm
    Nov 30, 2010 at 5:45 pm
  • Hello, I'm trying to store some token attributes found in a XML document. More specifically, token coordinates for future highlighting. Example: I have a XML with this structure: <word ...
    Fabiano NunesFabiano Nunes
    Nov 29, 2010 at 7:51 pm
    Nov 29, 2010 at 9:12 pm
  • Hi All, Is there a lucene plugin that allows indexing and searching of geospatial Data?
    Farouk alhassanFarouk alhassan
    Nov 23, 2010 at 6:42 pm
    Nov 23, 2010 at 7:00 pm
  • Hi all, I see in the javadoc for the ICUTokenizer that it has special handling for Lao,Myanmar, Khmer word breaking but no details in the javadoc about what it does with CJK, which for C and J ...
    Burton-West, TomBurton-West, Tom
    Nov 22, 2010 at 11:50 pm
    Nov 23, 2010 at 11:07 am
  • Hello, I'm just stuck with one problem and don't know how to figure it out. I'm working on the indexation of the objects that are in computer memory (they exist only in my java code). Don't have any ...
    Nov 22, 2010 at 4:20 pm
    Nov 23, 2010 at 10:06 am
  • Hello, I heard Yonik talk about a better dismax query parser for Solr so I was wondering if Lucene already has this functionality contributed to its contrib modules? - Pulkit ...
    Pulkit SinghalPulkit Singhal
    Nov 21, 2010 at 1:48 am
    Nov 21, 2010 at 5:49 am
Group Navigation
period‹ prev | Nov 2010 | next ›
Group Overview
groupjava-user @

80 users for November 2010

Uwe Schindler: 23 posts Erick Erickson: 22 posts Ian Lea: 19 posts Michael McCandless: 19 posts Pulkit Singhal: 14 posts Mark Kristensson: 13 posts Robert Muir: 10 posts David Fertig: 9 posts Jan Kurella: 8 posts Simon Willnauer: 8 posts Starz10de: 8 posts Anshum Gupta: 7 posts Maven apache: 7 posts Shai Erera: 7 posts Bernd Fehling: 6 posts Celso Fontes: 6 posts Yonik Seeley: 6 posts Amin Mohammed-Coleman: 5 posts Lance Norskog: 5 posts Luca Rondanini: 5 posts
show more