Search Discussions

127 discussions - 529 posts

  • Hi, I tried stressing Lucene in a controlled environment: one static IndexSearcher for an index that doesn't change, and in same process I create a number of Threads that call this Searcher ...
    Oren ShirOren Shir
    Nov 21, 2005 at 3:23 pm
    Apr 5, 2006 at 9:52 pm
  • Hi. What is the expected memory usage of Lucene these days? I dug up an old email [1] from 2001 which gave the following summary of memory usage: An IndexReader requires: one byte per field per ...
    Daniel NollDaniel Noll
    Nov 10, 2005 at 12:43 am
    Nov 18, 2005 at 2:38 am
  • Have any one implemented LSI in Lucene? Kindly let me know how hard/easy it is. thanks chandana
    Nov 30, 2005 at 11:42 pm
    Dec 20, 2005 at 9:20 am
  • Hello, My analyzer sometimes gives multiple terms for the same word. This makes them generated at the same position. When I use PhraseQuery to search for this term, it matches documents with all ...
    Ahmed El-dawyAhmed El-dawy
    Nov 4, 2005 at 8:46 am
    Nov 7, 2005 at 9:08 pm
  • Hi, I'm using lucene (which rocks, btw ;) behind the scenes at www.last.fm for various things, and i've run into a situation that seems somewhat inelegant regarding populating fields which i already ...
    Richard JonesRichard Jones
    Nov 2, 2005 at 1:11 pm
    Nov 2, 2005 at 5:21 pm
  • I'm working on building a custom highlighter for a client, which may eventually be generalizable. In my work, I've come across some issues I'd like to discuss. One issue is of appended fields ...
    Erik HatcherErik Hatcher
    Nov 20, 2005 at 1:48 pm
    Nov 22, 2005 at 8:55 am
  • When I boost fields while indexing, the fields still have a boost of 1.0 during searching. When I view the values via Luke, it confirms the value of 1.0. Do I have to boost it agin during search? I ...
    Daniel ClarkDaniel Clark
    Nov 17, 2005 at 10:39 am
    Nov 19, 2005 at 4:22 pm
  • Hi all, Can somebody please suggest a way/ways on how to optimize execution times this query below (or to use some of Trunk BooleanScorers)... Probably I do not see obvious. Use Case: Here I have ...
    Eks devEks dev
    Nov 9, 2005 at 5:34 pm
    Dec 23, 2005 at 12:04 pm
  • Hi. I was wondering if anyone else has seen this before. I'm using lucene 1.4.3 and have indexed about 3000 text documents using the statement: doc.add(Field.Text("contents", new FileReader(f), ...
    Nov 21, 2005 at 7:53 am
    Nov 21, 2005 at 8:44 pm
  • Dear all, I'd like to extract each term and its frequency in the index and each file in order to get the potential keywords of each file. Does Lucene provide any built-in method to do that? Thank you ...
    Supheakmungkol SARINSupheakmungkol SARIN
    Nov 14, 2005 at 3:22 am
    Nov 15, 2005 at 4:55 pm
  • Hi, I am having 1,00,000 documents in a index but in near future it will be 3 million and more. I am somewhat concerned about the searhing process with this much number of document. I am giving order ...
    Manoj Kr. SheoranManoj Kr. Sheoran
    Nov 4, 2005 at 6:07 am
    Nov 5, 2005 at 4:10 am
  • I am a brand new newbie with respect to Lucene, and I am just figuring out how to include it into an application I am building. (personal blog) In essence I have a set of articles that reside in a ...
    Alan ChandlerAlan Chandler
    Nov 23, 2005 at 7:30 pm
    Jan 28, 2006 at 8:43 pm
  • Hi, I have a requirement to highlight search keywords in the results and display the matching fragment of the text with the results. I am using the Hits highlighting mentioned in Lucene in Action. ...
    Harini RaghavanHarini Raghavan
    Nov 30, 2005 at 12:47 pm
    Dec 5, 2005 at 12:35 pm
  • I'm trying to reverse sort a result set by it's date field (YYYYMMDDhhmm). pseudocode: Boolean order = {false | true}; Sort sorter = new Sort(new SortField("date", order)); hits = ...
    Michael PowMichael Pow
    Nov 29, 2005 at 1:59 am
    Dec 1, 2005 at 5:57 am
  • Hi, I am not that experienced with Java and am attempting to implement the commit method for the IndexReader for the application I'm developing. I am trying to extend the IndexReader class but it ...
    Malcolm ClarkMalcolm Clark
    Nov 25, 2005 at 12:32 pm
    Nov 28, 2005 at 4:25 pm
  • Is using a QueryParser to parse a query using the same, single instance of Analyzer thread-safe? Or should I create a new Analyzer each time? ...
    Sharma, SiddharthSharma, Siddharth
    Nov 1, 2005 at 3:46 pm
    Nov 1, 2005 at 8:54 pm
  • Hello, I use Lucene in a long-running server application on a Linux server, and the other day I got the "Too many open files" exception. I've increased the number of allowed file handles, but was ...
    Matt MagoffinMatt Magoffin
    Nov 18, 2005 at 1:08 am
    Nov 18, 2005 at 6:22 pm
  • I am indexing persons that has the usual fields name, address etc. I need to keep track of which name and addresses are active now and which ones are old. I do that by having a two sets of fields ...
    Lasse LLasse L
    Nov 11, 2005 at 6:27 pm
    Nov 14, 2005 at 8:23 am
  • Hello all, I am wondering how many of you actually work with own scoring mechanism (overwriting Lucenes standard scoring) and how many of you do work on how to normalise this score. I would like to ...
    Karl KochKarl Koch
    Nov 5, 2005 at 8:26 pm
    Nov 7, 2005 at 6:43 pm
  • Hello group, the scoring formula for Lucene is well explained in "Lucene in Action". However, is this formula also valid for Lucene 1.2 (which I am using). I need to know that for documentation ...
    Karl KochKarl Koch
    Nov 4, 2005 at 7:52 pm
    Nov 6, 2005 at 1:35 am
  • Hi I am trying to download the source code for tm-extractors-0.4.jar from http://www.textmining.org/ Looks like the site has been hacked. Does anyone know the location of the CVS or SVN repository? ...
    Patrick KimberPatrick Kimber
    Nov 24, 2005 at 4:47 pm
    Jan 5, 2006 at 10:55 am
  • Hi, I use Lucene to index stuff that are changed very often but don't need to be real-time to searchers. e.g. the search result can be changed couple times per minute, but I only need to show the ...
    Victor LeeVictor Lee
    Nov 24, 2005 at 10:51 pm
    Nov 25, 2005 at 9:31 pm
  • Hi, For final finish up on work for my project. We intend to do search clustering. Now I have already read that there is no clear cut way of doing that in lucene. Wondering, if anyone has tackled ...
    Supreet SethiSupreet Sethi
    Nov 23, 2005 at 12:05 pm
    Nov 24, 2005 at 11:33 am
  • Hello, Luceners! What is "stemming"? I have Lucene in Action and found the following definitions on page 103: - reducing words to a root form (stemming) - changing words into the basic form ...
    Koji SekiguchiKoji Sekiguchi
    Nov 20, 2005 at 3:48 pm
    Nov 21, 2005 at 6:34 am
  • I have indexed a set of documents that do not have fields. I want to use the getTermFreqVector method from IndexReader to get the frequencies. However when I do that as: TermFreqVector[] z = ...
    Anna BuczakAnna Buczak
    Nov 18, 2005 at 3:37 am
    Nov 18, 2005 at 5:41 pm
  • Hi Folks. I downloaded the Lucene and tried to do an ant. It initially gave me the following error: BUILD FAILED file:/home/parikpol/downloads/lucene-1.4.3/build.xml:11: Unexpected element "tstamp" I ...
    Pol, ParikshitPol, Parikshit
    Nov 16, 2005 at 8:02 pm
    Nov 17, 2005 at 7:44 pm
  • Hi all. I have a question about sorting. Lucene in Action says: "For numeric types, each field being sorted for each document in the index requires that four bytes be cached. For String types, each ...
    Monsur HossainMonsur Hossain
    Nov 10, 2005 at 1:36 am
    Nov 11, 2005 at 12:12 am
  • Hi, If I understand correctly, when sorting by Sort.INDEXORDER the oldest documents that were added to the index will be returned first. I want the reverse, because I'm more interested in newer ...
    Oren ShirOren Shir
    Nov 3, 2005 at 2:46 pm
    Nov 3, 2005 at 4:31 pm
  • Hi all, I am interested in developing a system which will use Lucene to implement the search functionality. A key characteristic of this system is that certain information about the indexed documents ...
    Marios SkounakisMarios Skounakis
    Nov 16, 2005 at 9:47 am
    Jun 17, 2008 at 8:03 pm
  • This is an other good reason for buying the book ultimately! Thx stefan -----Ursprüngliche Nachricht----- Von: Erik Hatcher Gesendet: Dienstag, 29. November 2005 15:57 An: java-user@lucene.apache.org ...
    Stefan GusenbauerStefan Gusenbauer
    Nov 29, 2005 at 3:06 pm
    Nov 30, 2005 at 8:57 pm
  • Hello, I am searching over multiple indices using MultiSearcher. Thus I get hits from various indices. Is it possible to determine from which index a hit comes? The solution I found is to store the ...
    Nov 29, 2005 at 1:48 pm
    Nov 30, 2005 at 12:41 pm
  • I've read many comments from users on the list indicating that sorting may/will be performance-heavy. Is high CPU utilization with a sorted search one of the expected performance hits? In tests for ...
    Jeff RodenburgJeff Rodenburg
    Nov 20, 2005 at 9:28 pm
    Nov 21, 2005 at 7:04 am
  • I indexed dates using Field.Keyword(String,Date). The values seem to be encoded when I retrieve them via document.get("date"). Luke confirmed it. How do I decode the Date when retrieving from ...
    Daniel ClarkDaniel Clark
    Nov 17, 2005 at 10:43 am
    Nov 17, 2005 at 12:14 pm
  • Hi All, I'm using a bunch of SpanNearQueries combined in a SpanOrQuery to do a set of searches matches a phrase with a prefix search at the end. I.e. "phrase with prefix s*" kind of thing that ...
    Greg KGreg K
    Nov 16, 2005 at 8:20 pm
    Nov 17, 2005 at 7:48 am
  • Howdy all, have a quick question for you... I am seeing quite a difference between optimized index and one that is not optimized. I have read a few papers that say that it shouldn't matter, but I am ...
    Aigner, ThomasAigner, Thomas
    Nov 16, 2005 at 7:23 pm
    Nov 16, 2005 at 10:28 pm
  • Dear all, I'd like to add some other stopwords to the StandardAnalyzer. How do i do this? Thanks a lot in advance, Mungkol Yahoo! FareChase: Search multiple travel sites in one click. ...
    Supheakmungkol SARINSupheakmungkol SARIN
    Nov 14, 2005 at 6:02 am
    Nov 15, 2005 at 2:45 am
  • I have several indexes I want to search together. What performs better a single searcher on a multi reader or a single multi searcher on multiple searchers (1 per index). Thanks Mike
    Mike StreetonMike Streeton
    Nov 11, 2005 at 8:49 am
    Nov 14, 2005 at 8:54 am
  • Hi I'm using highlighter and have this problem: The query is over two or more fields, like: *body:home AND title:sale* I want to highlight over body field, but not highlight "sale" if "sale" is in ...
    Ernesto De SantisErnesto De Santis
    Nov 11, 2005 at 3:34 pm
    Nov 11, 2005 at 6:40 pm
  • I've used the example posted at http://wiki.apache.org/jakarta-lucene/LuceneFAQ#head-a801793d7479264e29157d92440199b35266dc18 to find all terms used in a complete index, but was wondering if there is ...
    Matt MagoffinMatt Magoffin
    Nov 9, 2005 at 11:50 pm
    Nov 11, 2005 at 12:37 am
  • Is there a way to limit the number of hits I want returned? Sometimes I just want one document. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Daniel Clark, Senior Consultant Sybase Federal Professional ...
    Daniel ClarkDaniel Clark
    Nov 10, 2005 at 12:55 am
    Nov 10, 2005 at 11:34 pm
  • If I have a Document object (doc), and I also have an IndexReader open, how can I find out IndexReader's docid corresponding to (doc)? IndexReader has a map from docid - Document, but I don't see the ...
    Nov 9, 2005 at 3:55 pm
    Nov 9, 2005 at 9:15 pm
  • Hi, I am looking at Lucene to index and search file metadata - filename, size, permissions, mtime, ctime, atime, etc. I do not need to index and search the contents of the file. I was wondering if ...
    Nov 2, 2005 at 10:33 pm
    Nov 3, 2005 at 9:22 am
  • I have a body of text which is being added to a document as unstored. All the words in the body text are coming through in the token stream for analyzing. For some reason I can search on some of ...
    Combs, CraigCombs, Craig
    Nov 30, 2005 at 1:05 pm
    Nov 30, 2005 at 8:44 pm
  • Hi, I am stumped. I can't seem to get word docs indexed. I have tried both POI and textmining libraries to little or no real affect. I dump the doc files into a text file with the same variable I use ...
    Steven BellSteven Bell
    Nov 27, 2005 at 5:49 am
    Nov 28, 2005 at 4:28 am
  • This is the results for the StandardTokenizer: input - output token - output type 1. 1.2 - 1.2 - <HOST 2. 1.2. - 1.2 - <HOST 3. a.b - a.b - <HOST 4. a.b. - a.b. - <ACRONYM 5. www.apache.org - ...
    Yahootintin 11533894Yahootintin 11533894
    Nov 22, 2005 at 12:40 am
    Nov 22, 2005 at 6:44 pm
  • Hi I have checked out the latest version of Lucene from CVS and have found a change in the results compared to version 1.4.3. The issue is with the deprecated API in the BooleanQuery class. The ...
    Patrick KimberPatrick Kimber
    Nov 15, 2005 at 10:24 am
    Nov 18, 2005 at 11:09 am
  • Hi Our index contains articles with special characters. For instance, the string P&O is indexed as P&#38;O. The correct entity codes are indexed for all the special characters we use. My question is ...
    Lucene UserLucene User
    Nov 15, 2005 at 4:00 pm
    Nov 16, 2005 at 5:05 pm
  • All, In the project I'm working on we have a separate index for each database. There are 12 databases now. but in the future there may be as many as 20. They all have their own release cycle so I ...
    Jason CalabreseJason Calabrese
    Nov 14, 2005 at 10:52 pm
    Nov 15, 2005 at 11:16 pm
  • I got this nagging problem that I can't figure out in the source code of Lucene. In the file org/apache/lucene/queryParser/QueryParser.java, there's a method called parse that returns a Query (see ...
    Eugene EzekielEugene Ezekiel
    Nov 13, 2005 at 6:39 pm
    Nov 13, 2005 at 9:56 pm
  • Hi, Was wondering if someone could help me out with a few things in Korean as related to Lucene: 1. Which Analyzer do you recommend? From the list, I see that some have had success with the ...
    Grant IngersollGrant Ingersoll
    Nov 11, 2005 at 1:37 pm
    Nov 12, 2005 at 9:47 am
Group Navigation
period‹ prev | Nov 2005 | next ›
Group Overview
groupjava-user @

140 users for November 2005

Erik Hatcher: 65 posts Chris Hostetter: 25 posts Yonik Seeley: 25 posts Otis Gospodnetic: 20 posts Oren Shir: 15 posts Paul Elschot: 13 posts Daniel Noll: 12 posts Karl Koch: 12 posts MALCOLM CLARK: 11 posts Daniel Clark: 9 posts Manoj Kr. Sheoran: 9 posts Michael D. Curtin: 9 posts Victor Lee: 9 posts Aigner, Thomas: 7 posts Doug Cutting: 7 posts Grant Ingersoll: 7 posts John Powers: 7 posts Marvin Humphrey: 7 posts Cheolgoo Kang: 6 posts Mark harwood: 6 posts
show more