Search Discussions

49 discussions - 229 posts

  • Hi, I understand that generally speaking you should use the same analyzer on querying as was used on indexing. In my code I am using the SnowballAnalyzer on index creation. However, on the query side ...
    Bill CheskyBill Chesky
    Aug 2, 2012 at 9:09 pm
    Aug 3, 2012 at 9:48 pm
  • Hi Everyone, I have the following task. I have a set of documents in multiple languages. I don't know what these languages are. Any given doc may contain text in several languages mixed up. So to me ...
    Ilya ZavorinIlya Zavorin
    Aug 24, 2012 at 7:49 pm
    Aug 26, 2012 at 7:20 pm
  • I've read the javadoc through a few times, but I confess that I'm still feeling dense. Are all tokenizers responsible for implementing some way of retaining the contents of their reader, so that a ...
    Benson MarguliesBenson Margulies
    Aug 29, 2012 at 7:29 pm
    Aug 29, 2012 at 8:52 pm
  • Is there an easy way to figure out the most common tokens and then remove those tokens from the documents. use case: imagine one is indexing a mailing list (such as this java-user) and is extracting ...
    Shaya PotterShaya Potter
    Aug 15, 2012 at 5:47 pm
    Aug 20, 2012 at 2:04 am
  • Our Solr 3.x code used init(ResourceLoader) and then called the loader to read a file. What's the new approach to reading content from files in the 'usual place'?
    Benson MarguliesBenson Margulies
    Aug 29, 2012 at 2:11 pm
    Aug 29, 2012 at 2:46 pm
  • Hi, I have 5 documents. Each document has a field TEST. Total structure is looking like this: Doc 01: TEST: "test 1 string" Doc 02: TEST: "test 2 string" Doc 03: TEST: "test 3 string" Doc 04: TEST ...
    Jochen HebbrechtJochen Hebbrecht
    Aug 20, 2012 at 12:06 pm
    Aug 20, 2012 at 9:10 pm
  • I'm trying to run CheckIndex as seperate tool on a large index to get nice infos about number of terms, number of tokens, ... but always get OOM exception. Already have JAVA_OPTS -d64 -Xmx25g -Xms25g ...
    Bernd FehlingBernd Fehling
    Aug 15, 2012 at 11:25 am
    Aug 15, 2012 at 2:28 pm
  • Hi, I have a big index, and when I searched it with a title string "Cla$$War", Lucene became very slow. It doesn't happen when I searched with other title string such as "Gone with Wind". Does the ...
    Aug 14, 2012 at 8:14 am
    Aug 14, 2012 at 2:40 pm
  • Hi: I don't claim to know anything about how tomcat manages threads, but really you shouldnt have all these objects. In general snowball stemmers should be reused per-thread-per-field. But if you ...
    Robert MuirRobert Muir
    Aug 1, 2012 at 3:38 pm
    Aug 2, 2012 at 1:01 pm
  • hello, I'm trying to construct a boolean query, but can't get it to return the results that I intend. Does anyone see what I'm doing wrong ? The query is like The idea is that it should return ...
    Aug 23, 2012 at 12:10 pm
    Aug 23, 2012 at 1:36 pm
  • Sound like some other analyzer can do the trick? Anyway, I don't want a slower lucene, and I want to treat "Cla$$War" as a whole word. What is the solution left? Thanks. ------------------ Original ...
    Aug 14, 2012 at 2:43 pm
    Aug 14, 2012 at 4:53 pm
  • Hi Everyone, If there was a straightforward way to take a Boolean Query created by the Lucene Query Parser and convert it to a Span Query. Ideally I'd like to take any ANDed clauses and require them ...
    Dave SeltzerDave Seltzer
    Aug 21, 2012 at 10:54 pm
    Aug 22, 2012 at 8:22 am
  • Dear list, I am trying to combine a WildcardQuery and a SpanQuery because I need to extract spans from the index for further processing. I realise that there have been a few public discussions about ...
    Carsten SchnoberCarsten Schnober
    Aug 14, 2012 at 8:59 am
    Aug 14, 2012 at 1:40 pm
  • Hi, in my application I have to write tons of small documents to the index, but with a twist. Many of the documents are actually aggregations of pieces of information that appear in a data stream, ...
    Harald KirschHarald Kirsch
    Aug 6, 2012 at 11:23 am
    Aug 10, 2012 at 5:10 pm
  • I am trying to (mis)use Lucene a bit like a NoSQL database or, rather, a persistent map. I am entering 38000 documents at a rate of 1000/s to the index. Because each item add may be actually an ...
    Harald KirschHarald Kirsch
    Aug 3, 2012 at 1:42 pm
    Aug 5, 2012 at 8:10 am
  • Hi, I've just migrated my webapp from Lucene 3.6 to 4.0-BETA. My 2 indexes are updated every couple of minutes by a batch. The webapp searcher needs to get refreshed whenever this happens. In 3.6, I ...
    Mossaab BagdouriMossaab Bagdouri
    Aug 26, 2012 at 6:41 pm
    Oct 10, 2012 at 12:17 am
  • Hi there, I had a question about migrating the coord value one level up. My current query structure has a root BooleanQuery with a bunch of nested BooleanQuery children: one of these looks for all ...
    Pranshu sharmaPranshu sharma
    Aug 28, 2012 at 3:13 pm
    Sep 14, 2012 at 2:14 am
  • Hello, I have two products that are using Lucene: The first product creates the Lucene indexes for some data using Lucene version 3.01. The second product utilizes the indexes created by the first ...
    Sitowitz, PaulSitowitz, Paul
    Aug 27, 2012 at 5:05 pm
    Aug 28, 2012 at 12:56 pm
  • Hi there, I currently have a lucene index based on version 3.5 made up of xml documents. I'd like to create smaller indexes from the main index ; 1) an index based on a date range from the last week ...
    Grainne wallaceGrainne wallace
    Aug 20, 2012 at 9:50 am
    Aug 20, 2012 at 4:50 pm
  • Hi, I have a situation in which I have many short documents (30-400 chars). My goal is given a phrase, find an indexed document which is a prefix of the phrase. Is there a way to achieve this goal ...
    Aug 16, 2012 at 5:38 pm
    Aug 20, 2012 at 8:38 am
  • The query has been stuck for more than an hour. The total size is less than 1G, and the number of docs is around 100,000. Hardware is ok as it works well with other much more demanding projects ...
    Aug 16, 2012 at 2:10 am
    Aug 17, 2012 at 12:27 am
  • Hello, we currently facing a problem which may lost updates for some documents during adding / comitting. The infrastructure: we have a main solr, which gets documents and distribute them to a lot of ...
    Ralf HeydeRalf Heyde
    Aug 14, 2012 at 11:46 am
    Aug 14, 2012 at 12:16 pm
  • Hey, I have an index with documentations of our products. The documentfields are: group name version description Because most of the documentations contains several sites I create for each site one ...
    Stäbler, Christoph (IT/I4Z)Stäbler, Christoph (IT/I4Z)
    Aug 30, 2012 at 10:59 am
    Aug 30, 2012 at 2:34 pm
  • I'm close to the bottom of my list here. I've got an Analyzer that, in 3.1, set up a CharFilter in the tokenStream method. So now I have to migrate that to createComponents. Can someone give me a ...
    Benson MarguliesBenson Margulies
    Aug 29, 2012 at 2:30 pm
    Aug 29, 2012 at 2:58 pm
  • Greetings subscribers to java-user@lucene. I've been offline for the past ~5 days, and when i looked at my email again this morning I found a message to java-user@lucene sitting in the moderator ...
    Chris HostetterChris Hostetter
    Aug 27, 2012 at 5:12 pm
    Aug 28, 2012 at 4:03 pm
  • Hi, I am new to Lucene. I have a complex query where I need to join more than two tables and have different filtering criteria on it. Is it possible to use Lucene for this ? For example , my query is ...
    Trupti GhugeTrupti Ghuge
    Aug 27, 2012 at 5:06 pm
    Aug 28, 2012 at 7:32 am
  • Dear all, I am currently trying to implement a personalized ranking with Lucene 3.6 for the search in a (non-commercial) social bookmarking system. The ranking of the search results is supposed to ...
    Sebastian R.Sebastian R.
    Aug 22, 2012 at 5:23 pm
    Aug 23, 2012 at 8:10 pm
  • Hello, I have a program which regularly creates a MemoryIndex to be searched against a list of queries. In order to transform the queries that I'm being sent I need to be able to call ...
    Dave SeltzerDave Seltzer
    Aug 22, 2012 at 2:23 pm
    Aug 22, 2012 at 2:39 pm
  • I am using lucene to produce several indexes from html-sites. To work with them i convert the lucene database into sql via a small programm. The main problem is that I take a small part of the ...
    Aug 15, 2012 at 5:40 pm
    Aug 16, 2012 at 6:11 pm
  • Hi, I have the string "$21 a Day Once a Month" to search on a large index. I escape the $ sign, and the query string looks like: +level:0 +(title:21 title:a title:day title:once title:a title:month) ...
    Aug 16, 2012 at 1:28 am
    Aug 16, 2012 at 3:10 am
  • hi everyone, in lucene 4.0 alpha, I found the DocValues are available and gave it a try. I am following the slides in ...
    Li LiLi Li
    Aug 6, 2012 at 9:34 am
    Aug 6, 2012 at 11:41 am
  • I'm failing to find advice in MIGRATE.txt on how to replace 'new Payload(...)' in migrating to 4.0. What am I missing?
    Benson MarguliesBenson Margulies
    Aug 29, 2012 at 1:47 pm
    Aug 29, 2012 at 1:50 pm
  • Hi, The context is that I've migrated from Lucene 3.6 to Lucene 4.0-BETA. Lucene 3.6 had the convenient method IndexSearcher.isCurrent() for any underlying IndexReader, including MultiReader. This is ...
    Mossaab BagdouriMossaab Bagdouri
    Aug 27, 2012 at 5:37 pm
    Aug 28, 2012 at 8:09 am
  • Hello list I build my queries programmatically with Term, NumericTerm, FuzzyQuery, BooleanQuery etc. In particular, I do not use QueryParser to build my query from a string. Still, I would like to ...
    Damian BirchlerDamian Birchler
    Aug 27, 2012 at 2:30 pm
    Aug 27, 2012 at 2:59 pm
  • Hi to all, In pruning package, for pruneAllPositions(TermPositions termPositions, Term t) methos it is said that : "termPositions - positioned term positions. Implementations MUST NOT advance this by ...
    Zeynep P.Zeynep P.
    Aug 14, 2012 at 1:53 pm
    Aug 22, 2012 at 2:28 pm
  • I'm curious as to whether it's possible to abuse merged segment warmers to run some queries on all documents that have been newly added to an index. This would be run in the context of a large, ...
    Greg SteffensenGreg Steffensen
    Aug 16, 2012 at 11:03 pm
    Aug 17, 2012 at 11:51 am
  • I would like to know if anyone has ideas (or pointers to discussions) about good ways to support advanced search options, such as the various kinds of SpanQuery, in a search application user ...
    Mike O'LearyMike O'Leary
    Aug 16, 2012 at 6:20 pm
    Aug 16, 2012 at 6:56 pm
  • We have recently moved to 3.6 from lucene 2.2 and have seen that the way tokens get indexed are not the same. Although we are open to reindexing the data which was initially indexed with 2.2, I would ...
    sunil Kumar Vermasunil Kumar Verma
    Aug 14, 2012 at 9:59 pm
    Aug 16, 2012 at 9:14 am
  • I appreciate your input. However, my question is which analyzer and tokenizer to choose. ------------------ Original ------------------ From: "Uwe Schindler"<uwe@thetaphi.de ; Date: Wed, Aug 15, 2012 ...
    Aug 15, 2012 at 1:38 am
    Aug 15, 2012 at 1:55 am
  • Hi all. I tried posting this in the Solr users group but didnt get any replies so thought I would try the Lucene group. hopefully someone is using the lucene spatial toolkit aka LSP aka spatial4j, ...
    Aug 8, 2012 at 4:37 pm
    Aug 9, 2012 at 4:58 am
  • Hi We are using Solr 4 with a custom query tree. For boolean queries, the score should not just be the sum of all sub-scores, but instead it should be the mean value of all the sub-scores, which is ...
    Pascal CholletPascal Chollet
    Aug 8, 2012 at 12:57 pm
    Aug 8, 2012 at 1:44 pm
  • Hi, We are using lucene 2.3.2 on linux/ubuntu (we will upgrade lucene soon), recently we got exception: read past EOF #012java.io.IOException: read past EOF at ...
    Zhang, LishengZhang, Lisheng
    Aug 2, 2012 at 5:56 pm
    Aug 7, 2012 at 12:35 am
  • Hello list I'm looking for something like Field.setBoost(float boost) that can be set at search time. The reason for this is that we would like to provide user (client-side) configurable search ...
    Damian BirchlerDamian Birchler
    Aug 28, 2012 at 6:47 am
    Aug 28, 2012 at 6:47 am
  • I have the following task that I need to implement in .NET. I get a block of text and need to assess whether this text is mostly readable or a bunch of unreadable garbage. This text is generated by ...
    Ilya ZavorinIlya Zavorin
    Aug 21, 2012 at 5:51 pm
    Aug 21, 2012 at 5:51 pm
  • Hello all, This is Atif, and I am new to Lucene, I am trying to install pylucene however, I am having trouble with make operation. I would appreciate any help in this matter. I used svn for ...
    Muhammad Atif QureshiMuhammad Atif Qureshi
    Aug 17, 2012 at 3:40 am
    Aug 17, 2012 at 3:40 am
  • 14 August 2012, Apache Lucene‚ 4.0-beta available The Lucene PMC is pleased to announce the release of Apache Lucene 4.0-beta Apache Lucene is a high-performance, full-featured text search engine ...
    Robert MuirRobert Muir
    Aug 14, 2012 at 11:35 am
    Aug 14, 2012 at 11:35 am
  • Dear All, I was wondering if the Open Relevance Project(ORP) is currently active and available for users. I just installed Lucene and was hoping to use the ORP to do some relevance testing and work ...
    Sachin KulkarniSachin Kulkarni
    Aug 8, 2012 at 9:21 pm
    Aug 8, 2012 at 9:21 pm
  • Hi, I've just read the following blog: http://blog.mikemccandless.com/2012/01/searching-relational-content-with .html <http://blog.mikemccandless.com/2012/01/searching-relational-content-wit h.html ...
    Johan HaestJohan Haest
    Aug 7, 2012 at 7:25 am
    Aug 7, 2012 at 7:25 am
  • ApacheCon Europe will be happening 5-8 November 2012 in Sinsheim, Germany at the Rhein-Neckar-Arena. Early bird tickets go on sale this Monday, 6 August. http://www.apachecon.eu/ The Lucene/Solr ...
    Chris HostetterChris Hostetter
    Aug 6, 2012 at 5:59 pm
    Aug 6, 2012 at 5:59 pm
Group Navigation
period‹ prev | Aug 2012 | next ›
Group Overview
groupjava-user @

65 users for August 2012

Uwe Schindler: 19 posts Jack Krupansky: 18 posts Robert Muir: 17 posts Ian Lea: 14 posts Benson Margulies: 13 posts Zhoucheng2008: 8 posts Bill Chesky: 6 posts Carsten Schnober: 6 posts Harald Kirsch: 6 posts Michael McCandless: 6 posts Shaya Potter: 6 posts Simon Willnauer: 6 posts Dave Seltzer: 5 posts Ilya Zavorin: 5 posts Jochen Hebbrecht: 5 posts Dawid Weiss: 4 posts Dyzc: 4 posts Li Li: 4 posts Bernd Fehling: 3 posts Chris Hostetter: 3 posts
show more