Search Discussions

62 discussions - 251 posts

  • Hi All, I'm giving a talk at ApacheCon titled "Bet you didn't know Lucene can..." (http://na11.apachecon.com/talks/18396). It's based on my observation, that over the years, a number of us in the ...
    Grant IngersollGrant Ingersoll
    Oct 22, 2011 at 9:11 am
    Nov 1, 2011 at 12:33 am
  • I already have the term-frequency-count for all the terms in a document. Is there a way I can re-use that info while indexing. I would like to use solr for this. ...
    Prasenjit mukherjeePrasenjit mukherjee
    Oct 23, 2011 at 3:23 pm
    Oct 25, 2011 at 11:36 am
  • According to https://lucene.apache.org/java/3_4_0/api/core/org/apache/lucene/search/Similarity.html ...
    David RyanDavid Ryan
    Oct 20, 2011 at 7:11 pm
    Nov 3, 2011 at 4:25 am
  • We have a modified version of a Lucene StandardAnalyzer , we use it for tokenizing music metadata such as as artist names & song titles, so typically only a few words. On tokenizing it usually it ...
    Paul TaylorPaul Taylor
    Oct 17, 2011 at 12:13 pm
    Oct 20, 2011 at 8:00 am
  • Hello list, in what way does BooleanQuery calculates the number of its clauses? Is this number based on the analyzed query or based on the raw query-string? Imagine you got a StopFilter or a ...
    Oct 5, 2011 at 7:43 am
    Oct 10, 2011 at 6:19 pm
  • Hi, I'm happy to announce that Luke - The Lucene Index Toolbox for Lucene 3.4.0 is available now for download from the project page: http://code.google.com/p/luke Changes in version 3.4.0 (released ...
    Andrzej BialeckiAndrzej Bialecki
    Oct 3, 2011 at 9:53 am
    Oct 4, 2011 at 8:26 am
  • Hello all, Now, I find there is a "org.apache.lucene.search.join" function in Lucene 3.4 version. But I found no demo for "join" function in the source code package: "lucene-3.4.0-src.tar". Now I ...
    Mead LaiMead Lai
    Oct 20, 2011 at 7:27 am
    Oct 21, 2011 at 2:47 pm
  • I have an application where I would like to pick one document from somewhere in the list of search results. For example, I would like to retrieve one of the results at rank 57, another at rank 1223, ...
    Herb RoitblatHerb Roitblat
    Oct 15, 2011 at 6:47 pm
    Oct 20, 2011 at 10:36 am
  • I am seeing this stack trace in my logs: org.apache.lucene.util.SetOnce$AlreadySetException: The object cannot be set twice! at org.apache.lucene.util.SetOnce.set(SetOnce.java:69) at ...
    Clemens WyssClemens Wyss
    Oct 24, 2011 at 2:47 pm
    Oct 25, 2011 at 5:12 am
  • Hello all, *Background: *There are *ONE MILLION* data in a table, and this table has 100 columns inside. The application need to search the data in EVERY column with one 'keyword'. so, I try it in a ...
    Mead LaiMead Lai
    Oct 11, 2011 at 7:13 am
    Oct 17, 2011 at 6:43 pm
  • Hi Group, I am indexing and searching a large corpus of news articles. The indexing process is very straightforward, I am utilizing the standardAnalyzer and analyzing the content of the news ...
    Oct 28, 2011 at 4:13 pm
    Nov 2, 2011 at 6:40 pm
  • We have situation when lucene index is replicated over network. And on that machine reader reopen doesn't make new documents visible to a search. As far as I know IndexReader.reopen() call does work ...
    Denis BazhenovDenis Bazhenov
    Oct 17, 2011 at 2:33 am
    Oct 31, 2011 at 1:06 pm
  • We are using lucene 2.3.2 (yes we should upgrade) and recently we had Exception when opening index: ### java.io.IOException: read past EOF<?xml:namespace prefix = o ns = ...
    Zhang, LishengZhang, Lisheng
    Oct 28, 2011 at 8:58 pm
    Oct 29, 2011 at 6:20 pm
  • Hi, We've noticed some Lucene performance phenomenon, and would appreciate an explanation from anyone familiar with Lucene internals (I know Lucene as a user, but haven't looked under its hood). We ...
    Sol myrSol myr
    Oct 23, 2011 at 1:06 pm
    Oct 27, 2011 at 7:32 pm
  • Hi all, I am planing to change my existing lucene index to use the new facets introduced in lucene 3.4.0. Unfortunately, I could not find an answer to my question in the documentation: I create a ...
    Christoph KaserChristoph Kaser
    Oct 19, 2011 at 7:55 am
    Oct 24, 2011 at 9:47 am
  • Hi, I was given a task to investigate whether it is possible to return Lucene field name when a query is matched. At the moment our application returns the usual matched docs, but the new requirement ...
    Oct 20, 2011 at 12:14 pm
    Oct 21, 2011 at 9:01 am
  • Hi all, I am using Lucene to query Medline abstracts and as a result I get around 3 million hits. Each of the hits is processed and information from a certain field is used. After certain number of ...
    Tamara BobicTamara Bobic
    Oct 18, 2011 at 4:22 pm
    Oct 19, 2011 at 11:19 am
  • I'm doing some performance test doing bulk indexing with lucene 4.0 and I'm seeing weird results. I've read http://www.gossamer-threads.com/lists/lucene/java-dev/127190?do=post_view_threaded#127190 ...
    Marc SturleseMarc Sturlese
    Oct 11, 2011 at 12:14 pm
    Oct 15, 2011 at 2:22 pm
  • Hi, I'm new to Lucene. I have records and I wanna sort them by fields. I've created indexes for those fields with 'not_analyzed'. The sort is case sensitive. In a sense, *A...* *X...* *b...* is the ...
    Senthil V SSenthil V S
    Oct 11, 2011 at 4:49 pm
    Oct 12, 2011 at 1:34 am
  • Hi, I use Lucene, but an not familiar with its internals. I'd appreciate help understanding whether Term Frequences and Positions - are stored per Document of per Field? On the one hand, I never ask ...
    Sol myrSol myr
    Oct 4, 2011 at 9:47 am
    Oct 10, 2011 at 10:10 am
  • Hi, I have an application that has an index with 30 millions docs in it. every day, I add around 1 million docs, and I remove the oldest 1 million, to keepit stable at 30 million. for the most part ...
    V SevelV Sevel
    Oct 27, 2011 at 11:45 am
    Oct 31, 2011 at 9:49 am
  • This is with lucene 3.0.3 running JDK 6u20 64-bit. I'm running into an issue where merges are looping seemingly randomly with the use of ConcurrentMergeSchduler. By "seemingly randomly", I mean that ...
    Oct 26, 2011 at 7:38 pm
    Oct 29, 2011 at 10:45 am
  • My use case is the following : Given an n-dimensional vector ( only +ve quadrants/points ) find its closest neighbours. I would like to try out with lucene's default ranking. Here is how a typical ...
    Prasenjit mukherjeePrasenjit mukherjee
    Oct 22, 2011 at 4:47 pm
    Oct 28, 2011 at 3:44 am
  • Hi all, I am unable to get the lucene demo to run on my macbook pro. I downloaded the jars into my home directory and then I set the CLASSPATH variable to point to them. However, once I run the ...
    Daniel QuachDaniel Quach
    Oct 25, 2011 at 4:09 am
    Oct 26, 2011 at 8:50 am
  • Hello, I am using the ReviewBoard software, which internally uses PyLucene for its search function. Almost every time I use the search functionality however, I get a segmentation fault, which gets ...
    Stein, RubenStein, Ruben
    Oct 24, 2011 at 3:07 pm
    Oct 25, 2011 at 7:36 am
  • Hi all, I usually use Nutch for this but, just for fun, I tried to create a language identifier based on Lucene only. I had a really small set of "training data": 10 files (roughly 2M each) for 10 ...
    Luca RondaniniLuca Rondanini
    Oct 22, 2011 at 12:50 am
    Oct 24, 2011 at 4:39 pm
  • Hi I am trying to understand why I am not able to retrieve docs I have indexed by a ShingleAnalyzer. The setup is as follows: During indexing I do the following: PerFieldAnalyzerWrapper wrapper = ...
    Peyman FaratinPeyman Faratin
    Oct 9, 2011 at 4:12 pm
    Oct 10, 2011 at 5:08 am
  • Tried a first install on Windows (7 64 bit - but installing as 32 bits) and didn't get very far. Next, at work where I have a linux box, the install was pretty straightforward with the one wrinkle of ...
    Oct 29, 2011 at 6:25 pm
    Nov 3, 2011 at 5:58 pm
  • I've read in another thread (http://lucene.472066.n3.nabble.com/Indexing-slower-in-trunk-td3059836.html#a3062991) /Since Lucene 2.9, Lucene works on a per segment basis when searching. Since Lucene ...
    Marc SturleseMarc Sturlese
    Oct 10, 2011 at 11:03 am
    Oct 27, 2011 at 6:54 pm
  • Hi all, I'm an trying to provide a way to efficiently allow a client to page over all of the documents in multiple Lucene indexes that I'm querying with a MultiSearcher (~1-2 million docs). ...
    Alexander DevineAlexander Devine
    Oct 25, 2011 at 7:25 pm
    Oct 25, 2011 at 7:47 pm
  • Hi folks, I'm hoping someone can shed some light on how filters and boolean queries work under the hood. As I understand it, the following two queries are functionally equivalent: boolean must, term ...
    Josh DevinsJosh Devins
    Oct 23, 2011 at 3:40 pm
    Oct 23, 2011 at 8:38 pm
  • Hi Guys, Use Case: Field: Name Data: Jose , Jose Sam, jose, jose jacob, jose , joseph, josef , S. Jose, B. jose etc. There is a field (Name), I want to index this field. I will be searching this ...
    Jamir ShaikhJamir Shaikh
    Oct 15, 2011 at 12:22 am
    Oct 18, 2011 at 6:15 pm
  • Hi, I am new to lucene. I am using lucene 2.4.1 in my project to do a search in a text document. I need to perform a wild card query. I am using the code given in Hrycon - blog. It is working fine ...
    Vidya Kanigiluppai SivasubramanianVidya Kanigiluppai Sivasubramanian
    Oct 18, 2011 at 7:18 am
    Oct 18, 2011 at 9:37 am
  • Hi I have the following shinglefilter (Lucene 3.2) public TokenStream tokenStream(String fieldName, Reader reader) { StandardTokenizer first = new StandardTokenizer(Version.LUCENE_32, reader); ...
    Peyman FaratinPeyman Faratin
    Oct 11, 2011 at 2:26 pm
    Oct 12, 2011 at 1:27 am
  • Hi, Does anyone have a modified scoring (Similarity) function they would care to share? I'm searching web page documents and find the default Similarity seems to assign too much weight to documents ...
    Joel HalbertJoel Halbert
    Oct 8, 2011 at 7:37 am
    Oct 8, 2011 at 1:12 pm
  • I'm trying to understand the .fdt file format and seem to have run into some discrepancies between the documentation and the actual format. Near the start of the file, there are some bytes that don't ...
    Michael RyanMichael Ryan
    Oct 3, 2011 at 7:11 pm
    Oct 4, 2011 at 12:46 pm
  • Hello Every One! I'm struggling with my degree paper. My research project is build a search engine for a language which has many affixes and prefixes. Many papers have been read, the common way is ...
    Shengtao LeiShengtao Lei
    Oct 31, 2011 at 6:07 am
    Oct 31, 2011 at 9:40 am
  • Hi, I am using lucene 2.4.1 in my project. I need to display the search results when searched for a particular term and on selecting an item in the result page, I need to display the document where ...
    Vidya Kanigiluppai SivasubramanianVidya Kanigiluppai Sivasubramanian
    Oct 28, 2011 at 8:49 am
    Oct 28, 2011 at 2:46 pm
  • Hi, Could I please ask another question regarding Lucene "under the hood" / performance. I wondered how "AND" queries are implemented? Say we query for "+hello +world". Would Lucene simply find 2 ...
    Sol myrSol myr
    Oct 25, 2011 at 12:18 pm
    Oct 25, 2011 at 12:58 pm
  • How do I use the Lucene Benchmark to index a wikipedia dump? I want to be able to execute phrase queries on the latest english wikipedia page dump. I'm trying to look for example use cases but I ...
    Daniel QuachDaniel Quach
    Oct 20, 2011 at 4:30 pm
    Oct 23, 2011 at 8:49 pm
  • Hi upgraded from 3.1 to 3.4, now it is compliaing about deprecated method indexWriter.setMergeFactor(); Saying it can only be used with the default LogMergePolicy ,but I never set the merge policy so ...
    Paul TaylorPaul Taylor
    Oct 21, 2011 at 4:56 pm
    Oct 22, 2011 at 2:19 pm
  • Hi, what can I do if I want to have "/" (slashes) as tokens to search? Thanks & Regards Michael --------------------------------------------------------------------- To unsubscribe, e-mail: ...
    Michael SzediwyMichael Szediwy
    Oct 21, 2011 at 9:00 am
    Oct 21, 2011 at 6:35 pm
  • Hi, I am having a weird experience. I made a few changes with the source code (Lucene 3.3). I created a basic application to test it. First, I added Lucene 3.3 project to basic project as "required ...
    Zeynep P.Zeynep P.
    Oct 17, 2011 at 5:46 pm
    Oct 20, 2011 at 8:15 am
  • Hi Can someone answer my question please.... Regards, Vidya From: Vidya Kanigiluppai Sivasubramanian Sent: Wednesday, October 19, 2011 6:06 PM To: ''java-user@lucene.apache.org' Subject: FW: How to ...
    Vidya Kanigiluppai SivasubramanianVidya Kanigiluppai Sivasubramanian
    Oct 19, 2011 at 12:41 pm
    Oct 19, 2011 at 1:05 pm
  • Hi, I would like to read the term and its frequency or score out of indices. How can I do it using Java? Thanks!
    Oct 18, 2011 at 12:32 am
    Oct 18, 2011 at 10:02 am
  • Hello, I had some problem with this issue: https://issues.apache.org/jira/browse/LUCENE-2239 I was getting ClosedByInterruptException even though my application was not calling Thread.interrupt() ...
    Grzegorz TańczykGrzegorz Tańczyk
    Oct 16, 2011 at 10:57 am
    Oct 16, 2011 at 11:30 am
  • Hi, Question about Payload Query and Document Boosts. We are using Lucene 3.2 and Payload queries, with our own PayloadSimilarity class which overrides the scorePayload method like so: {code} ...
    Sujit PalSujit Pal
    Oct 13, 2011 at 1:17 am
    Oct 13, 2011 at 10:52 pm
  • Hello, i'm trying to search the following phase: I'm searching all occurrences of: . "The Right Way" . "The Right Ways" Possible solutions could be something like this - combining a phrase & wildcard ...
    Ralf HeydeRalf Heyde
    Oct 13, 2011 at 10:08 am
    Oct 13, 2011 at 10:24 am
  • Hello all, Well, I add some document into index with "date" type : doc.add(new Field("datestamp", "20111012",Store.YES, Index.NOT_ANALYZED)); Then, I want to get the result order by "datestamp" ...
    Mead LaiMead Lai
    Oct 13, 2011 at 3:56 am
    Oct 13, 2011 at 8:42 am
  • Hi, I noticed that the new Lucene 3.4 supports "BlockJoinQuery" (allowing for 'join' or 'relation' between documents). I understand the documented limitations on the feature (nowhere near the power ...
    Sol myrSol myr
    Oct 11, 2011 at 10:57 am
    Oct 11, 2011 at 12:31 pm
Group Navigation
period‹ prev | Oct 2011 | next ›
Group Overview
groupjava-user @

82 users for October 2011

Ian Lea: 19 posts Mead Lai: 12 posts Michael McCandless: 11 posts Janwen: 10 posts Simon Willnauer: 10 posts Mihai Caraman: 9 posts Uwe Schindler: 9 posts Prasenjit mukherjee: 8 posts Shai Erera: 7 posts Sol myr: 7 posts Paul Taylor: 6 posts Vidya Kanigiluppai Sivasubramanian: 6 posts Dawid Weiss: 5 posts Em: 5 posts Sujit Pal: 5 posts Doron Cohen: 4 posts Erick Erickson: 4 posts Jithin: 4 posts Peyman Faratin: 4 posts Robert Muir: 4 posts
show more