Search Discussions

69 discussions - 227 posts

  • Dear list, a very basic question about lucene, which version of unicode can be handled (indexed and searched) with lucene? It looks like lucene can only handle the very old Unicode 2.0 but not the ...
    Bernd FehlingBernd Fehling
    Feb 25, 2011 at 10:24 am
    Feb 27, 2011 at 8:43 pm
  • I'd like to suggest search terms to my users. My naïve approach would have been: After at least n characters have been typed (asynchronously) find terms in IndexReader.terms() which "match" Is there ...
    Clemens WyssClemens Wyss
    Feb 21, 2011 at 4:05 pm
    Feb 22, 2011 at 12:12 pm
  • This could be a rhetorical question. The way to find the last/max term that is a unique per document is to use TermsEnum to seek to the first term of a field, then call seek to the docFreq-1 for the ...
    Jason RutherglenJason Rutherglen
    Feb 19, 2011 at 2:25 am
    Feb 21, 2011 at 7:22 pm
  • I know that the size of a Lucene index can double while optimization is underway, but it's supposed to eventually settle back down to the original size, correct? We have a Lucene index consisting of ...
    Phil HeroldPhil Herold
    Feb 9, 2011 at 6:55 pm
    Feb 11, 2011 at 11:06 pm
  • Hi, I am using SpanQuery and SpanNearQuery to get phrase query like "Sql Server". In my text file in which I am searching, it is present like (sql. server) mean 'sql dot server' which is not a span ...
    Ranjit KumarRanjit Kumar
    Feb 10, 2011 at 5:39 am
    Feb 11, 2011 at 1:15 pm
  • I'm curious if there's a new way (using flex or term states) to store IDs alongside a document and retrieve the IDs of the top N results? The goal would be to minimize HD seeks, and not use field ...
    Jason RutherglenJason Rutherglen
    Feb 2, 2011 at 6:04 pm
    Feb 3, 2011 at 1:22 pm
  • Hi, I have two index files. I am searching id1 from Index A and id2 from Index B. By using id1 (Index A) results , I am searching id2 from Index B. I stored these two index files in local file ...
    Feb 14, 2011 at 6:17 pm
    Jun 5, 2011 at 8:34 am
  • Hello all, I am using Lucene for my project and we have new requirement to present data in the form of Analytics. Facet could be used for that but for this purpose i don't want to migrate to Solr. ...
    Feb 23, 2011 at 9:00 am
    Feb 24, 2011 at 6:21 am
  • hi all is there any limit to post email to this maillist now? thanks --------------------------------------------------------------------- To unsubscribe, e-mail: ...
    Li LiLi Li
    Feb 15, 2011 at 12:30 pm
    Feb 16, 2011 at 11:14 am
  • Hi All, I use solr 3.x and put excel documents into an index. I have my own query parser and use SpanQueries to provide a proximity search feature. It works really good. Most often than not its ...
    Livia HauserLivia Hauser
    Feb 7, 2011 at 7:59 pm
    Feb 10, 2011 at 1:54 pm
  • Hello all, We are using the ICUTokenizer because we have documents in about 400 different languages. We are also setting autoGeneratePhraseQueries to false so that CJK and other languages that don't ...
    Burton-West, TomBurton-West, Tom
    Feb 4, 2011 at 5:47 pm
    Feb 4, 2011 at 8:34 pm
  • Is there a query syntax for specifying a numeric range for a field indexed as a NumericField. I've tried numericfield:[0 TO 10] But it is parsed as a TermRangeQuery and not a NumericRangeQuery. Many ...
    Anuj ShahAnuj Shah
    Feb 3, 2011 at 3:50 pm
    Feb 4, 2011 at 3:39 pm
  • Hi, I would like to override default similarity's computeNorm to work with a different field, other than the query field. Here is the DefaultSimilarity implementation: @Override public float ...
    Tsvika RabkinTsvika Rabkin
    Feb 1, 2011 at 1:28 pm
    Feb 3, 2011 at 10:24 pm
  • Hi all, I am seeking for some guidelines to directly convert an already existing index to Lucene index. The index available to me is of a set of <value1,value2 pairs. Where each pair is : < word , ...
    Lokendra SinghLokendra Singh
    Feb 25, 2011 at 6:27 am
    Feb 25, 2011 at 9:56 am
  • hi, I am using query like criteria = (sql OR sqlserver OR "sql server") AND java AND delphi . In this case when i am using default parser as code mention below: QueryParser parser = new ...
    Ranjit KumarRanjit Kumar
    Feb 18, 2011 at 4:17 pm
    Feb 22, 2011 at 9:51 am
  • Hello all, I am using Keyword analyzer to index a field and while using queryparser, I am using the same analyzer. I am indexing the text Hello world and while searching using queryparser.parse it is ...
    Feb 18, 2011 at 10:39 am
    Feb 18, 2011 at 11:37 am
  • Hello, I am little confused on the stored and index part of lucene How it actually stores the indexed field and stored field Is it that for every field indexed , all the store fields added .I mean do ...
    Feb 17, 2011 at 10:44 am
    Feb 17, 2011 at 1:43 pm
  • Hi, I am tying WordNet synonyms into an SynonymAnalyzer. But I find there is a wrong in the search result as following: input keywords: *browned fox* query.toString(): (content:browned content:brown) ...
    Gong LiGong Li
    Feb 12, 2011 at 11:05 am
    Feb 12, 2011 at 5:35 pm
  • Guys, this is tiny and probably not relevant. But I'll bet a beer that at least a dozen people had to dirtymod this class while they could have run it from command line. A 15 min time save that took ...
    Pablo MendesPablo Mendes
    Feb 9, 2011 at 5:06 pm
    Feb 10, 2011 at 12:20 pm
  • Hello everybody, I am currently using Lucene 3.0.2 with payloads. I store extra information in the payloads about the term like frequencies and therefore I don't need frequencies and term positions ...
    Alex vBAlex vB
    Feb 2, 2011 at 7:35 pm
    Feb 3, 2011 at 10:37 pm
  • Hi Guys, I've been using Lucene for more than 5 years and it is a great tool - great job! Thanks for everything... Lately I encountered the new payloads support and it looks its a great solution for ...
    Ophir CohenOphir Cohen
    Feb 1, 2011 at 10:19 am
    Feb 2, 2011 at 10:47 pm
  • Hi, Today I have noticed that sometimes lucene sort produced strange result in plain English names, like (String ASC) l yy liu yu I traced to lucene source code, it seems to be a java English ...
    Zhang, LishengZhang, Lisheng
    Feb 27, 2011 at 1:00 am
    Mar 1, 2011 at 7:34 am
  • Hi there, I'd like to serialize some Lucene Documents I've built before. My goal is to send the documents over a http connection to a Solr server which then should add them to its index. I thought ...
    Erik FäßlerErik Fäßler
    Feb 22, 2011 at 1:58 pm
    Feb 22, 2011 at 2:54 pm
  • Hi. I'd like to take advantage of the overrideable encodeNormValue method in Similarity. From revision history it looks like this change was checked in on 11/24/09, and that a Branch for 3.x was ...
    Kim KokkonenKim Kokkonen
    Feb 21, 2011 at 12:05 am
    Feb 21, 2011 at 4:05 pm
  • Hi, I use PDFBOX to extract the text in the PDF and then use Lucene to index and search. Finally, I can find the context of the keyword but in String. Question: I need to create a new PDF which ...
    Gong LiGong Li
    Feb 19, 2011 at 1:45 pm
    Feb 20, 2011 at 8:06 am
  • Hi, I am developing a PDF search engine, locally. I have used API: pdfbox and lucene. I must show the user the PDF page containing the keywords(if highlight, it's great) and sort by relevance(default ...
    Gong LiGong Li
    Feb 18, 2011 at 6:30 pm
    Feb 20, 2011 at 6:36 am
  • I'm trying to set different boost values for different fields. Before adding the document to the index every value is fine. But when I run a search in the explanation every boost is 1 and the final ...
    Akos TajtiAkos Tajti
    Feb 16, 2011 at 3:37 pm
    Feb 17, 2011 at 12:08 pm
  • Hi All, I have an application in java use lucene 3.0.3 which run fine. I wanted to use servlet to make this application as web application. However, I got this error: java.lang.NoSuchMethodError: ...
    Feb 16, 2011 at 9:23 am
    Feb 16, 2011 at 9:54 am
  • Hi all, I am trying to index documents by phrases (multiple words) in the text, and want to get around the StandardAnalyzer for this field. (however, I will still use standardAnalyzer for the other ...
    Yuhan ZhangYuhan Zhang
    Feb 14, 2011 at 8:49 pm
    Feb 15, 2011 at 9:39 am
  • Hey All, I try to construct a boolean query that has to run on 3 different set of indexes: in two of them, it should query a field name "contents" and in one of them, it should query a field named ...
    Liat orenLiat oren
    Feb 14, 2011 at 6:23 pm
    Feb 14, 2011 at 6:46 pm
  • Hi All, i am trying to get a sample index to which i can perform queries. can anyone point to a location where i can download such a index. for example index of wikipedia docs or any other such large ...
    Narayan bhatiNarayan bhati
    Feb 13, 2011 at 12:10 pm
    Feb 13, 2011 at 1:04 pm
  • Hi, I want to iterate over all documents in a given index. I've found the following piece of code [1]: IndexReader reader = // create IndexReader for (int i=0; i<reader.maxDoc(); i++) { if ...
    Georger AraujoGeorger Araujo
    Feb 12, 2011 at 2:08 pm
    Feb 12, 2011 at 5:55 pm
  • Hi, I'm building my own BooleanQuery (rather than using Query Parser). That's because I need different defaults from my users: If a user types: java program I need to run the query: +java* +program* ...
    Sol myrSol myr
    Feb 9, 2011 at 8:44 am
    Feb 10, 2011 at 8:35 am
  • Hi, I am coding a *local pdf search engine* in Java.(If someone did it before, could you please give some tips?) So I need query parse. Assume I want to search for "hello user" in the document. *Q1*. ...
    Gong LiGong Li
    Feb 8, 2011 at 4:37 pm
    Feb 9, 2011 at 10:22 am
  • Hi, I started using Lucene a few weeks ago, and I must say I'm amazed. Hats off to the developers and the community! I'd like to write a custom analyzer whose only difference to ...
    Georger AraujoGeorger Araujo
    Feb 6, 2011 at 8:29 pm
    Feb 8, 2011 at 3:58 pm
  • Hi, Do you know any good open source tool to extract text from MS outlook MSG files? 1) Apache Tika seems not to support *.msg yet. 2) Apache POI recently started to support *.msg (3.7 10/2010), but ...
    Zhang, LishengZhang, Lisheng
    Feb 4, 2011 at 12:04 am
    Feb 6, 2011 at 7:25 am
  • Hi, I am developing an advanced pdf search engine in java by using pdfbox and lucene. And I must display the context of each keyword in the user interface, but i cannot find a method to do so. Most ...
    Gong LiGong Li
    Feb 3, 2011 at 5:32 pm
    Feb 3, 2011 at 5:51 pm
  • We are having issues with FileChannelClosed and are NOT calling Thread.interrupt. We also start to see AlreadyClosedException on Reader. * * we are running the latest 3.0.3 We have code in my lucene ...
    Jason TesserJason Tesser
    Feb 25, 2011 at 2:23 pm
    Feb 25, 2011 at 6:31 pm
  • Hi all, in our project we're using lucene in tomcat. To avoid some overhead we have a shared IndexSearcher instance. In the past we had too many open files errors many times. To prevent this the ...
    Akos TajtiAkos Tajti
    Feb 25, 2011 at 1:12 pm
    Feb 25, 2011 at 6:15 pm
  • Hi All, I need to create a boolean query with MUST clause as well as SHOULD Clause. The result I get is the one which has MUST Clause but there are no SHOULD Clause present. I need something like the ...
    Vaijanath RaoVaijanath Rao
    Feb 25, 2011 at 5:17 am
    Feb 25, 2011 at 7:49 am
  • Hi, I'm using lucene 3.0.3 on ubuntu and always getting ClosedChannelException: java.nio.channels.ClosedChannelException at sun.nio.ch.FileChannelImpl.ensureOpen(Unknown Source) at ...
    Akos TajtiAkos Tajti
    Feb 23, 2011 at 11:25 am
    Feb 23, 2011 at 11:35 am
  • Hi I am trying to implement an Expectation Maximization algorithm for document clustering. I am planning to use Lucene Term Vectors for finding similarity between 2 documents. There are 2 kinds of EM ...
    Ajay AnandanAjay Anandan
    Feb 21, 2011 at 7:57 pm
    Feb 22, 2011 at 8:02 am
  • Hello everybody, I was wondering, if someone could point me to what I need to be aware of, using a ParallelReader. My intention is to modify Nutch (http://nutch.apache.org/) in a way, that in the ...
    David SaileDavid Saile
    Feb 21, 2011 at 8:39 am
    Feb 21, 2011 at 8:53 am
  • Hey all, I'm somewhat new to Lucene. Meaning I used it some time ago for a parser we wrote to tokenize a document into word grams. the approach I took was simple as follows: 1. extended the lucene ...
    CassUser CassUserCassUser CassUser
    Feb 17, 2011 at 7:06 pm
    Feb 19, 2011 at 11:24 pm
  • Hello, I have a problem with documents that much the same query. So I do not index anything what can identify clearly my documents (like id). That's why I want add a document that is already indexed ...
    Feb 16, 2011 at 10:40 am
    Feb 16, 2011 at 12:40 pm
  • Dear all, For anyone wanting to add some NLP abilities to Lucene, I've released a small library at https://github.com/larsmans/lucene-stanford-lemmatizer . This library performs part-of-speech ...
    Lars BuitinckLars Buitinck
    Feb 8, 2011 at 4:51 pm
    Feb 14, 2011 at 6:36 pm
  • Hi, I need to generate a *single *executable JAR. In my code, it needs the wordnet index directory. So When I run the JAR, it needs local directory in my computer. And other computer can't run. Is ...
    Gong LiGong Li
    Feb 14, 2011 at 4:35 pm
    Feb 14, 2011 at 5:43 pm
  • Hi there, I am using Lucene for an actually quite simple search. I am not indexing long texts, but instead each document only has a couple of fields with texts from one word to a very short sentence ...
    Schmidt, DennisSchmidt, Dennis
    Feb 13, 2011 at 11:51 pm
    Feb 14, 2011 at 6:14 am
  • Hi, I need to generate executable JAR. In my code, it has some lines as following: String path = "d:\\project\\"; File f = new File(path); Directory dir = FSDirectory.open(f); In the path, there is a ...
    Gong LiGong Li
    Feb 13, 2011 at 3:28 pm
    Feb 13, 2011 at 8:01 pm
  • Hi, I use standardAnalyzer, queryParser, highlighter in my program, but they lowercase the keywords. Now i need to search the keywords CASE SENSITIVE. Is there any methods to achieve this and also ...
    Gong LiGong Li
    Feb 10, 2011 at 6:14 pm
    Feb 10, 2011 at 10:01 pm
Group Navigation
period‹ prev | Feb 2011 | next ›
Group Overview
groupjava-user @

82 users for February 2011

Simon Willnauer: 16 posts Ian Lea: 13 posts Gong Li: 11 posts Robert Muir: 10 posts Uwe Schindler: 9 posts Bernd Fehling: 8 posts Michael McCandless: 8 posts Jason Rutherglen: 7 posts Erick Erickson: 6 posts Ranjit Kumar: 6 posts Yonik Seeley: 6 posts Anshum: 5 posts Ganesh: 5 posts Zhang, Lisheng: 5 posts Akos Tajti: 4 posts Anuj Shah: 4 posts Findbestopensource: 4 posts Georger Araujo: 4 posts Phil Herold: 4 posts Burton-West, Tom: 3 posts
show more