Search Discussions

69 discussions - 256 posts

  • Hi all! Is there any seperate parser for jsp files. Any other option other than modifying indexHTML.java class is appreciated. I already tried modifying this class, html parsing is fine, but jsp ...
    Pinky IyerPinky Iyer
    Apr 2, 2003 at 10:13 pm
    Apr 4, 2003 at 2:40 pm
  • If I use a StandardAnalyzer for indexing, is it important to provide it with the same stop words list for searching, as I used for indexing? Thanks, Dan ...
    Armbrust, Daniel C.Armbrust, Daniel C.
    Apr 11, 2003 at 2:34 pm
    Apr 15, 2003 at 10:08 pm
  • This is an exception that we've got: 03-04-30 03:00:12,806 ERROR [ThreadPoolThread 1] /usr/local/tomcat_4.1.24/myapps/cl/WEB-INF/search/_1t.f5 (Too many open files) java.io.FileNotFoundException: ...
    Andrey AkselrodAndrey Akselrod
    Apr 30, 2003 at 3:09 pm
    Apr 30, 2003 at 4:38 pm
  • The example code that ships with Lucene includes the following snippet in HTMLDocument.java: // Add the last modified date of the file a field named "modified". Use a // Keyword field, so that it's ...
    Simon LieschkeSimon Lieschke
    Apr 7, 2003 at 4:44 am
    Apr 10, 2003 at 8:22 am
  • Hi, I have written a search engine for our intranet. I'm storing the file path as a KEYWORD field. While searching if I want to restrict my search results to a particular file path I'd use a query ...
    Biswas, Goutam_KumarBiswas, Goutam_Kumar
    Apr 8, 2003 at 4:08 pm
    Apr 15, 2003 at 2:04 pm
  • replaceAll() is only available for jdk14 above. -----Original Message----- From: keelkerr@hotmail.com Sent: Thu 4/10/2003 12:09 AM To: Lucene Users List Cc: Subject: a simple highlight resolvent hi ...
    Hui OuyangHui Ouyang
    Apr 10, 2003 at 1:17 pm
    Apr 10, 2003 at 6:25 pm
  • Hi, I have been using queries like: filename:(txt) AND path:(/u/biswasg/Install*) with Lucene 1.2 which gave me correct results. I moved to Lucene 1.3 a while ago and find that these queries no ...
    Biswas, Goutam_KumarBiswas, Goutam_Kumar
    Apr 27, 2003 at 7:02 am
    Apr 28, 2003 at 6:09 pm
  • Hi there I apologise if this is obvious, I have read as much as I can in the FAQ's and API docs but can not see a solution for my problem. I would like to be able to search the entire index and use ...
    Ryan MoffatRyan Moffat
    Apr 15, 2003 at 2:17 pm
    Apr 16, 2003 at 8:49 am
  • Hello, I need to perform the follwoing search: (search my full text fields using or condition) AND (search my category fields using or condition) This is what I tried to do: BooleanQuery bQuery = new ...
    Andrey AkselrodAndrey Akselrod
    Apr 8, 2003 at 5:29 pm
    Apr 8, 2003 at 9:13 pm
  • Hi. I would like to write $SUBJ (HCDC), because LARM does not offer many options which are required by web/http crawling IMHO. Here is my list: 1. I would like to manage the decision what will be ...
    Leo GalambosLeo Galambos
    Apr 22, 2003 at 7:56 am
    Jun 10, 2003 at 12:37 pm
  • I am proud to announce the latest version of PDFBox. This version comes with some exciting changes to the project. Bob Dickinson joins PDFBox the development team and brings 20 years of programming ...
    Ben LitchfieldBen Litchfield
    Apr 21, 2003 at 3:12 am
    May 3, 2003 at 4:36 pm
  • I would like to be able to search against roughly 100,000 documents, with each document containing roughly 10 fields, only 1 or 2 of which will be blocks of plain english text (the rest will be set ...
    Chris MillerChris Miller
    Apr 30, 2003 at 5:54 pm
    Apr 30, 2003 at 11:27 pm
  • Hi All, I have submitted a bug report on this issue to bugzilla, but since I haven't got any response to it I ask you guys When the RangeQuery.toString() outputs. The result is something like this: ...
    Aviran MordoAviran Mordo
    Apr 28, 2003 at 2:44 pm
    Apr 29, 2003 at 1:07 am
  • Otis, Your suggestion worked. Thanks. However there is one more problem. If the path contains a '-' I do not get the results, even if I escape the '-'. For example: ...
    Biswas, Goutam_KumarBiswas, Goutam_Kumar
    Apr 27, 2003 at 6:27 pm
    Apr 28, 2003 at 2:53 pm
  • Lucene's QueryParsers seems to drop stop/key words even if they are enclosed in double quotes. For example: apple for tomato -- +apple +tomato Which is what I expected, however "apple for tomato" -- ...
    Victor HadiantoVictor Hadianto
    Apr 13, 2003 at 7:14 am
    Apr 14, 2003 at 1:10 pm
  • Hello, Is there a simple way to query an index for documents which don't have a specific value (a keyword field in this instance). For example, I have documents with a category field and I would like ...
    Apr 11, 2003 at 11:57 am
    Apr 11, 2003 at 1:27 pm
  • Hi, I am trying to append one index to another. How should i do it? Let me explain my problem, probably people can suggest some better way.. i have indexed a set of pdf documents. These documents are ...
    Subhrajyoti MoitraSubhrajyoti Moitra
    Apr 7, 2003 at 5:54 am
    Apr 10, 2003 at 4:10 am
  • Hello everyone, Here I try to increment update index file and follow the idea to delete modified file first and re-add it. Here is the source. But when I execute it, the index directory create a ...
    Apr 3, 2003 at 3:22 am
    Apr 5, 2003 at 4:10 am
  • hi everybody I am trying to index documents using Lucene generating about 30 MB of index (Optimized) which can be raised to about 100 MB or More ( but that would be on a high end server machine). ...
    Amit KapurAmit Kapur
    Apr 2, 2003 at 9:44 am
    Apr 3, 2003 at 4:00 pm
  • If I wanted to build an index where all of the words were tagged with part of speech information, its seems that the type field of the Token would be the place to put this. But, as I understand it, ...
    Armbrust, Daniel C.Armbrust, Daniel C.
    Apr 24, 2003 at 5:09 pm
    Apr 30, 2003 at 11:20 pm
  • Hi all, Well, I got around my previous problem by switching to a different HTML parser. Now I have an even more subtle and frustrating problem! :( I'm using Che Dong's CJKTokenizer/CJKAnalyzer to do ...
    Apr 21, 2003 at 8:53 pm
    Apr 23, 2003 at 3:21 pm
  • Hello all, My project is socket based application.Whenever I receive a request, i have to index the document send by the client using lucene. Algorithm 1.create Index reader object 2.delete the ...
    Apr 18, 2003 at 5:34 am
    Apr 22, 2003 at 4:34 pm
  • Hi, I´m starting a new project and I´ve some questions about Lucene. The idea is to build a daemon to hold several indexes. The indexes will be created at start, and then updated along the day, with ...
    Jose GalianaJose Galiana
    Apr 22, 2003 at 7:59 am
    Apr 22, 2003 at 3:28 pm
  • Maybe I'm missing something, but doesn't it seem wrong that the IndexReader and Searcher have no method to identify when their underlying directory has been modified? Instead, you have to go all the ...
    Armbrust, Daniel C.Armbrust, Daniel C.
    Apr 16, 2003 at 4:05 pm
    Apr 17, 2003 at 6:55 pm
  • hi! i am using lucene1.2 in a file sharing system, my average file amount is about 400 totalling about 50megs (small), when run on linux it is fine using jdk1.4.1, however using jdk1.4.1 on windows i ...
    Eoghan SEoghan S
    Apr 2, 2003 at 7:29 pm
    Apr 2, 2003 at 8:36 pm
  • I was trying to implement default boolean operator AND in the demo code SearchFiles.java. I did the following :- QueryParser qp = new QueryParser("contents", analyzer); System.out.println("Value for ...
    Ramrakhiani, VikasRamrakhiani, Vikas
    Apr 16, 2003 at 6:07 am
    May 15, 2003 at 1:42 pm
  • It´s possible to index several documents with different structure?For example, some document have a title field, and anothers documents don´t. Some documents have keywords field and others don´t. Are ...
    Jose GalianaJose Galiana
    Apr 30, 2003 at 2:34 pm
    Apr 30, 2003 at 2:52 pm
  • I was bored this week so I've been exploring how Lucene can be used for Role-Based Access Control. If you're interested, I have a blob entry at ...
    David MedinetsDavid Medinets
    Apr 28, 2003 at 11:00 pm
    Apr 29, 2003 at 5:04 pm
  • Hi, I have a index which is about 4g.b in size . now if i opmimize this index it seems to take extra space of abt. 2-3g.b during the optimization step. Once this is done the size again comes down. ...
    Harpreet S WaliaHarpreet S Walia
    Apr 29, 2003 at 12:26 pm
    Apr 29, 2003 at 4:54 pm
  • Hello everyone, I've got a document that I run through an information extraction engine that returns a list of concepts associated to a document with an appropriate relevancy factor (for example, ...
    Stephane VaucherStephane Vaucher
    Apr 28, 2003 at 6:23 pm
    Apr 28, 2003 at 9:09 pm
  • First I created a RAMDirectory and then searched it when it was empty. And I got a NullPointerException. I looked in the API for a way to determine if the directory is empty so that I could add code ...
    David MedinetsDavid Medinets
    Apr 28, 2003 at 12:39 pm
    Apr 28, 2003 at 3:21 pm
  • I don't think so. However, this sounds like a scenario in which you should make your 2 threads communicate. Either through JVM or even something as simple as "I'm done building the index" file. Otis ...
    Otis GospodneticOtis Gospodnetic
    Apr 16, 2003 at 8:28 pm
    Apr 20, 2003 at 5:54 pm
  • I was wondering if there's any open source projects attempting to create a C++ implementation similar to Lucene. I use lucene for a project which has modest text indexing needs for XML data, but i ...
    Marc DumontierMarc Dumontier
    Apr 15, 2003 at 11:53 pm
    Apr 16, 2003 at 1:21 am
  • Is there any way generate an enumeration of all searchable fields in an index? Other than doing a search that matches every document, and then examining every single document one by one? Thanks, Dan ...
    Armbrust, Daniel C.Armbrust, Daniel C.
    Apr 14, 2003 at 6:26 pm
    Apr 14, 2003 at 7:19 pm
  • Hi, I need a help. After to do a search (with my search engine) I generate a jsp that send to the browser the link where I find the string or the word that I search(the string is inside an excel ...
    Apr 10, 2003 at 7:19 pm
    Apr 11, 2003 at 8:11 am
  • When I try to index Japanese HTML files using HTMLParser, I just get "lexical errors" in every file: Parse Aborted: Lexical error at line 12, column 28. Encountered: "\u2030" (8240), after : "" Is ...
    Apr 8, 2003 at 4:53 pm
    Apr 8, 2003 at 6:02 pm
  • The TestPhrasePrefixQuery looks like it is searching for "blueberry pi*" and it even seems to work at first glance. However, the test data is not extensive enough to show what is really happening. ...
    David MedinetsDavid Medinets
    Apr 22, 2003 at 6:14 pm
    May 1, 2003 at 12:39 am
  • Hi everybody, I'm new to pdfbox that I recently downloaded (latest version: 0.6.1). It seems easy to use but I'm experiencing troubles using it : The following sample code I wrote compiles but raises ...
    Apr 30, 2003 at 7:05 pm
    Apr 30, 2003 at 7:11 pm
  • I've written an analyzer which uses a filter which I wrote which invokes LVG's (http://umlslex.nlm.nih.gov/lvg/2003/index.html) norm function on each token, and then, if there is more than one result ...
    Armbrust, Daniel C.Armbrust, Daniel C.
    Apr 29, 2003 at 9:49 pm
    Apr 30, 2003 at 1:35 pm
  • HI, I have indexed a database table with Lucene. It works fine. I have a column "description" in the table. This filed cann contain values like "SI 6303 G" ect... I make a query like this: queryStr = ...
    Test2 SchwabTest2 Schwab
    Apr 30, 2003 at 7:09 am
    Apr 30, 2003 at 10:40 am
  • The code for this class can be found at http://affy.blogspot.com/2003_04_20_affy_archive.html#200194642. As I was working with Lucene the other day, I envisioned looking for a document about a ...
    David MedinetsDavid Medinets
    Apr 24, 2003 at 6:29 pm
    Apr 25, 2003 at 9:02 am
  • Hi, I am using Lucene 1.2 web application. Most of queries work fine. However,when I tried to use phrase query like the following: "basic service" The fist result page is good (url like ...
    Apr 24, 2003 at 5:13 pm
    Apr 24, 2003 at 5:27 pm
  • Sorry to keep spamming the list like this, but I think I figured out the problem I posted about earlier today. The CJKTokenizer was returning an extra, empty token at the end of each run. Hence the ...
    Apr 21, 2003 at 10:56 pm
    Apr 22, 2003 at 2:16 am
  • Hi all, What I'm trying to do is exactly the same with this discussion: http://nagoya.apache.org/eyebrowse/ReadMsg?listName=&msgId=117093 Basically I want to know from my search terms which one that ...
    Victor HadiantoVictor Hadianto
    Apr 14, 2003 at 6:38 am
    Apr 17, 2003 at 1:42 am
  • Hi, I was looking into the following query parser constructor: QueryParser public QueryParser(String f, Analyzer a) Constructs a query parser. I want to know what is the parameter f being passed to ...
    Ramrakhiani, VikasRamrakhiani, Vikas
    Apr 16, 2003 at 5:57 am
    Apr 16, 2003 at 6:18 am
  • When I create an index, and then read back the available fields on my index using IndexReader.getFieldNames, the collection it returns always has one more field than I added. In the simplest case, ...
    Armbrust, Daniel C.Armbrust, Daniel C.
    Apr 15, 2003 at 3:41 pm
    Apr 16, 2003 at 3:37 am
  • I've noticed that whenever I add document to the index the doc.id() gets the next number. I use this behavior in order to sort the results by sorting the doc.id(). Let me explain: If I insert the ...
    Aviran MordoAviran Mordo
    Apr 14, 2003 at 7:42 pm
    Apr 15, 2003 at 2:59 am
  • Hi, I am using PDFBox to create LucenePDFDocument (org.pdfbox.searchengine.lucene.LucenePDFDocument). I want to know in which field is the content of pdf document stored. thanks, vikas. ...
    Ramrakhiani, VikasRamrakhiani, Vikas
    Apr 14, 2003 at 4:44 pm
    Apr 14, 2003 at 4:51 pm
  • Ok, I have searched till I am blue in the face. I am wanting the compile lucene and so I downloaded the JavaCC2_1 compiler and tried to installed it on a windows 2000 machine. I typed c:\install java ...
    Apr 9, 2003 at 1:59 pm
    Apr 9, 2003 at 2:17 pm
  • Hi, I'm haven't touch Lucene for ages and now revisiting it back, so please bear with me. The following is my problem. I am indexing a bunch of emails and part of the field that I index in the ...
    Victor HadiantoVictor Hadianto
    Apr 9, 2003 at 1:57 am
    Apr 9, 2003 at 8:01 am
Group Navigation
period‹ prev | Apr 2003 | next ›
Group Overview
groupjava-user @

75 users for April 2003

Otis Gospodnetic: 35 posts Rob Outar: 23 posts David Medinets: 12 posts Armbrust, Daniel C.: 11 posts Victor Hadianto: 9 posts Aviran Mordo: 8 posts Eric Isakson: 8 posts Kerr: 8 posts Andrey Akselrod: 6 posts Biswas, Goutam_Kumar: 6 posts Mchaput: 6 posts Mmachado: 5 posts Alex Murzaku: 5 posts Mganesh: 5 posts Ben Litchfield: 4 posts Hui Ouyang: 4 posts Amit Kapur: 3 posts Blaplante: 3 posts Che Dong: 3 posts Chris Miller: 3 posts
show more