FAQ
Hi Steve,

I went through the article. Thanks for the link. The span query mentions the
position i.e n positions from the terms. The problem was like this :

Lucene was <some more text> made by Doug <some more text> cutting

If Doug is found between the words Lucene and cutting then it is a hit.
[there can be any number of positions which is unknown]. If the number of
positions are known then the spans could be used.

I came across qsol where in the paragraphseperator and sentence seperator
can be specified and string can be searched within the paragraph.

Can you give your comments.

- Rgds
Ba3





Steven A Rowe wrote:
Hi ba3,

Did you read the Lucid Imagination article I linked to?:

http://www.lucidimagination.com/blog/2009/07/18/the-spanquery/


It has examples, including specifying the term indicating the end of the
span.

If the article doesn't do it for you, I need more information to be able
to help. Can you give an example of what you want to do?

Thanks,
Steve
-----Original Message-----
From: ba3
Sent: Tuesday, July 28, 2009 10:39 PM
To: java-user@lucene.apache.org
Subject: RE: Multiline Regex with Lucene


Hi Steve,

In case of span queries, the span first query can specify the start of
the
span, is it possible to specify the term [not the position] indicating
the
end of the span ?

-- Regards
Ba3


Steven A Rowe wrote:
Hi ba3,

Check out the list of "Direct Known Subclasses" from the SpanQuery
javadocs to see what's available:
http://lucene.apache.org/java/2_4_1/api/org/apache/lucene/search/spans/
SpanQuery.html
SpanRegexQuery may be what you're looking for:
http://lucene.apache.org/java/2_4_1/api/org/apache/lucene/search/regex/
SpanRegexQuery.html

Steve
-----Original Message-----
From: ba3
Sent: Tuesday, July 28, 2009 12:53 PM
To: java-user@lucene.apache.org
Subject: Re: Multiline Regex with Lucene


Hi,

Thanks for the pointers. I will try the span queries.
But can span query support regexp as a term ?

Also for more details in the problem :
The problem is like this:
find a search string inside a block of statements.
The block starts with a string and ends with a character.

-- Regards
Ba3



Erick Erickson wrote:
I doubt you're thinking in terms of tokens. Your inputstream is
broken
up
into tokens (think of them as words,
depending upon the analyzer) and regex searchers are
confined to those *tokens*. So the concept of a multi-line
regex in a search is kind of ...odd...

You could possibly index your input as UN_TOKENIZED, but
I really have no clue what Lucene would do with that. I think
you're off in uncharted territory here.

Perhaps a better thing would be for you to explain *why* you
want to do this and perhaps folks can come up with some
suggestions, I suspect this may be an XY problem, see
http://www.perlmonks.org/index.pl?node_id=542341

Best
Erick

On Sun, Jul 26, 2009 at 9:52 AM, ba3 <sbadhrinath@gmail.com>
wrote:
I was trying to do a regex search with the lucene and
JavaUtilRegexCapabilities.
The code used is :
RegexQuery query = new RegexQuery(new
Term("contents","(?m)hello.*(\r[^#]*)This is to be
searched.*(\r[^#]*)#"));
query.setRegexImplementation(new JavaUtilRegexCapabilities());

I verified the regex in : http://www.gskinner.com/RegExr/ [with
the
multi
line checked]
In lucene though there are no hits. Can you please point me in
the
right
direction

-- Rgds
Ba3

Regex :
hello.*(\r[^#]*)This is to be searched.*(\r[^#]*)#

Content :
hello world
This is to be searched
#
Test line should not be selected
hello
This should not work
some other lines
#
Not to be selected
hello world
Some lines
This is to be searched
Some lines
#
hello earth
some lines
#
--
View this message in context:
http://www.nabble.com/Multiline-Regex-with-Lucene-
tp24667109p24667109.html
Sent from the Lucene - Java Users mailing list archive at
Nabble.com.

-----------------------------------------------------------------
----
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
--
View this message in context: http://www.nabble.com/Multiline-Regex-
with-
Lucene-tp24667109p24703547.html
Sent from the Lucene - Java Users mailing list archive at
Nabble.com.

--------------------------------------------------------------------
-
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

--
View this message in context: http://www.nabble.com/Multiline-Regex-
with-Lucene-tp24667109p24711445.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

--
View this message in context: http://www.nabble.com/Multiline-Regex-with-Lucene-tp24667109p24723404.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 11 of 13 | next ›
Discussion Overview
groupjava-user @
categorieslucene
postedJul 26, '09 at 1:52p
activeJul 29, '09 at 5:14p
posts13
users5
websitelucene.apache.org

People

Translate

site design / logo © 2022 Grokbase