[ https://issues.apache.org/jira/browse/LUCENE-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662733#action_12662733 ]

Shai Erera commented on LUCENE-1482:

I think using a logger to replace the infostream stuff is probably acceptable. What I personally don't want to see happen is instrumentation creep/bloat, where debugging statements slowly make their way all throughout Lucene.

Grant wrote a few posts back:
"I also think it is important to address Yonik's point about "inappropriate" places. In other words, we need guidelines about where and when to using logging and committers need to be on the lookout for logging uses. I realize that is as much a community policing problem as a patch problem, but, we should address them before we adopt logging."

IMO, adding logging messages to outer classes, like QueryParser, is unnecessary since the application can achieve the same thing by itself (logging the input query text, used Analyzer and the output Query object). But logging internal places, like merging, is very important, because you usually can't reproduce it in your dev env. (it requires the exact settings to IndexWriter, the exact stream of documents and the exact operations (add/remove)).
Like I said, logging in Lucene is mostly important when you're trying to debug an application which is out of your hands. Customers are rarely willing to share their content. Also, community-wise, being able to ask someone to drop a log of operations that has happened and caused a certain problem is valuable. Today you can ask it only on IndexWriter output, which may not be enough.

I've tried to explain - these calls can be costly if used in the wrong place, esp on the wrong processor architectures. What appears in inner loop will vary widely by application, and there are a ton of lucene users out there using it in all sorts of ways we can't imagine. For example, I'd rather not see debugging in Query/Weight/Scorer classes - for most applications, query and weight construction won't be a bottleneck, but there are some where it could be (running thousands of stored queries against each incoming document via memoryindex for example).

I'm sorry, but I don't buy this (or I'm still missing something). What's the difference between logger.isDebugEnabled to indexOutput.writeInt? Both are method calls on a different object. Why is the latter acceptable and the former not?
I'm not saying that we should drop any OO design and programming, but just pointing out that Lucene's code is already filled with many method calls on different objects, inside as well as outside of loops.
The only way I think you could claim the two are different is because indexOutput.writeInt is essential for Lucene's operation, while logger.isDebugEnabled is not. But I believe logging in Lucene is as much important (and valuable) as encoding its data structures.
Replace infoSteram by a logging framework (SLF4J)

Key: LUCENE-1482
URL: https://issues.apache.org/jira/browse/LUCENE-1482
Project: Lucene - Java
Issue Type: Improvement
Components: Index
Reporter: Shai Erera
Fix For: 2.4.1, 2.9

Attachments: LUCENE-1482-2.patch, LUCENE-1482.patch, slf4j-api-1.5.6.jar, slf4j-nop-1.5.6.jar

Lucene makes use of infoStream to output messages in its indexing code only. For debugging purposes, when the search application is run on the customer side, getting messages from other code flows, like search, query parsing, analysis etc can be extremely useful.
There are two main problems with infoStream today:
1. It is owned by IndexWriter, so if I want to add logging capabilities to other classes I need to either expose an API or propagate infoStream to all classes (see for example DocumentsWriter, which receives its infoStream instance from IndexWriter).
2. I can either turn debugging on or off, for the entire code.
Introducing a logging framework can allow each class to control its logging independently, and more importantly, allows the application to turn on logging for only specific areas in the code (i.e., org.apache.lucene.index.*).
I've investigated SLF4J (stands for Simple Logging Facade for Java) which is, as it names states, a facade over different logging frameworks. As such, you can include the slf4j.jar in your application, and it recognizes at deploy time what is the actual logging framework you'd like to use. SLF4J comes with several adapters for Java logging, Log4j and others. If you know your application uses Java logging, simply drop slf4j.jar and slf4j-jdk14.jar in your classpath, and your logging statements will use Java logging underneath the covers.
This makes the logging code very simple. For a class A the logger will be instantiated like this:
public class A {
private static final logger = LoggerFactory.getLogger(A.class);
And will later be used like this:
public class A {
private static final logger = LoggerFactory.getLogger(A.class);
public void foo() {
if (logger.isDebugEnabled()) {
That's all !
Checking for isDebugEnabled is very quick, at least using the JDK14 adapter (but I assume it's fast also over other logging frameworks).
The important thing is, every class controls its own logger. Not all classes have to output logging messages, and we can improve Lucene's logging gradually, w/o changing the API, by adding more logging messages to interesting classes.
I will submit a patch shortly
This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Search Discussions

Discussion Posts


Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 10 of 10 | next ›
Discussion Overview
groupjava-dev @
postedJan 8, '09 at 5:03p
activeJan 11, '09 at 6:52a



site design / logo © 2021 Grokbase