FAQ
[ https://issues.apache.org/jira/browse/LUCENE-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662039#action_12662039 ]

Shai Erera commented on LUCENE-1482:
------------------------------------

Grant, given what I wrote below, having Lucene use NOP adapter, are you still worried w.r.t. the performance implications?

If there is a general reluctance to add a dependency on SLF4J, can we review the other options I suggested - using infoStream as a class with static methods? That at least will allow adding more prints from other classes, w/o changing their API.

I prefer SLF4J because IMO logging is important, but having infoStream as a service class is better than what exists today (and I don't believe someone can argue that calling a static method has any significant, if at all, performance implications).

If the committers want to drop that issue, please let me know and I'll close it. I don't like to nag :-)
Replace infoSteram by a logging framework (SLF4J)
-------------------------------------------------

Key: LUCENE-1482
URL: https://issues.apache.org/jira/browse/LUCENE-1482
Project: Lucene - Java
Issue Type: Improvement
Components: Index
Reporter: Shai Erera
Fix For: 2.4.1, 2.9

Attachments: LUCENE-1482-2.patch, LUCENE-1482.patch, slf4j-api-1.5.6.jar, slf4j-nop-1.5.6.jar


Lucene makes use of infoStream to output messages in its indexing code only. For debugging purposes, when the search application is run on the customer side, getting messages from other code flows, like search, query parsing, analysis etc can be extremely useful.
There are two main problems with infoStream today:
1. It is owned by IndexWriter, so if I want to add logging capabilities to other classes I need to either expose an API or propagate infoStream to all classes (see for example DocumentsWriter, which receives its infoStream instance from IndexWriter).
2. I can either turn debugging on or off, for the entire code.
Introducing a logging framework can allow each class to control its logging independently, and more importantly, allows the application to turn on logging for only specific areas in the code (i.e., org.apache.lucene.index.*).
I've investigated SLF4J (stands for Simple Logging Facade for Java) which is, as it names states, a facade over different logging frameworks. As such, you can include the slf4j.jar in your application, and it recognizes at deploy time what is the actual logging framework you'd like to use. SLF4J comes with several adapters for Java logging, Log4j and others. If you know your application uses Java logging, simply drop slf4j.jar and slf4j-jdk14.jar in your classpath, and your logging statements will use Java logging underneath the covers.
This makes the logging code very simple. For a class A the logger will be instantiated like this:
public class A {
private static final logger = LoggerFactory.getLogger(A.class);
}
And will later be used like this:
public class A {
private static final logger = LoggerFactory.getLogger(A.class);
public void foo() {
if (logger.isDebugEnabled()) {
logger.debug("message");
}
}
}
That's all !
Checking for isDebugEnabled is very quick, at least using the JDK14 adapter (but I assume it's fast also over other logging frameworks).
The important thing is, every class controls its own logger. Not all classes have to output logging messages, and we can improve Lucene's logging gradually, w/o changing the API, by adding more logging messages to interesting classes.
I will submit a patch shortly
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Search Discussions

  • Yonik Seeley (JIRA) at Jan 8, 2009 at 5:21 pm
    [ https://issues.apache.org/jira/browse/LUCENE-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662044#action_12662044 ]

    Yonik Seeley commented on LUCENE-1482:
    --------------------------------------

    It seems we should take into consideration the performance of a real logger (not the NOP logger) because real applications that already use SLF4J can't use NOP adapter. Solr just switched to SLF4J for example.
    Replace infoSteram by a logging framework (SLF4J)
    -------------------------------------------------

    Key: LUCENE-1482
    URL: https://issues.apache.org/jira/browse/LUCENE-1482
    Project: Lucene - Java
    Issue Type: Improvement
    Components: Index
    Reporter: Shai Erera
    Fix For: 2.4.1, 2.9

    Attachments: LUCENE-1482-2.patch, LUCENE-1482.patch, slf4j-api-1.5.6.jar, slf4j-nop-1.5.6.jar


    Lucene makes use of infoStream to output messages in its indexing code only. For debugging purposes, when the search application is run on the customer side, getting messages from other code flows, like search, query parsing, analysis etc can be extremely useful.
    There are two main problems with infoStream today:
    1. It is owned by IndexWriter, so if I want to add logging capabilities to other classes I need to either expose an API or propagate infoStream to all classes (see for example DocumentsWriter, which receives its infoStream instance from IndexWriter).
    2. I can either turn debugging on or off, for the entire code.
    Introducing a logging framework can allow each class to control its logging independently, and more importantly, allows the application to turn on logging for only specific areas in the code (i.e., org.apache.lucene.index.*).
    I've investigated SLF4J (stands for Simple Logging Facade for Java) which is, as it names states, a facade over different logging frameworks. As such, you can include the slf4j.jar in your application, and it recognizes at deploy time what is the actual logging framework you'd like to use. SLF4J comes with several adapters for Java logging, Log4j and others. If you know your application uses Java logging, simply drop slf4j.jar and slf4j-jdk14.jar in your classpath, and your logging statements will use Java logging underneath the covers.
    This makes the logging code very simple. For a class A the logger will be instantiated like this:
    public class A {
    private static final logger = LoggerFactory.getLogger(A.class);
    }
    And will later be used like this:
    public class A {
    private static final logger = LoggerFactory.getLogger(A.class);
    public void foo() {
    if (logger.isDebugEnabled()) {
    logger.debug("message");
    }
    }
    }
    That's all !
    Checking for isDebugEnabled is very quick, at least using the JDK14 adapter (but I assume it's fast also over other logging frameworks).
    The important thing is, every class controls its own logger. Not all classes have to output logging messages, and we can improve Lucene's logging gradually, w/o changing the API, by adding more logging messages to interesting classes.
    I will submit a patch shortly
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Shai Erera (JIRA) at Jan 9, 2009 at 5:56 pm
    [ https://issues.apache.org/jira/browse/LUCENE-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662448#action_12662448 ]

    Shai Erera commented on LUCENE-1482:
    ------------------------------------

    Like I wrote before, I believe that if someone will use a *real* logger, it's most probably because his application already uses such a logger in other places of the code, not necessarily just the search parts. Therefore, the performance implications of using a logger is not important, IMO.
    For the sake of argument, what if some writes his own adapter, which performs really bad on isDebugEnabled() (for example) - is that the concern of the Lucene community?
    The way I view it - this patch gives those who want to control Lucene logging, better control of it. The fact that Lucene ships with the NOP adapter means it will not be affected by the logger's isDebugEnabled() calls. If you want to always output the log messages, drop an adapter which always returns true.

    I wonder if there is a general reluctance to use SLF4J at all, and that's why you continue to raise the performance implications. Because I seriously don't understand why you think that checking if debug is enabled can pose any performance hit, even when used with a *real* logger.
    If performance measurement is what's keeping this patch from being committed, I'll run one of the indexing algoirhtms w/ and w/o the patch. I'll use the NOP adapter and the Java logger adapter so we'll have 3 measurements.
    However, if performance is not what's blocking that issue, please let me know now, so I won't spend test cycles for nothing.

    And ... I also proposed another alternative, which is not as good as logging IMO, but still better than what we have today - offer an InfoStream class with static methods verbose() and message(). It can be used by all Lucene classes, w/o changing their API and thus allows adding more messages gradually w/o being concerned w/ API backward compatibility.

    I prefer SLF4J, but if the committers are against it, then this one should be considered also.

    Shai
    Replace infoSteram by a logging framework (SLF4J)
    -------------------------------------------------

    Key: LUCENE-1482
    URL: https://issues.apache.org/jira/browse/LUCENE-1482
    Project: Lucene - Java
    Issue Type: Improvement
    Components: Index
    Reporter: Shai Erera
    Fix For: 2.4.1, 2.9

    Attachments: LUCENE-1482-2.patch, LUCENE-1482.patch, slf4j-api-1.5.6.jar, slf4j-nop-1.5.6.jar


    Lucene makes use of infoStream to output messages in its indexing code only. For debugging purposes, when the search application is run on the customer side, getting messages from other code flows, like search, query parsing, analysis etc can be extremely useful.
    There are two main problems with infoStream today:
    1. It is owned by IndexWriter, so if I want to add logging capabilities to other classes I need to either expose an API or propagate infoStream to all classes (see for example DocumentsWriter, which receives its infoStream instance from IndexWriter).
    2. I can either turn debugging on or off, for the entire code.
    Introducing a logging framework can allow each class to control its logging independently, and more importantly, allows the application to turn on logging for only specific areas in the code (i.e., org.apache.lucene.index.*).
    I've investigated SLF4J (stands for Simple Logging Facade for Java) which is, as it names states, a facade over different logging frameworks. As such, you can include the slf4j.jar in your application, and it recognizes at deploy time what is the actual logging framework you'd like to use. SLF4J comes with several adapters for Java logging, Log4j and others. If you know your application uses Java logging, simply drop slf4j.jar and slf4j-jdk14.jar in your classpath, and your logging statements will use Java logging underneath the covers.
    This makes the logging code very simple. For a class A the logger will be instantiated like this:
    public class A {
    private static final logger = LoggerFactory.getLogger(A.class);
    }
    And will later be used like this:
    public class A {
    private static final logger = LoggerFactory.getLogger(A.class);
    public void foo() {
    if (logger.isDebugEnabled()) {
    logger.debug("message");
    }
    }
    }
    That's all !
    Checking for isDebugEnabled is very quick, at least using the JDK14 adapter (but I assume it's fast also over other logging frameworks).
    The important thing is, every class controls its own logger. Not all classes have to output logging messages, and we can improve Lucene's logging gradually, w/o changing the API, by adding more logging messages to interesting classes.
    I will submit a patch shortly
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Yonik Seeley (JIRA) at Jan 9, 2009 at 7:06 pm
    [ https://issues.apache.org/jira/browse/LUCENE-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662476#action_12662476 ]

    Yonik Seeley commented on LUCENE-1482:
    --------------------------------------

    I'm not arguing for or against SLF4J at this point, but simply pointing out that I didn't think it was appropriate to base any analysis on the NOP adapter, which can't be used for any project already using SLF4J.

    I think using a logger to replace the infostream stuff is probably acceptable. What I personally don't want to see happen is instrumentation creep/bloat, where debugging statements slowly make their way all throughout Lucene.

    bq. Because I seriously don't understand why you think that checking if debug is enabled can pose any performance hit, even when used with a real logger.

    I've tried to explain - these calls can be costly if used in the wrong place, esp on the wrong processor architectures. What appears in inner loop will vary widely by application, and there are a *ton* of lucene users out there using it in all sorts of ways we can't imagine. For example, I'd rather not see debugging in Query/Weight/Scorer classes - for most applications, query and weight construction won't be a bottleneck, but there are some where it could be (running thousands of stored queries against each incoming document via memoryindex for example).

    Replace infoSteram by a logging framework (SLF4J)
    -------------------------------------------------

    Key: LUCENE-1482
    URL: https://issues.apache.org/jira/browse/LUCENE-1482
    Project: Lucene - Java
    Issue Type: Improvement
    Components: Index
    Reporter: Shai Erera
    Fix For: 2.4.1, 2.9

    Attachments: LUCENE-1482-2.patch, LUCENE-1482.patch, slf4j-api-1.5.6.jar, slf4j-nop-1.5.6.jar


    Lucene makes use of infoStream to output messages in its indexing code only. For debugging purposes, when the search application is run on the customer side, getting messages from other code flows, like search, query parsing, analysis etc can be extremely useful.
    There are two main problems with infoStream today:
    1. It is owned by IndexWriter, so if I want to add logging capabilities to other classes I need to either expose an API or propagate infoStream to all classes (see for example DocumentsWriter, which receives its infoStream instance from IndexWriter).
    2. I can either turn debugging on or off, for the entire code.
    Introducing a logging framework can allow each class to control its logging independently, and more importantly, allows the application to turn on logging for only specific areas in the code (i.e., org.apache.lucene.index.*).
    I've investigated SLF4J (stands for Simple Logging Facade for Java) which is, as it names states, a facade over different logging frameworks. As such, you can include the slf4j.jar in your application, and it recognizes at deploy time what is the actual logging framework you'd like to use. SLF4J comes with several adapters for Java logging, Log4j and others. If you know your application uses Java logging, simply drop slf4j.jar and slf4j-jdk14.jar in your classpath, and your logging statements will use Java logging underneath the covers.
    This makes the logging code very simple. For a class A the logger will be instantiated like this:
    public class A {
    private static final logger = LoggerFactory.getLogger(A.class);
    }
    And will later be used like this:
    public class A {
    private static final logger = LoggerFactory.getLogger(A.class);
    public void foo() {
    if (logger.isDebugEnabled()) {
    logger.debug("message");
    }
    }
    }
    That's all !
    Checking for isDebugEnabled is very quick, at least using the JDK14 adapter (but I assume it's fast also over other logging frameworks).
    The important thing is, every class controls its own logger. Not all classes have to output logging messages, and we can improve Lucene's logging gradually, w/o changing the API, by adding more logging messages to interesting classes.
    I will submit a patch shortly
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Robert engels at Jan 9, 2009 at 7:11 pm
    This is not really true these days. Dynamic class instrumentation/
    byte modification can remove the calls entirely (for loggers not
    enabled). They can be enabled during startup (or a reload from a
    different class loader).

    See the paper at http://www.springerlink.com/content/ur00014m03275421
    On Jan 9, 2009, at 1:05 PM, Yonik Seeley (JIRA) wrote:


    [ https://issues.apache.org/jira/browse/LUCENE-1482?
    page=com.atlassian.jira.plugin.system.issuetabpanels:comment-
    tabpanel&focusedCommentId=12662476#action_12662476 ]

    Yonik Seeley commented on LUCENE-1482:
    --------------------------------------

    I'm not arguing for or against SLF4J at this point, but simply
    pointing out that I didn't think it was appropriate to base any
    analysis on the NOP adapter, which can't be used for any project
    already using SLF4J.

    I think using a logger to replace the infostream stuff is probably
    acceptable. What I personally don't want to see happen is
    instrumentation creep/bloat, where debugging statements slowly make
    their way all throughout Lucene.

    bq. Because I seriously don't understand why you think that
    checking if debug is enabled can pose any performance hit, even
    when used with a real logger.

    I've tried to explain - these calls can be costly if used in the
    wrong place, esp on the wrong processor architectures. What
    appears in inner loop will vary widely by application, and there
    are a *ton* of lucene users out there using it in all sorts of ways
    we can't imagine. For example, I'd rather not see debugging in
    Query/Weight/Scorer classes - for most applications, query and
    weight construction won't be a bottleneck, but there are some where
    it could be (running thousands of stored queries against each
    incoming document via memoryindex for example).

    Replace infoSteram by a logging framework (SLF4J)
    -------------------------------------------------

    Key: LUCENE-1482
    URL: https://issues.apache.org/jira/browse/
    LUCENE-1482
    Project: Lucene - Java
    Issue Type: Improvement
    Components: Index
    Reporter: Shai Erera
    Fix For: 2.4.1, 2.9

    Attachments: LUCENE-1482-2.patch, LUCENE-1482.patch, slf4j-
    api-1.5.6.jar, slf4j-nop-1.5.6.jar


    Lucene makes use of infoStream to output messages in its indexing
    code only. For debugging purposes, when the search application is
    run on the customer side, getting messages from other code flows,
    like search, query parsing, analysis etc can be extremely useful.
    There are two main problems with infoStream today:
    1. It is owned by IndexWriter, so if I want to add logging
    capabilities to other classes I need to either expose an API or
    propagate infoStream to all classes (see for example
    DocumentsWriter, which receives its infoStream instance from
    IndexWriter).
    2. I can either turn debugging on or off, for the entire code.
    Introducing a logging framework can allow each class to control
    its logging independently, and more importantly, allows the
    application to turn on logging for only specific areas in the code
    (i.e., org.apache.lucene.index.*).
    I've investigated SLF4J (stands for Simple Logging Facade for
    Java) which is, as it names states, a facade over different
    logging frameworks. As such, you can include the slf4j.jar in your
    application, and it recognizes at deploy time what is the actual
    logging framework you'd like to use. SLF4J comes with several
    adapters for Java logging, Log4j and others. If you know your
    application uses Java logging, simply drop slf4j.jar and slf4j-
    jdk14.jar in your classpath, and your logging statements will use
    Java logging underneath the covers.
    This makes the logging code very simple. For a class A the logger
    will be instantiated like this:
    public class A {
    private static final logger = LoggerFactory.getLogger(A.class);
    }
    And will later be used like this:
    public class A {
    private static final logger = LoggerFactory.getLogger(A.class);
    public void foo() {
    if (logger.isDebugEnabled()) {
    logger.debug("message");
    }
    }
    }
    That's all !
    Checking for isDebugEnabled is very quick, at least using the
    JDK14 adapter (but I assume it's fast also over other logging
    frameworks).
    The important thing is, every class controls its own logger. Not
    all classes have to output logging messages, and we can improve
    Lucene's logging gradually, w/o changing the API, by adding more
    logging messages to interesting classes.
    I will submit a patch shortly
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Robert engels at Jan 9, 2009 at 7:44 pm
    Also, see this http://venkatesans.com for an implementation (Frog)
    which injects logging at runtime.

    This is not really what I propose though. I think it is better to
    code the logging statements, and have them removed at runtime. It
    allows for more context sensitive logging statements.

    Normally this is done like this:

    if(Logger.isEnabled(loggername)) {
    Logger.log(loggername,xxx);
    }

    The runtime loader can detect the Logger.isEnabled() byte code, and
    remove the entire if statement during class loading.


    On Jan 9, 2009, at 1:11 PM, robert engels wrote:

    This is not really true these days. Dynamic class instrumentation/
    byte modification can remove the calls entirely (for loggers not
    enabled). They can be enabled during startup (or a reload from a
    different class loader).

    See the paper at http://www.springerlink.com/content/ur00014m03275421
    On Jan 9, 2009, at 1:05 PM, Yonik Seeley (JIRA) wrote:


    [ https://issues.apache.org/jira/browse/LUCENE-1482?
    page=com.atlassian.jira.plugin.system.issuetabpanels:comment-
    tabpanel&focusedCommentId=12662476#action_12662476 ]

    Yonik Seeley commented on LUCENE-1482:
    --------------------------------------

    I'm not arguing for or against SLF4J at this point, but simply
    pointing out that I didn't think it was appropriate to base any
    analysis on the NOP adapter, which can't be used for any project
    already using SLF4J.

    I think using a logger to replace the infostream stuff is probably
    acceptable. What I personally don't want to see happen is
    instrumentation creep/bloat, where debugging statements slowly
    make their way all throughout Lucene.

    bq. Because I seriously don't understand why you think that
    checking if debug is enabled can pose any performance hit, even
    when used with a real logger.

    I've tried to explain - these calls can be costly if used in the
    wrong place, esp on the wrong processor architectures. What
    appears in inner loop will vary widely by application, and there
    are a *ton* of lucene users out there using it in all sorts of
    ways we can't imagine. For example, I'd rather not see debugging
    in Query/Weight/Scorer classes - for most applications, query and
    weight construction won't be a bottleneck, but there are some
    where it could be (running thousands of stored queries against
    each incoming document via memoryindex for example).

    Replace infoSteram by a logging framework (SLF4J)
    -------------------------------------------------

    Key: LUCENE-1482
    URL: https://issues.apache.org/jira/browse/
    LUCENE-1482
    Project: Lucene - Java
    Issue Type: Improvement
    Components: Index
    Reporter: Shai Erera
    Fix For: 2.4.1, 2.9

    Attachments: LUCENE-1482-2.patch, LUCENE-1482.patch,
    slf4j-api-1.5.6.jar, slf4j-nop-1.5.6.jar


    Lucene makes use of infoStream to output messages in its indexing
    code only. For debugging purposes, when the search application is
    run on the customer side, getting messages from other code flows,
    like search, query parsing, analysis etc can be extremely useful.
    There are two main problems with infoStream today:
    1. It is owned by IndexWriter, so if I want to add logging
    capabilities to other classes I need to either expose an API or
    propagate infoStream to all classes (see for example
    DocumentsWriter, which receives its infoStream instance from
    IndexWriter).
    2. I can either turn debugging on or off, for the entire code.
    Introducing a logging framework can allow each class to control
    its logging independently, and more importantly, allows the
    application to turn on logging for only specific areas in the
    code (i.e., org.apache.lucene.index.*).
    I've investigated SLF4J (stands for Simple Logging Facade for
    Java) which is, as it names states, a facade over different
    logging frameworks. As such, you can include the slf4j.jar in
    your application, and it recognizes at deploy time what is the
    actual logging framework you'd like to use. SLF4J comes with
    several adapters for Java logging, Log4j and others. If you know
    your application uses Java logging, simply drop slf4j.jar and
    slf4j-jdk14.jar in your classpath, and your logging statements
    will use Java logging underneath the covers.
    This makes the logging code very simple. For a class A the logger
    will be instantiated like this:
    public class A {
    private static final logger = LoggerFactory.getLogger(A.class);
    }
    And will later be used like this:
    public class A {
    private static final logger = LoggerFactory.getLogger(A.class);
    public void foo() {
    if (logger.isDebugEnabled()) {
    logger.debug("message");
    }
    }
    }
    That's all !
    Checking for isDebugEnabled is very quick, at least using the
    JDK14 adapter (but I assume it's fast also over other logging
    frameworks).
    The important thing is, every class controls its own logger. Not
    all classes have to output logging messages, and we can improve
    Lucene's logging gradually, w/o changing the API, by adding more
    logging messages to interesting classes.
    I will submit a patch shortly
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Shalin Shekhar Mangar at Jan 9, 2009 at 8:31 pm

    On Sat, Jan 10, 2009 at 12:41 AM, robert engels wrote:

    This is not really true these days. Dynamic class instrumentation/byte
    modification can remove the calls entirely (for loggers not enabled). They
    can be enabled during startup (or a reload from a different class loader).

    See the paper at http://www.springerlink.com/content/ur00014m03275421

    1. A user will need to write code using the technique in that paper to
    remove logging statements from Lucene if he does not want them?
    2. Or, Lucene can create two distributions, one with logging and the other
    with logging removed through bytecode modification.

    I think it is unfair to expect users to do #1. Is there an existing open
    source project which Lucene can add to the build process to do #2?

    If we forget the bytecode modification for a moment, how much cost does this
    add to Lucene when used by a real application with slf4j logging? (e.g. Solr
    uses the jdk adapter and no-op adapter cannot be used)

    --
    Regards,
    Shalin Shekhar Mangar.
  • Yonik Seeley at Jan 9, 2009 at 8:40 pm

    On Fri, Jan 9, 2009 at 3:31 PM, Shalin Shekhar Mangar wrote:
    If we forget the bytecode modification for a moment, how much cost does this
    add to Lucene when used by a real application with slf4j logging? (e.g. Solr
    uses the jdk adapter and no-op adapter cannot be used)
    AFAIK, the infostream stuff is only in IndexWriter. There shouldn't
    be a measurable
    performance impact.

    My concerns had more to do with future patches and the places some
    users may want to start adding logging (potentially anywhere they have
    ever had a problem). I'm communicating those concerns now to get
    people to think twice before peppering Lucene full of logging
    statements. Sometimes a debugger is the right tool for the job
    instead.

    -Yonik

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Robert engels at Jan 9, 2009 at 9:18 pm
    You only write one version - the one with logging statements.

    They will be removed at RUNTIME - given the proper class loader.

    The Frog library I referenced allows a degree of logging without
    writing any logging code - it is injected at runtime.
    On Jan 9, 2009, at 2:31 PM, Shalin Shekhar Mangar wrote:

    On Sat, Jan 10, 2009 at 12:41 AM, robert engels
    wrote:
    This is not really true these days. Dynamic class instrumentation/
    byte modification can remove the calls entirely (for loggers not
    enabled). They can be enabled during startup (or a reload from a
    different class loader).

    See the paper at http://www.springerlink.com/content/ur00014m03275421

    1. A user will need to write code using the technique in that paper
    to remove logging statements from Lucene if he does not want them?
    2. Or, Lucene can create two distributions, one with logging and
    the other with logging removed through bytecode modification.

    I think it is unfair to expect users to do #1. Is there an existing
    open source project which Lucene can add to the build process to do
    #2?

    If we forget the bytecode modification for a moment, how much cost
    does this add to Lucene when used by a real application with slf4j
    logging? (e.g. Solr uses the jdk adapter and no-op adapter cannot
    be used)

    --
    Regards,
    Shalin Shekhar Mangar.
  • Shai Erera (JIRA) at Jan 11, 2009 at 6:52 am
    [ https://issues.apache.org/jira/browse/LUCENE-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662733#action_12662733 ]

    Shai Erera commented on LUCENE-1482:
    ------------------------------------

    {bq}
    I think using a logger to replace the infostream stuff is probably acceptable. What I personally don't want to see happen is instrumentation creep/bloat, where debugging statements slowly make their way all throughout Lucene.
    {bq}

    Grant wrote a few posts back:
    "I also think it is important to address Yonik's point about "inappropriate" places. In other words, we need guidelines about where and when to using logging and committers need to be on the lookout for logging uses. I realize that is as much a community policing problem as a patch problem, but, we should address them before we adopt logging."

    IMO, adding logging messages to outer classes, like QueryParser, is unnecessary since the application can achieve the same thing by itself (logging the input query text, used Analyzer and the output Query object). But logging internal places, like merging, is very important, because you usually can't reproduce it in your dev env. (it requires the exact settings to IndexWriter, the exact stream of documents and the exact operations (add/remove)).
    Like I said, logging in Lucene is mostly important when you're trying to debug an application which is out of your hands. Customers are rarely willing to share their content. Also, community-wise, being able to ask someone to drop a log of operations that has happened and caused a certain problem is valuable. Today you can ask it only on IndexWriter output, which may not be enough.

    {bq}
    I've tried to explain - these calls can be costly if used in the wrong place, esp on the wrong processor architectures. What appears in inner loop will vary widely by application, and there are a ton of lucene users out there using it in all sorts of ways we can't imagine. For example, I'd rather not see debugging in Query/Weight/Scorer classes - for most applications, query and weight construction won't be a bottleneck, but there are some where it could be (running thousands of stored queries against each incoming document via memoryindex for example).
    {bq}

    I'm sorry, but I don't buy this (or I'm still missing something). What's the difference between logger.isDebugEnabled to indexOutput.writeInt? Both are method calls on a different object. Why is the latter acceptable and the former not?
    I'm not saying that we should drop any OO design and programming, but just pointing out that Lucene's code is already filled with many method calls on different objects, inside as well as outside of loops.
    The only way I think you could claim the two are different is because indexOutput.writeInt is essential for Lucene's operation, while logger.isDebugEnabled is not. But I believe logging in Lucene is as much important (and valuable) as encoding its data structures.
    Replace infoSteram by a logging framework (SLF4J)
    -------------------------------------------------

    Key: LUCENE-1482
    URL: https://issues.apache.org/jira/browse/LUCENE-1482
    Project: Lucene - Java
    Issue Type: Improvement
    Components: Index
    Reporter: Shai Erera
    Fix For: 2.4.1, 2.9

    Attachments: LUCENE-1482-2.patch, LUCENE-1482.patch, slf4j-api-1.5.6.jar, slf4j-nop-1.5.6.jar


    Lucene makes use of infoStream to output messages in its indexing code only. For debugging purposes, when the search application is run on the customer side, getting messages from other code flows, like search, query parsing, analysis etc can be extremely useful.
    There are two main problems with infoStream today:
    1. It is owned by IndexWriter, so if I want to add logging capabilities to other classes I need to either expose an API or propagate infoStream to all classes (see for example DocumentsWriter, which receives its infoStream instance from IndexWriter).
    2. I can either turn debugging on or off, for the entire code.
    Introducing a logging framework can allow each class to control its logging independently, and more importantly, allows the application to turn on logging for only specific areas in the code (i.e., org.apache.lucene.index.*).
    I've investigated SLF4J (stands for Simple Logging Facade for Java) which is, as it names states, a facade over different logging frameworks. As such, you can include the slf4j.jar in your application, and it recognizes at deploy time what is the actual logging framework you'd like to use. SLF4J comes with several adapters for Java logging, Log4j and others. If you know your application uses Java logging, simply drop slf4j.jar and slf4j-jdk14.jar in your classpath, and your logging statements will use Java logging underneath the covers.
    This makes the logging code very simple. For a class A the logger will be instantiated like this:
    public class A {
    private static final logger = LoggerFactory.getLogger(A.class);
    }
    And will later be used like this:
    public class A {
    private static final logger = LoggerFactory.getLogger(A.class);
    public void foo() {
    if (logger.isDebugEnabled()) {
    logger.debug("message");
    }
    }
    }
    That's all !
    Checking for isDebugEnabled is very quick, at least using the JDK14 adapter (but I assume it's fast also over other logging frameworks).
    The important thing is, every class controls its own logger. Not all classes have to output logging messages, and we can improve Lucene's logging gradually, w/o changing the API, by adding more logging messages to interesting classes.
    I will submit a patch shortly
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-dev @
categorieslucene
postedJan 8, '09 at 5:03p
activeJan 11, '09 at 6:52a
posts10
users4
websitelucene.apache.org

People

Translate

site design / logo © 2021 Grokbase