Grokbase Groups Lucene dev July 2011
FAQ
Integrate Solr benchmarking support into the Benchmark module
-------------------------------------------------------------

Key: SOLR-2646
URL: https://issues.apache.org/jira/browse/SOLR-2646
Project: Solr
Issue Type: New Feature
Reporter: Mark Miller
Assignee: Mark Miller
Fix For: 4.0


As part of my buzzwords Solr pef talk, I did some work to allow some Solr benchmarking with the benchmark module.

I'll attach a patch with the current work I've done soon - there is still a fair amount to clean up and fix - a couple hacks or three - but it's already fairly useful.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Search Discussions

  • Mark Miller (JIRA) at Jul 18, 2011 at 10:59 pm
    [ https://issues.apache.org/jira/browse/SOLR-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Mark Miller updated SOLR-2646:
    ------------------------------

    Attachment: SOLR-2646.patch

    Still some to do here, but here is what I have at the moment. Larger issues that are left are:

    * cleanly integrate into the build (hack integration now)

    * improve error handling and reporting so that it's easier to create working algorithms.
    Integrate Solr benchmarking support into the Benchmark module
    -------------------------------------------------------------

    Key: SOLR-2646
    URL: https://issues.apache.org/jira/browse/SOLR-2646
    Project: Solr
    Issue Type: New Feature
    Reporter: Mark Miller
    Assignee: Mark Miller
    Fix For: 4.0

    Attachments: SOLR-2646.patch


    As part of my buzzwords Solr pef talk, I did some work to allow some Solr benchmarking with the benchmark module.
    I'll attach a patch with the current work I've done soon - there is still a fair amount to clean up and fix - a couple hacks or three - but it's already fairly useful.
    --
    This message is automatically generated by JIRA.
    For more information on JIRA, see: http://www.atlassian.com/software/jira



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: dev-help@lucene.apache.org
  • Mark Miller (JIRA) at Jul 18, 2011 at 11:03 pm
    [ https://issues.apache.org/jira/browse/SOLR-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Mark Miller updated SOLR-2646:
    ------------------------------

    Attachment: Dev-SolrBenchmarkModule.pdf

    Attached is a brief rough guide to getting started writing or running an algorithm. Thanks to Martijn Koster for contributing improvements and additional info for it.
    Integrate Solr benchmarking support into the Benchmark module
    -------------------------------------------------------------

    Key: SOLR-2646
    URL: https://issues.apache.org/jira/browse/SOLR-2646
    Project: Solr
    Issue Type: New Feature
    Reporter: Mark Miller
    Assignee: Mark Miller
    Fix For: 4.0

    Attachments: Dev-SolrBenchmarkModule.pdf, SOLR-2646.patch


    As part of my buzzwords Solr pef talk, I did some work to allow some Solr benchmarking with the benchmark module.
    I'll attach a patch with the current work I've done soon - there is still a fair amount to clean up and fix - a couple hacks or three - but it's already fairly useful.
    --
    This message is automatically generated by JIRA.
    For more information on JIRA, see: http://www.atlassian.com/software/jira



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: dev-help@lucene.apache.org
  • Mark Miller (JIRA) at Jul 18, 2011 at 11:11 pm
    [ https://issues.apache.org/jira/browse/SOLR-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13067377#comment-13067377 ]

    Mark Miller commented on SOLR-2646:
    -----------------------------------

    Also, as a reminder to myself - the SolrSearchTask is a bit of hack right now - Query#toString police alert ;)
    Integrate Solr benchmarking support into the Benchmark module
    -------------------------------------------------------------

    Key: SOLR-2646
    URL: https://issues.apache.org/jira/browse/SOLR-2646
    Project: Solr
    Issue Type: New Feature
    Reporter: Mark Miller
    Assignee: Mark Miller
    Fix For: 4.0

    Attachments: Dev-SolrBenchmarkModule.pdf, SOLR-2646.patch


    As part of my buzzwords Solr pef talk, I did some work to allow some Solr benchmarking with the benchmark module.
    I'll attach a patch with the current work I've done soon - there is still a fair amount to clean up and fix - a couple hacks or three - but it's already fairly useful.
    --
    This message is automatically generated by JIRA.
    For more information on JIRA, see: http://www.atlassian.com/software/jira



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: dev-help@lucene.apache.org
  • Mark Miller (JIRA) at Jul 18, 2011 at 11:21 pm
    [ https://issues.apache.org/jira/browse/SOLR-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13067386#comment-13067386 ]

    Mark Miller commented on SOLR-2646:
    -----------------------------------

    Some of the available settings (top of the alg file) that can be varied per round:
    {code}
    solr.server=(fully qualified classname)
    solr.streaming.server.queue.size=(int)
    solr.streaming.server.threadcount=(int)

    solr.internal.server.xmx=(eg 1000M)

    solr.configs.home=(path to config files to use)
    solr.schema=(schema.xml filename in solr.configs.home)
    solr.config(solrconfig.xml filename in solr.configs.home)

    solr.field.mappings=(map benchmark field names to Solr schema names eg doctitle>title,docid>id,docdate>date)
    {code}
    Integrate Solr benchmarking support into the Benchmark module
    -------------------------------------------------------------

    Key: SOLR-2646
    URL: https://issues.apache.org/jira/browse/SOLR-2646
    Project: Solr
    Issue Type: New Feature
    Reporter: Mark Miller
    Assignee: Mark Miller
    Fix For: 4.0

    Attachments: Dev-SolrBenchmarkModule.pdf, SOLR-2646.patch


    As part of my buzzwords Solr pef talk, I did some work to allow some Solr benchmarking with the benchmark module.
    I'll attach a patch with the current work I've done soon - there is still a fair amount to clean up and fix - a couple hacks or three - but it's already fairly useful.
    --
    This message is automatically generated by JIRA.
    For more information on JIRA, see: http://www.atlassian.com/software/jira



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: dev-help@lucene.apache.org
  • Michael McCandless (JIRA) at Jul 19, 2011 at 10:22 am
    [ https://issues.apache.org/jira/browse/SOLR-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13067604#comment-13067604 ]

    Michael McCandless commented on SOLR-2646:
    ------------------------------------------

    This is awesome Mark! We badly need to be able to easily benchmark Solr.
    Integrate Solr benchmarking support into the Benchmark module
    -------------------------------------------------------------

    Key: SOLR-2646
    URL: https://issues.apache.org/jira/browse/SOLR-2646
    Project: Solr
    Issue Type: New Feature
    Reporter: Mark Miller
    Assignee: Mark Miller
    Fix For: 4.0

    Attachments: Dev-SolrBenchmarkModule.pdf, SOLR-2646.patch


    As part of my buzzwords Solr pef talk, I did some work to allow some Solr benchmarking with the benchmark module.
    I'll attach a patch with the current work I've done soon - there is still a fair amount to clean up and fix - a couple hacks or three - but it's already fairly useful.
    --
    This message is automatically generated by JIRA.
    For more information on JIRA, see: http://www.atlassian.com/software/jira



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: dev-help@lucene.apache.org
  • Mark Miller (JIRA) at Jul 27, 2011 at 9:43 pm
    [ https://issues.apache.org/jira/browse/SOLR-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Mark Miller updated SOLR-2646:
    ------------------------------

    Attachment: SOLR-2646.patch

    New patch -

    *A variety of little improvements in error handling and messages. Slightly better handling of starting/stopping solr internally (a lot I'd like to improve still though).

    *Also adds the log param to StartSolrServer so that you can use StartSolrServer(log) to pump the Solr logs to the console. Very useful when developing an algorithm and to be sure it's doing what you think it is.

    *Also now actually points to the correct configs folder in the internal example algs, and doesn't silently use the example config (or last used) when it cannot find the specified config file.
    Integrate Solr benchmarking support into the Benchmark module
    -------------------------------------------------------------

    Key: SOLR-2646
    URL: https://issues.apache.org/jira/browse/SOLR-2646
    Project: Solr
    Issue Type: New Feature
    Reporter: Mark Miller
    Assignee: Mark Miller
    Fix For: 4.0

    Attachments: Dev-SolrBenchmarkModule.pdf, SOLR-2646.patch, SOLR-2646.patch


    As part of my buzzwords Solr pef talk, I did some work to allow some Solr benchmarking with the benchmark module.
    I'll attach a patch with the current work I've done soon - there is still a fair amount to clean up and fix - a couple hacks or three - but it's already fairly useful.
    --
    This message is automatically generated by JIRA.
    For more information on JIRA, see: http://www.atlassian.com/software/jira



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: dev-help@lucene.apache.org
  • Mark Miller (Updated) (JIRA) at Feb 11, 2012 at 7:27 pm
    [ https://issues.apache.org/jira/browse/SOLR-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Mark Miller updated SOLR-2646:
    ------------------------------

    Attachment: SOLR-2646.patch

    to trunk
    Integrate Solr benchmarking support into the Benchmark module
    -------------------------------------------------------------

    Key: SOLR-2646
    URL: https://issues.apache.org/jira/browse/SOLR-2646
    Project: Solr
    Issue Type: New Feature
    Reporter: Mark Miller
    Assignee: Mark Miller
    Fix For: 4.0

    Attachments: Dev-SolrBenchmarkModule.pdf, SOLR-2646.patch, SOLR-2646.patch, SOLR-2646.patch


    As part of my buzzwords Solr pef talk, I did some work to allow some Solr benchmarking with the benchmark module.
    I'll attach a patch with the current work I've done soon - there is still a fair amount to clean up and fix - a couple hacks or three - but it's already fairly useful.
    --
    This message is automatically generated by JIRA.
    If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
    For more information on JIRA, see: http://www.atlassian.com/software/jira



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: dev-help@lucene.apache.org
  • Robert Muir (Commented) (JIRA) at Feb 13, 2012 at 5:19 pm
    [ https://issues.apache.org/jira/browse/SOLR-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13206990#comment-13206990 ]

    Robert Muir commented on SOLR-2646:
    -----------------------------------

    What else do you need to get this in... cleaner integration into the build?
    Integrate Solr benchmarking support into the Benchmark module
    -------------------------------------------------------------

    Key: SOLR-2646
    URL: https://issues.apache.org/jira/browse/SOLR-2646
    Project: Solr
    Issue Type: New Feature
    Reporter: Mark Miller
    Assignee: Mark Miller
    Fix For: 4.0

    Attachments: Dev-SolrBenchmarkModule.pdf, SOLR-2646.patch, SOLR-2646.patch, SOLR-2646.patch


    As part of my buzzwords Solr pef talk, I did some work to allow some Solr benchmarking with the benchmark module.
    I'll attach a patch with the current work I've done soon - there is still a fair amount to clean up and fix - a couple hacks or three - but it's already fairly useful.
    --
    This message is automatically generated by JIRA.
    If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
    For more information on JIRA, see: http://www.atlassian.com/software/jira



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: dev-help@lucene.apache.org
  • Mark Miller (Commented) (JIRA) at Feb 13, 2012 at 6:33 pm
    [ https://issues.apache.org/jira/browse/SOLR-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13207043#comment-13207043 ]

    Mark Miller commented on SOLR-2646:
    -----------------------------------

    Yeah - I guess that is my biggest problem - for example, I hack into benchmark module to find the Solr jars - which is why you have to run ant dist first (and it uses the Solr example, so you have to run example).

    {noformat}
    + <!-- used to run solr benchmarks -->
    + <pathelement path="../../solr/dist/apache-solr-solrj-4.0-SNAPSHOT.jar" />
    + <fileset dir="../../solr/dist/solrj-lib">
    + <include name="**/*.jar" />
    + </fileset>
    {noformat}

    It is even hardcoded for 4.0-SNAPSHOT at the moment - that can be wild-carded, but it's still a little nasty.

    There are certainly plenty of other rough edges, but that is the largest hack issue probably.
    Integrate Solr benchmarking support into the Benchmark module
    -------------------------------------------------------------

    Key: SOLR-2646
    URL: https://issues.apache.org/jira/browse/SOLR-2646
    Project: Solr
    Issue Type: New Feature
    Reporter: Mark Miller
    Assignee: Mark Miller
    Fix For: 4.0

    Attachments: Dev-SolrBenchmarkModule.pdf, SOLR-2646.patch, SOLR-2646.patch, SOLR-2646.patch


    As part of my buzzwords Solr pef talk, I did some work to allow some Solr benchmarking with the benchmark module.
    I'll attach a patch with the current work I've done soon - there is still a fair amount to clean up and fix - a couple hacks or three - but it's already fairly useful.
    --
    This message is automatically generated by JIRA.
    If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
    For more information on JIRA, see: http://www.atlassian.com/software/jira



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: dev-help@lucene.apache.org
  • Mark Miller (JIRA) at Jul 9, 2012 at 12:48 pm
    [ https://issues.apache.org/jira/browse/SOLR-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Mark Miller updated SOLR-2646:
    ------------------------------

    Attachment: SOLR-2646.patch

    A patch taking things to trunk.
    Integrate Solr benchmarking support into the Benchmark module
    -------------------------------------------------------------

    Key: SOLR-2646
    URL: https://issues.apache.org/jira/browse/SOLR-2646
    Project: Solr
    Issue Type: New Feature
    Reporter: Mark Miller
    Assignee: Mark Miller
    Fix For: 4.0

    Attachments: Dev-SolrBenchmarkModule.pdf, SOLR-2646.patch, SOLR-2646.patch, SOLR-2646.patch, SOLR-2646.patch


    As part of my buzzwords Solr pef talk, I did some work to allow some Solr benchmarking with the benchmark module.
    I'll attach a patch with the current work I've done soon - there is still a fair amount to clean up and fix - a couple hacks or three - but it's already fairly useful.
    --
    This message is automatically generated by JIRA.
    If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
    For more information on JIRA, see: http://www.atlassian.com/software/jira



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: dev-help@lucene.apache.org
  • Mark Miller (JIRA) at Jul 9, 2012 at 12:50 pm
    [ https://issues.apache.org/jira/browse/SOLR-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Mark Miller updated SOLR-2646:
    ------------------------------

    Attachment: SolrIndexingPerfHistory.pdf

    I took a little time and tested Solr indexing performance on trunk over about the past year and a half. I also added some numbers from 3.6 for comparison.

    This benchmark tests both a single indexing thread, as well as 4 threads with the concurrent solr server.

    I test indexing 10000 wikipedia docs and do 4 runs (serial, concurrent, serial, concurrent). I toss the first 2 runs and record the second 2 runs. I do this once at the end of each month.
    Integrate Solr benchmarking support into the Benchmark module
    -------------------------------------------------------------

    Key: SOLR-2646
    URL: https://issues.apache.org/jira/browse/SOLR-2646
    Project: Solr
    Issue Type: New Feature
    Reporter: Mark Miller
    Assignee: Mark Miller
    Fix For: 4.0

    Attachments: Dev-SolrBenchmarkModule.pdf, SOLR-2646.patch, SOLR-2646.patch, SOLR-2646.patch, SOLR-2646.patch, SolrIndexingPerfHistory.pdf


    As part of my buzzwords Solr pef talk, I did some work to allow some Solr benchmarking with the benchmark module.
    I'll attach a patch with the current work I've done soon - there is still a fair amount to clean up and fix - a couple hacks or three - but it's already fairly useful.
    --
    This message is automatically generated by JIRA.
    If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
    For more information on JIRA, see: http://www.atlassian.com/software/jira



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: dev-help@lucene.apache.org
  • David Smiley (JIRA) at Jul 9, 2012 at 3:27 pm
    [ https://issues.apache.org/jira/browse/SOLR-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13409566#comment-13409566 ]

    David Smiley commented on SOLR-2646:
    ------------------------------------

    According to the note at the bottom of SolrIndexingPerfHistory.pdf, it appears trunk is slower than 3.6 -- how could that be?
    Integrate Solr benchmarking support into the Benchmark module
    -------------------------------------------------------------

    Key: SOLR-2646
    URL: https://issues.apache.org/jira/browse/SOLR-2646
    Project: Solr
    Issue Type: New Feature
    Reporter: Mark Miller
    Assignee: Mark Miller
    Fix For: 4.0

    Attachments: Dev-SolrBenchmarkModule.pdf, SOLR-2646.patch, SOLR-2646.patch, SOLR-2646.patch, SOLR-2646.patch, SolrIndexingPerfHistory.pdf


    As part of my buzzwords Solr pef talk, I did some work to allow some Solr benchmarking with the benchmark module.
    I'll attach a patch with the current work I've done soon - there is still a fair amount to clean up and fix - a couple hacks or three - but it's already fairly useful.
    --
    This message is automatically generated by JIRA.
    If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
    For more information on JIRA, see: http://www.atlassian.com/software/jira



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: dev-help@lucene.apache.org
  • Mark Miller (JIRA) at Jul 9, 2012 at 3:42 pm
    [ https://issues.apache.org/jira/browse/SOLR-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13409577#comment-13409577 ]

    Mark Miller commented on SOLR-2646:
    -----------------------------------

    Could not tell you. Could be a large variety of things.

    This is a test using the current example configs shipped at each date - which means it's not always apples to apples if default config changes. Analysis could have changed for our default english text. New defaults for features or ease of use may have been enabled.

    For example, I believe the update log is on by default now for durability and realtime GET, etc.

    Also, some code paths have changed to support various new features.

    Also, Lucene is changing underneath us, so we should probably compare to some similar benchmark there (I know Mike publishes quite a few that could be looked at).

    It's not so easy to dig in after the fact with month resolution.

    At some point, it would be nice to have this automated and published as Lucene is - then we could run it nightly.

    There is some work to do to get there though (don't know that ill have time for it in the near future), and we would need a good consistent machine to run it on (I could probably run it at night or something).

    I have not attempted to track anything down other than the broad numbers right now.

    This is simply to start a record that can help as we move forward in evaluating how changes impact performance.

    Obviously the single threaded path has not been affected - so whatever has changed, it's likely mostly around concurrency.
    Integrate Solr benchmarking support into the Benchmark module
    -------------------------------------------------------------

    Key: SOLR-2646
    URL: https://issues.apache.org/jira/browse/SOLR-2646
    Project: Solr
    Issue Type: New Feature
    Reporter: Mark Miller
    Assignee: Mark Miller
    Fix For: 4.0

    Attachments: Dev-SolrBenchmarkModule.pdf, SOLR-2646.patch, SOLR-2646.patch, SOLR-2646.patch, SOLR-2646.patch, SolrIndexingPerfHistory.pdf


    As part of my buzzwords Solr pef talk, I did some work to allow some Solr benchmarking with the benchmark module.
    I'll attach a patch with the current work I've done soon - there is still a fair amount to clean up and fix - a couple hacks or three - but it's already fairly useful.
    --
    This message is automatically generated by JIRA.
    If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
    For more information on JIRA, see: http://www.atlassian.com/software/jira



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: dev-help@lucene.apache.org
  • Mark Miller (JIRA) at Jul 15, 2012 at 10:57 pm
    [ https://issues.apache.org/jira/browse/SOLR-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414803#comment-13414803 ]

    Mark Miller commented on SOLR-2646:
    -----------------------------------

    I've got a fair amount of this automated now. It's still somewhat hackey though.

    Because you need to apply the benchmark patch to get things working, I count on that checkout existing and being patched in a specific location. It drives the benchmark, but talks to a running Solr that is started from a checkout. I use git so that it's really cheap to flip through revs and run benchmarks.

    The main driver is an ugly .sh script - it accepts a few params (name of the chart, where to write result files, location of alg file, date range of checkouts to run the alg against, and the interval to try between days).

    For instance, you might say, run the indexing benchmark over the period of 2012-01-04 to 2012-07-15 and do it once for every 5 days.

    This happens and the output of the benchmarks are dumped into a folder.

    Then I have a simple java cmd line app that will process the result folder. It takes a chart name, the location of results folder, and a list of named regexes - each regex pointing to the pertinent data to pull from the results file. The java app pulls out all the data, writes a csv file, and outputs a simple line chart.

    I don't know how cleaned up this will get, i won't post any of it for now - but I may get to the point of running some stuff locally automatically and pushing to a webserver with the charts etc, al la Lucene.
    Integrate Solr benchmarking support into the Benchmark module
    -------------------------------------------------------------

    Key: SOLR-2646
    URL: https://issues.apache.org/jira/browse/SOLR-2646
    Project: Solr
    Issue Type: New Feature
    Reporter: Mark Miller
    Assignee: Mark Miller
    Fix For: 4.0

    Attachments: Dev-SolrBenchmarkModule.pdf, SOLR-2646.patch, SOLR-2646.patch, SOLR-2646.patch, SOLR-2646.patch, SolrIndexingPerfHistory.pdf, chart.jpg


    As part of my buzzwords Solr pef talk, I did some work to allow some Solr benchmarking with the benchmark module.
    I'll attach a patch with the current work I've done soon - there is still a fair amount to clean up and fix - a couple hacks or three - but it's already fairly useful.
    --
    This message is automatically generated by JIRA.
    If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
    For more information on JIRA, see: http://www.atlassian.com/software/jira



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: dev-help@lucene.apache.org
  • Mark Miller (JIRA) at Jul 15, 2012 at 10:57 pm
    [ https://issues.apache.org/jira/browse/SOLR-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Mark Miller updated SOLR-2646:
    ------------------------------

    Attachment: chart.jpg
    Integrate Solr benchmarking support into the Benchmark module
    -------------------------------------------------------------

    Key: SOLR-2646
    URL: https://issues.apache.org/jira/browse/SOLR-2646
    Project: Solr
    Issue Type: New Feature
    Reporter: Mark Miller
    Assignee: Mark Miller
    Fix For: 4.0

    Attachments: Dev-SolrBenchmarkModule.pdf, SOLR-2646.patch, SOLR-2646.patch, SOLR-2646.patch, SOLR-2646.patch, SolrIndexingPerfHistory.pdf, chart.jpg


    As part of my buzzwords Solr pef talk, I did some work to allow some Solr benchmarking with the benchmark module.
    I'll attach a patch with the current work I've done soon - there is still a fair amount to clean up and fix - a couple hacks or three - but it's already fairly useful.
    --
    This message is automatically generated by JIRA.
    If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
    For more information on JIRA, see: http://www.atlassian.com/software/jira



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: dev-help@lucene.apache.org
  • Mark Miller (JIRA) at Jul 15, 2012 at 10:59 pm
    [ https://issues.apache.org/jira/browse/SOLR-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414804#comment-13414804 ]

    Mark Miller commented on SOLR-2646:
    -----------------------------------

    Attached an example generated chart. Would probably end up embedding that in html. The Lucene stuff uses a javascript charting lib, but I don't really want to deal with javascript - would rather stick to java when I can.
    Integrate Solr benchmarking support into the Benchmark module
    -------------------------------------------------------------

    Key: SOLR-2646
    URL: https://issues.apache.org/jira/browse/SOLR-2646
    Project: Solr
    Issue Type: New Feature
    Reporter: Mark Miller
    Assignee: Mark Miller
    Fix For: 4.0

    Attachments: Dev-SolrBenchmarkModule.pdf, SOLR-2646.patch, SOLR-2646.patch, SOLR-2646.patch, SOLR-2646.patch, SolrIndexingPerfHistory.pdf, chart.jpg


    As part of my buzzwords Solr pef talk, I did some work to allow some Solr benchmarking with the benchmark module.
    I'll attach a patch with the current work I've done soon - there is still a fair amount to clean up and fix - a couple hacks or three - but it's already fairly useful.
    --
    This message is automatically generated by JIRA.
    If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
    For more information on JIRA, see: http://www.atlassian.com/software/jira



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: dev-help@lucene.apache.org
  • Erick Erickson (JIRA) at Jul 17, 2012 at 1:37 pm
    [ https://issues.apache.org/jira/browse/SOLR-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13416176#comment-13416176 ]

    Erick Erickson commented on SOLR-2646:
    --------------------------------------

    Way cool!

    Is there any chance that we could report MB/sec to/instead of docs/sec? I suspect that's a more meaningful number for comparisons. Or perhaps just count the bytes sent to Solr and post that as a footnote? Yeah, yeah, yeah, the analysis chain will change things.... but a "doc" is an even more variable thing....

    Actually, I guess that this number could be counted once since the data set doesn't change that rapidly.

    FWIW
    Integrate Solr benchmarking support into the Benchmark module
    -------------------------------------------------------------

    Key: SOLR-2646
    URL: https://issues.apache.org/jira/browse/SOLR-2646
    Project: Solr
    Issue Type: New Feature
    Reporter: Mark Miller
    Assignee: Mark Miller
    Fix For: 4.0

    Attachments: Dev-SolrBenchmarkModule.pdf, SOLR-2646.patch, SOLR-2646.patch, SOLR-2646.patch, SOLR-2646.patch, SolrIndexingPerfHistory.pdf, chart.jpg


    As part of my buzzwords Solr pef talk, I did some work to allow some Solr benchmarking with the benchmark module.
    I'll attach a patch with the current work I've done soon - there is still a fair amount to clean up and fix - a couple hacks or three - but it's already fairly useful.
    --
    This message is automatically generated by JIRA.
    If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
    For more information on JIRA, see: http://www.atlassian.com/software/jira



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: dev-help@lucene.apache.org
  • Mark Miller (JIRA) at Jul 17, 2012 at 2:23 pm
    [ https://issues.apache.org/jira/browse/SOLR-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13416237#comment-13416237 ]

    Mark Miller commented on SOLR-2646:
    -----------------------------------

    It's a constant data set that the test runs on - simply a static dump of wikipedia articles (one doc per line file).

    Every checkout the benchmark runs against uses exactly the same wikipedia docs.

    You can currently compare with Lucene using change over time to some degree, since they both indicate indexing speed.

    I'm sure that we can figure mb/s the same way the Lucene stuff does - but it might be a hack unless you can do it purely in the benchmark package. My current system just extracts info from benchmark result files - so it can extract the result of any benchmark you can make - if thats a mb/s result, that's no problem. I think perhaps though, the Lucene, python driven stuff might even do some external stuff on it's own? I don't know for sure.
    Integrate Solr benchmarking support into the Benchmark module
    -------------------------------------------------------------

    Key: SOLR-2646
    URL: https://issues.apache.org/jira/browse/SOLR-2646
    Project: Solr
    Issue Type: New Feature
    Reporter: Mark Miller
    Assignee: Mark Miller
    Fix For: 4.0

    Attachments: Dev-SolrBenchmarkModule.pdf, SOLR-2646.patch, SOLR-2646.patch, SOLR-2646.patch, SOLR-2646.patch, SolrIndexingPerfHistory.pdf, chart.jpg


    As part of my buzzwords Solr pef talk, I did some work to allow some Solr benchmarking with the benchmark module.
    I'll attach a patch with the current work I've done soon - there is still a fair amount to clean up and fix - a couple hacks or three - but it's already fairly useful.
    --
    This message is automatically generated by JIRA.
    If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
    For more information on JIRA, see: http://www.atlassian.com/software/jira



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: dev-help@lucene.apache.org
  • Lance Norskog (JIRA) at Jul 17, 2012 at 3:01 pm
    [ https://issues.apache.org/jira/browse/SOLR-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13416261#comment-13416261 ]

    Lance Norskog commented on SOLR-2646:
    -------------------------------------

    Are there strategies to keep the disk cache consistent across runs? Linux has a feature to clear it (poke a 0 somewhere in /proc).
    Integrate Solr benchmarking support into the Benchmark module
    -------------------------------------------------------------

    Key: SOLR-2646
    URL: https://issues.apache.org/jira/browse/SOLR-2646
    Project: Solr
    Issue Type: New Feature
    Reporter: Mark Miller
    Assignee: Mark Miller
    Fix For: 4.0

    Attachments: Dev-SolrBenchmarkModule.pdf, SOLR-2646.patch, SOLR-2646.patch, SOLR-2646.patch, SOLR-2646.patch, SolrIndexingPerfHistory.pdf, chart.jpg


    As part of my buzzwords Solr pef talk, I did some work to allow some Solr benchmarking with the benchmark module.
    I'll attach a patch with the current work I've done soon - there is still a fair amount to clean up and fix - a couple hacks or three - but it's already fairly useful.
    --
    This message is automatically generated by JIRA.
    If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
    For more information on JIRA, see: http://www.atlassian.com/software/jira



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: dev-help@lucene.apache.org
  • Robert Muir (JIRA) at Jul 17, 2012 at 3:05 pm
    [ https://issues.apache.org/jira/browse/SOLR-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13416267#comment-13416267 ]

    Robert Muir commented on SOLR-2646:
    -----------------------------------

    The python script does this on linux:

    {noformat}
    echo 3 > /proc/sys/vm/drop_caches
    {noformat}

    and this on windows:
    {noformat}
    for /R %I in (*) do fsutil file setvaliddata %I %~zI
    {noformat}
    Integrate Solr benchmarking support into the Benchmark module
    -------------------------------------------------------------

    Key: SOLR-2646
    URL: https://issues.apache.org/jira/browse/SOLR-2646
    Project: Solr
    Issue Type: New Feature
    Reporter: Mark Miller
    Assignee: Mark Miller
    Fix For: 4.0

    Attachments: Dev-SolrBenchmarkModule.pdf, SOLR-2646.patch, SOLR-2646.patch, SOLR-2646.patch, SOLR-2646.patch, SolrIndexingPerfHistory.pdf, chart.jpg


    As part of my buzzwords Solr pef talk, I did some work to allow some Solr benchmarking with the benchmark module.
    I'll attach a patch with the current work I've done soon - there is still a fair amount to clean up and fix - a couple hacks or three - but it's already fairly useful.
    --
    This message is automatically generated by JIRA.
    If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
    For more information on JIRA, see: http://www.atlassian.com/software/jira



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: dev-help@lucene.apache.org
  • Mark Miller (JIRA) at Jul 17, 2012 at 3:30 pm
    [ https://issues.apache.org/jira/browse/SOLR-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13416287#comment-13416287 ]

    Mark Miller commented on SOLR-2646:
    -----------------------------------

    bq. Are there strategies to keep the disk cache consistent across runs?

    I have a warm phase that basically runs a slightly short version of the bench to try and be fair here. I was tossing the first first round (there are 2) and the warm phase so that things were on a more even playing field.
    Integrate Solr benchmarking support into the Benchmark module
    -------------------------------------------------------------

    Key: SOLR-2646
    URL: https://issues.apache.org/jira/browse/SOLR-2646
    Project: Solr
    Issue Type: New Feature
    Reporter: Mark Miller
    Assignee: Mark Miller
    Fix For: 4.0

    Attachments: Dev-SolrBenchmarkModule.pdf, SOLR-2646.patch, SOLR-2646.patch, SOLR-2646.patch, SOLR-2646.patch, SolrIndexingPerfHistory.pdf, chart.jpg


    As part of my buzzwords Solr pef talk, I did some work to allow some Solr benchmarking with the benchmark module.
    I'll attach a patch with the current work I've done soon - there is still a fair amount to clean up and fix - a couple hacks or three - but it's already fairly useful.
    --
    This message is automatically generated by JIRA.
    If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
    For more information on JIRA, see: http://www.atlassian.com/software/jira



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: dev-help@lucene.apache.org
  • Mark Miller (JIRA) at Jul 17, 2012 at 3:32 pm
    [ https://issues.apache.org/jira/browse/SOLR-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13416288#comment-13416288 ]

    Mark Miller commented on SOLR-2646:
    -----------------------------------

    bq. The python script does this on linux:

    Great! I'll add this to my sh script.
    Integrate Solr benchmarking support into the Benchmark module
    -------------------------------------------------------------

    Key: SOLR-2646
    URL: https://issues.apache.org/jira/browse/SOLR-2646
    Project: Solr
    Issue Type: New Feature
    Reporter: Mark Miller
    Assignee: Mark Miller
    Fix For: 4.0

    Attachments: Dev-SolrBenchmarkModule.pdf, SOLR-2646.patch, SOLR-2646.patch, SOLR-2646.patch, SOLR-2646.patch, SolrIndexingPerfHistory.pdf, chart.jpg


    As part of my buzzwords Solr pef talk, I did some work to allow some Solr benchmarking with the benchmark module.
    I'll attach a patch with the current work I've done soon - there is still a fair amount to clean up and fix - a couple hacks or three - but it's already fairly useful.
    --
    This message is automatically generated by JIRA.
    If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
    For more information on JIRA, see: http://www.atlassian.com/software/jira



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: dev-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdev @
categorieslucene
postedJul 10, '11 at 2:30p
activeJul 17, '12 at 3:32p
posts23
users1
websitelucene.apache.org

1 user in discussion

Mark Miller (JIRA): 23 posts

People

Translate

site design / logo © 2021 Grokbase