FAQ
On the IRC channel, I ran into somebody who was having problems with
optimizes on their Solr indexes taking a really long time. When
investigating, they found that during the optimize, *reads* were
happening on their SSD disk at over 800MB/s, but *writes* were
proceeding at only 20 MB/s.

Looking into ConcurrentMergeScheduler, I discovered that it does indeed
have a default write throttle of only 20 MB/s. I saw code that would
sometimes set the speed to unlimited, but had a hard time figuring out
what circumstances will result in the different settings, so based on
the user experience, I assume that the 20MB/s throttle must be applied
for Solr optimizes.
From what I can see in the code, there's currently no way in
solrconfig.xml to configure scheduler options like the maximum write
speed. Before I an open an issue to add additional configuration
options for the merge scheduler, I thought it might be a good idea to
just double-check with everyone here to see whether there's something I
missed.

This is likely even affecting people who are not using SSD storage.
Most modern magnetic disks can easily exceed 20MB/s on both reads and
writes. Some RAID arrays can write REALLY fast.

Thanks,
Shawn

Search Discussions

  • Michael McCandless at Jun 16, 2016 at 8:36 am
    Hmm, merging can't read at 800 MB/sec and only write at 20 MB/sec for very
    long ... unless there is a huge percentage of deletes.

    Also, by default CMS doesn't throttle forced merges (see
    CMS.get/setForceMergeMBPerSec).

    Maybe capture IndexWriter.setInfoStream output?

    Mike McCandless

    http://blog.mikemccandless.com
    On Wed, Jun 15, 2016 at 9:12 PM, Shawn Heisey wrote:

    On the IRC channel, I ran into somebody who was having problems with
    optimizes on their Solr indexes taking a really long time. When
    investigating, they found that during the optimize, *reads* were
    happening on their SSD disk at over 800MB/s, but *writes* were
    proceeding at only 20 MB/s.

    Looking into ConcurrentMergeScheduler, I discovered that it does indeed
    have a default write throttle of only 20 MB/s. I saw code that would
    sometimes set the speed to unlimited, but had a hard time figuring out
    what circumstances will result in the different settings, so based on
    the user experience, I assume that the 20MB/s throttle must be applied
    for Solr optimizes.

    From what I can see in the code, there's currently no way in
    solrconfig.xml to configure scheduler options like the maximum write
    speed. Before I an open an issue to add additional configuration
    options for the merge scheduler, I thought it might be a good idea to
    just double-check with everyone here to see whether there's something I
    missed.

    This is likely even affecting people who are not using SSD storage.
    Most modern magnetic disks can easily exceed 20MB/s on both reads and
    writes. Some RAID arrays can write REALLY fast.

    Thanks,
    Shawn
  • Shawn Heisey at Jun 16, 2016 at 8:04 pm

    On 6/16/2016 2:35 AM, Michael McCandless wrote:
    Hmm, merging can't read at 800 MB/sec and only write at 20 MB/sec for
    very long ... unless there is a huge percentage of deletes. Also, by
    default CMS doesn't throttle forced merges (see
    CMS.get/setForceMergeMBPerSec). Maybe capture
    IndexWriter.setInfoStream output?
    I can see the problem myself. I have a RAID10 array with six SATA
    disks. When I click the Optimize button for a core that's several
    gigabytes, iotop shows me reads happening at about 100MB/s for several
    seconds, then writes clocking no more than 25 MB/s, and usually a lot
    less. The last several gigabytes that were written were happening at
    less than 5 MB/s. This is VERY slow, and does affect my nightly
    indexing processes.

    Asking the shell to copy a 5GB file revealed sustained write rates of
    over 500MB/s, so the hardware can definitely go faster.

    I patched in an option for solrconfig.xml where I could force it to call
    disableAutoIOThrottle(). I included logging in my patch to make
    absolutely sure that the new code was used. This option made no
    difference in the write speed. I also enabled infoStream, but either I
    configured it wrong or I do not know where to look for the messages. I
    was modifying and compiling branch_5_5.

    This is the patch that I applied:

    http://apaste.info/wKG

    I did see the expected log entries in solr.log when I restarted with the
    patch and the new option in solrconfig.xml.

    What else can I look at?

    Thanks,
    Shawn
  • Michael McCandless at Jun 17, 2016 at 2:31 pm
    Really we need the infoStream output, to see what IW is doing, to take so
    long merging.

    Likely only one merge thread is running (CMS tries to detect if your IO
    system "spins" and if so, uses 1 merge thread) ... maybe try configuring
    this to something higher since your RAID array can probably handle it?

    It's good that disabling auto IO throttling didn't change things ... that's
    what I expected (since forced merges are not throttled by default).

    Maybe capture all thread stacks and post back here?

    Mike McCandless

    http://blog.mikemccandless.com
    On Thu, Jun 16, 2016 at 4:04 PM, Shawn Heisey wrote:
    On 6/16/2016 2:35 AM, Michael McCandless wrote:

    Hmm, merging can't read at 800 MB/sec and only write at 20 MB/sec for
    very long ... unless there is a huge percentage of deletes. Also, by
    default CMS doesn't throttle forced merges (see
    CMS.get/setForceMergeMBPerSec). Maybe capture
    IndexWriter.setInfoStream output?
    I can see the problem myself. I have a RAID10 array with six SATA
    disks. When I click the Optimize button for a core that's several
    gigabytes, iotop shows me reads happening at about 100MB/s for several
    seconds, then writes clocking no more than 25 MB/s, and usually a lot
    less. The last several gigabytes that were written were happening at
    less than 5 MB/s. This is VERY slow, and does affect my nightly
    indexing processes.

    Asking the shell to copy a 5GB file revealed sustained write rates of
    over 500MB/s, so the hardware can definitely go faster.

    I patched in an option for solrconfig.xml where I could force it to call
    disableAutoIOThrottle(). I included logging in my patch to make
    absolutely sure that the new code was used. This option made no
    difference in the write speed. I also enabled infoStream, but either I
    configured it wrong or I do not know where to look for the messages. I
    was modifying and compiling branch_5_5.

    This is the patch that I applied:

    http://apaste.info/wKG

    I did see the expected log entries in solr.log when I restarted with the
    patch and the new option in solrconfig.xml.

    What else can I look at?

    Thanks,
    Shawn

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupsolr-user @
categorieslucene
postedJun 16, '16 at 1:12a
activeJun 17, '16 at 2:31p
posts4
users2
websitelucene.apache.org...

People

Translate

site design / logo © 2019 Grokbase