Recently some massive bad image scraping sites started mangling the
requests to the images on my site, I get thousands of requests to urls like:

    "/blah/blah/images/Image-5.jpg" width="128" height="49"
alt="image"/></a> </div> <div class="c0 r"><a href="/m/imgres?..."

this is the request uri verbatim, not just a snippet of the html on the
client side...

So google webmaster tools has been reporting tons of Soft 404 on my sites.

The problem is that I have mod_speling enabled and using it only to
correct the case of the url:

          CheckSpelling On
          CheckCaseOnly On

For some reason the module doesn't behave correctly and offers a 300


Multiple Choices
The document name you requested (/blah/blah/images/Image-5.jpg"
width="128" height="49" alt="image"/></a> </div> <div class="c0 r"><a
href="/m/imgres) could not be found on this server. However, we found
documents with names similar to the one you requested.

Available documents:

      /blah/blah/images/Image-5.jpg/></a> </div> <div class="c0 r"><a
href="/m/imgres?q=foobar (common basename)

The image /blah/blah/images/Image-5.jpg exists on the filesystem.

There are two problems here:

1. Why does it return response 300
2. Why the offered available document is bogus

Turning CheckSpelling off correctly reports this as 404. But then I lose
the ability to correct the spelling of misspelled URLs which is another
huge problem, as many clients don't respect the case-sensitivity of the

What is the right course of action in this case? Is it a bug in
mod_speling, or am I missing some other configuration?

Server version: Apache/2.2.23 (Unix)

I looked through the svn, there weren't any changes in the last 5 years,
so probably it's pointless trying the very latest version.

(the same problem exists on apache/1.3)

Thank you.

Stas Bekman http://stasosphere.com
http://stason.org http://chestofbooks.com
http://vitalitylink.com http://healingcloud.com

To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 1 of 1 | next ›
Discussion Overview
groupusers @
postedJun 18, '13 at 9:53p
activeJun 18, '13 at 9:53p

1 user in discussion

Stas Bekman: 1 post



site design / logo © 2018 Grokbase