FAQ
Hello,

I came across the problem that Gwenview moves the photo from the camera
memory by renaming them, but later I forgot which where moved.
Then I tought about a small script in python, but I stumbled upon my
ignorance on the way to do that.

PIL can find similar pictures. I was thinking to reduce the foto into gray
scale and resize them to same size, what algorithm should take place?
Is PIL able to compare 2 images?

--
goto /dev/null

Search Discussions

  • Billy Mays at Jul 8, 2011 at 12:37 pm

    On 07/08/2011 07:29 AM, TheSaint wrote:
    Hello,

    I came across the problem that Gwenview moves the photo from the camera
    memory by renaming them, but later I forgot which where moved.
    Then I tought about a small script in python, but I stumbled upon my
    ignorance on the way to do that.

    PIL can find similar pictures. I was thinking to reduce the foto into gray
    scale and resize them to same size, what algorithm should take place?
    Is PIL able to compare 2 images?
    I recently wrote a program after reading an article (
    http://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html
    ) using the DCT method he proposes. It worked surprisingly well even
    with just the 64bit hash it produces.

    --
    Bill
  • TheSaint at Jul 8, 2011 at 2:14 pm

    Billy Mays wrote:

    It worked surprisingly well even
    with just the 64bit hash it produces.
    I'd say that comparing 2 images reduced upto 32x32 bit seems too little to
    find if one of the 2 portrait has a smile referred to the other.
    I think it's about that mine and your suggestion are similar, but I'd like
    to scale pictures not less than 256x256 pixel.
    Also to take a wider case which the comparison involve a rotated image.

    --
    goto /dev/null
  • Billy Mays at Jul 8, 2011 at 2:32 pm

    On 07/08/2011 10:14 AM, TheSaint wrote:
    Billy Mays wrote:
    It worked surprisingly well even
    with just the 64bit hash it produces.
    I'd say that comparing 2 images reduced upto 32x32 bit seems too little to
    find if one of the 2 portrait has a smile referred to the other.
    I think it's about that mine and your suggestion are similar, but I'd like
    to scale pictures not less than 256x256 pixel.
    Also to take a wider case which the comparison involve a rotated image.
    Originally I thought the same thing. It turns out that doing a DCT on
    an image typically moves the more important data to the top left corner
    of the output. This means that most of the other data in the output an
    be thrown away since most of it doesn't significantly affect the image.
    The 32x32 is an arbitrary size, you can make it any square block that
    you want.

    Rotation is harder to find. You can always take a brute force approach
    by simply rotating the image a couple of times and try running the
    algorithm on each of the rotated pics. Image matching is a difficult
    problem.

    --
    Bill
  • Kevin Zhang at Jul 10, 2011 at 1:20 pm

    On Fri, Jul 8, 2011 at 8:37 PM, Billy Mays wrote:

    I recently wrote a program after reading an article (
    http://www.hackerfactor.com/**blog/index.php?/archives/432-**
    Looks-Like-It.html<http://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html>) using the DCT method he proposes. It worked surprisingly well even with
    just the 64bit hash it produces.
    The link you provided was so great.
    It mentioned an implementation of the hash algorithm in Python though
    invalid, so I spent some time writing my own version.
    It works really fine and kind of solved the problem of finding duplicated
    pictures to delete I recently came across.

    Thanks Billy!

    ps.
    If anyone's interested, pleas checkout the source code in the attachment and
    welcome any advise.
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: <http://mail.python.org/pipermail/python-list/attachments/20110710/de0a7013/attachment.html>
    -------------- next part --------------
    A non-text attachment was scrubbed...
    Name: pic_seeker.py
    Type: application/octet-stream
    Size: 3991 bytes
    Desc: not available
    URL: <http://mail.python.org/pipermail/python-list/attachments/20110710/de0a7013/attachment.obj>
  • Fulvio at Jul 11, 2011 at 2:50 pm

    Kevin Zhang wrote:

    If anyone's interested, pleas checkout the source code in the attachment
    and welcome any advise.
    I found that isn't python 3 code :(

    Then the code should go into some other program to allow actions on those
    pictures which are matching each other. Am I right?
  • Kevin Zhang at Jul 12, 2011 at 1:30 am

    On Mon, Jul 11, 2011 at 10:50 PM, Fulvio wrote:

    I found that isn't python 3 code :(

    It's written in python 2.6.
    Then the code should go into some other program to allow actions on those
    pictures which are matching each other. Am I right?

    The leverages PIL to get the job done.
    The performance from PIL's quite poor, though not precisely measured, most
    of the time was
    spent on resizing pictures with PIL.
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: <http://mail.python.org/pipermail/python-list/attachments/20110712/02576dec/attachment.html>
  • Thomas Jollans at Jul 8, 2011 at 3:16 pm

    On 07/08/2011 01:29 PM, TheSaint wrote:
    Hello,

    I came across the problem that Gwenview moves the photo from the camera
    memory by renaming them, but later I forgot which where moved.
    Then I tought about a small script in python, but I stumbled upon my
    ignorance on the way to do that.

    PIL can find similar pictures. I was thinking to reduce the foto into gray
    scale and resize them to same size, what algorithm should take place?
    Is PIL able to compare 2 images?
    If Gwenview simply moves/renames the images, is it not enough to compare
    the actual files, byte by byte?
  • Dan Stromberg at Jul 8, 2011 at 3:29 pm

    On Fri, Jul 8, 2011 at 8:16 AM, Thomas Jollans wrote:
    On 07/08/2011 01:29 PM, TheSaint wrote:
    Hello,

    I came across the problem that Gwenview moves the photo from the camera
    memory by renaming them, but later I forgot which where moved.
    Then I tought about a small script in python, but I stumbled upon my
    ignorance on the way to do that.

    PIL can find similar pictures. I was thinking to reduce the foto into gray
    scale and resize them to same size, what algorithm should take place?
    Is PIL able to compare 2 images?
    If Gwenview simply moves/renames the images, is it not enough to compare
    the actual files, byte by byte?
    This'll detect duplicates pretty fast; it often doesn't even need to read a
    whole file once, while not sacrificing accuracy:

    http://stromberg.dnsalias.org/~dstromberg/equivalence-classes.html
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: <http://mail.python.org/pipermail/python-list/attachments/20110708/2f9d41e8/attachment.html>
  • Dave Angel at Jul 8, 2011 at 4:13 pm

    On 01/-10/-28163 02:59 PM, TheSaint wrote:
    Hello,

    I came across the problem that Gwenview moves the photo from the camera
    memory by renaming them, but later I forgot which where moved.
    Then I tought about a small script in python, but I stumbled upon my
    ignorance on the way to do that.

    PIL can find similar pictures. I was thinking to reduce the foto into gray
    scale and resize them to same size, what algorithm should take place?
    Is PIL able to compare 2 images?
    If your real problem is identifying a renamed file amongst thousands of
    others, why not just compare the metadata? it'll be much faster.

    For example, if you only have one camera, the timestamp stored in the
    EXIF data would be pretty close, Some cameras also store their "shutter
    release number" in the metadata, which would be even better.

    One concern is whether Glenview or any other of your utilities discard
    the metadata. That would be a big mistake.

    Also, if Gwenview has no other features you're counting on, perhaps you
    should write your own "move the files from camera to computer" utility.
    that's what I did, and it renames and reorganises the files as it does,
    according to my conventions, not someone else's. One reason for the
    renaming is that my cameras only use 4 digit numbers, and these recycle
    every 10000 images.

    DaveA
  • Fulvio at Jul 11, 2011 at 2:42 pm

    Thomas Jollans wrote:

    If Gwenview simply moves/renames the images, is it not enough to compare
    the actual files, byte by byte?
    For the work at the spot I found Geeqie, doing right. In the other hand
    learning some PIL function is one of my interest.
  • Fulvio at Jul 11, 2011 at 2:54 pm

    Dave Angel wrote:

    If your real problem is identifying a renamed file amongst thousands of
    others, why not just compare the metadata? it'll be much faster.
    This was the primer situation, then to get into the dirt I tought something
    more sophisticated.
    There was a program some year's back which was brilliant an fast to find
    similar pictures on several thousand of them.
    Now I can't recall what was the program name and very interesting to do some
    of mine experiments.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppython-list @
categoriespython
postedJul 8, '11 at 11:29a
activeJul 12, '11 at 1:30a
posts12
users7
websitepython.org

People

Translate

site design / logo © 2022 Grokbase