FAQ
I'm writing a tool to do some binary file comparisons.
I'm opening the file using

fd=open(filename,'rb')

# Need to seek to 0x80 (hex 80th) location

fd.seek(0x80)

# Need to read just 8 bytes and get the result back in hex format.
x=fd.read(8)
print x

This prints out garbage. I would like to know what am i missing here.
Basically, I am trying to read
8 bytes from location 0x80 from a binary file called "filename"

Any tips/inputs are welcome.

thanks!

Search Discussions

  • Chris Rebert at Mar 4, 2009 at 11:15 pm

    On Wed, Mar 4, 2009 at 2:58 PM, vibgyorbits wrote:
    I'm writing a tool to do some binary file comparisons.
    I'm opening the file using

    fd=open(filename,'rb')

    # Need to seek to 0x80 (hex 80th) location

    fd.seek(0x80)

    # Need to read just 8 bytes and get the result back in hex format.
    x=fd.read(8)
    print x

    This prints out garbage. I would like to know what am i missing here.
    `print x` outputs the raw bytes in the bytestring `x` you just read,
    which yes, generally looks like gibberish.
    It doesn't magically convert it to hex format. Remember that
    bytestrings could just as well contain ASCII in other situations,
    which you certainly wouldn't want to see as hex. You'll have to
    explicitly/manually do the conversion from bytes->hex.

    Cheers,
    Chris

    --
    I have a blog:
    http://blog.rebertia.com
  • Rhodri James at Mar 4, 2009 at 11:19 pm

    On Wed, 04 Mar 2009 22:58:38 -0000, vibgyorbits wrote:

    I'm writing a tool to do some binary file comparisons.
    I'm opening the file using

    fd=open(filename,'rb')

    # Need to seek to 0x80 (hex 80th) location

    fd.seek(0x80)

    # Need to read just 8 bytes and get the result back in hex format.
    x=fd.read(8)
    print x

    This prints out garbage. I would like to know what am i missing here.
    Your bytes are being interpreted as characters when you print the buffer,
    and the chance of them being meaningful text is probably small. Try the
    following:

    for b in x:
    print hex(ord(b))

    Does that look more like what you were expecting?

    --
    Rhodri James *-* Wildebeeste Herder to the Masses
  • Tino Wildenhain at Mar 4, 2009 at 11:28 pm

    Rhodri James wrote:
    On Wed, 04 Mar 2009 22:58:38 -0000, vibgyorbits wrote:

    I'm writing a tool to do some binary file comparisons.
    I'm opening the file using

    fd=open(filename,'rb')

    # Need to seek to 0x80 (hex 80th) location

    fd.seek(0x80)

    # Need to read just 8 bytes and get the result back in hex format.
    x=fd.read(8)
    print x

    This prints out garbage. I would like to know what am i missing here.
    Your bytes are being interpreted as characters when you print the
    buffer, and the chance of them being meaningful text is probably small.
    Try the following:

    for b in x:
    print hex(ord(b))
    better:

    print x.encode("hex")

    Cheers
    Tino
    -------------- next part --------------
    A non-text attachment was scrubbed...
    Name: smime.p7s
    Type: application/x-pkcs7-signature
    Size: 3241 bytes
    Desc: S/MIME Cryptographic Signature
    URL: <http://mail.python.org/pipermail/python-list/attachments/20090305/0834655a/attachment.bin>
  • Rhodri James at Mar 4, 2009 at 11:51 pm

    On Wed, 04 Mar 2009 23:28:32 -0000, Tino Wildenhain wrote:

    Rhodri James wrote:
    On Wed, 04 Mar 2009 22:58:38 -0000, vibgyorbits <bkajey at gmail.com>
    wrote:
    I'm writing a tool to do some binary file comparisons.
    I'm opening the file using

    fd=open(filename,'rb')

    # Need to seek to 0x80 (hex 80th) location

    fd.seek(0x80)

    # Need to read just 8 bytes and get the result back in hex format.
    x=fd.read(8)
    print x

    This prints out garbage. I would like to know what am i missing here.
    Your bytes are being interpreted as characters when you print the
    buffer, and the chance of them being meaningful text is probably small.
    Try the following:

    for b in x:
    print hex(ord(b))
    better:

    print x.encode("hex")
    Encodings make my head hurt :-) While there are programmatic purposes
    I'd leap at the "hex" encoder for, it doesn't make for the most human-
    readable output. I'll stick with the for loop, if you don't mind.


    --
    Rhodri James *-* Wildebeeste Herder to the Masses
  • John Machin at Mar 5, 2009 at 12:37 am

    On Mar 5, 10:51?am, "Rhodri James" wrote:
    On Wed, 04 Mar 2009 23:28:32 -0000, Tino Wildenhain <t... at wildenhain.de> ?
    wrote:


    Rhodri James wrote:
    On Wed, 04 Mar 2009 22:58:38 -0000, vibgyorbits <bka... at gmail.com> ?
    wrote:
    I'm writing a tool to do some binary file comparisons.
    I'm opening the file using
    fd=open(filename,'rb')
    # Need to seek to 0x80 (hex 80th) location
    fd.seek(0x80)
    # Need to read just 8 bytes and get the result back in hex format.
    x=fd.read(8)
    print x
    This prints out garbage. I would like to know what am i missing here.
    Your bytes are being interpreted as characters when you print the
    buffer, and the chance of them being meaningful text is probably small.
    Try the following:
    for b in x:
    ? ? print hex(ord(b))
    better:
    print x.encode("hex")
    Encodings make my head hurt :-) ?While there are programmatic purposes
    I'd leap at the "hex" encoder for, it doesn't make for the most human-
    readable output. ?I'll stick with the for loop, if you don't mind.
    One byte per line??
    x = open('foo.xls', 'rb').read(8)
    ' '.join(z.encode('hex') for z in x)
    'd0 cf 11 e0 a1 b1 1a e1'
    ' '.join(z.encode('hex') for z in x).upper()
    'D0 CF 11 E0 A1 B1 1A E1'
    >>>
  • Benjamin Peterson at Mar 5, 2009 at 1:13 am

    Tino Wildenhain <tino <at> wildenhain.de> writes:
    Rhodri James wrote:
    for b in x:
    print hex(ord(b))
    better:

    print x.encode("hex")
    even better:

    import binascii
    print binascii.hexlify(some_bytes)
  • John Machin at Mar 5, 2009 at 1:25 am

    On Mar 5, 12:13?pm, Benjamin Peterson wrote:
    Tino Wildenhain <tino <at> wildenhain.de> writes:
    Rhodri James wrote:
    for b in x:
    ? ? print hex(ord(b))
    better:
    print x.encode("hex")
    even better:

    import binascii
    print binascii.hexlify(some_bytes)
    AFAICT binascii.hexlify(some_bytes) gives the SAME result as
    some_bytes.encode("hex") for much more typing -- I see no "better"
    here.
  • Benjamin Peterson at Mar 5, 2009 at 4:13 am

    John Machin <sjmachin <at> lexicon.net> writes:
    On Mar 5, 12:13?pm, Benjamin Peterson wrote:

    import binascii
    print binascii.hexlify(some_bytes)
    AFAICT binascii.hexlify(some_bytes) gives the SAME result as
    some_bytes.encode("hex") for much more typing -- I see no
    "better"
    here.
    So called encodings like "hex" and "rot13" are abuse of
    encode() method. encode() should translate
    between byte strings and unicode, not preform
    transformations like that. This has been removed
    in 3.x, so you should use binascii.





    From http Thu Mar 5 05:15:42 2009
    From: http (Paul Rubin)
    Date: 04 Mar 2009 20:15:42 -0800
    Subject: Inverse of dict(zip(x,y))
    References: <ebb9ec45-05ac-43db-b2f1-d04ca523d654@p11g2000yqe.googlegroups.com>
    <6faf39c90903040216l6ffe6720l50b26881f0e84b8f@mail.gmail.com>
    <mailman.1170.1236161796.11746.python-list@python.org>
    <gGtrl.24990$cu.3385@news-server.bigpond.net.au>
    <eda1c457-4bec-4cc1-902b-a87ca9bb10c4@v38g2000yqb.googlegroups.com>
    <pan.2009.03.05.03.38.34@REMOVE.THIS.cybersource.com.au>
    Message-ID: <7x4oy861w1.fsf@ruckus.brouhaha.com>

    Steven D'Aprano <steven at REMOVE.THIS.cybersource.com.au> writes:
    Sure, but if you want two lists, as the OP asked for, then you have to
    iterate over it twice either way:

    # method 1:
    keys = dict.keys()
    values = dict.values()

    # method 2:
    keys, values = zip(*dict.items())

    First you iterate over the dict to get the items, then you iterate over
    the items to split into two lists. Anyone want to take bets on which is
    faster?
    The first way involves iterating over the dict items twice. The
    second way iterates over the dict items just once, copying them to
    another place; it then iterates over the copy.
  • Tino Wildenhain at Mar 5, 2009 at 7:30 am

    Benjamin Peterson wrote:
    John Machin <sjmachin <at> lexicon.net> writes:
    On Mar 5, 12:13 pm, Benjamin Peterson wrote:
    import binascii
    print binascii.hexlify(some_bytes)
    AFAICT binascii.hexlify(some_bytes) gives the SAME result as
    some_bytes.encode("hex") for much more typing -- I see no
    "better"
    here.
    So called encodings like "hex" and "rot13" are abuse of
    encode() method. encode() should translate
    between byte strings and unicode, not preform
    transformations like that. This has been removed
    in 3.x, so you should use binascii.
    Thats actually not what I understand of the encoding/decoding
    methods (which are very handy, beside the pure charset
    conversions) that is, they translate between multiple (e.g.
    1-n byte strings to a single byte encoding (for encode)
    and the other way round for decode.

    Charset mapping is surely the original purpose but I see no
    reason why all the "pseudo" encodings are bad - since after
    all they are encodings (base64, hex, ... even gzip)
    what is missing at the moment would be urlencoding.

    Cheers
    Tino
    -------------- next part --------------
    A non-text attachment was scrubbed...
    Name: smime.p7s
    Type: application/x-pkcs7-signature
    Size: 3241 bytes
    Desc: S/MIME Cryptographic Signature
    URL: <http://mail.python.org/pipermail/python-list/attachments/20090305/2ecbdbe9/attachment.bin>
  • Hendrik van Rooyen at Mar 5, 2009 at 7:37 am
    "Benjamin Peterson" wrote:

    So called encodings like "hex" and "rot13" are abuse of
    encode() method. encode() should translate
    between byte strings and unicode, not preform
    transformations like that. This has been removed
    in 3.x, so you should use binascii.
    When all else fails, and just for fun, go to first principles:
    hextab = ['0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F']
    s = 'the quick brown fox jums OVER the lazy dog'
    h = []
    for x in s:
    h.append(hextab[ord(x)>>4])
    h.append(hextab[ord(x)&15])
    print ''.join(h)
    74686520717569636B2062726F776E20666F78206A756D73204F56455220746865206C617A792064
    6F67
    >>>

    - Hendrik
  • Vibgyorbits at Mar 5, 2009 at 3:17 pm
    k,thanks all.

    l=map(lambda x: '%02x' %ord(x),d)
    s=string.join(l,sep='')

    PS#. Endedup learning little bit of Lambda functions. :-)

    Scott David Daniels <<< Thanks for your wisdom about the "spaces".
    Its a 3 liner code-snippet!
  • Marco Mariani at Mar 5, 2009 at 3:24 pm

    vibgyorbits wrote:

    l=map(lambda x: '%02x' %ord(x),d)
    s=string.join(l,sep='')

    PS#. Endedup learning little bit of Lambda functions. :-)
    That's so 2007...

    The 2.5-esque way to write that is

    s = ''.join('%02x' % ord(x) for x in d)
  • Vibgyorbits at Mar 5, 2009 at 3:27 pm

    On Mar 5, 9:24?am, Marco Mariani wrote:
    vibgyorbits wrote:
    l=map(lambda x: '%02x' %ord(x),d)
    s=string.join(l,sep='')
    PS#. Endedup learning little bit of Lambda functions. :-)
    That's so 2007...

    The 2.5-esque way to write that is

    s = ''.join('%02x' % ord(x) for x in d)
    Yes..:-) I totally agree..still learning some new stuff. SOme of you
    folks are really good.
    thanks again.. Although, next step would be to really get into the
    algorithm complexity ,but right now
    let me finish my tool & then dig into finer details.
  • Ben Finney at Mar 5, 2009 at 12:03 am

    vibgyorbits <bkajey at gmail.com> writes:

    I'm writing a tool to do some binary file comparisons.
    I'm opening the file using

    fd=open(filename,'rb')

    # Need to seek to 0x80 (hex 80th) location

    fd.seek(0x80)

    # Need to read just 8 bytes and get the result back in hex format.
    x=fd.read(8)
    print x

    This prints out garbage. I would like to know what am i missing here.
    Are you missing anything? Perhaps those bytes, when printed in your
    terminal's encoding, *are* garbage. Are you expecting them to be
    encoded text, or something else?
    Basically, I am trying to read 8 bytes from location 0x80 from a
    binary file called "filename"
    You have successfully done that. What else do you want to do with
    those bytes once read?

    Perhaps a better question: What goal are you trying to accomplish?

    --
    \ ?Pinky, are you pondering what I'm pondering?? ?Wuh, I think |
    `\ so, Brain, but wouldn't anything lose its flavor on the bedpost |
    _o__) overnight?? ?_Pinky and The Brain_ |
    Ben Finney
  • Ben Finney at Mar 5, 2009 at 12:10 am
    I just found a well-hidden part of the behaviour you expected.

    vibgyorbits <bkajey at gmail.com> writes:
    # Need to read just 8 bytes and get the result back in hex format.
    x=fd.read(8)
    print x
    Why would this print the bytes in hex format? ?Convert to hexadecimal?
    is not the default text encoding for ?print?.

    Instead, use the built-in ?hex? function to create a new string with
    the value of the bytes:

    bytes = fd.read(8)
    for byte in bytes:
    print hex(byte)

    If that's not sufficient, you'll need to be more explicit in
    describing what it is you want to do.

    --
    \ ?Pinky, are you pondering what I'm pondering?? ?I think so, |
    `\ Brain, but if it was only supposed to be a three hour tour, why |
    _o__) did the Howells bring all their money?? ?_Pinky and The Brain_ |
    Ben Finney
  • Scott David Daniels at Mar 5, 2009 at 1:25 am

    vibgyorbits wrote:
    I'm writing a tool to do some binary file comparisons.
    I'm opening the file using
    fd=open(filename,'rb')
    # Need to seek to 0x80 (hex 80th) location
    fd.seek(0x80)
    # Need to read just 8 bytes and get the result back in hex format.
    x=fd.read(8)
    print x
    This prints out garbage. I would like to know what am i missing here.
    Basically, I am trying to read
    8 bytes from location 0x80 from a binary file called "filename"
    Any tips/inputs are welcome.
    (1) Put some air into those assignments; spaces are free.
    (2) You probably want something like this:
    import binascii
    fd = open(filename, 'rb')
    fd.seek(0x80)
    x = fd.read(8)
    print binascii.hexlify(x)
    fd.close()

    --Scott David Daniels
    Scott.Daniels at Acm.Org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppython-list @
categoriespython
postedMar 4, '09 at 10:58p
activeMar 5, '09 at 3:27p
posts17
users10
websitepython.org

People

Translate

site design / logo © 2022 Grokbase