FAQ
Hi,

I am trying to open a binary file and find a string that begins:

<x:xapmeta xmlns:x='adobe:ns:meta/

I then want all the data, up to and including the end tag which looks
like this:

</x:xapmeta>


It is easy enough to find the tag I am after

use strict;
use warnings;
open(FH,$file) or die "Can't open $file: $!\n";
binmode(FH);
while (<FH>) {
if ($_ =~ "<x:xapmeta xmlns:x='adobe:ns:meta/") {
print "Found $_\n";
last;
}
}
close(FH);

But once I have found my tag I would like to use sysseek and sysread
to slurp up some data. Is there some way I can find out where my
position in the file is once $_ has matched?

Perhaps there is another way to do what I want but my experiments
with sysread have worked and ben very fast.

Any ideas?
TIA.
Dp.

Search Discussions

  • Tom Phoenix at Aug 11, 2006 at 4:28 pm

    On 8/11/06, Beginner wrote:

    But once I have found my tag I would like to use sysseek and sysread
    to slurp up some data. Is there some way I can find out where my
    position in the file is once $_ has matched?
    You probably want seek() and read(), instead of sysseek() and
    sysread(). (The "sys" variants are very low-level.) Then you would
    want tell() to identify the position in the file. You may need to
    subtract a few bytes if you need to locate the position of a string
    that has already been read. It's not the way most Perl programmers
    would solve the problem, but it may work for you. Good luck with it!

    --Tom Phoenix
    Stonehenge Perl Training
  • Beginner at Aug 11, 2006 at 4:45 pm

    On 11 Aug 2006 at 9:28, Tom Phoenix wrote:
    On 8/11/06, Beginner wrote:

    But once I have found my tag I would like to use sysseek and sysread
    to slurp up some data. Is there some way I can find out where my
    position in the file is once $_ has matched?
    You probably want seek() and read(), instead of sysseek() and
    sysread(). (The "sys" variants are very low-level.) Then you would
    want tell() to identify the position in the file. You may need to
    subtract a few bytes if you need to locate the position of a string
    that has already been read. It's not the way most Perl programmers
    would solve the problem, but it may work for you. Good luck with it!
    Thanx Tom,

    I had just found tell (honest) in the opentut. You are of course
    tight I have to step back a couple of bytes to get to the beginning
    of the string I want but WHOOPIE it works.

    I can quickly retrieve all the XML/XMP from an image file (similar
    to, but no where near as well as, the excellent JPEG::MetaData
    module). $d is now XML and ready for parsing.

    I would be interested to know who I can improve this, or what a real
    programmer would do differently. Any tips are much appreciated.

    Thanx.
    Dp.


    ================ What I have so far =========

    use strict;
    use warnings;
    use XML::Simple;
    use Data::Dumper;

    my $file = 'test2.tif';
    my ($d, $start,$end);

    open(FH, $file) or die "Can't open $file: $!\n";

    binmode(FH);
    while ( <FH> ) {
    if ($_ =~ "<x:xapmeta xmlns:x='adobe:ns:meta/") {
    $start = tell FH;
    }
    if ($_ =~ "</x:xapmeta>") {
    $end = tell FH;
    last;
    }
    }

    $start -= 84; # Length of string above.
    my $amount = ($end - $start);

    print "Start=$start, END=$end, $amount\n";
    seek(FH,$start,0);
    read(FH,$d, $amount);

    close(FH);
    print Dumper($d);
    ================
  • John W. Krahn at Aug 11, 2006 at 9:45 pm

    Beginner wrote:

    I had just found tell (honest) in the opentut. You are of course
    tight I have to step back a couple of bytes to get to the beginning
    of the string I want but WHOOPIE it works.

    I can quickly retrieve all the XML/XMP from an image file (similar
    to, but no where near as well as, the excellent JPEG::MetaData
    module). $d is now XML and ready for parsing.

    I would be interested to know who I can improve this, or what a real
    programmer would do differently. Any tips are much appreciated.
    Okey doke!

    ================ What I have so far =========

    use strict;
    use warnings;
    use XML::Simple;
    use Data::Dumper;

    my $file = 'test2.tif';
    my ($d, $start,$end);

    open(FH, $file) or die "Can't open $file: $!\n";

    binmode(FH);
    while ( <FH> ) {
    if ($_ =~ "<x:xapmeta xmlns:x='adobe:ns:meta/") {
    $start = tell FH;
    }
    if ($_ =~ "</x:xapmeta>") {
    $end = tell FH;
    last;
    }
    }

    $start -= 84; # Length of string above.
    my $amount = ($end - $start);

    print "Start=$start, END=$end, $amount\n";
    seek(FH,$start,0);
    read(FH,$d, $amount);

    close(FH);
    print Dumper($d);
    ================

    use strict;
    use warnings;
    use XML::Simple;
    use Data::Dumper;

    my $file = 'test2.tif';

    open my $FH, '<:raw', $file or die "Can't open $file: $!\n";

    my $data;
    while ( <$FH> ) {
    next unless s!.*?<x:xapmeta xmlns:x='adobe:ns:meta/!!;
    $data = $_;

    $data .= <$FH> until $data =~ s!</x:xapmeta>.*!!s;
    last;
    }

    close $FH;
    print Dumper $data;





    John
    --
    use Perl;
    program
    fulfillment
  • Beginner at Aug 14, 2006 at 7:50 am

    On 11 Aug 2006 at 14:45, John W. Krahn wrote:

    Beginner wrote:
    I would be interested to know who I can improve this, or what a real
    programmer would do differently. Any tips are much appreciated.
    Okey doke!
    ================ What I have so far =========

    use strict;
    use warnings;
    use XML::Simple;
    use Data::Dumper;

    my $file = 'test2.tif';
    my ($d, $start,$end);

    open(FH, $file) or die "Can't open $file: $!\n";

    binmode(FH);
    while ( <FH> ) {
    if ($_ =~ "<x:xapmeta xmlns:x='adobe:ns:meta/") {
    $start = tell FH;
    }
    if ($_ =~ "</x:xapmeta>") {
    $end = tell FH;
    last;
    }
    }

    $start -= 84; # Length of string above.
    my $amount = ($end - $start);

    print "Start=$start, END=$end, $amount\n";
    seek(FH,$start,0);
    read(FH,$d, $amount);

    close(FH);
    print Dumper($d);
    ================

    use strict;
    use warnings;
    use XML::Simple;
    use Data::Dumper;

    my $file = 'test2.tif';

    open my $FH, '<:raw', $file or die "Can't open $file: $!\n";

    my $data;
    while ( <$FH> ) {
    next unless s!.*?<x:xapmeta xmlns:x='adobe:ns:meta/!!;
    $data = $_;

    $data .= <$FH> until $data =~ s!</x:xapmeta>.*!!s;
    last;
    }

    close $FH;
    print Dumper $data;

    John
    --
    use Perl;
    program
    fulfillment


    That's interesting, thanx John.

    It is leaner. You have eliminated all the seek/read stuff, nice.

    You haven't specified binmode, is it implied by the '<:raw' notation?

    What is "s!." in line 12, "next unless s!...."

    I haven't seen a filehandle made into a variable, is there some more
    reading I could be doing, perlI/O perhaps?

    Here the output I get:

    $VAR1 = '\' x:xaptk=\'XMP toolkit 2.8.2-33, framework 1.5\'>

    I am missing a couple of characters at the beginning.

    Your help is much appreciated. I am trying to learn more about the
    perl mind-set and how to program corerctly. Thanx again.
    Dp.
  • John W. Krahn at Aug 14, 2006 at 10:52 am

    Beginner wrote:
    On 11 Aug 2006 at 14:45, John W. Krahn wrote:

    use strict;
    use warnings;
    use XML::Simple;
    use Data::Dumper;

    my $file = 'test2.tif';

    open my $FH, '<:raw', $file or die "Can't open $file: $!\n";

    my $data;
    while ( <$FH> ) {
    next unless s!.*?<x:xapmeta xmlns:x='adobe:ns:meta/!!;
    $data = $_;

    $data .= <$FH> until $data =~ s!</x:xapmeta>.*!!s;
    last;
    }

    close $FH;
    print Dumper $data;

    That's interesting, thanx John.

    It is leaner. You have eliminated all the seek/read stuff, nice.

    You haven't specified binmode, is it implied by the '<:raw' notation?
    Yes.

    perldoc PerlIO

    What is "s!." in line 12, "next unless s!...."
    s/// is the substitution operator.

    perldoc perlop

    Because the pattern contains the '/' character I used the '!' character to
    delimit it instead of using '/'.

    I haven't seen a filehandle made into a variable, is there some more
    reading I could be doing, perlI/O perhaps?
    perldoc -f open
    [snip]
    If FILEHANDLE is an undefined scalar variable (or array or hash
    element) the variable is assigned a reference to a new anonymous
    filehandle,

    perldoc perlopentut
    [snip]
    Indirect Filehandles

    "open"’s first argument can be a reference to a filehandle. As of perl
    5.6.0, if the argument is uninitialized, Perl will automatically create a
    filehandle and put a reference to it in the first argument, like so:

    Here the output I get:

    $VAR1 = '\' x:xaptk=\'XMP toolkit 2.8.2-33, framework 1.5\'>

    I am missing a couple of characters at the beginning.
    I am just going by the example you provided, I don't have the actual data to
    test it on.


    John
    --
    use Perl;
    program
    fulfillment
  • Beginner at Aug 14, 2006 at 11:15 am
    On 14 Aug 2006 at 3:51, John W. Krahn wrote:

    One last question (honest).
    What is "s!." in line 12, "next unless s!...."
    s/// is the substitution operator.
    Why are you substituing here? Isn't a match good enough or is it
    necessary for some other reason?
    Here the output I get:

    $VAR1 = '\' x:xaptk=\'XMP toolkit 2.8.2-33, framework 1.5\'>

    I am missing a couple of characters at the beginning.
    I am just going by the example you provided, I don't have the actual data to
    test it on.
    Very true.

    Thanx again.
    Dp.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupbeginners @
categoriesperl
postedAug 11, '06 at 3:36p
activeAug 14, '06 at 11:15a
posts7
users3
websiteperl.org

People

Translate

site design / logo © 2022 Grokbase