FAQ
I have Perl 5.6.1 on Sun Solaris.

I am processing a text file which will be imported into our
typesetting software. In our typesetting software I want to make
sure a number does not separate from its unit of measure. So I want
to keep "21 cm" together by changing it to "<bx;1>21 cm<ba>".

My problem is my perl variable contains multiple matches for the
pattern "\d+ cm" and I want to surround each match with <bx;1> and
<ba>. Here is an example line before changing it.

54 x 34 x 30-3/4 H<l>137 x 86 x 78 cm<l>Kneehole Height: 24-1/2``
(62 cm)<l>Chair height: 30-3/4 (78 cm)<l>

(Don't worry about special strings like <l>, they are used by our
typesetting software.) Notice that 78 cm appears twice, both should
have <bx;1><ba> around them.

The line should end up like this:
54 x 34 x 30-3/4 H<l>137 x 86 x <bx;1>78 cm<ba><l>Kneehole Height:
24-1/2`` (<bx;1>62 cm<ba>)<l>Chair height: 30-3/4 (<bx;1>78 cm<ba>)<l>

I have tried the following code to loop through the matches but Perl
always finds the first instance only. Can anyone help me? I have
already done a Google search, only to find thousands of sites that
are of absolutely no use to me. And I have read some of the perl
docs, and those did not help me either.

sub fixdesc2
{my($l)=@_; # Pass in string to process.
my($s,$old,$new);

$s=$l;
while ($s=~m/\d+ +cm/g)
{
$old=$&; # Save current match.
$new=$old;
$s=~s/$old/<bx;1>$new<ba>/;
} # while

return $s; # fixdesc2
}

Thank you.

--
__________
Stop paying for shipping and dead animals. Trade locally. Free
registration. No fees. Just friendly local people.
Kentwood Aquarium Trade: http://www.network54.com/Forum/436090/
(This is not an auction site, it is a community service forum.)

Search Discussions

  • D. Bolliger at Oct 25, 2006 at 5:45 pm

    C. Roberts am Mittwoch, 25. Oktober 2006 19:32:
    I have Perl 5.6.1 on Sun Solaris.

    I am processing a text file which will be imported into our
    typesetting software. In our typesetting software I want to make
    sure a number does not separate from its unit of measure. So I want
    to keep "21 cm" together by changing it to "<bx;1>21 cm<ba>".

    My problem is my perl variable contains multiple matches for the
    pattern "\d+ cm" and I want to surround each match with <bx;1> and
    <ba>. Here is an example line before changing it.

    54 x 34 x 30-3/4 H<l>137 x 86 x 78 cm<l>Kneehole Height: 24-1/2``
    (62 cm)<l>Chair height: 30-3/4 (78 cm)<l>

    (Don't worry about special strings like <l>, they are used by our
    typesetting software.) Notice that 78 cm appears twice, both should
    have <bx;1><ba> around them.

    The line should end up like this:
    54 x 34 x 30-3/4 H<l>137 x 86 x <bx;1>78 cm<ba><l>Kneehole Height:
    24-1/2`` (<bx;1>62 cm<ba>)<l>Chair height: 30-3/4 (<bx;1>78 cm<ba>)<l>

    I have tried the following code to loop through the matches but Perl
    always finds the first instance only. Can anyone help me? I have
    already done a Google search, only to find thousands of sites that
    are of absolutely no use to me. And I have read some of the perl
    docs, and those did not help me either.

    sub fixdesc2
    {my($l)=@_; # Pass in string to process.
    my($s,$old,$new);

    $s=$l;
    while ($s=~m/\d+ +cm/g)
    {
    $old=$&; # Save current match.
    $new=$old;
    $s=~s/$old/<bx;1>$new<ba>/;
    } # while

    return $s; # fixdesc2
    }
    What about:

    #!/usr/bin/perl
    use strict;
    use warnings;

    while (<DATA>) {
    s/(\d+\s+cm)/<bx;1>$1<ba>/g;
    print;
    }

    __DATA__
    54 x 34 x 30-3/4 H<l>137 x 86 x 78 cm<l>Kneehole Height: 24-1/2`` (62 cm)<l>Chair height: 30-3/4 (78 cm)<l>
  • C . R . at Oct 25, 2006 at 6:38 pm
    Well, that kinda worked. I had to change it to work on a scalar so this
    is what I wrote:
    $s=~s/(\d+ +cm)/<bx;1>$1<ba>/g;

    Input string: 144 cm
    Output string: <bx;1>14<bx;1>4 cm<ba><ba>

    Why did I get duplicate <bx;1> and <ba> strings?
    Is the \G operator here and does v5.6.1 have it?

    Chuck
  • D. Bolliger at Oct 25, 2006 at 9:06 pm

    C.R. am Mittwoch, 25. Oktober 2006 20:38:
    Well, that kinda worked. I had to change it to work on a scalar so this
    is what I wrote:
    $s=~s/(\d+ +cm)/<bx;1>$1<ba>/g;

    Input string: 144 cm
    Output string: <bx;1>14<bx;1>4 cm<ba><ba>

    Why did I get duplicate <bx;1> and <ba> strings?
    Hm, I can't reproduce this (perl 5.8.8):

    $ perl -le 'my $s=q(144 cm); $s=~s/(\d+ +cm)/<bx;1>$1<ba>/g; print $s;'
    <bx;1>144 cm<ba>

    What exactly did you do?
    Is the \G operator here ?
    and does v5.6.1 have it?
    Don't remember, sorry.

    Dani
  • C . R . at Oct 27, 2006 at 4:27 pm
    In article <200610252306.50496.info@dbolliger.ch>, info@dbolliger.ch
    says...
    C.R. am Mittwoch, 25. Oktober 2006 20:38:
    Well, that kinda worked. I had to change it to work on a scalar so this
    is what I wrote:
    $s=~s/(\d+ +cm)/<bx;1>$1<ba>/g;

    Input string: 144 cm
    Output string: <bx;1>14<bx;1>4 cm<ba><ba>

    Why did I get duplicate <bx;1> and <ba> strings?
    Hm, I can't reproduce this (perl 5.8.8):

    $ perl -le 'my $s=q(144 cm); $s=~s/(\d+ +cm)/<bx;1>$1<ba>/g; print $s;'
    <bx;1>144 cm<ba>
    My program (perl 5.6.1 on Solars) picks the data out of an array like
    this:
    $s=$a[2];

    Then I attemp to process $s like this:
    $s=~s/(\d+ +cm)/<bx;1>$1<ba>/g;

    It's not real complicated. Later I write out $s to a file. $s does not
    contain any line feeds or carriage returns.

    In the debugger I display $s just before the substitution executes, step
    past the line, then I display $s in the debugger where I can see too
    many <bx;1>s and <ba>s.
  • Rob Dixon at Oct 27, 2006 at 7:09 pm

    C.R. wrote:
    In article <200610252306.50496.info@dbolliger.ch>, info@dbolliger.ch
    says...
    C.R. am Mittwoch, 25. Oktober 2006 20:38:
    Well, that kinda worked. I had to change it to work on a scalar so this
    is what I wrote:
    $s=~s/(\d+ +cm)/<bx;1>$1<ba>/g;
    >>>
    Input string: 144 cm
    Output string: <bx;1>14<bx;1>4 cm<ba><ba>
    >>>
    Why did I get duplicate <bx;1> and <ba> strings?
    Hm, I can't reproduce this (perl 5.8.8):
    >>
    $ perl -le 'my $s=q(144 cm); $s=~s/(\d+ +cm)/<bx;1>$1<ba>/g; print $s;'
    <bx;1>144 cm<ba>
    >>
    My program (perl 5.6.1 on Solars) picks the data out of an array like
    this:
    $s=$a[2]; >
    Then I attemp to process $s like this:
    $s=~s/(\d+ +cm)/<bx;1>$1<ba>/g; >
    It's not real complicated. Later I write out $s to a file. $s does not
    contain any line feeds or carriage returns. >
    In the debugger I display $s just before the substitution executes, step
    past the line, then I display $s in the debugger where I can see too
    many <bx;1>s and <ba>s.
    You need to show us your code Chuck. Perl doesn't do that, in any situation that
    I can think of. Try running this on its own:

    my $s = '144 cm';
    $s =~ s/(\d+ +cm)/<bx;1>$1<ba>/g;
    print $s;

    I get

    <bx;1>144 cm<ba>

    what do you get?

    That may help on its own. If not, like I said, post the relevant part of your
    code.

    Rob
  • C . R . at Oct 31, 2006 at 4:21 pm
    In article <45425A1D.60401@350.com>, rob.dixon@350.com says...
    You need to show us your code Chuck. Perl doesn't do that, in any situation that
    I can think of. Try running this on its own:

    my $s = '144 cm';
    $s =~ s/(\d+ +cm)/<bx;1>$1<ba>/g;
    print $s;

    I get

    <bx;1>144 cm<ba>

    what do you get?

    That may help on its own. If not, like I said, post the relevant part of your
    code.
    Your example above is extremely simple, and simply does not apply to my
    situation. But yes, that code above will work on my version of Perl,
    becasue Perl is only replacing one instance of /\d+ cm/. My situation is
    more complicated where I need to replace MULTIPLE instances of /\d+ cm/
    in a single string.

    My first post in this thread shows example data as it is stored in a
    scalar variable. It also shows what the string SHOULD look like after
    the substitution.

    Or maybe, perl simply is not able to replace multiple instances of a
    regex expression in a single scalar/string variable.

    $s="54 x 34 x 30-3/4 H<l>137 x 86 x 78 cm<l>Kneehole Height: 24-1/2``
    (62 cm)<l>Chair height: 30-3/4 (78 cm)<l>";

    (Don't worry about special strings like <l>, they are used by our
    typesetting software.) Notice that 78 cm appears twice, both should
    have <bx;1><ba> around them.

    $s should end up like this:
    54 x 34 x 30-3/4 H<l>137 x 86 x <bx;1>78 cm<ba><l>Kneehole Height:
    24-1/2`` (<bx;1>62 cm<ba>)<l>Chair height: 30-3/4 (<bx;1>78 cm<ba>)<l>

    Notice the insertion of <bx;1> and <ba> around strings that match
    /\d+ cm/.
  • Rob Dixon at Oct 31, 2006 at 9:40 pm
    Chuck Roberts wrote:
    >
    In article <45425A1D.60401@350.com>, rob.dixon@350.com says...
    >>
    You need to show us your code Chuck. Perl doesn't do that, in any situation
    that I can think of. Try running this on its own:
    >>
    my $s = '144 cm';
    $s =~ s/(\d+ +cm)/<bx;1>$1<ba>/g;
    print $s;
    >>
    I get
    >>
    <bx;1>144 cm<ba>
    >>
    what do you get?
    >>
    That may help on its own. If not, like I said, post the relevant part of your
    code.
    >
    Your example above is extremely simple, and simply does not apply to my
    situation. But yes, that code above will work on my version of Perl,
    because Perl is only replacing one instance of /\d+ cm/. My situation is
    more complicated where I need to replace MULTIPLE instances of /\d+ cm/
    in a single string.
    Sure, but my intention was to reproduce what you were seeing, and you posted
    this:

    Chuck Roberts wrote:
    >>>
    Well, that kinda worked. I had to change it to work on a scalar so this
    is what I wrote:
    >>>
    $s=~s/(\d+ +cm)/<bx;1>$1<ba>/g;
    >>>
    Input string: 144 cm
    Output string: <bx;1>14<bx;1>4 cm<ba><ba>
    So I coded exactly that (except that I replaced the unlimited spaces with
    unlimited whitespace) and got a different result.
    My first post in this thread shows example data as it is stored in a
    scalar variable. It also shows what the string SHOULD look like after
    the substitution. >
    Or maybe, perl simply is not able to replace multiple instances of a
    regex expression in a single scalar/string variable. >
    $s="54 x 34 x 30-3/4 H<l>137 x 86 x 78 cm<l>Kneehole Height: 24-1/2``
    (62 cm)<l>Chair height: 30-3/4 (78 cm)<l>"; >
    (Don't worry about special strings like <l>, they are used by our
    typesetting software.) Notice that 78 cm appears twice, both should
    have <bx;1><ba> around them. >
    $s should end up like this:
    54 x 34 x 30-3/4 H<l>137 x 86 x <bx;1>78 cm<ba><l>Kneehole Height:
    24-1/2`` (<bx;1>62 cm<ba>)<l>Chair height: 30-3/4 (<bx;1>78 cm<ba>)<l> >
    Notice the insertion of <bx;1> and <ba> around strings that match
    /\d+ cm/.
    Fine. Lets try again:

    my $s="54 x 34 x 30-3/4 H<l>137 x 86 x 78 cm<l>Kneehole Height: 24-1/2`` (62
    cm)<l>Chair height: 30-3/4 (78 cm)<l>";
    $s =~ s/(\d+\s+cm)/<bx;1>$1<ba>/g;
    print $s, "\n";

    **OUTPUT**

    54 x 34 x 30-3/4 H<l>137 x 86 x <bx;1>78 cm<ba><l>Kneehole Height: 24-1/2``
    (<bx;1>62 cm<ba>)<l>Chair height: 30-3/4 (<bx;1>78 cm<ba>)<l>

    which is exactly what I expected and exactly what you say you want, but somehow
    it's not what you're getting. Something else is happening somewhere and we can't
    be sure what it is unless you show us your code as I asked.

    But I can try to guess. It does look very much to me as if you're trying to make
    the substitution twice, using two different methods. Look:

    my $s = '144 cm';
    $s =~ s/(\d+\s+cm)/<bx;1>$1<ba>/g;
    $s =~ s/(\d +cm)/<bx;1>$1<ba>/g;

    print $s, "\n";

    **OUTPUT**

    <bx;1>14<bx;1>4 cm<ba><ba>

    Which is exactly what you're seeing. Check your code for something like that.

    And if I'm wrong, then please post your code.

    Rob
  • C . R . at Nov 6, 2006 at 7:30 pm
    In article <4547C2C2.90808@350.com>, rob.dixon@350.com says...
    My first post in this thread shows example data as it is stored in a
    scalar variable. It also shows what the string SHOULD look like after
    the substitution.

    Or maybe, perl simply is not able to replace multiple instances of a
    regex expression in a single scalar/string variable.

    $s="54 x 34 x 30-3/4 H<l>137 x 86 x 78 cm<l>Kneehole Height: 24-1/2``
    Thanks. But what version of Perl are you using Rob? I'm using 5.6.1 on
    Solaris. Perhaps my version is buggy in this situation.

    Chuck
  • Mumia W. at Oct 31, 2006 at 10:09 pm

    On 10/31/2006 10:20 AM, C.R. wrote:
    In article <45425A1D.60401@350.com>, rob.dixon@350.com says...
    You need to show us your code Chuck. Perl doesn't do that, in any situation that
    I can think of. Try running this on its own:

    my $s = '144 cm';
    $s =~ s/(\d+ +cm)/<bx;1>$1<ba>/g;
    print $s;

    I get

    <bx;1>144 cm<ba>

    what do you get?

    That may help on its own. If not, like I said, post the relevant part of your
    code.
    Your example above is extremely simple, and simply does not apply to my
    situation. But yes, that code above will work on my version of Perl,
    becasue Perl is only replacing one instance of /\d+ cm/. My situation is
    more complicated where I need to replace MULTIPLE instances of /\d+ cm/
    in a single string.

    My first post in this thread shows example data as it is stored in a
    scalar variable. It also shows what the string SHOULD look like after
    the substitution.

    Or maybe, perl simply is not able to replace multiple instances of a
    regex expression in a single scalar/string variable.

    $s="54 x 34 x 30-3/4 H<l>137 x 86 x 78 cm<l>Kneehole Height: 24-1/2``
    (62 cm)<l>Chair height: 30-3/4 (78 cm)<l>";

    (Don't worry about special strings like <l>, they are used by our
    typesetting software.) Notice that 78 cm appears twice, both should
    have <bx;1><ba> around them.

    $s should end up like this:
    54 x 34 x 30-3/4 H<l>137 x 86 x <bx;1>78 cm<ba><l>Kneehole Height:
    24-1/2`` (<bx;1>62 cm<ba>)<l>Chair height: 30-3/4 (<bx;1>78 cm<ba>)<l>

    Notice the insertion of <bx;1> and <ba> around strings that match
    /\d+ cm/.
    Did you try the code that Rob Dixon posted? It works for me with your
    longer example string.
  • D. Bolliger at Nov 1, 2006 at 9:56 am

    C.R. am Dienstag, 31. Oktober 2006 17:20:
    In article <45425A1D.60401@350.com>, rob.dixon@350.com says...
    You need to show us your code Chuck. Perl doesn't do that, in any
    situation that I can think of. Try running this on its own:

    my $s = '144 cm';
    $s =~ s/(\d+ +cm)/<bx;1>$1<ba>/g;
    print $s;

    I get

    <bx;1>144 cm<ba>

    what do you get?

    That may help on its own. If not, like I said, post the relevant part of
    your code.
    Your example above is extremely simple, and simply does not apply to my
    situation. But yes, that code above will work on my version of Perl,
    becasue Perl is only replacing one instance of /\d+ cm/. My situation is
    more complicated where I need to replace MULTIPLE instances of /\d+ cm/
    in a single string.

    My first post in this thread shows example data as it is stored in a
    scalar variable. It also shows what the string SHOULD look like after
    the substitution.

    Or maybe, perl simply is not able to replace multiple instances of a
    regex expression in a single scalar/string variable.

    $s="54 x 34 x 30-3/4 H<l>137 x 86 x 78 cm<l>Kneehole Height: 24-1/2``
    (62 cm)<l>Chair height: 30-3/4 (78 cm)<l>";

    (Don't worry about special strings like <l>, they are used by our
    typesetting software.) Notice that 78 cm appears twice, both should
    have <bx;1><ba> around them.

    $s should end up like this:
    54 x 34 x 30-3/4 H<l>137 x 86 x <bx;1>78 cm<ba><l>Kneehole Height:
    24-1/2`` (<bx;1>62 cm<ba>)<l>Chair height: 30-3/4 (<bx;1>78 cm<ba>)<l>

    Notice the insertion of <bx;1> and <ba> around strings that match
    /\d+ cm/.
    Hello Chuck (again)

    Have a look at the code of your first posting:

    [Chuck:]
    while ($s=~m/\d+ +cm/g)
    {
    $old=$&; # Save current match.
    $new=$old;
    $s=~s/$old/<bx;1>$new<ba>/;
    } # while
    That's too complicated and thus also error prone. You don't need a loop to
    replace all occurances in a string. The /g modifier is here to do that.

    Simply replace /all/ above lines with:

    $s=~s/(\d+ +cm)/<bx;1>$1<ba>/g;

    (or any of the variants presented by others)

    I hope this helps.

    Dani
  • Dr.Ruud at Oct 25, 2006 at 8:33 pm

    "D. Bolliger" schreef:

    #!/usr/bin/perl
    use strict;
    use warnings;

    while (<DATA>) {
    s/(\d+\s+cm)/<bx;1>$1<ba>/g;
    print;
    }

    __DATA__
    54 x 34 x 30-3/4 H<l>137 x 86 x 78 cm<l>Kneehole Height: 24-1/2`` (62
    cm)<l>Chair height: 30-3/4 (78 cm)<l>
    If "cm" can be wrapped to the next line, either slurp or use paragraph
    mode:


    #!/usr/bin/perl
    use strict ;
    use warnings ;

    { local $/ = ''; # paragraph mode
    while (<DATA>)
    {
    s/([0-9]+\s+cm)/<bx;1>$1<ba>/g ;
    print ;
    }
    }

    __DATA__
    54 x 34 x 30-3/4 H<l>137 x 86 x 78 cm<l>Kneehole Height: 24.5" (62
    cm)<l>Chair height: 30-3/4 (78 cm)<l>


    In Latin1 (ISO-8859-1) there is a NBSP available.

    --
    Affijn, Ruud

    "Gewoon is een tijger."

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupbeginners @
categoriesperl
postedOct 25, '06 at 5:32p
activeNov 6, '06 at 7:30p
posts12
users6
websiteperl.org

People

Translate

site design / logo © 2021 Grokbase