FAQ
Hey all,

I'm kind of new to perl and i've come accross a problem:

I have a large text file with fields separated by pipes which should appear as:

ABIT|GC|MX200|nVIDIA GeForce 2 MX200 32MB AGP Card|75.00|74.00|73.00|10%|99.00

However alot of entries are split over 2 lines like this:

ABIT|MB|BD7-RAID|S478/I845D/ATX/AGP/6PCI/AC97/2 DDR/400FSB/1CNR/133

RAID*2|239.00|CALL|CALL|10%|299.00

I wrote a small perl script to split the lines by a pipe into an array, and if the array size was less than a certain value, join it with the next line.

To test the script i copied a few paragraphs from the big text file to test on. When i ran the script on the small file, it sorted everything perfectly. But whenever i ran the script on the full file it always came out wrong, after a few hours of frustration, i ended up copying the entire file and pasting it to a new file. After running the script on the new file, it sorted it fine??

The original file was saved on a windows machine, and i copy/pasted it through a linux text editor. I noticed when i edit the original file using vi, down the bottom next to the file name is [dos]. Im thinking this has something to do with it.



Thanks in advance for any help or ideas that some of you may have,



Jo

Search Discussions

  • Timothy Johnson at Jul 3, 2002 at 12:18 am
    Microsoft operating systems use different characters for line endings than
    unix boxes. I believe it is the equivalent to the UNIX \r\n.

    -----Original Message-----
    From: Jo
    Sent: Tuesday, July 02, 2002 4:43 PM
    To: PerlBeginners
    Subject: Processing text


    Hey all,

    I'm kind of new to perl and i've come accross a problem:

    I have a large text file with fields separated by pipes which should appear
    as:

    ABIT|GC|MX200|nVIDIA GeForce 2 MX200 32MB AGP
    Card|75.00|74.00|73.00|10%|99.00

    However alot of entries are split over 2 lines like this:

    ABIT|MB|BD7-RAID|S478/I845D/ATX/AGP/6PCI/AC97/2 DDR/400FSB/1CNR/133

    RAID*2|239.00|CALL|CALL|10%|299.00

    I wrote a small perl script to split the lines by a pipe into an array, and
    if the array size was less than a certain value, join it with the next line.


    To test the script i copied a few paragraphs from the big text file to test
    on. When i ran the script on the small file, it sorted everything perfectly.
    But whenever i ran the script on the full file it always came out wrong,
    after a few hours of frustration, i ended up copying the entire file and
    pasting it to a new file. After running the script on the new file, it
    sorted it fine??

    The original file was saved on a windows machine, and i copy/pasted it
    through a linux text editor. I noticed when i edit the original file using
    vi, down the bottom next to the file name is [dos]. Im thinking this has
    something to do with it.



    Thanks in advance for any help or ideas that some of you may have,



    Jo
  • Elaine -HFB- Ashton at Jul 3, 2002 at 3:14 pm
    Jo [joe@oshima.com.au] quoth:
    *>
    *>The original file was saved on a windows machine, and i copy/pasted it through a linux text editor. I noticed when i edit the original file using vi, down the bottom next to the file name is [dos]. Im thinking this has something to do with it.

    http://history.perl.org/oneliners/filters/dos2unix.html

    You can also simply use the unix utility 'dos2unix' which comes with most
    flavours of unix these days.

    e.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupbeginners @
categoriesperl
postedJul 2, '02 at 11:43p
activeJul 3, '02 at 3:14p
posts3
users3
websiteperl.org

People

Translate

site design / logo © 2021 Grokbase