FAQ
Hi,

I analyzing some netwokr log files. There are around 200-300 files and
each file has more than 2 million entries in it.

Currently my script is reading each file line by line. So it will take
lot of time to process all the files.

Is there any efficient way to do it?

May be Multiprocessing, Multitasking ?


Thanks.

Search Discussions

  • Mr. Shawn H. Corey at Dec 12, 2008 at 6:29 pm

    On Thu, 2008-12-11 at 12:28 -0800, friend.05@gmail.com wrote:
    Hi,

    I analyzing some netwokr log files. There are around 200-300 files and
    each file has more than 2 million entries in it.

    Currently my script is reading each file line by line. So it will take
    lot of time to process all the files.

    Is there any efficient way to do it?

    May be Multiprocessing, Multitasking ?
    Are all these files on the same disk? Are they in the same partition?
    What's going to slow down processing the most is disk I/O. Speeding
    that up will give quicker results than multiprocessing, multitasking or
    threading.


    --
    Just my 0.00000002 million dollars worth,
    Shawn

    The key to success is being too stupid to realize you can fail.
  • Rob Dixon at Dec 12, 2008 at 8:39 pm

    friend.05@gmail.com wrote:

    I analyzing some netwokr log files. There are around 200-300 files and
    each file has more than 2 million entries in it.

    Currently my script is reading each file line by line. So it will take
    lot of time to process all the files.

    Is there any efficient way to do it?

    May be Multiprocessing, Multitasking ?
    Reading about 40GB of data line by line is going to take several seconds I'm
    afraid, but what are you doing with the lines as you read them? You may be able
    to speed things up a little using techniques appropriate to your application.

    I suggest you start by writing a benchmark program that just reads through all
    of the files without processing the data and see what your baseline speed is.
    Then see how much overhead the processing adds so that you know which code to
    optimise.

    Rob

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupbeginners @
categoriesperl
postedDec 11, '08 at 8:29p
activeDec 12, '08 at 8:39p
posts3
users3
websiteperl.org

People

Translate

site design / logo © 2022 Grokbase