On Mon, 27 Feb 2006 10:49:35 -0600, Daniel.Gladstone@acxiom.com ("Gladstone Daniel - dglads") wrote:

I have a text file with around 1 million lines and I need to do a search

And replace on over 9000 words. I am currently reading a line and
A hash table against it and any matches it is replacing the word in the
string. It is running real slow. Any thoughts on how to improve it?
Look on cpan for Regexp-Optimizer.

It will help optimize the regexp for your list. If you think about
it, 9000 words will have alot in common, so you should be able
to find patterns in those sets of words, and rexexp for them.
There is no need to look for each word.

You also might experiment with breaking your 9000 word
list into smaller lists, use the optimizer on those smaller
lists, and run each one against each line separately.

I'm not really a human, but I play one on earth.

Search Discussions

Discussion Posts


Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 3 of 3 | next ›
Discussion Overview
groupbeginners @
postedFeb 27, '06 at 4:49p
activeFeb 28, '06 at 2:15p



site design / logo © 2022 Grokbase