On Mon, 27 Feb 2006 10:49:35 -0600, Daniel.Gladstone@acxiom.com ("Gladstone Daniel - dglads") wrote:
I have a text file with around 1 million lines and I need to do a search
And replace on over 9000 words. I am currently reading a line and
passing
A hash table against it and any matches it is replacing the word in the
string. It is running real slow. Any thoughts on how to improve it?
I have a text file with around 1 million lines and I need to do a search
And replace on over 9000 words. I am currently reading a line and
passing
A hash table against it and any matches it is replacing the word in the
string. It is running real slow. Any thoughts on how to improve it?
It will help optimize the regexp for your list. If you think about
it, 9000 words will have alot in common, so you should be able
to find patterns in those sets of words, and rexexp for them.
There is no need to look for each word.
You also might experiment with breaking your 9000 word
list into smaller lists, use the optimizer on those smaller
lists, and run each one against each line separately.