Shridhar Daithankar
Recently, I ran a huge update on an Integer column affecting 100
million
rows in my database. What happened was my disk space increased in
size
and
my IO load was very high. It appears that MVCC wants to rewrite each
row
(each row was about 5kB due to a bytea column). In addition, VACUUM needs
to run to recover space eating up even more IO bandwidth.
I am sure people have answered the approach you have suggested so let me
suggest a workaround for your problem.

You could run following in a transaction.

- begin
- Create another table with exact same structure
- write a procedure that reads from input table and updates the value in
between
- drop the original table
- rename new table to old one
- commit
- analyze new table

Except for increased disk space, this approach has all the good things
postgresql offers. Especially using transactable DDLs it is huge benefit.
You
certainly do save on vacuum.

If the entire table is updated then you can almost certainly get things
done
faster this way.
I second this approach. Space usage similar, time faster.

For a procedure, you can use an Insert Select with a Case statement
instead of the UPDATE step.

This type of approach is in use in many other DBMS. I would vote
strongly (how is that possible?) against doing anything at all to MVCC.
It works, it's brilliant and doesn't fall foul of anybody else's patent
rights. It would be very hard to re-write anything as good and the time
spent would double other requirements. MySQL has multiple storage
managers and that must slow them down enormously trying to test or even
fine tune things.

Best Regards, Simon Riggs

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 7 of 11 | next ›
Discussion Overview
grouppgsql-hackers @
categoriespostgresql
postedFeb 24, '04 at 4:49p
activeMar 25, '04 at 1:00p
posts11
users9
websitepostgresql.org...
irc#postgresql

People

Translate

site design / logo © 2021 Grokbase