I am working on a possible extension of postgresql mvcc to support very timely failure masking in the context of three-tier applications so i am currently studying Postgresql internals...
I am wondering what are the reasons why both the MultiXactIds and the corresponding OFFSETs and MEMBERs are currently persisted.
In multixact.c 's documentation on the top of the file you can find the following statement:
"...This allows us to completely rebuild the data entered since the last checkpoint during XLOG replay..."
I can see the need to persist (not eagerly) multixactids to avoid wraparounds. Essentially, mass storage is used to extend the limited capabity of slrus data structures in shared memory.
The point i am missing is the need to be able to completely recover multixacts offsets and members data. These carry information about current transactions holding shared locks on db tuples, which should not be essential for recovery purposes. After a crash you want to recover the content of your data, not the presence of shared locks on any tuple. AFAICS, this seems true for both committed/aborted transactions (which being concluded do not care any more about the fact that they could have held any shared lock), as well as prepared transactions (which only need to recover their exclusive locks).
I have tried to dig around the comments within the main multixact.c functions and i have walked through this comment (CreateMultiXactId())):
"...The only way for the MXID to be referenced from any data page is for heap_lock_tuple() to have put it there, and heap_lock_tuple() generates an XLOG record that must follow ours... "
But still I cannot see the need to recover complete shared locks info (i.e. not only multixactids but also the corresponding registered transactionids that were holding the lock)...
May be this is needed to support savepoints/subtransactions? Or is it something else that i am missing?
Thanks for your precious help!
Chiacchiera con i tuoi amici in tempo reale!