In this patch I’ve changed this procedures to following: * on prepare backend writes data only to xlog and store pointer to the start of the xlog record * if commit occurs before checkpoint then backend reads data from xlog by this pointer * on checkpoint 2pc data copied to files and fsynced * if commit happens after checkpoint then backend reads files * in case of crash replay will move data from xlog to files (as it was before patch)
This looks sound to me.
I think we could do better still, but this looks like the easiest 80% and actually removes code.
The lack of substantial comments on the patch is a problem though - the details above should go in the patch. I'll have a go at reworking this for you, this time.
--
Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services