Thread: pg_dump and not MVCC-safe commands
How pg_dump interact with not MVCC-safe commands? As I understand, pg_dump first take snapshot and then lock all tables it intended to dump. What happens if not MVCC-safe command committed after snapshot but before lock? From comment to pg_dump.c I understand that it may fail with 'cache lookup failed' error. But, can it happen, that pg_dump not fail, but instead capture inconsistent dump? For example TRUNCATE committed after snapshot and pg_dump will see result of TRUNCATE but not result of other commands in TRUNCATE transaction?
Hi,
Le lun. 20 mai 2024 à 11:27, PetSerAl <petseral@gmail.com> a écrit :
How pg_dump interact with not MVCC-safe commands?
As I understand, pg_dump first take snapshot and then lock all tables
it intended to dump. What happens if not MVCC-safe command committed
after snapshot but before lock? From comment to pg_dump.c I understand
that it may fail with 'cache lookup failed' error. But, can it happen,
that pg_dump not fail, but instead capture inconsistent dump? For
example TRUNCATE committed after snapshot and pg_dump will see result
of TRUNCATE but not result of other commands in TRUNCATE transaction?
You can't truncate an already existing table while pg_dump is running. TRUNCATE needs an exclusive lock, and pg_dump already has a lock on all tables of the database it's dumping. So TRUNCATE will be blocked until pg_dump finishes all its work.
(The same will happen for VACUUM FULL, CLUSTER and some (all?) ALTER TABLE commands.)
--
Guillaume.
My question: What happens if not MVCC-safe command committed after snapshot but before lock? On Mon, May 20, 2024 at 12:33 PM Guillaume Lelarge <guillaume@lelarge.info> wrote: > > Hi, > > Le lun. 20 mai 2024 à 11:27, PetSerAl <petseral@gmail.com> a écrit : >> >> How pg_dump interact with not MVCC-safe commands? >> >> As I understand, pg_dump first take snapshot and then lock all tables >> it intended to dump. What happens if not MVCC-safe command committed >> after snapshot but before lock? From comment to pg_dump.c I understand >> that it may fail with 'cache lookup failed' error. But, can it happen, >> that pg_dump not fail, but instead capture inconsistent dump? For >> example TRUNCATE committed after snapshot and pg_dump will see result >> of TRUNCATE but not result of other commands in TRUNCATE transaction? >> >> > > You can't truncate an already existing table while pg_dump is running. TRUNCATE needs an exclusive lock, and pg_dump alreadyhas a lock on all tables of the database it's dumping. So TRUNCATE will be blocked until pg_dump finishes all itswork. > > (The same will happen for VACUUM FULL, CLUSTER and some (all?) ALTER TABLE commands.) > > > -- > Guillaume.
PetSerAl <petseral@gmail.com> writes: > My question: > What happens if not MVCC-safe command committed after snapshot but before lock? Then you'd get a dump that's not exactly consistent with the state at the time of the snapshot. However, it would be the very same database state that any other query would see at that time. So if it's functionally inconsistent for your purposes, then you brought that on your own head by using TRUNCATE concurrently with other operations. The other thing I can think of that's a hazard in this area is that pg_dump will record current values of sequence objects that may be advanced beyond where the sequence was at the instant of the snapshot, since nextval() isn't MVCC-safe. There are a lot of other ways (rollbacks and crashes) in which a sequence can get advanced beyond the last derived value you can find in the database, so an application that finds this to be unacceptable should probably not be using sequences. regards, tom lane
> However, it would be the very same > database state that any other query would see at that time. Other queries can use different techniques to prevent this. For example, lock tables before snapshot, as recommended in documentation. I understand, that pg_dump can not use that, as it need snapshot to enumerate tables to lock. > then you brought > that on your own head by using TRUNCATE concurrently with other > operations. People, who admins database servers, are not always the same people, who write application/queries. That means, if you know nothing about application, you can not guarantee, if live database backup would be consistent. Basically, you need application cooperation to make consistent live database backup.
> On May 20, 2024, at 08:49, PetSerAl <petseral@gmail.com> wrote: > Basically, you need application cooperation to make > consistent live database backup. If it is critical that you have a completely consistent backup as of a particular point in time, and you are not concernedabout restoring to a different processor architecture, pg_basebackup is a superior solution to pg_dump.
On Mon, May 20, 2024 at 11:54 AM Christophe Pettus <xof@thebuild.com> wrote:
> On May 20, 2024, at 08:49, PetSerAl <petseral@gmail.com> wrote:
> Basically, you need application cooperation to make
> consistent live database backup.
If it is critical that you have a completely consistent backup as of a particular point in time, and you are not concerned about restoring to a different processor architecture, pg_basebackup is a superior solution to pg_dump.
Single-threaded, and thus dreadfully slow. I'll stick with PgBackRest.