Thread: Unsynchronized parallel dumps from 13.3 replica produced by pg_dump
Hi,
Any ideas would be much appreciated.
Thanks,
Chris
Chris Williams <cswilliams@gmail.com> writes: > We have a script that runs a pg_dump off of an RDS PG13.3 replica several > times per day. We then load this dump using pg_restore into another > postgres RDS db in another AWS account, scrub some of the data, and then > take a snapshot of it. Hmm ... I'm fairly sure that RDS Postgres is not Postgres at this level of detail. The info I've been able to find about their replication mechanism talks about things like "eventually consistent reads", which is not something community Postgres deals in. In particular, what I'd expect from the community code is that a replica could see a sequence as being *ahead* of the value that you might expect from looking at related tables; but never behind. (Also, that statement is true regardless of whether you are doing parallel dump.) And non-sequence tables should always be consistent, period. So I'm suspicious that this is an RDS-specific effect, and thus that you should consult Amazon support first. If they say "no, it's Postgres all the way down", then we need to look closer. regards, tom lane
Thanks Tom. It's a strange one for sure. Hopefully AWS support will shed some light on it. I will clarify too that this is the regular RDS Postgres version and not their other Aurora Postgres service. I suspect the Aurora Postgres probably differs from the community version by quite a bit, but I'm unsure how much their regular Postgres offering differs, if at all.
Thanks,
Chris
On Mon, Oct 18, 2021 at 8:05 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Chris Williams <cswilliams@gmail.com> writes:
> We have a script that runs a pg_dump off of an RDS PG13.3 replica several
> times per day. We then load this dump using pg_restore into another
> postgres RDS db in another AWS account, scrub some of the data, and then
> take a snapshot of it.
Hmm ... I'm fairly sure that RDS Postgres is not Postgres at this level
of detail. The info I've been able to find about their replication
mechanism talks about things like "eventually consistent reads", which
is not something community Postgres deals in.
In particular, what I'd expect from the community code is that a replica
could see a sequence as being *ahead* of the value that you might expect
from looking at related tables; but never behind. (Also, that statement
is true regardless of whether you are doing parallel dump.) And
non-sequence tables should always be consistent, period.
So I'm suspicious that this is an RDS-specific effect, and thus that
you should consult Amazon support first. If they say "no, it's Postgres
all the way down", then we need to look closer.
regards, tom lane
Chris Williams <cswilliams@gmail.com> writes: > Thanks Tom. It's a strange one for sure. Hopefully AWS support will shed > some light on it. I will clarify too that this is the regular RDS Postgres > version and not their other Aurora Postgres service. I suspect the Aurora > Postgres probably differs from the community version by quite a bit, but > I'm unsure how much their regular Postgres offering differs, if at all. Yeah, Aurora is definitely a different beast at the storage level. I'm not entirely sure about RDS. regards, tom lane