On 1/11/23 21:58, Andres Freund wrote:
> Hi,
>
> On 2023-01-11 15:41:45 -0500, Robert Haas wrote:
>> I wonder, then, what happens if somebody wants to do parallel apply. That
>> would seem to require some relaxation of this rule, but then doesn't that
>> break what this patch wants to do?
>
> I don't think it'd pose a direct problem - presumably you'd only parallelize
> applying changes, not committing the transactions containing them. You'd get a
> lot of inconsistencies otherwise.
>
Right. It's the commit order that matters - as long as that's
maintained, the result should be consistent etc.
There's plenty of other hard problems, though - for example it's trivial
for the apply workers to apply the changes in the incorrect order
(contradicting commit order) and then a deadlock. And the deadlock
detector may easily keep aborting the incorrect worker (the oldest one),
so that the replication grinds down to a halt.
I was wondering recently how far would we get by just doing prefetch for
logical apply - instead of applying the changes, just try doing a lookup
on he replica identity values, and then simple serial apply.
> If you're thinking of decoding changes in parallel (rather than streaming out
> large changes before commit when possible), you'd only be able to do that in
> cases when transaction haven't performed catalog changes, I think. In which
> case there'd also be no issue wrt transactional sequence changes.
>
Perhaps, although it's not clear to me how would you know that in
advance? I mean, you could start decoding changes in parallel, and then
you find one of the earlier transactions touched a catalog.
Bu maybe I misunderstand what "decoding" refers to - don't we need the
snapshot only in reorderbuffer? In which case all the other stuff could
be parallelized (not sure if that's really expensive).
Anyway, all of this is far out of scope of this patch.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company