Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions - Mailing list pgsql-hackers

From Alexey Kondratov
Subject Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
Date
Msg-id 6dcf26af-2ebc-4149-8c60-070ec6214954@postgrespro.ru
Whole thread Raw
In response to Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
List pgsql-hackers
Hi Tomas,

On 14.01.2019 21:23, Tomas Vondra wrote:
> Attached is an updated patch series, merging fixes and changes to TAP
> tests proposed by Alexey. I've merged the fixes into the appropriate
> patches, and I've kept the TAP changes / new tests as separate patches
> towards the end of the series.

I had problems applying this patch along with 2pc streaming one to the 
current master, but everything applied well on 97c39498e5. Regression 
tests pass. What I personally do not like in the current TAP tests set 
is that you have added "WITH (streaming=on)" to all tests including old 
non-streaming ones. It seems unclear, which mechanism is tested there: 
streaming, but those transactions probably do not hit memory limit, so 
it depends on default server parameters; or non-streaming, but then what 
is the need for (streaming=on)? I would prefer to add (streaming=on) 
only to the new tests, where it is clearly necessary.

> I'm a bit unhappy with two aspects of the current patch series:
>
> 1) We now track schema changes in two ways - using the pre-existing
> schema_sent flag in RelationSyncEntry, and the (newly added) flag in
> ReorderBuffer. While those options are used for regular vs. streamed
> transactions, fundamentally it's the same thing and so having two
> competing ways seems like a bad idea. Not sure what's the best way to
> resolve this, though.

Yes, sure, when I have found problems with streaming of extensive DDL, I 
added new flag in the simplest way, and it worked. Now, old schema_sent 
flag is per relation based, while the new one - is_schema_sent - is per 
top-level transaction based. If I get it correctly, the former seems to 
be more thrifty, since new schema is sent only if we are streaming 
change for relation, whose schema is outdated. In contrast, in the 
latter case we will send new schema even if there will be no new changes 
which belong to this relation.

I guess, it would be better to stick to the old behavior. I will try to 
investigate how to better use it in the streaming mode as well.

> 2) We've removed quite a few asserts, particularly ensuring sanity of
> cmin/cmax values. To some extent that's expected, because by allowing
> decoding of in-progress transactions relaxes some of those rules. But
> I'd be much happier if some of those asserts could be reinstated, even
> if only in a weaker form.


Asserts have been removed from two places: (1) 
HeapTupleSatisfiesHistoricMVCC, which seems inevitable, since we are 
touching the essence of the MVCC visibility rules, when trying to decode 
an in-progress transaction, and (2) ReorderBufferBuildTupleCidHash, 
which is probably not related directly to the topic of the ongoing 
patch, since Arseny Sher faced the same issue with simple repetitive DDL 
decoding [1] recently.

Not many, but I agree, that replacing them with some softer asserts 
would be better, than just removing, especially point 1).


[1] https://www.postgresql.org/message-id/flat/874l9p8hyw.fsf%40ars-thinkpad


Regards

-- 
Alexey Kondratov

Postgres Professional https://www.postgrespro.com
Russian Postgres Company



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: ATTACH/DETACH PARTITION CONCURRENTLY
Next
From: Andreas Karlsson
Date:
Subject: Re: Feature: temporary materialized views