Re: POC and rebased patch for CSN based snapshots - Mailing list pgsql-hackers

From movead.li@highgo.ca
Subject Re: POC and rebased patch for CSN based snapshots
Date
Msg-id 2020070422561540463621@highgo.ca
Whole thread Raw
In response to POC and rebased patch for CSN based snapshots  (Movead Li <movead.li@highgo.ca>)
Responses Re: POC and rebased patch for CSN based snapshots
List pgsql-hackers
Hello Andrey

>> I have researched your patch which is so great, in the patch only data
>> out of 'global_snapshot_defer_time' can be vacuum, and it keep dead
>> tuple even if no snapshot import at all,right?
>>
>> I am thanking about a way if we can start remain dead tuple just before
>> we import a csn snapshot.
>>
>> Base on Clock-SI paper, we should get local CSN then send to shard nodes,
>> because we do not known if the shard nodes' csn bigger or smaller then
>> master node, so we should keep some dead tuple all the time to support
>> snapshot import anytime.
>>
>> Then if we can do a small change to CLock-SI model, we do not use the
>> local csn when transaction start, instead we touch every shard node for
>> require their csn, and shard nodes start keep dead tuple, and master node
>> choose the biggest csn to send to shard nodes.
>>
>> By the new way, we do not need to keep dead tuple all the time and do
>> not need to manage a ring buf, we can give to ball to 'snapshot too old'
>> feature. But for trade off, almost all shard node need wait.
>> I will send more detail explain in few days.
>I think, in the case of distributed system and many servers it can be
>bottleneck.
>Main idea of "deferred time" is to reduce interference between DML
>queries in the case of intensive OLTP workload. This time can be reduced
>if the bloationg of a database prevails over the frequency of
>transaction aborts.
OK there maybe a performance issue, and I have another question about Clock-SI.

For example we have three  nodes, shard1(as master), shard2, shard3, which
(time of node2) > (time of node2) > (time of node3), and you can see a picture:
http://movead.gitee.io/picture/blog_img_bad/csn/clock_si_question.png 

As far as I know about Clock-SI, left part of the blue line will setup as a snapshot
if master require a snapshot at time t1. But in fact data A should in snapshot but
not and data B should out of snapshot but not.

If this scene may appear in your origin patch? Or something my understand about
Clock-SI is wrong?


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: warnings for invalid function casts
Next
From: Justin Pryzby
Date:
Subject: Re: pg_read_file() with virtual files returns empty string