Thread: Few observations in replication slots related code

Few observations in replication slots related code

From
Amit Kapila
Date:
Few observations in Replication slots related code:

1. In function StartupReplicationSlots(XLogRecPtr checkPointRedo),
parameter checkPointRedo is not used.


2. Few check are in different order in functions
pg_create_physical_replication_slot() and 
pg_create_logical_replication_slot().

if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
elog(ERROR, "return type must be a row type");
check_permissions();

CheckLogicalDecodingRequirements();

I don't think it makes any difference, however having checks in similar
order seems better unless there is a reason for not doing so.

3. Currently there is any Assert (Assert(!MyReplicationSlot);) in
pg_create_logical_replication_slot(), is there a need for similar
Assert in pg_create_physical_replication_slot()?

4.
StartupDecodingContext()
{
..
context = AllocSetContextCreate(CurrentMemoryContext,
"Changeset Extraction Context",
}

To be consistent, shall we name this context as
logical | logical decoding?

5.
pg_create_logical_replication_slot()
{
..
CreateInitDecodingContext()

..
ReplicationSlotPersist()
}

Function pg_create_logical_replication_slot() is trying to
save slot twice once during CreateInitDecodingContext() and
then in ReplicationSlotPersist(), isn't it better if we can make
it do it just once? 

6.
elog(ERROR, "cannot handle changeset extraction yet");

Shouldn't it be better to use logical replication instead
of changeset extraction?

7.
+ * XXX: It might make sense to introduce ephemeral slots and always use
+ * the slot mechanism.

Already there are RS_EPHEMERAL slots, might be this
comment needs bit of rephrasing.

8. Is there a chance of inconsistency, if pg_restxlog resets the
xlog and after restart, one of the slots contains xlog position
older than what is resetted by pg_resetxlog?

9.
void
LogicalIncreaseXminForSlot(XLogRecPtr current_lsn, TransactionId xmin)
{
..
/*
* don't overwrite if we already have a newer xmin. This can happen if we
* restart decoding in a slot.
*/
if (TransactionIdPrecedesOrEquals(xmin, slot->data.catalog_xmin))
{
}
..
}

Should we just release spinlock in this loop and just return,
rather than keeping no action loop?

10.
* Iff ri_type = REPLICA_IDENTITY_INDEX, indexOid must be the Oid of a suitable
 * index. Otherwise, it should be InvalidOid.
 */
static void
relation_mark_replica_identity(Relation rel, char ri_type, Oid indexOid,
  bool is_internal)

typo - *Iff*

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: Few observations in replication slots related code

From
Andres Freund
Date:
Hi Amit,

On 2014-06-12 08:55:59 +0530, Amit Kapila wrote:
> 1. In function StartupReplicationSlots(XLogRecPtr checkPointRedo),
> parameter checkPointRedo is not used.

It used to be included in a debug message. Apparently the message was
removed at some point (don't remember it, but I have a memory like a
sieve).

> 2. Few check are in different order in functions
> pg_create_physical_replication_slot() and
> pg_create_logical_replication_slot().
> 
> if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
> elog(ERROR, "return type must be a row type");
> check_permissions();
> 
> CheckLogicalDecodingRequirements();
> 
> I don't think it makes any difference, however having checks in similar
> order seems better unless there is a reason for not doing so.

Can change it.

> 3. Currently there is any Assert (Assert(!MyReplicationSlot);) in
> pg_create_logical_replication_slot(), is there a need for similar
> Assert in pg_create_physical_replication_slot()?

Hm. Shouldn't really matter either way these days, but I guess it
doesn't hurt to add one.

> 4.
> StartupDecodingContext()
> {
> ..
> context = AllocSetContextCreate(CurrentMemoryContext,
> "Changeset Extraction Context",
> }
> 
> To be consistent, shall we name this context as
> logical | logical decoding?

Heh. That apparently escaped when things were renamed. Yes.

> 5.
> pg_create_logical_replication_slot()
> {
> ..
> CreateInitDecodingContext()
> 
> ..
> ReplicationSlotPersist()
> }
> 
> Function pg_create_logical_replication_slot() is trying to
> save slot twice once during CreateInitDecodingContext() and
> then in ReplicationSlotPersist(), isn't it better if we can make
> it do it just once?

Doesn't work here. In the first save it's not yet marked as persistent -
but we still need to safely reserve the xmin. It's also not something
that should happen very frequently, so I'm not worried about the
overhead.

> 6.
> elog(ERROR, "cannot handle changeset extraction yet");
> 
> Shouldn't it be better to use logical replication instead
> of changeset extraction?

Will change.

> 7.
> + * XXX: It might make sense to introduce ephemeral slots and always use
> + * the slot mechanism.
> 
> Already there are RS_EPHEMERAL slots, might be this
> comment needs bit of rephrasing.

> 8. Is there a chance of inconsistency, if pg_restxlog resets the
> xlog and after restart, one of the slots contains xlog position
> older than what is resetted by pg_resetxlog?

Yes. There's lots of ways to screw over your database by using
pg_resetxlog. I personally think there should be a required
--yes-i-am-breaking-my-database parameter for pg_resetxlog.

> 9.
> void
> LogicalIncreaseXminForSlot(XLogRecPtr current_lsn, TransactionId xmin)
> {
> ..
> /*
>  * don't overwrite if we already have a newer xmin. This can happen if we
>  * restart decoding in a slot.
>  */
> if (TransactionIdPrecedesOrEquals(xmin, slot->data.catalog_xmin))
> {
> }
> ..
> }
> 
> Should we just release spinlock in this loop and just return,
> rather than keeping no action loop?

Don't think that'd make it any faster/simpler. We'd be stuck with
duplicate codepaths.

> 10.
> * Iff ri_type = REPLICA_IDENTITY_INDEX, indexOid must be the Oid of a
> suitable
>  * index. Otherwise, it should be InvalidOid.
>  */
> static void
> relation_mark_replica_identity(Relation rel, char ri_type, Oid indexOid,
>    bool is_internal)
> 
> typo - *Iff*

iff := if and only if.

Thanks for the look.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Few observations in replication slots related code

From
Andres Freund
Date:
Hi,

On 2014-06-12 09:15:08 +0200, Andres Freund wrote:
> > 6.
> > elog(ERROR, "cannot handle changeset extraction yet");
> > 
> > Shouldn't it be better to use logical replication instead
> > of changeset extraction?
> 
> Will change.

I don't see that message anywhere in current code? All of those were
rephrased in b89e151054a05f0f6d356ca52e3b725dd0505e53 (where Robert
changed the term to logical decoding) and then implemented in
5a991ef8692ed0d170b44958a81a6bd70e90585c.

Pushed a fix for the rest of the ones I commented on earlier. Thanks!

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Few observations in replication slots related code

From
Amit Kapila
Date:
On Thu, Jun 12, 2014 at 5:02 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> On 2014-06-12 09:15:08 +0200, Andres Freund wrote:
> > > 6.
> > > elog(ERROR, "cannot handle changeset extraction yet");
> > >
> > > Shouldn't it be better to use logical replication instead
> > > of changeset extraction?
> >
> > Will change.
>
> I don't see that message anywhere in current code? 

Right, actually I was reading code from Git History and also
referring latest code, so it seems I have seen that in original
commit and missed to verify it in latest code.

While checking latest code, I got usage of *changeset extraction*
in below comment:

/*

..

*

 * This is useful to initialize the cutoff xid after which a new changeset

 * extraction replication slot can start decoding changes.

 *

..

*/

  

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: Few observations in replication slots related code

From
Amit Kapila
Date:
On Thu, Jun 12, 2014 at 12:45 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> On 2014-06-12 08:55:59 +0530, Amit Kapila wrote:
> > Function pg_create_logical_replication_slot() is trying to
> > save slot twice once during CreateInitDecodingContext() and
> > then in ReplicationSlotPersist(), isn't it better if we can make
> > it do it just once?
>
> Doesn't work here. In the first save it's not yet marked as persistent -
> but we still need to safely reserve the xmin.

Okay, but if it crashes before saving the persistency to permanent
file and there remains a .tmp for this replication slot which it created
during save of this persistency information, then also xmin will get
lost, because during startup it will not consider such a slot.

> It's also not something
> that should happen very frequently, so I'm not worried about the
> overhead.

Right, just looking from the purpose of need for same.

> > 8. Is there a chance of inconsistency, if pg_restxlog resets the
> > xlog and after restart, one of the slots contains xlog position
> > older than what is resetted by pg_resetxlog?
>
> Yes. There's lots of ways to screw over your database by using
> pg_resetxlog.

Thats right, trying to think if there could be any thing which
won't even allow the server to get started due to replication slots
after pg_resetxlog.  As per my current understanding, I don't think
there is any such problem.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: Few observations in replication slots related code

From
Andres Freund
Date:
On 2014-06-13 13:01:59 +0530, Amit Kapila wrote:
> On Thu, Jun 12, 2014 at 12:45 PM, Andres Freund <andres@2ndquadrant.com>
> wrote:
> > On 2014-06-12 08:55:59 +0530, Amit Kapila wrote:
> > > Function pg_create_logical_replication_slot() is trying to
> > > save slot twice once during CreateInitDecodingContext() and
> > > then in ReplicationSlotPersist(), isn't it better if we can make
> > > it do it just once?
> >
> > Doesn't work here. In the first save it's not yet marked as persistent -
> > but we still need to safely reserve the xmin.
> 
> Okay, but if it crashes before saving the persistency to permanent
> file and there remains a .tmp for this replication slot which it created
> during save of this persistency information, then also xmin will get
> lost, because during startup it will not consider such a slot.

I can't follow here. If it crashes before it's marked persistent it'll
get deleted at startup (c.f. RestoreSlotFromDisk). And .tmp slots are
cleaned up at restart.
It's fine if the entire slot is lost if the server crashes too early -
we haven't yet returned sucess...

> > Yes. There's lots of ways to screw over your database by using
> > pg_resetxlog.
> 
> Thats right, trying to think if there could be any thing which
> won't even allow the server to get started due to replication slots
> after pg_resetxlog.  As per my current understanding, I don't think
> there is any such problem.

I'm not aware of any.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Few observations in replication slots related code

From
Amit Kapila
Date:
On Fri, Jun 13, 2014 at 1:45 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> On 2014-06-13 13:01:59 +0530, Amit Kapila wrote:
> > Okay, but if it crashes before saving the persistency to permanent
> > file and there remains a .tmp for this replication slot which it created
> > during save of this persistency information, then also xmin will get
> > lost, because during startup it will not consider such a slot.
>
> I can't follow here. If it crashes before it's marked persistent it'll
> get deleted at startup (c.f. RestoreSlotFromDisk). And .tmp slots are
> cleaned up at restart.

Yes that's fine, the point I wanted to say here is that the flush
of slot in CreateInitDecodingContext() wouldn't save xmin in all
cases which is I think what it want to achieve.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: Few observations in replication slots related code

From
Andres Freund
Date:
On 2014-06-13 14:21:33 +0530, Amit Kapila wrote:
> On Fri, Jun 13, 2014 at 1:45 PM, Andres Freund <andres@2ndquadrant.com>
> wrote:
> > On 2014-06-13 13:01:59 +0530, Amit Kapila wrote:
> > > Okay, but if it crashes before saving the persistency to permanent
> > > file and there remains a .tmp for this replication slot which it created
> > > during save of this persistency information, then also xmin will get
> > > lost, because during startup it will not consider such a slot.
> >
> > I can't follow here. If it crashes before it's marked persistent it'll
> > get deleted at startup (c.f. RestoreSlotFromDisk). And .tmp slots are
> > cleaned up at restart.
> 
> Yes that's fine, the point I wanted to say here is that the flush
> of slot in CreateInitDecodingContext() wouldn't save xmin in all
> cases which is I think what it want to achieve.

It'll savely stored until a xact abort (will be deleted during cleanup),
PANIC crash (deleted during startup) or until the slot is marked as
persistent. Which is what we need.

Maybe this helps:/* * For logical decoding, it's extremely important that we never remove any * data that's still
neededfor decoding purposes, even after a crash; * otherwise, decoding will produce wrong answers.  Ordinary streaming
*replication also needs to prevent old row versions from being removed * too soon, but the worst consequence we might
encounterthere is * unwanted query cancellations on the standby.  Thus, for logical * decoding, this value represents
thelatest xmin that has actually been * written to disk, whereas for streaming replication, it's just the same * as the
persistentvalue (data.xmin). */TransactionId effective_xmin;TransactionId effective_catalog_xmin;
 


Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services