Re: Adding REPACK [concurrently] - Mailing list pgsql-hackers

From Antonin Houska
Subject Re: Adding REPACK [concurrently]
Date
Msg-id 67772.1772462341@localhost
Whole thread Raw
In response to Re: Adding REPACK [concurrently]  (Mihail Nikalayeu <mihailnikalayeu@gmail.com>)
Responses Re: Adding REPACK [concurrently]
List pgsql-hackers
Mihail Nikalayeu <mihailnikalayeu@gmail.com> wrote:

> Some review comments:

Thanks again!

> ------------
> 
> > attrs = palloc0_array(Datum, desc->natts);
> > isnull = palloc0_array(bool, desc->natts);
> 
> It looks like there is a memory leak with those arrays.

I suppose you mean store_change(). Yes, I tried to free the individual chunks
and forgot these. The next version uses a new, per-change memory context.

> > ident_idx = RelationGetReplicaIndex(rel);
> > if (!OidIsValid(ident_idx) && OidIsValid(rel->rd_pkindex))
> 
> check_repack_concurrently_requirements uses rd_pkindex as fallback.
> 
> But rebuild_relation_finish_concurrent does not contain such logic:
> 
> > ident_idx_old = RelationGetReplicaIndex(OldHeap);

Good point. I added a new argument to rebuild_relation_finish_concurrent() so
that the identity index is only determined once.

> ------------
> 
> > >
> > > > ConditionVariablePrepareToSleep(&shared->cv);
> > > >   for (;;)
> > > >   {
> > > >      bool      initialized;
> > > >
> > > >      SpinLockAcquire(&shared->mutex);
> > > >      initialized = shared->initialized;
> > > >      SpinLockRelease(&shared->mutex);
> > > src/backend/commands/cluster.c:3663
> > >
> > > I think we should check GetBackgroundWorkerPid for worker status, to
> > > not wait forever in case of some issue with the worker.
> 
> > ConditionVariableSleep() calls CHECK_FOR_INTERRUPTS(). That should process
> > error messages from the worker.
> 
> Hm, yes, and RepackWorkerShutdown will detach the queue. But
> ProcessRepackMessages does not react somehow to SHM_MQ_DETACHED - just
> ignores. Or am I missing something?

On ERROR / FATAL, RepackWorkerShutdown() should send the message before
detaching. elog.c does it via send_message_to_frontend(), due to the previous
call of pq_redirect_to_shm_mq() in RepackWorkerMain(). ProcessRepackMessages()
then only needs to care about the message, not about the worker's detaching.

> ------------
> 
> > build_identity_key
> > ....
> > n = ident_idx->indnatts;
> 
> Should we use indnkeyatts here?

Definitely. I missed the addition of the INCLUDE columns feature during
maintenance of pg_squeeze, and copied the bug to REPACK. Fixed.

> ------------
> 
> > build_identity_key
> > ....
> > entry->sk_collation = att->attcollation;
> 
> Should we use index collation (not heap) here?
> entry->sk_collation =  ident_idx_rel->rd_indcollation[i];

AFAIC they should be equal, but what you propose simplifies the code a
bit. Done.

> ------------
> 
> > SnapBuildInitialSnapshotForRepack
> 
> What is about to add defensive checks like SnapBuildInitialSnapshot does?
> 
> > if (!builder->committed.includes_all_transactions)
> >    elog(ERROR, "cannot build an initial slot snapshot, not all transactions are monitored anymore");

Initially I added a header comment (XXX) to
SnapBuildInitialSnapshotForRepack() saying that some of the checks, including
this one, could be adopted. The checks were problematic in the backend
executing REPACK. However, they appear to be fine if the code is executed by
the logical decoding worker. So what I'm trying now is to add a new argument
"repack" to SnapBuildInitialSnapshot() and remove the original 0003 diff
("Move conversion of a historic to MVCC snapshot to a separate function.")
from the series altogether.

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com


Attachment

pgsql-hackers by date:

Previous
From: Madyshev Egor
Date:
Subject: RE: Idea to enhance pgbench by more modes to generate data (multi-TXNs, UNNEST, COPY BINARY)
Next
From: Fujii Masao
Date:
Subject: Re: pg_stat_replication.*_lag sometimes shows NULL during active replication