Re: Adding REPACK [concurrently] - Mailing list pgsql-hackers

From Antonin Houska
Subject Re: Adding REPACK [concurrently]
Date
Msg-id 8029.1770024929@localhost
Whole thread Raw
In response to Re: Adding REPACK [concurrently]  (Mihail Nikalayeu <mihailnikalayeu@gmail.com>)
Responses Re: Adding REPACK [concurrently]
List pgsql-hackers
Mihail Nikalayeu <mihailnikalayeu@gmail.com> wrote:

> PART 1:
>
> --------------
>
> Something still wrong with 0006, check:
>
> 'pgbench: error: client 12 script 0 aborted in command 2 query 0: ERROR:  attempted to overwrite invisible tuple
> https://cirrus-ci.com/task/6385612527239168?logs=test_world#L300
>
> But it is hard to reproduce - happened once.
>
> --------------
>
> Also, once I got
> [16:25:18.641] #   at /tmp/cirrus-ci-build/contrib/amcheck/t/007_repack_concurrently.pl line 57.
> [16:25:18.641] #                   'pgbench: error: client 6 script 0 aborted in command 2 query 0: ERROR:  relation
21856deleted while still in use 
> https://cirrus-ci.com/task/4686014242881536?logs=test_world#L384
>
> It was the PROC_IN_REPACK version (see below), but I think it is not related to it. But I'm not 100% sure.

I think it *is* related. My earlier patch version, which used the
PROC_IN_VACUUM flag improperly [1] was also causing visibility issues.  Please
let me know if you manage to reproduce the issue with v32.

> PART 2:
>
> > I'm considering a special kind of relation whose catalog entries remain in the
> >  catalog cache and are never written to the catalog tables. (Unlike temporary
> >  relation, it'd be WAL logged so that REPACK can be replayed on standby.)
>
> I think it is too complicated, especially including replication logic.

I'm confused by hearing a complaint about complexity of code that I haven't
posted yet. And I don't understand the relationship to "replication logic":
REPACK (CONCURRENTLY) tries to avoid decoding of data changes in the *new*
(transient) relation anyway.

> Essentially we have two issues:
> 1) make sure catalog entities are not dropped because the vacuum
> 2) make sure data in new table is not vacuumed also

3) XID assigned early due to creation of catalog entries for the new table -
that XID prevents the VACUUM xmin horizon from advancing till the end of the
transaction, i.e. till the end of REPACK execution.

> Also, I am still not sure if MVCC-safe implementation is worth its complexity compared with "relcheckxmin"approach
[0].

IMO it's better for users to see the correct data than ERROR. But it still
needs work.

[1] https://www.postgresql.org/message-id/88003.1769511456%40localhost

--
Antonin Houska
Web: https://www.cybertec-postgresql.com



pgsql-hackers by date:

Previous
From: David Geier
Date:
Subject: Re: Hash-based MCV matching for large IN-lists
Next
From: Andrei Lepikhov
Date:
Subject: Re: Is there value in having optimizer stats for joins/foreignkeys?