Greetings,
* Tomas Vondra (tomas.vondra@2ndquadrant.com) wrote:
> On Tue, Jan 28, 2020 at 02:34:07PM -0500, Stephen Frost wrote:
> >We certainly can't run external commands during transaction COMMIT, so
> >this can't be part of a regular DROP TABLE.
>
> IMO the best solution would be that the DROP TABLE does everything as
> usual, but instead of deleting the relfilenode it moves it to some sort
> of queue. And then a background worker would "erase" these relfilenodes
> outside the COMMIT.
That sounds interesting, though I'm a bit worried that it's going to
lead to the same kind of complications and difficulty that we have with
deleted columns- anything that's working with the system tables will
need to see this new "dropped but pending delete" flag. Would we also
rename the table when this happens? Or change the schema it's in?
Otherwise, your typical DROP IF EXISTS / CREATE could end up breaking.
> And yes, we need to do this in a way that works with replicas, i.e. we
> need to WAL-log it somehow. And it should to be done in a way that works
> when the replica is on a different type of filesystem.
I agree it should go through WAL somehow (ideally without needing an
actual zero'd or whatever page for every page in the relation), but why
do we care about the filesystem on the replica? We don't have anything
that's really filesystem specific in WAL replay today and I don't see
this as needing to change that..
Thanks,
Stephen