Home > mailing lists

Thread: removal of dangling temp tables

removal of dangling temp tables

From

Alvaro Herrera

Date:

14 December 2018, 16:28:43

We recently ran into a funny situation, where autovacuum would not
remove very old temp tables. The relfrozenxid of those tables was about
to reach the max freeze age, so monitoring started to complain. It
turned out that autovacuum saw that the backendId was used by a live
backend ... but that session was in reality not using those temp tables,
and had not used any temp table at all. They were left-overs from a
crash months ago, and since the session was not using temp tables, they
had not been removed. DISCARD ALL may have run, but had no effect.

I think the best way to fix this is to call RemoveTempRelations()
unconditionally at session start (without doing the rest of the temp
table setup, just the removal.)

In versions earlier than pg11, related issues occur if you have a crash
with autovacuum off and/or workers disabled, and temp tables are leaked
in backendID 1 or 2; then start with normal values. In that case, those
backendIDs are used by the logical replication launcher and the
autovacuum launcher, so autovacuum does not remove them either. This
was fixed in PG11 inadvertently by this commit:
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=943576bddcb52971041d9f5f806789921fa107ee
The reason the commit fixes it is that now the databaseID of the PGPROC
entry is compared to the temp table's database; and for those worker
processes, the DatabaseId is InvalidOid so it all works out.

This isn't a terribly interesting bug, as restarting with changed
worker/autovacuum options after a crash that happens to leak temp tables
should be quite rare. But anyway we can fix this second issue in prior
branches by adding a comparison to databaseId to the existing 'if' test
in autovacuum; easy enough, no compatibility concerns. This should also
cover the case that one session crashes leaking temp tables, then the
same session connects to a different database for a long time. (I
didn't verify this last point to be a real problem.)

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: removal of dangling temp tables

From

Robert Haas

Date:

14 December 2018, 16:48:45

On Fri, Dec 14, 2018 at 11:29 AM Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> I think the best way to fix this is to call RemoveTempRelations()
> unconditionally at session start (without doing the rest of the temp
> table setup, just the removal.)

That would certainly simplify things.  I think I thought about that as
far back as a734fd5d1c309cc553b7c8c79fba96218af090f7 but it seemed
like a significant behavior change and I wasn't sure that everyone
would like it.  In particular, it adds overhead to backend startup
that, in the case of a large temp schema, could be fairly long.

Nevertheless, I tentatively think that a change like this is a good
idea.  I wouldn't back-patch it, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: removal of dangling temp tables

From

Alvaro Herrera

Date:

14 December 2018, 17:26:58

On 2018-Dec-14, Robert Haas wrote:

> On Fri, Dec 14, 2018 at 11:29 AM Alvaro Herrera
> <alvherre@2ndquadrant.com> wrote:
> > I think the best way to fix this is to call RemoveTempRelations()
> > unconditionally at session start (without doing the rest of the temp
> > table setup, just the removal.)
> 
> That would certainly simplify things.  I think I thought about that as
> far back as a734fd5d1c309cc553b7c8c79fba96218af090f7 but it seemed
> like a significant behavior change and I wasn't sure that everyone
> would like it.  In particular, it adds overhead to backend startup
> that, in the case of a large temp schema, could be fairly long.

Hmm, I think in the case covered by your commit, that is a session that
crashes with a few thousands of temp tables, this new patch might cause
a failure to open a new session altogether.

Maybe it'd be better to change temp table removal to always drop
max_locks_per_transaction objects at a time (ie. commit/start a new
transaction every so many objects).

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

clean-temp-ns-startup.patch

Re: removal of dangling temp tables

From

Robert Haas

Date:

14 December 2018, 17:48:35

On Fri, Dec 14, 2018 at 12:27 PM Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> Hmm, I think in the case covered by your commit, that is a session that
> crashes with a few thousands of temp tables, this new patch might cause
> a failure to open a new session altogether.

Oh, good point.  Or if the catalog is corrupted.

> Maybe it'd be better to change temp table removal to always drop
> max_locks_per_transaction objects at a time (ie. commit/start a new
> transaction every so many objects).

We're basically just doing DROP SCHEMA ... CASCADE, so I'm not sure
how we'd implement that, but I agree it would be significantly better.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: removal of dangling temp tables

From

Tom Lane

Date:

14 December 2018, 17:57:23

Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, Dec 14, 2018 at 11:29 AM Alvaro Herrera
> <alvherre@2ndquadrant.com> wrote:
>> I think the best way to fix this is to call RemoveTempRelations()
>> unconditionally at session start (without doing the rest of the temp
>> table setup, just the removal.)

> That would certainly simplify things.  I think I thought about that as
> far back as a734fd5d1c309cc553b7c8c79fba96218af090f7 but it seemed
> like a significant behavior change and I wasn't sure that everyone
> would like it.  In particular, it adds overhead to backend startup
> that, in the case of a large temp schema, could be fairly long.

> Nevertheless, I tentatively think that a change like this is a good
> idea.  I wouldn't back-patch it, though.

I seem to recall discussions about having crash recovery go around
and clean out temp tables.  That seems like a better plan than
penalizing every session start with this.

            regards, tom lane

Re: removal of dangling temp tables

From

Robert Haas

Date:

14 December 2018, 18:16:30

On Fri, Dec 14, 2018 at 12:57 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I seem to recall discussions about having crash recovery go around
> and clean out temp tables.  That seems like a better plan than
> penalizing every session start with this.

Well, crash recovery already removes the files, but it can't really
remove the catalog entries, can it?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: removal of dangling temp tables

From

Tom Lane

Date:

14 December 2018, 18:35:50

Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, Dec 14, 2018 at 12:57 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I seem to recall discussions about having crash recovery go around
>> and clean out temp tables.  That seems like a better plan than
>> penalizing every session start with this.

> Well, crash recovery already removes the files, but it can't really
> remove the catalog entries, can it?

Hm.  It *could*, if we wanted it to run some transactions after
finishing recovery.  But I guess I wonder why bother; if the disk
space is gone then there's not that much reason to be in a hurry
to get rid of the catalog entries.  The particular problem Alvaro
is complaining of might be better handled by ignoring temp tables
while computing max freeze age etc.  I have some recollection that
we'd intentionally included them, but I wonder why really; it's
not like autovacuum is going to be able to do anything about an
ancient temp table.

Alternatively, maybe we could have backends flag whether they've
taken ownership of their temp schemas or not, and let autovacuum
flush old temp tables if not?

            regards, tom lane

Re: removal of dangling temp tables

From

Andres Freund

Date:

14 December 2018, 18:40:08

Hi,

On 2018-12-14 13:35:50 -0500, Tom Lane wrote:
> Hm.  It *could*, if we wanted it to run some transactions after
> finishing recovery.

How, we don't have infrastructure for changing databases yet?


> But I guess I wonder why bother; if the disk
> space is gone then there's not that much reason to be in a hurry
> to get rid of the catalog entries.  The particular problem Alvaro
> is complaining of might be better handled by ignoring temp tables
> while computing max freeze age etc.  I have some recollection that
> we'd intentionally included them, but I wonder why really; it's
> not like autovacuum is going to be able to do anything about an
> ancient temp table.

We can't truncate the clog, adapt the xid horizon, etc if there's any
temp tables. Otherwise you'd get failures when reading from one, no?

Greetings,

Andres Freund

Re: removal of dangling temp tables

From

Robert Haas

Date:

14 December 2018, 18:46:03

On Fri, Dec 14, 2018 at 6:35 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Hm.  It *could*, if we wanted it to run some transactions after
> finishing recovery.

It'd have to launch a separate process per database.  That would be
useful infrastructure for other things, too, like automatic catalog
upgrades in minor releases, but I'm not volunteering to write that
infrastructure right now.

> Alternatively, maybe we could have backends flag whether they've
> taken ownership of their temp schemas or not, and let autovacuum
> flush old temp tables if not?

Yes, that seems like a possibly promising approach.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: removal of dangling temp tables

From

Alvaro Herrera

Date:

15 December 2018, 02:02:20

On 2018-Dec-14, Robert Haas wrote:

> On Fri, Dec 14, 2018 at 12:27 PM Alvaro Herrera
> <alvherre@2ndquadrant.com> wrote:

> > Maybe it'd be better to change temp table removal to always drop
> > max_locks_per_transaction objects at a time (ie. commit/start a new
> > transaction every so many objects).
> 
> We're basically just doing DROP SCHEMA ... CASCADE, so I'm not sure
> how we'd implement that, but I agree it would be significantly better.

(Minor nit: even currently, we don't drop the schema itself, only the
objects it contains.)

I was thinking we could scan pg_depend for objects depending on the
schema, add them to an ObjectAddresses array, and do
performMultipleDeletions once every max_locks_per_transaction objects.
But in order for this to have any useful effect we'd have to commit the
transaction and start another one; maybe that's too onerous.

Maybe we could offer such a behavior as a special case to be used only
in case the regular mechanism fails.  So add a PG_TRY which, in case of
failure, sends a hint to do the cleanup.  Not sure this is worthwhile.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: removal of dangling temp tables

From

Alvaro Herrera

Date:

15 December 2018, 02:06:32

On 2018-Dec-14, Robert Haas wrote:

> On Fri, Dec 14, 2018 at 6:35 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Hm.  It *could*, if we wanted it to run some transactions after
> > finishing recovery.
> 
> It'd have to launch a separate process per database.  That would be
> useful infrastructure for other things, too, like automatic catalog
> upgrades in minor releases, but I'm not volunteering to write that
> infrastructure right now.

This looks like a major project.

> > Alternatively, maybe we could have backends flag whether they've
> > taken ownership of their temp schemas or not, and let autovacuum
> > flush old temp tables if not?
> 
> Yes, that seems like a possibly promising approach.

I did propose in my OP the idea of a PGPROC boolean flag that indicates
whether the temp namespace has been set up.  If not, have autovac remove
those tables.  I like this option better, but I fear it adds more
ProcArrayLock contention.  Maybe I can just use a new LWLock to
coordinate that particular member of the ProcGlobal array ... (but then
it can no longer be a boolean.)

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: removal of dangling temp tables

From

Michael Paquier

Date:

15 December 2018, 03:42:03

On Fri, Dec 14, 2018 at 11:06:32PM -0300, Alvaro Herrera wrote:
> I did propose in my OP the idea of a PGPROC boolean flag that indicates
> whether the temp namespace has been set up.  If not, have autovac remove
> those tables.  I like this option better, but I fear it adds more
> ProcArrayLock contention.  Maybe I can just use a new LWLock to
> coordinate that particular member of the ProcGlobal array ... (but then
> it can no longer be a boolean.)

Isn't that what tempNamespaceId can be used for in PGPROC now?  The flag
would be set only when a backend creates a new temporary schema so as it
can be tracked as the owner of the schema.
--
Michael

Attachment

signature.asc

Re: removal of dangling temp tables

From

Alvaro Herrera

Date:

15 December 2018, 13:21:16

On 2018-Dec-15, Michael Paquier wrote:

> On Fri, Dec 14, 2018 at 11:06:32PM -0300, Alvaro Herrera wrote:
> > I did propose in my OP the idea of a PGPROC boolean flag that indicates
> > whether the temp namespace has been set up.  If not, have autovac remove
> > those tables.  I like this option better, but I fear it adds more
> > ProcArrayLock contention.  Maybe I can just use a new LWLock to
> > coordinate that particular member of the ProcGlobal array ... (but then
> > it can no longer be a boolean.)
> 
> Isn't that what tempNamespaceId can be used for in PGPROC now?  The flag
> would be set only when a backend creates a new temporary schema so as it
> can be tracked as the owner of the schema.

Oh, we already have it!  Sorry, I overlooked it.  With that, it seems
the patch is fairly simple ... I wonder about the locking implications
in autovacuum, though -- the value is set in backends without acquiring
a lock.  I wonder if we could use memory barriers, so it'd incur little
cost.

I wonder how this thing works in parallel query workers.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: removal of dangling temp tables

From

Tom Lane

Date:

15 December 2018, 14:51:31

Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> On 2018-Dec-15, Michael Paquier wrote:
>> Isn't that what tempNamespaceId can be used for in PGPROC now?  The flag
>> would be set only when a backend creates a new temporary schema so as it
>> can be tracked as the owner of the schema.

> Oh, we already have it!  Sorry, I overlooked it.  With that, it seems
> the patch is fairly simple ... I wonder about the locking implications
> in autovacuum, though -- the value is set in backends without acquiring
> a lock.

I was wondering about that too.  But I think it's probably OK.  If
autovacuum observes that (a) a table is old enough to pose a wraparound
hazard and (b) its putatively owning backend hasn't yet set
tempNamespaceId, then I think it's safe to conclude that that table is
removable, despite the theoretical race condition.

Autovacuum would need to acquire a deletion lock and then check that the
table is still there, to avoid race conditions if the backend starts to
clean out the schema immediately after it looks.  But I think those race
conditions exist anyway (consider a fresh backend that starts cleaning out
its temp schema immediately), so if we have a problem with concurrent
deletion attempts then that problem exists already.

> I wonder how this thing works in parallel query workers.

Surely workers are not allowed to create or delete temp tables.

            regards, tom lane

Re: removal of dangling temp tables

From

Michael Paquier

Date:

16 December 2018, 06:58:47

On Sat, Dec 15, 2018 at 09:51:31AM -0500, Tom Lane wrote:
> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
>> Oh, we already have it!  Sorry, I overlooked it.  With that, it seems
>> the patch is fairly simple ... I wonder about the locking implications
>> in autovacuum, though -- the value is set in backends without acquiring
>> a lock.
>
> I was wondering about that too.  But I think it's probably OK.  If
> autovacuum observes that (a) a table is old enough to pose a wraparound
> hazard and (b) its putatively owning backend hasn't yet set
> tempNamespaceId, then I think it's safe to conclude that that table is
> removable, despite the theoretical race condition.

This relies on the fact that the flag gets set by a backend within a
transaction context, and autovacuum would not see yet temp relations
associated to it at the moment of the scan of pg_class if the backend
has not committed yet its namespace creation via the creation of the
first temp table it uses.

> Autovacuum would need to acquire a deletion lock and then check that the
> table is still there, to avoid race conditions if the backend starts to
> clean out the schema immediately after it looks.  But I think those race
> conditions exist anyway (consider a fresh backend that starts cleaning out
> its temp schema immediately), so if we have a problem with concurrent
> deletion attempts then that problem exists already.
>
>> I wonder how this thing works in parallel query workers.
>
> Surely workers are not allowed to create or delete temp tables.

Yes, InitTempTableNamespace prevents their creation and InitLocalBuffers
prevents their access as buffers of temp tables are local to a backend
and cannot be shared across multiple workers.  Amit Kapila has been
working on this problem lately for example.
--
Michael

Attachment

signature.asc

Re: removal of dangling temp tables

From

Alvaro Herrera

Date:

26 December 2018, 19:08:34

On 2018-Dec-16, Michael Paquier wrote:

> On Sat, Dec 15, 2018 at 09:51:31AM -0500, Tom Lane wrote:
> > Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> >> Oh, we already have it!  Sorry, I overlooked it.  With that, it seems
> >> the patch is fairly simple ... I wonder about the locking implications
> >> in autovacuum, though -- the value is set in backends without acquiring
> >> a lock.
> > 
> > I was wondering about that too.  But I think it's probably OK.  If
> > autovacuum observes that (a) a table is old enough to pose a wraparound
> > hazard and (b) its putatively owning backend hasn't yet set
> > tempNamespaceId, then I think it's safe to conclude that that table is
> > removable, despite the theoretical race condition.
> 
> This relies on the fact that the flag gets set by a backend within a
> transaction context, and autovacuum would not see yet temp relations
> associated to it at the moment of the scan of pg_class if the backend
> has not committed yet its namespace creation via the creation of the
> first temp table it uses.

Makes sense, thanks.

I think there are two possible ways forward.  The conservative one is to
just apply the attached patch to branches 9.4-10.  That will let
autovacuum drop tables when the backend is in another database. It may
not solve the problem for the bunch of users that have only one database
that takes the majority of connections, but I think it's worth doing
nonetheless.  I tested the 9.4 instance and it works fine; tables are
deleted as soon as I make the session connection to another database.

The more aggressive action is to backpatch 943576bddcb5 ("Make
autovacuum more aggressive to remove orphaned temp tables") which is
currently only in pg11.  We would put the new PGPROC member at the end
of the struct, to avoid ABI incompatibilities, but it'd bring trouble
for extensions that put PGPROC in arrays.  I checked the code of some
known extensions; found that pglogical uses PGPROC, but only pointers to
it, so it wouldn't be damaged by the proposed change AFAICS.

Another possibly useful change is to make DISCARD ALL and DISCARD TEMP
delete everything in what would be the backend's temp namespace, even if
it hasn't been initialized yet.  This would cover the case where a
connection pooler keeps the connection open for a very long time, which
I think is a common case.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

autovac-temp-REL9_4_STABLE.patch

RE: removal of dangling temp tables

From

"Tsunakawa, Takayuki"

Date:

26 December 2018, 23:45:52

From: Alvaro Herrera [mailto:alvherre@2ndquadrant.com]
> The more aggressive action is to backpatch 943576bddcb5 ("Make autovacuum
> more aggressive to remove orphaned temp tables") which is currently only
> in pg11.  We would put the new PGPROC member at the end of the struct, to
> avoid ABI incompatibilities, but it'd bring trouble for extensions that
> put PGPROC in arrays.  I checked the code of some known extensions; found
> that pglogical uses PGPROC, but only pointers to it, so it wouldn't be
> damaged by the proposed change AFAICS.

+1
I think this is a bug from a user's perspective that garbage is left.  I want to believe that fixing bugs of PostgreSQL
itselfare prioritized over the ABI compatibility for extensions, if we have to choose one of them.
 



> Another possibly useful change is to make DISCARD ALL and DISCARD TEMP delete
> everything in what would be the backend's temp namespace, even if it hasn't
> been initialized yet.  This would cover the case where a connection pooler
> keeps the connection open for a very long time, which I think is a common
> case.

That sounds good.


Regards
Takayuki Tsunakawa

Re: removal of dangling temp tables

From

Alvaro Herrera

Date:

26 December 2018, 23:51:56

On 2018-Dec-26, Tsunakawa, Takayuki wrote:

> From: Alvaro Herrera [mailto:alvherre@2ndquadrant.com]
> > The more aggressive action is to backpatch 943576bddcb5 ("Make autovacuum
> > more aggressive to remove orphaned temp tables") which is currently only
> > in pg11.  We would put the new PGPROC member at the end of the struct, to
> > avoid ABI incompatibilities, but it'd bring trouble for extensions that
> > put PGPROC in arrays.  I checked the code of some known extensions; found
> > that pglogical uses PGPROC, but only pointers to it, so it wouldn't be
> > damaged by the proposed change AFAICS.
> 
> +1
> I think this is a bug from a user's perspective that garbage is left.
> I want to believe that fixing bugs of PostgreSQL itself are
> prioritized over the ABI compatibility for extensions, if we have to
> choose one of them.

Having been victim of ABI incompatibility myself, I loathe giving too
much priority to other issues that can be resolved in other ways, so I
don't necessarily support your view on bugs.
That said, I think in this case it shouldn't be a problem, so I'm going
to work on that next.

I haven't got around to creating the abidiff reporting system yet ...

> > Another possibly useful change is to make DISCARD ALL and DISCARD TEMP delete
> > everything in what would be the backend's temp namespace, even if it hasn't
> > been initialized yet.  This would cover the case where a connection pooler
> > keeps the connection open for a very long time, which I think is a common
> > case.
> 
> That sounds good.

Thanks.  I just tested that the attached patch does the intended.  (I
named it "autovac" but obviously the DISCARD part is not about
autovacuum).  I intend to push this patch first, and later backpatch the
other one.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

autovac-temp-REL9_4_STABLE.patch

RE: removal of dangling temp tables

From

"Tsunakawa, Takayuki"

Date:

27 December 2018, 00:26:36

From: Alvaro Herrera [mailto:alvherre@2ndquadrant.com]
> Having been victim of ABI incompatibility myself, I loathe giving too much
> priority to other issues that can be resolved in other ways, so I don't
> necessarily support your view on bugs.
> That said, I think in this case it shouldn't be a problem, so I'm going
> to work on that next.

I understood your sad experience...

> Thanks.  I just tested that the attached patch does the intended.  (I named
> it "autovac" but obviously the DISCARD part is not about autovacuum).  I
> intend to push this patch first, and later backpatch the other one.

The patch looks good.  I think this can be committed.

Please don't mind replying to this mail.


Regards
Takayuki Tsunakawa

Re: removal of dangling temp tables

From

Michael Paquier

Date:

27 December 2018, 04:32:57

On Wed, Dec 26, 2018 at 08:51:56PM -0300, Alvaro Herrera wrote:
> Having been victim of ABI incompatibility myself, I loathe giving too
> much priority to other issues that can be resolved in other ways, so I
> don't necessarily support your view on bugs.
> That said, I think in this case it shouldn't be a problem, so I'm going
> to work on that next.

And it is even better if bugs can be fixed, even partially without any
ABI breakages.  Anyway, not breaking the ABI of PGPROC is why 246a6c8
has not been back-patched to begin with, because we have no idea how
PGPROC is being used and because its interface is public, so if the
intent is to apply 246a6c8 to v10 and down this gets a -1 from me.

Back-patching what you sent in
https://www.postgresql.org/message-id/20181226190834.wsk2wzott5yzrjiq@alvherre.pgsql
is fine for me.

>>> Another possibly useful change is to make DISCARD ALL and DISCARD TEMP delete
>>> everything in what would be the backend's temp namespace, even if it hasn't
>>> been initialized yet.  This would cover the case where a connection pooler
>>> keeps the connection open for a very long time, which I think is a common
>>> case.
>>
>> That sounds good.
>
> Thanks.  I just tested that the attached patch does the intended.  (I
> named it "autovac" but obviously the DISCARD part is not about
> autovacuum).  I intend to push this patch first, and later backpatch the
> other one.

+       snprintf(namespaceName, sizeof(namespaceName), "pg_temp_%d",
+                MyBackendId);
+       namespaceId = get_namespace_oid(namespaceName, true);

So this is the reverse engineering of GetTempNamespaceBackendId().
Perhaps a dedicated API in namespace.c would be interesting to have?

I would recommend to keep the changes for DISCARD and autovacuum into
separate commits.
--
Michael

Attachment

signature.asc

Re: removal of dangling temp tables

From

Alvaro Herrera

Date:

27 December 2018, 19:30:21

On 2018-Dec-27, Michael Paquier wrote:

> On Wed, Dec 26, 2018 at 08:51:56PM -0300, Alvaro Herrera wrote:
> > Having been victim of ABI incompatibility myself, I loathe giving too
> > much priority to other issues that can be resolved in other ways, so I
> > don't necessarily support your view on bugs.
> > That said, I think in this case it shouldn't be a problem, so I'm going
> > to work on that next.
> 
> And it is even better if bugs can be fixed, even partially without any
> ABI breakages.  Anyway, not breaking the ABI of PGPROC is why 246a6c8
> has not been back-patched to begin with, because we have no idea how
> PGPROC is being used and because its interface is public, so if the
> intent is to apply 246a6c8 to v10 and down this gets a -1 from me.

We allow structs to receive new members at the end of the struct, since
this doesn't affect the offset of existing members; thus code already
compiled with the previous struct definition does not break.  AFAICS
there is no danger in backpatching that, moving that struct member at
the end of the struct.

> Back-patching what you sent in
> https://www.postgresql.org/message-id/20181226190834.wsk2wzott5yzrjiq@alvherre.pgsql
> is fine for me.

Done.

> +       snprintf(namespaceName, sizeof(namespaceName), "pg_temp_%d",
> +                MyBackendId);
> +       namespaceId = get_namespace_oid(namespaceName, true);
> 
> So this is the reverse engineering of GetTempNamespaceBackendId().
> Perhaps a dedicated API in namespace.c would be interesting to have?

Since this code doesn't appear in branches 11 and later, I'm not sure
creating such an API has any value.

> I would recommend to keep the changes for DISCARD and autovacuum into
> separate commits.

Yeah, done like that.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: removal of dangling temp tables

From

Michael Paquier

Date:

27 December 2018, 22:56:54

On Thu, Dec 27, 2018 at 04:30:21PM -0300, Alvaro Herrera wrote:
> We allow structs to receive new members at the end of the struct, since
> this doesn't affect the offset of existing members; thus code already
> compiled with the previous struct definition does not break.  AFAICS
> there is no danger in backpatching that, moving that struct member at
> the end of the struct.

Sure.  Now this comes to PGPROC, which I am not sure we can say is
never manipulated as an array.
--
Michael

Attachment

signature.asc

Re: removal of dangling temp tables

From

Alvaro Herrera

Date:

28 December 2018, 03:05:34

On 2018-Dec-28, Michael Paquier wrote:

> On Thu, Dec 27, 2018 at 04:30:21PM -0300, Alvaro Herrera wrote:
> > We allow structs to receive new members at the end of the struct, since
> > this doesn't affect the offset of existing members; thus code already
> > compiled with the previous struct definition does not break.  AFAICS
> > there is no danger in backpatching that, moving that struct member at
> > the end of the struct.
> 
> Sure.  Now this comes to PGPROC, which I am not sure we can say is
> never manipulated as an array.

The server code allocates arrays, but that's fine because that code is
recompiled.  Extensions only pass pointers around -- they don't create
any additional arrays.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: removal of dangling temp tables

From

Michael Paquier

Date:

28 December 2018, 04:43:14

On Fri, Dec 28, 2018 at 12:05:34AM -0300, Alvaro Herrera wrote:
> The server code allocates arrays, but that's fine because that code is
> recompiled.  Extensions only pass pointers around -- they don't create
> any additional arrays.

There are many exotic extensions which could be using sizeof(PGPROC)
as that's a popular structure, so I am glad that 246a6c8f did not find
its way down.  So thanks for what you have done!
--
Michael

Attachment

signature.asc

Re: removal of dangling temp tables

From

Alvaro Herrera

Date:

28 December 2018, 13:25:45

On 2018-Dec-28, Michael Paquier wrote:

> On Fri, Dec 28, 2018 at 12:05:34AM -0300, Alvaro Herrera wrote:
> > The server code allocates arrays, but that's fine because that code is
> > recompiled.  Extensions only pass pointers around -- they don't create
> > any additional arrays.
> 
> There are many exotic extensions which could be using sizeof(PGPROC)
> as that's a popular structure,

Can you show one instance of this?

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: removal of dangling temp tables

From

Alvaro Herrera

Date:

28 December 2018, 16:23:58

On 2018-Dec-28, Alvaro Herrera wrote:

> On 2018-Dec-28, Michael Paquier wrote:
> 
> > On Fri, Dec 28, 2018 at 12:05:34AM -0300, Alvaro Herrera wrote:
> > > The server code allocates arrays, but that's fine because that code is
> > > recompiled.  Extensions only pass pointers around -- they don't create
> > > any additional arrays.
> > 
> > There are many exotic extensions which could be using sizeof(PGPROC)
> > as that's a popular structure,
> 
> Can you show one instance of this?

I looked at 
https://github.com/postgrespro/pg_wait_sampling/blob/master/pg_wait_sampling.c
https://github.com/citusdata/citus/search?q=pgproc&unscoped_q=pgproc

and skimmed a few others can't find any instance where the full struct
is used, as opposed to just a pointer to it.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: removal of dangling temp tables

From

Tom Lane

Date:

28 December 2018, 16:35:39

Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> On 2018-Dec-28, Alvaro Herrera wrote:
>> On 2018-Dec-28, Michael Paquier wrote:
>>> There are many exotic extensions which could be using sizeof(PGPROC)
>>> as that's a popular structure,

>> Can you show one instance of this?

> I looked at
> https://github.com/postgrespro/pg_wait_sampling/blob/master/pg_wait_sampling.c
> https://github.com/citusdata/citus/search?q=pgproc&unscoped_q=pgproc
> and skimmed a few others can't find any instance where the full struct
> is used, as opposed to just a pointer to it.

No, the point Michael is making is that the array stride in the ProcArray
is part of our ABI.  For example, accessing a PGPROC from its pgprocno
using the GetPGProcByNumber macro will be broken if we change the
struct size.  I do not think you can assume that no extension does that.

            regards, tom lane

Re: removal of dangling temp tables

From

Tom Lane

Date:

28 December 2018, 16:47:25

I wrote:
> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
>> I looked at
>> https://github.com/postgrespro/pg_wait_sampling/blob/master/pg_wait_sampling.c
>> https://github.com/citusdata/citus/search?q=pgproc&unscoped_q=pgproc
>> and skimmed a few others can't find any instance where the full struct
>> is used, as opposed to just a pointer to it.

> No, the point Michael is making is that the array stride in the ProcArray
> is part of our ABI.  For example, accessing a PGPROC from its pgprocno
> using the GetPGProcByNumber macro will be broken if we change the
> struct size.  I do not think you can assume that no extension does that.

In fact, there's a counterexample right here in pg_wait_sampling:

https://github.com/postgrespro/pg_wait_sampling/blob/master/pg_wait_sampling.c#L343

            regards, tom lane

Re: removal of dangling temp tables

From

Alvaro Herrera

Date:

28 December 2018, 17:34:44

On 2018-Dec-28, Tom Lane wrote:

> I wrote:
> > Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> >> I looked at 
> >> https://github.com/postgrespro/pg_wait_sampling/blob/master/pg_wait_sampling.c
> >> https://github.com/citusdata/citus/search?q=pgproc&unscoped_q=pgproc
> >> and skimmed a few others can't find any instance where the full struct
> >> is used, as opposed to just a pointer to it.
> 
> > No, the point Michael is making is that the array stride in the ProcArray
> > is part of our ABI.  For example, accessing a PGPROC from its pgprocno
> > using the GetPGProcByNumber macro will be broken if we change the
> > struct size.  I do not think you can assume that no extension does that.
> 
> In fact, there's a counterexample right here in pg_wait_sampling:
> 
> https://github.com/postgrespro/pg_wait_sampling/blob/master/pg_wait_sampling.c#L343

Ughh.  I stand corrected.

This seems a terrible interface, from an ABI compatibility point of
view, and exposing proclist_node_get() as an inline function, doubly so.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services