Thread: Re: pgsql: Prevent instability in contrib/pageinspect's regression test.

Re: pgsql: Prevent instability in contrib/pageinspect's regression test.

From
Tom Lane
Date:
I wrote:
> Andres Freund <andres@anarazel.de> writes:
>> Looks like a chunk of the buildfarm doesn't like this - presumably because
>> they use force_parallel_mode = regress. Seems ok to just force that to off in
>> this test?

> Ugh ... didn't occur to me to try that.  I'll take a look.

Hmm, so the problem is:

SELECT octet_length(get_raw_page('test1', 'main', 0)) AS main_0;
ERROR:  cannot access temporary tables during a parallel operation

Why in the world is get_raw_page() marked as parallel safe?
It clearly isn't, given this restriction.

            regards, tom lane



Re: pgsql: Prevent instability in contrib/pageinspect's regression test.

From
Robert Haas
Date:
On Mon, Nov 21, 2022 at 12:35 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I wrote:
> > Andres Freund <andres@anarazel.de> writes:
> >> Looks like a chunk of the buildfarm doesn't like this - presumably because
> >> they use force_parallel_mode = regress. Seems ok to just force that to off in
> >> this test?
>
> > Ugh ... didn't occur to me to try that.  I'll take a look.
>
> Hmm, so the problem is:
>
> SELECT octet_length(get_raw_page('test1', 'main', 0)) AS main_0;
> ERROR:  cannot access temporary tables during a parallel operation
>
> Why in the world is get_raw_page() marked as parallel safe?
> It clearly isn't, given this restriction.

I suspect that restriction was overlooked when evaluating the marking
of this function.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: pgsql: Prevent instability in contrib/pageinspect's regression test.

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, Nov 21, 2022 at 12:35 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Hmm, so the problem is:
>> 
>> SELECT octet_length(get_raw_page('test1', 'main', 0)) AS main_0;
>> ERROR:  cannot access temporary tables during a parallel operation
>> 
>> Why in the world is get_raw_page() marked as parallel safe?
>> It clearly isn't, given this restriction.

> I suspect that restriction was overlooked when evaluating the marking
> of this function.

So it would seem.  PARALLEL RESTRICTED should work, though.

I'll check to see if any sibling functions have the same issue,
and push a patch to adjust them.

Presumably the parallel labeling has to be fixed as far back as
it's marked that way (didn't look).  Maybe we should push the
test change further back too, just to exercise this.

            regards, tom lane



Re: pgsql: Prevent instability in contrib/pageinspect's regression test.

From
Tom Lane
Date:
I wrote:
> I'll check to see if any sibling functions have the same issue,
> and push a patch to adjust them.
> Presumably the parallel labeling has to be fixed as far back as
> it's marked that way (didn't look).  Maybe we should push the
> test change further back too, just to exercise this.

Hmm, so this is easy enough to fix in HEAD and v15, as attached.
However, there's a problem in older branches: their newest
versions of pageinspect are older than 1.10, so this doesn't
work as-is.

We could imagine inventing versions like 1.9.1, and providing
a script pageinspect--1.9--1.9.1.sql to do what's done here
as well as (in later branches) pageinspect--1.9.1--1.10.sql that
duplicates pageinspect--1.9--1.10.sql, and then the same again for
1.8 and 1.7 for the older in-support branches.  That seems like
an awful lot of trouble for something that there have been no
field complaints about.

I'm inclined to apply this in HEAD and v15, and revert the
test change in v14, and call it good.

            regards, tom lane

diff --git a/contrib/pageinspect/Makefile b/contrib/pageinspect/Makefile
index 5c0736564a..ad5a3ac511 100644
--- a/contrib/pageinspect/Makefile
+++ b/contrib/pageinspect/Makefile
@@ -13,7 +13,8 @@ OBJS = \
     rawpage.o

 EXTENSION = pageinspect
-DATA =  pageinspect--1.9--1.10.sql pageinspect--1.8--1.9.sql \
+DATA =  pageinspect--1.10--1.11.sql \
+    pageinspect--1.9--1.10.sql pageinspect--1.8--1.9.sql \
     pageinspect--1.7--1.8.sql pageinspect--1.6--1.7.sql \
     pageinspect--1.5.sql pageinspect--1.5--1.6.sql \
     pageinspect--1.4--1.5.sql pageinspect--1.3--1.4.sql \
diff --git a/contrib/pageinspect/meson.build b/contrib/pageinspect/meson.build
index 3ec50b9445..25fa7dc20c 100644
--- a/contrib/pageinspect/meson.build
+++ b/contrib/pageinspect/meson.build
@@ -33,6 +33,7 @@ install_data(
   'pageinspect--1.7--1.8.sql',
   'pageinspect--1.8--1.9.sql',
   'pageinspect--1.9--1.10.sql',
+  'pageinspect--1.10--1.11.sql',
   'pageinspect.control',
   kwargs: contrib_data_args,
 )
diff --git a/contrib/pageinspect/pageinspect--1.10--1.11.sql b/contrib/pageinspect/pageinspect--1.10--1.11.sql
new file mode 100644
index 0000000000..8fa5e105bc
--- /dev/null
+++ b/contrib/pageinspect/pageinspect--1.10--1.11.sql
@@ -0,0 +1,28 @@
+/* contrib/pageinspect/pageinspect--1.10--1.11.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION pageinspect UPDATE TO '1.11'" to load this file. \quit
+
+--
+-- Functions that fetch relation pages must be PARALLEL RESTRICTED,
+-- not PARALLEL SAFE, otherwise they will fail when run on a
+-- temporary table in a parallel worker process.
+--
+
+ALTER FUNCTION get_raw_page(text, int8) PARALLEL RESTRICTED;
+ALTER FUNCTION get_raw_page(text, text, int8) PARALLEL RESTRICTED;
+-- tuple_data_split must be restricted because it may fetch TOAST data.
+ALTER FUNCTION tuple_data_split(oid, bytea, integer, integer, text) PARALLEL RESTRICTED;
+ALTER FUNCTION tuple_data_split(oid, bytea, integer, integer, text, bool) PARALLEL RESTRICTED;
+-- heap_page_item_attrs must be restricted because it calls tuple_data_split.
+ALTER FUNCTION heap_page_item_attrs(bytea, regclass, bool) PARALLEL RESTRICTED;
+ALTER FUNCTION heap_page_item_attrs(bytea, regclass) PARALLEL RESTRICTED;
+ALTER FUNCTION bt_metap(text) PARALLEL RESTRICTED;
+ALTER FUNCTION bt_page_stats(text, int8) PARALLEL RESTRICTED;
+ALTER FUNCTION bt_page_items(text, int8) PARALLEL RESTRICTED;
+ALTER FUNCTION hash_bitmap_info(regclass, int8) PARALLEL RESTRICTED;
+-- brin_page_items might be parallel safe, because it seems to touch
+-- only index metadata, but I don't think there's a point in risking it.
+-- Likewise for gist_page_items.
+ALTER FUNCTION brin_page_items(bytea, regclass) PARALLEL RESTRICTED;
+ALTER FUNCTION gist_page_items(bytea, regclass) PARALLEL RESTRICTED;
diff --git a/contrib/pageinspect/pageinspect.control b/contrib/pageinspect/pageinspect.control
index 7cdf37913d..f277413dd8 100644
--- a/contrib/pageinspect/pageinspect.control
+++ b/contrib/pageinspect/pageinspect.control
@@ -1,5 +1,5 @@
 # pageinspect extension
 comment = 'inspect the contents of database pages at a low level'
-default_version = '1.10'
+default_version = '1.11'
 module_pathname = '$libdir/pageinspect'
 relocatable = true

Re: pgsql: Prevent instability in contrib/pageinspect's regression test.

From
Andres Freund
Date:
Hi,

On 2022-11-21 12:52:01 -0500, Robert Haas wrote:
> On Mon, Nov 21, 2022 at 12:35 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > I wrote:
> > > Andres Freund <andres@anarazel.de> writes:
> > >> Looks like a chunk of the buildfarm doesn't like this - presumably because
> > >> they use force_parallel_mode = regress. Seems ok to just force that to off in
> > >> this test?
> >
> > > Ugh ... didn't occur to me to try that.  I'll take a look.
> >
> > Hmm, so the problem is:
> >
> > SELECT octet_length(get_raw_page('test1', 'main', 0)) AS main_0;
> > ERROR:  cannot access temporary tables during a parallel operation
> >
> > Why in the world is get_raw_page() marked as parallel safe?
> > It clearly isn't, given this restriction.
> 
> I suspect that restriction was overlooked when evaluating the marking
> of this function.

It's somewhat sad to add this restriction - I've used get_raw_page() (+
other functions) to scan a whole database for a bug. IIRC that actually
did end up using parallelism, albeit likely not very efficiently.

Don't really have a better idea though.

It may be worth inventing a framework where a function could analyze its
arguments (presumably via prosupport) to determine the degree of
parallel safety, but this doesn't seem sufficient reason...

Greetings

Andres Freund



Re: pgsql: Prevent instability in contrib/pageinspect's regression test.

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2022-11-21 12:52:01 -0500, Robert Haas wrote:
>> On Mon, Nov 21, 2022 at 12:35 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Why in the world is get_raw_page() marked as parallel safe?
>>> It clearly isn't, given this restriction.

> It's somewhat sad to add this restriction - I've used get_raw_page() (+
> other functions) to scan a whole database for a bug. IIRC that actually
> did end up using parallelism, albeit likely not very efficiently.
> Don't really have a better idea though.

Me either.

> It may be worth inventing a framework where a function could analyze its
> arguments (presumably via prosupport) to determine the degree of
> parallel safety, but this doesn't seem sufficient reason...

Maybe, but in this example you could only decide you were parallel
safe if the argument is an OID constant, which'd be pretty limiting.

If I were trying to find a better fix I'd be looking for ways for
parallel workers to be able to read the parent's temp tables.
(Perhaps that could tie in with the blue-sky discussion we had
the other day about allowing autovacuum on temp tables??)

            regards, tom lane



Re: pgsql: Prevent instability in contrib/pageinspect's regression test.

From
"David G. Johnston"
Date:
On Mon, Nov 21, 2022 at 1:12 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andres Freund <andres@anarazel.de> writes:
> On 2022-11-21 12:52:01 -0500, Robert Haas wrote:
>> On Mon, Nov 21, 2022 at 12:35 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Why in the world is get_raw_page() marked as parallel safe?
>>> It clearly isn't, given this restriction.

> It's somewhat sad to add this restriction - I've used get_raw_page() (+
> other functions) to scan a whole database for a bug. IIRC that actually
> did end up using parallelism, albeit likely not very efficiently.
> Don't really have a better idea though.

Me either.

> It may be worth inventing a framework where a function could analyze its
> arguments (presumably via prosupport) to determine the degree of
> parallel safety, but this doesn't seem sufficient reason...

Maybe, but in this example you could only decide you were parallel
safe if the argument is an OID constant, which'd be pretty limiting.

If I were trying to find a better fix I'd be looking for ways for
parallel workers to be able to read the parent's temp tables.
(Perhaps that could tie in with the blue-sky discussion we had
the other day about allowing autovacuum on temp tables??)



I don't suppose we want to just document the fact that these power-user non-core functions are unable to process temporary tables safely without first disabling parallelism for the session.

David J.

Re: pgsql: Prevent instability in contrib/pageinspect's regression test.

From
Andres Freund
Date:
Hi,

On 2022-11-21 15:12:15 -0500, Tom Lane wrote:
> If I were trying to find a better fix I'd be looking for ways for
> parallel workers to be able to read the parent's temp tables.
> (Perhaps that could tie in with the blue-sky discussion we had
> the other day about allowing autovacuum on temp tables??)

That'd be a nontrivial change, because we explicitly don't use any
locking for anything relating to localbuf.c. One possible benefit could
be that we could substantially reduce the code duplication between
"normal" bufmgr.c and localbuf.c.

Greetings,

Andres Freund



Re: pgsql: Prevent instability in contrib/pageinspect's regression test.

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2022-11-21 15:12:15 -0500, Tom Lane wrote:
>> If I were trying to find a better fix I'd be looking for ways for
>> parallel workers to be able to read the parent's temp tables.
>> (Perhaps that could tie in with the blue-sky discussion we had
>> the other day about allowing autovacuum on temp tables??)

> That'd be a nontrivial change, because we explicitly don't use any
> locking for anything relating to localbuf.c. One possible benefit could
> be that we could substantially reduce the code duplication between
> "normal" bufmgr.c and localbuf.c.

I didn't say this was easy ;-).  Aside from locking, the local buffers
are inside the process's address space and not accessible from outside.
Maybe they could be mapped into a shared memory region instead?
And there are optimizations like commit a7212be8b that depend on the
assumption that nothing else is accessing our process's temp tables.
That'd need a lot of thought, if we don't want to give up all the
performance benefits of temp tables.

            regards, tom lane



Re: pgsql: Prevent instability in contrib/pageinspect's regression test.

From
Greg Stark
Date:
On Mon, 21 Nov 2022 at 15:01, Andres Freund <andres@anarazel.de> wrote:
>
> It's somewhat sad to add this restriction - I've used get_raw_page() (+
> other functions) to scan a whole database for a bug. IIRC that actually
> did end up using parallelism, albeit likely not very efficiently.
>
> Don't really have a better idea though.

Given how specific the use case is here a simple solution would be to
just have a dedicated get_raw_temp_page() and restrict get_raw_page()
to persistent tables.

I suppose slightly gilding it would be to make a get_raw_page_temp()
and get_raw_page_persistent() and then you could have get_raw_page()
call the appropropriate one. They would be parallel restricted except
for get_raw_page_persistent() and if you explicitly called it you
could get parallel scans otherwise you wouldn't.

-- 
greg