Thread: Re: pgsql: Prevent instability in contrib/pageinspect's regression test.
I wrote: > Andres Freund <andres@anarazel.de> writes: >> Looks like a chunk of the buildfarm doesn't like this - presumably because >> they use force_parallel_mode = regress. Seems ok to just force that to off in >> this test? > Ugh ... didn't occur to me to try that. I'll take a look. Hmm, so the problem is: SELECT octet_length(get_raw_page('test1', 'main', 0)) AS main_0; ERROR: cannot access temporary tables during a parallel operation Why in the world is get_raw_page() marked as parallel safe? It clearly isn't, given this restriction. regards, tom lane
On Mon, Nov 21, 2022 at 12:35 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > I wrote: > > Andres Freund <andres@anarazel.de> writes: > >> Looks like a chunk of the buildfarm doesn't like this - presumably because > >> they use force_parallel_mode = regress. Seems ok to just force that to off in > >> this test? > > > Ugh ... didn't occur to me to try that. I'll take a look. > > Hmm, so the problem is: > > SELECT octet_length(get_raw_page('test1', 'main', 0)) AS main_0; > ERROR: cannot access temporary tables during a parallel operation > > Why in the world is get_raw_page() marked as parallel safe? > It clearly isn't, given this restriction. I suspect that restriction was overlooked when evaluating the marking of this function. -- Robert Haas EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes: > On Mon, Nov 21, 2022 at 12:35 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Hmm, so the problem is: >> >> SELECT octet_length(get_raw_page('test1', 'main', 0)) AS main_0; >> ERROR: cannot access temporary tables during a parallel operation >> >> Why in the world is get_raw_page() marked as parallel safe? >> It clearly isn't, given this restriction. > I suspect that restriction was overlooked when evaluating the marking > of this function. So it would seem. PARALLEL RESTRICTED should work, though. I'll check to see if any sibling functions have the same issue, and push a patch to adjust them. Presumably the parallel labeling has to be fixed as far back as it's marked that way (didn't look). Maybe we should push the test change further back too, just to exercise this. regards, tom lane
I wrote: > I'll check to see if any sibling functions have the same issue, > and push a patch to adjust them. > Presumably the parallel labeling has to be fixed as far back as > it's marked that way (didn't look). Maybe we should push the > test change further back too, just to exercise this. Hmm, so this is easy enough to fix in HEAD and v15, as attached. However, there's a problem in older branches: their newest versions of pageinspect are older than 1.10, so this doesn't work as-is. We could imagine inventing versions like 1.9.1, and providing a script pageinspect--1.9--1.9.1.sql to do what's done here as well as (in later branches) pageinspect--1.9.1--1.10.sql that duplicates pageinspect--1.9--1.10.sql, and then the same again for 1.8 and 1.7 for the older in-support branches. That seems like an awful lot of trouble for something that there have been no field complaints about. I'm inclined to apply this in HEAD and v15, and revert the test change in v14, and call it good. regards, tom lane diff --git a/contrib/pageinspect/Makefile b/contrib/pageinspect/Makefile index 5c0736564a..ad5a3ac511 100644 --- a/contrib/pageinspect/Makefile +++ b/contrib/pageinspect/Makefile @@ -13,7 +13,8 @@ OBJS = \ rawpage.o EXTENSION = pageinspect -DATA = pageinspect--1.9--1.10.sql pageinspect--1.8--1.9.sql \ +DATA = pageinspect--1.10--1.11.sql \ + pageinspect--1.9--1.10.sql pageinspect--1.8--1.9.sql \ pageinspect--1.7--1.8.sql pageinspect--1.6--1.7.sql \ pageinspect--1.5.sql pageinspect--1.5--1.6.sql \ pageinspect--1.4--1.5.sql pageinspect--1.3--1.4.sql \ diff --git a/contrib/pageinspect/meson.build b/contrib/pageinspect/meson.build index 3ec50b9445..25fa7dc20c 100644 --- a/contrib/pageinspect/meson.build +++ b/contrib/pageinspect/meson.build @@ -33,6 +33,7 @@ install_data( 'pageinspect--1.7--1.8.sql', 'pageinspect--1.8--1.9.sql', 'pageinspect--1.9--1.10.sql', + 'pageinspect--1.10--1.11.sql', 'pageinspect.control', kwargs: contrib_data_args, ) diff --git a/contrib/pageinspect/pageinspect--1.10--1.11.sql b/contrib/pageinspect/pageinspect--1.10--1.11.sql new file mode 100644 index 0000000000..8fa5e105bc --- /dev/null +++ b/contrib/pageinspect/pageinspect--1.10--1.11.sql @@ -0,0 +1,28 @@ +/* contrib/pageinspect/pageinspect--1.10--1.11.sql */ + +-- complain if script is sourced in psql, rather than via ALTER EXTENSION +\echo Use "ALTER EXTENSION pageinspect UPDATE TO '1.11'" to load this file. \quit + +-- +-- Functions that fetch relation pages must be PARALLEL RESTRICTED, +-- not PARALLEL SAFE, otherwise they will fail when run on a +-- temporary table in a parallel worker process. +-- + +ALTER FUNCTION get_raw_page(text, int8) PARALLEL RESTRICTED; +ALTER FUNCTION get_raw_page(text, text, int8) PARALLEL RESTRICTED; +-- tuple_data_split must be restricted because it may fetch TOAST data. +ALTER FUNCTION tuple_data_split(oid, bytea, integer, integer, text) PARALLEL RESTRICTED; +ALTER FUNCTION tuple_data_split(oid, bytea, integer, integer, text, bool) PARALLEL RESTRICTED; +-- heap_page_item_attrs must be restricted because it calls tuple_data_split. +ALTER FUNCTION heap_page_item_attrs(bytea, regclass, bool) PARALLEL RESTRICTED; +ALTER FUNCTION heap_page_item_attrs(bytea, regclass) PARALLEL RESTRICTED; +ALTER FUNCTION bt_metap(text) PARALLEL RESTRICTED; +ALTER FUNCTION bt_page_stats(text, int8) PARALLEL RESTRICTED; +ALTER FUNCTION bt_page_items(text, int8) PARALLEL RESTRICTED; +ALTER FUNCTION hash_bitmap_info(regclass, int8) PARALLEL RESTRICTED; +-- brin_page_items might be parallel safe, because it seems to touch +-- only index metadata, but I don't think there's a point in risking it. +-- Likewise for gist_page_items. +ALTER FUNCTION brin_page_items(bytea, regclass) PARALLEL RESTRICTED; +ALTER FUNCTION gist_page_items(bytea, regclass) PARALLEL RESTRICTED; diff --git a/contrib/pageinspect/pageinspect.control b/contrib/pageinspect/pageinspect.control index 7cdf37913d..f277413dd8 100644 --- a/contrib/pageinspect/pageinspect.control +++ b/contrib/pageinspect/pageinspect.control @@ -1,5 +1,5 @@ # pageinspect extension comment = 'inspect the contents of database pages at a low level' -default_version = '1.10' +default_version = '1.11' module_pathname = '$libdir/pageinspect' relocatable = true
Hi, On 2022-11-21 12:52:01 -0500, Robert Haas wrote: > On Mon, Nov 21, 2022 at 12:35 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > I wrote: > > > Andres Freund <andres@anarazel.de> writes: > > >> Looks like a chunk of the buildfarm doesn't like this - presumably because > > >> they use force_parallel_mode = regress. Seems ok to just force that to off in > > >> this test? > > > > > Ugh ... didn't occur to me to try that. I'll take a look. > > > > Hmm, so the problem is: > > > > SELECT octet_length(get_raw_page('test1', 'main', 0)) AS main_0; > > ERROR: cannot access temporary tables during a parallel operation > > > > Why in the world is get_raw_page() marked as parallel safe? > > It clearly isn't, given this restriction. > > I suspect that restriction was overlooked when evaluating the marking > of this function. It's somewhat sad to add this restriction - I've used get_raw_page() (+ other functions) to scan a whole database for a bug. IIRC that actually did end up using parallelism, albeit likely not very efficiently. Don't really have a better idea though. It may be worth inventing a framework where a function could analyze its arguments (presumably via prosupport) to determine the degree of parallel safety, but this doesn't seem sufficient reason... Greetings Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2022-11-21 12:52:01 -0500, Robert Haas wrote: >> On Mon, Nov 21, 2022 at 12:35 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Why in the world is get_raw_page() marked as parallel safe? >>> It clearly isn't, given this restriction. > It's somewhat sad to add this restriction - I've used get_raw_page() (+ > other functions) to scan a whole database for a bug. IIRC that actually > did end up using parallelism, albeit likely not very efficiently. > Don't really have a better idea though. Me either. > It may be worth inventing a framework where a function could analyze its > arguments (presumably via prosupport) to determine the degree of > parallel safety, but this doesn't seem sufficient reason... Maybe, but in this example you could only decide you were parallel safe if the argument is an OID constant, which'd be pretty limiting. If I were trying to find a better fix I'd be looking for ways for parallel workers to be able to read the parent's temp tables. (Perhaps that could tie in with the blue-sky discussion we had the other day about allowing autovacuum on temp tables??) regards, tom lane
Re: pgsql: Prevent instability in contrib/pageinspect's regression test.
From
"David G. Johnston"
Date:
On Mon, Nov 21, 2022 at 1:12 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andres Freund <andres@anarazel.de> writes:
> On 2022-11-21 12:52:01 -0500, Robert Haas wrote:
>> On Mon, Nov 21, 2022 at 12:35 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Why in the world is get_raw_page() marked as parallel safe?
>>> It clearly isn't, given this restriction.
> It's somewhat sad to add this restriction - I've used get_raw_page() (+
> other functions) to scan a whole database for a bug. IIRC that actually
> did end up using parallelism, albeit likely not very efficiently.
> Don't really have a better idea though.
Me either.
> It may be worth inventing a framework where a function could analyze its
> arguments (presumably via prosupport) to determine the degree of
> parallel safety, but this doesn't seem sufficient reason...
Maybe, but in this example you could only decide you were parallel
safe if the argument is an OID constant, which'd be pretty limiting.
If I were trying to find a better fix I'd be looking for ways for
parallel workers to be able to read the parent's temp tables.
(Perhaps that could tie in with the blue-sky discussion we had
the other day about allowing autovacuum on temp tables??)
I don't suppose we want to just document the fact that these power-user non-core functions are unable to process temporary tables safely without first disabling parallelism for the session.
David J.
Hi, On 2022-11-21 15:12:15 -0500, Tom Lane wrote: > If I were trying to find a better fix I'd be looking for ways for > parallel workers to be able to read the parent's temp tables. > (Perhaps that could tie in with the blue-sky discussion we had > the other day about allowing autovacuum on temp tables??) That'd be a nontrivial change, because we explicitly don't use any locking for anything relating to localbuf.c. One possible benefit could be that we could substantially reduce the code duplication between "normal" bufmgr.c and localbuf.c. Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2022-11-21 15:12:15 -0500, Tom Lane wrote: >> If I were trying to find a better fix I'd be looking for ways for >> parallel workers to be able to read the parent's temp tables. >> (Perhaps that could tie in with the blue-sky discussion we had >> the other day about allowing autovacuum on temp tables??) > That'd be a nontrivial change, because we explicitly don't use any > locking for anything relating to localbuf.c. One possible benefit could > be that we could substantially reduce the code duplication between > "normal" bufmgr.c and localbuf.c. I didn't say this was easy ;-). Aside from locking, the local buffers are inside the process's address space and not accessible from outside. Maybe they could be mapped into a shared memory region instead? And there are optimizations like commit a7212be8b that depend on the assumption that nothing else is accessing our process's temp tables. That'd need a lot of thought, if we don't want to give up all the performance benefits of temp tables. regards, tom lane
On Mon, 21 Nov 2022 at 15:01, Andres Freund <andres@anarazel.de> wrote: > > It's somewhat sad to add this restriction - I've used get_raw_page() (+ > other functions) to scan a whole database for a bug. IIRC that actually > did end up using parallelism, albeit likely not very efficiently. > > Don't really have a better idea though. Given how specific the use case is here a simple solution would be to just have a dedicated get_raw_temp_page() and restrict get_raw_page() to persistent tables. I suppose slightly gilding it would be to make a get_raw_page_temp() and get_raw_page_persistent() and then you could have get_raw_page() call the appropropriate one. They would be parallel restricted except for get_raw_page_persistent() and if you explicitly called it you could get parallel scans otherwise you wouldn't. -- greg