Thread: New Object Access Type hooks

New Object Access Type hooks

From
Mark Dilger
Date:
Hackers,

Over in [1], Joshua proposed a new set of Object Access Type hooks based on strings rather than Oids.

His patch was written to be applied atop my patch for granting privileges on gucs.

On review of his patch, I became uncomfortable with the complete lack of regression test coverage.  To be fair, he did
pastea bit of testing logic to the thread, but it appears to be based on pgaudit, and it is unclear how to include such
atest in the core project, where pgaudit is not assumed to be installed. 

First, I refactored his patch to work against HEAD and not depend on my GUCs patch.  Find that as v1-0001.  The
refactoringexposed a bit of a problem.  To call the new hook for SET and ALTER SYSTEM commands, I need to pass in the
Oidof a catalog table.  But since my GUC patch isn't applied yet, there isn't any such table (pg_setting_acl or
whatnot)to pass.  So I'm passing InvalidOid, but I don't know if that is right.  In any event, if we want a new API
likethis, we should think a bit harder about whether it can be used to check operations where no table Oid is
applicable.

Second, I added a new test directory, src/test/modules/test_oat_hooks, which includes a new loadable module with hook
implementationsand a regression test for testing the object access hooks.  The main point of the test is to log which
hooksget called in which order, and which hooks do or do not get called when other hooks allow or deny access.  That
informationshows up in the expected output as NOTICE messages. 

This second patch has gotten a little long, and I'd like another pair of eyes on this before spending a second day on
theeffort.  Please note that this is a quick WIP patch in response to the patch Joshua posted earlier today.  Sorry for
sometimesmissing function comments, etc.  The goal, if this design seems acceptable, is to polish this, hopefully with
Joshua'sassistance, and get it committed *before* my GUCs patch, so that my patch can be rebased to use it.  Otherwise,
ifthis is rejected, I can continue on the GUC patch without this. 

(FYI, I got a test failure from src/test/recovery/t/013_crash_restart.pl when testing v1-0001.  I'm not sure yet what
thatis about.) 



[1] https://www.postgresql.org/message-id/flat/664799.1647456444%40sss.pgh.pa.us#c9721c2da88d59684ac7ac5fc36f09c1

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Attachment

Re: New Object Access Type hooks

From
Joshua Brindle
Date:
On Thu, Mar 17, 2022 at 11:21 PM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:
>
> Hackers,
>
> Over in [1], Joshua proposed a new set of Object Access Type hooks based on strings rather than Oids.
>
> His patch was written to be applied atop my patch for granting privileges on gucs.
>
> On review of his patch, I became uncomfortable with the complete lack of regression test coverage.  To be fair, he
didpaste a bit of testing logic to the thread, but it appears to be based on pgaudit, and it is unclear how to include
sucha test in the core project, where pgaudit is not assumed to be installed. 
>
> First, I refactored his patch to work against HEAD and not depend on my GUCs patch.  Find that as v1-0001.  The
refactoringexposed a bit of a problem.  To call the new hook for SET and ALTER SYSTEM commands, I need to pass in the
Oidof a catalog table.  But since my GUC patch isn't applied yet, there isn't any such table (pg_setting_acl or
whatnot)to pass.  So I'm passing InvalidOid, but I don't know if that is right.  In any event, if we want a new API
likethis, we should think a bit harder about whether it can be used to check operations where no table Oid is
applicable.
>
> Second, I added a new test directory, src/test/modules/test_oat_hooks, which includes a new loadable module with hook
implementationsand a regression test for testing the object access hooks.  The main point of the test is to log which
hooksget called in which order, and which hooks do or do not get called when other hooks allow or deny access.  That
informationshows up in the expected output as NOTICE messages. 
>
> This second patch has gotten a little long, and I'd like another pair of eyes on this before spending a second day on
theeffort.  Please note that this is a quick WIP patch in response to the patch Joshua posted earlier today.  Sorry for
sometimesmissing function comments, etc.  The goal, if this design seems acceptable, is to polish this, hopefully with
Joshua'sassistance, and get it committed *before* my GUCs patch, so that my patch can be rebased to use it.  Otherwise,
ifthis is rejected, I can continue on the GUC patch without this. 
>

This is great, thank you for doing this. I didn't even realize the OAT
hooks had no regression tests.

It looks good to me, I reviewed both and tested the module. I wonder
if the slight abuse of subid is warranted with brand new hooks going
in but not enough to object, I just hope this doesn't rise to the too
large to merge this late level.

> (FYI, I got a test failure from src/test/recovery/t/013_crash_restart.pl when testing v1-0001.  I'm not sure yet what
thatis about.) 
>
>
>
> [1] https://www.postgresql.org/message-id/flat/664799.1647456444%40sss.pgh.pa.us#c9721c2da88d59684ac7ac5fc36f09c1

>



Re: New Object Access Type hooks

From
Mark Dilger
Date:

> On Mar 18, 2022, at 7:16 AM, Joshua Brindle <joshua.brindle@crunchydata.com> wrote:
>
> This is great, thank you for doing this. I didn't even realize the OAT
> hooks had no regression tests.
>
> It looks good to me, I reviewed both and tested the module. I wonder
> if the slight abuse of subid is warranted with brand new hooks going
> in but not enough to object, I just hope this doesn't rise to the too
> large to merge this late level.

The majority of the patch is regression testing code, stuff which doesn't get installed.  It's even marked as
NO_INSTALLCHECK,so it won't get installed even as part of "make installcheck".  That seems safe enough to me. 

Not including tests of OAT seems worse, as it leaves us open to breaking the behavior without realizing we've done so.
Arefactoring of the core code might cause hooks to be called in a different order, something which isn't necessarily
wrong,but should not be done unknowingly. 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: New Object Access Type hooks

From
Andrew Dunstan
Date:
On 3/18/22 11:15, Mark Dilger wrote:
>
>> On Mar 18, 2022, at 7:16 AM, Joshua Brindle <joshua.brindle@crunchydata.com> wrote:
>>
>> This is great, thank you for doing this. I didn't even realize the OAT
>> hooks had no regression tests.
>>
>> It looks good to me, I reviewed both and tested the module. I wonder
>> if the slight abuse of subid is warranted with brand new hooks going
>> in but not enough to object, I just hope this doesn't rise to the too
>> large to merge this late level.


The core code is extracted from a current CF patch, so I think in
principle it's OK.


I haven't looked at it in detail, but regarding the test code I'm not
sure why there's a .control file, since this isn't a loadable extension,
not why there's a test_oat_hooks.h file.


> The majority of the patch is regression testing code, stuff which doesn't get installed.  It's even marked as
NO_INSTALLCHECK,so it won't get installed even as part of "make installcheck".  That seems safe enough to me.
 
>
> Not including tests of OAT seems worse, as it leaves us open to breaking the behavior without realizing we've done
so. A refactoring of the core code might cause hooks to be called in a different order, something which isn't
necessarilywrong, but should not be done unknowingly.
 
>

Yes, and in any case we've added test code after feature freeze in the past.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: New Object Access Type hooks

From
Andrew Dunstan
Date:
On 3/17/22 23:21, Mark Dilger wrote:
> Hackers,
>
> Over in [1], Joshua proposed a new set of Object Access Type hooks based on strings rather than Oids.
>
> His patch was written to be applied atop my patch for granting privileges on gucs.
>
> On review of his patch, I became uncomfortable with the complete lack of regression test coverage.  To be fair, he
didpaste a bit of testing logic to the thread, but it appears to be based on pgaudit, and it is unclear how to include
sucha test in the core project, where pgaudit is not assumed to be installed.
 
>
> First, I refactored his patch to work against HEAD and not depend on my GUCs patch.  Find that as v1-0001.  The
refactoringexposed a bit of a problem.  To call the new hook for SET and ALTER SYSTEM commands, I need to pass in the
Oidof a catalog table.  But since my GUC patch isn't applied yet, there isn't any such table (pg_setting_acl or
whatnot)to pass.  So I'm passing InvalidOid, but I don't know if that is right.  In any event, if we want a new API
likethis, we should think a bit harder about whether it can be used to check operations where no table Oid is
applicable.


My first inclination is to say it's probably ok. The immediately obvious
alternative would be to create yet another set of functions that don't
have classId parameters. That doesn't seem attractive.

Modulo that issue I think patch 1 is basically ok, but we should fix the
comments in objectaccess.c.  Rather than "It is [the] entrypoint ..." we
should have something like "Oid variant entrypoint ..." and "Name
variant entrypoint ...", and also fix the function names in the comments.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: New Object Access Type hooks

From
Mark Dilger
Date:

> On Mar 18, 2022, at 3:04 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>
> I haven't looked at it in detail, but regarding the test code I'm not
> sure why there's a .control file, since this isn't a loadable extension,
> not why there's a test_oat_hooks.h file.

The .control file exists because the test defines a loadable module which defines the hooks.  The test_oat_hooks.h file
wasextraneous, and has been removed in v2. 



—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Attachment

Re: New Object Access Type hooks

From
Mark Dilger
Date:

> On Mar 21, 2022, at 8:41 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
>
> My first inclination is to say it's probably ok. The immediately obvious
> alternative would be to create yet another set of functions that don't
> have classId parameters. That doesn't seem attractive.
>
> Modulo that issue I think patch 1 is basically ok, but we should fix the
> comments in objectaccess.c.  Rather than "It is [the] entrypoint ..." we
> should have something like "Oid variant entrypoint ..." and "Name
> variant entrypoint ...", and also fix the function names in the comments.

Joshua,

Do you care to create a new version of this, perhaps based on the v2-0001 patch I just posted?

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: New Object Access Type hooks

From
Andrew Dunstan
Date:
On 3/21/22 15:57, Mark Dilger wrote:
>> On Mar 18, 2022, at 3:04 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>>
>> I haven't looked at it in detail, but regarding the test code I'm not
>> sure why there's a .control file, since this isn't a loadable extension,
>> not why there's a test_oat_hooks.h file.
> The .control file exists because the test defines a loadable module which defines the hooks. 



To the best of my knowledge .control files are only used by extensions,
not by other modules. They are only referenced in
src/backend/commands/extension.c in the backend code. For example,
auto_explain which is a loadable module but not en extension does not
have one, and I bet if you remove it you'll find this will work just fine.


cheers


andrew


-- 

Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: New Object Access Type hooks

From
Mark Dilger
Date:

> On Mar 21, 2022, at 1:30 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
> 
> To the best of my knowledge .control files are only used by extensions,
> not by other modules. They are only referenced in
> src/backend/commands/extension.c in the backend code. For example,
> auto_explain which is a loadable module but not en extension does not
> have one, and I bet if you remove it you'll find this will work just fine.

Fixed, also with adjustments to Joshua's function comments.



—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Attachment

Re: New Object Access Type hooks

From
Thomas Munro
Date:
On Fri, Mar 18, 2022 at 4:22 PM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:
> (FYI, I got a test failure from src/test/recovery/t/013_crash_restart.pl when testing v1-0001.  I'm not sure yet what
thatis about.)
 

Doesn't look like 0001 has anything to do with that...   Are you on a
Mac?  Did it look like this recent failure from CI?

https://cirrus-ci.com/task/4686108033286144

https://api.cirrus-ci.com/v1/artifact/task/4686108033286144/log/src/test/recovery/tmp_check/log/regress_log_013_crash_restart

https://api.cirrus-ci.com/v1/artifact/task/4686108033286144/log/src/test/recovery/tmp_check/log/013_crash_restart_primary.log

I have no idea what is going on there, but searching for discussion
brought me here...



Re: New Object Access Type hooks

From
Mark Dilger
Date:

> On Mar 21, 2022, at 10:03 PM, Thomas Munro <thomas.munro@gmail.com> wrote:
>
> On Fri, Mar 18, 2022 at 4:22 PM Mark Dilger
> <mark.dilger@enterprisedb.com> wrote:
>> (FYI, I got a test failure from src/test/recovery/t/013_crash_restart.pl when testing v1-0001.  I'm not sure yet
whatthat is about.) 
>
> Doesn't look like 0001 has anything to do with that...   Are you on a
> Mac?

Yes, macOS Catalina, currently 10.15.7.

>  Did it look like this recent failure from CI?
>
> https://cirrus-ci.com/task/4686108033286144
>
https://api.cirrus-ci.com/v1/artifact/task/4686108033286144/log/src/test/recovery/tmp_check/log/regress_log_013_crash_restart
>
https://api.cirrus-ci.com/v1/artifact/task/4686108033286144/log/src/test/recovery/tmp_check/log/013_crash_restart_primary.log

I no longer have the logs for comparison.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: New Object Access Type hooks

From
Andrew Dunstan
Date:
On 3/22/22 01:03, Thomas Munro wrote:
> On Fri, Mar 18, 2022 at 4:22 PM Mark Dilger
> <mark.dilger@enterprisedb.com> wrote:
>> (FYI, I got a test failure from src/test/recovery/t/013_crash_restart.pl when testing v1-0001.  I'm not sure yet
whatthat is about.)
 
> Doesn't look like 0001 has anything to do with that...   Are you on a
> Mac?  Did it look like this recent failure from CI?


Probably not connected. It's working fine for me on Ubuntu/


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: New Object Access Type hooks

From
Andrew Dunstan
Date:
On 3/21/22 19:08, Mark Dilger wrote:
>
>> On Mar 21, 2022, at 1:30 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>>
>> To the best of my knowledge .control files are only used by extensions,
>> not by other modules. They are only referenced in
>> src/backend/commands/extension.c in the backend code. For example,
>> auto_explain which is a loadable module but not en extension does not
>> have one, and I bet if you remove it you'll find this will work just fine.
> Fixed, also with adjustments to Joshua's function comments.
>

Pushed with slight adjustments - the LOAD was unnecessary as was the
setting of client_min_messages - the latter would have made buildfarm
animals unhappy.


Now you need to re-submit your GUCs patch I think.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: New Object Access Type hooks

From
Julien Rouhaud
Date:
Hi,

On Tue, Mar 22, 2022 at 10:41:05AM -0400, Andrew Dunstan wrote:
> 
> Pushed with slight adjustments - the LOAD was unnecessary as was the
> setting of client_min_messages - the latter would have made buildfarm
> animals unhappy.

For the record this just failed on my buildfarm animal:
https://brekka.postgresql.org/cgi-bin/show_stage_log.pl?nm=lapwing&dt=2022-03-22%2014%3A40%3A10&stg=misc-check.



Re: New Object Access Type hooks

From
Mark Dilger
Date:

> On Mar 22, 2022, at 8:14 AM, Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> Hi,
>
> On Tue, Mar 22, 2022 at 10:41:05AM -0400, Andrew Dunstan wrote:
>>
>> Pushed with slight adjustments - the LOAD was unnecessary as was the
>> setting of client_min_messages - the latter would have made buildfarm
>> animals unhappy.
>
> For the record this just failed on my buildfarm animal:
> https://brekka.postgresql.org/cgi-bin/show_stage_log.pl?nm=lapwing&dt=2022-03-22%2014%3A40%3A10&stg=misc-check.

culicidae is complaining:

==~_~===-=-===~_~== pgsql.build/src/test/modules/test_oat_hooks/log/postmaster.log ==~_~===-=-===~_~==
2022-03-22 14:53:27.175 UTC [2166986][postmaster][:0] LOG:  starting PostgreSQL 15devel on x86_64-pc-linux-gnu,
compiledby gcc (Debian 11.2.0-18) 11.2.0, 64-bit 
2022-03-22 14:53:27.175 UTC [2166986][postmaster][:0] LOG:  listening on Unix socket
"/tmp/pg_regress-RiE7x8/.s.PGSQL.6280"
2022-03-22 14:53:27.198 UTC [2167008][not initialized][:0] FATAL:  test_oat_hooks must be loaded via
shared_preload_libraries
2022-03-22 14:53:27.202 UTC [2167006][not initialized][:0] FATAL:  test_oat_hooks must be loaded via
shared_preload_libraries
2022-03-22 14:53:27.203 UTC [2167009][not initialized][:0] FATAL:  test_oat_hooks must be loaded via
shared_preload_libraries
2022-03-22 14:53:27.204 UTC [2166986][postmaster][:0] LOG:  checkpointer process (PID 2167006) exited with exit code 1
2022-03-22 14:53:27.204 UTC [2166986][postmaster][:0] LOG:  terminating any other active server processes
2022-03-22 14:53:27.204 UTC [2166986][postmaster][:0] LOG:  shutting down because restart_after_crash is off
2022-03-22 14:53:27.206 UTC [2166986][postmaster][:0] LOG:  database system is shut down
==~_~===-=-===~_~== pgsql.build/src/test/modules/test_rls_hooks/log/initdb.log ==~_~===-=-===~_~==


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: New Object Access Type hooks

From
Andrew Dunstan
Date:
On 3/22/22 11:26, Mark Dilger wrote:
>
>> On Mar 22, 2022, at 8:14 AM, Julien Rouhaud <rjuju123@gmail.com> wrote:
>>
>> Hi,
>>
>> On Tue, Mar 22, 2022 at 10:41:05AM -0400, Andrew Dunstan wrote:
>>> Pushed with slight adjustments - the LOAD was unnecessary as was the
>>> setting of client_min_messages - the latter would have made buildfarm
>>> animals unhappy.
>> For the record this just failed on my buildfarm animal:
>> https://brekka.postgresql.org/cgi-bin/show_stage_log.pl?nm=lapwing&dt=2022-03-22%2014%3A40%3A10&stg=misc-check.
> culicidae is complaining:
>
> ==~_~===-=-===~_~== pgsql.build/src/test/modules/test_oat_hooks/log/postmaster.log ==~_~===-=-===~_~==
> 2022-03-22 14:53:27.175 UTC [2166986][postmaster][:0] LOG:  starting PostgreSQL 15devel on x86_64-pc-linux-gnu,
compiledby gcc (Debian 11.2.0-18) 11.2.0, 64-bit
 
> 2022-03-22 14:53:27.175 UTC [2166986][postmaster][:0] LOG:  listening on Unix socket
"/tmp/pg_regress-RiE7x8/.s.PGSQL.6280"
> 2022-03-22 14:53:27.198 UTC [2167008][not initialized][:0] FATAL:  test_oat_hooks must be loaded via
shared_preload_libraries
> 2022-03-22 14:53:27.202 UTC [2167006][not initialized][:0] FATAL:  test_oat_hooks must be loaded via
shared_preload_libraries
> 2022-03-22 14:53:27.203 UTC [2167009][not initialized][:0] FATAL:  test_oat_hooks must be loaded via
shared_preload_libraries
> 2022-03-22 14:53:27.204 UTC [2166986][postmaster][:0] LOG:  checkpointer process (PID 2167006) exited with exit code
1
> 2022-03-22 14:53:27.204 UTC [2166986][postmaster][:0] LOG:  terminating any other active server processes
> 2022-03-22 14:53:27.204 UTC [2166986][postmaster][:0] LOG:  shutting down because restart_after_crash is off
> 2022-03-22 14:53:27.206 UTC [2166986][postmaster][:0] LOG:  database system is shut down
> ==~_~===-=-===~_~== pgsql.build/src/test/modules/test_rls_hooks/log/initdb.log ==~_~===-=-===~_~==
>
>


That seems quite weird. I'm not sure how it's getting loaded at all if
not via shared_preload_libraries


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: New Object Access Type hooks

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> That seems quite weird. I'm not sure how it's getting loaded at all if
> not via shared_preload_libraries

Some other animals are showing this:

diff -U3 /home/postgres/pgsql/src/test/modules/test_oat_hooks/expected/test_oat_hooks.out
/home/postgres/pgsql/src/test/modules/test_oat_hooks/results/test_oat_hooks.out
--- /home/postgres/pgsql/src/test/modules/test_oat_hooks/expected/test_oat_hooks.out    2022-03-22 11:57:40.224991011
-0400
+++ /home/postgres/pgsql/src/test/modules/test_oat_hooks/results/test_oat_hooks.out    2022-03-22 11:59:59.998983366
-0400
@@ -48,6 +48,8 @@
 SELECT * FROM regress_test_table;
 NOTICE:  in executor check perms: superuser attempting execute
 NOTICE:  in executor check perms: superuser finished execute
+NOTICE:  in executor check perms: superuser attempting execute
+NOTICE:  in executor check perms: superuser finished execute
  t
 ---
 (0 rows)
@@ -95,6 +97,8 @@
                       ^
 NOTICE:  in executor check perms: non-superuser attempting execute
 NOTICE:  in executor check perms: non-superuser finished execute
+NOTICE:  in executor check perms: non-superuser attempting execute
+NOTICE:  in executor check perms: non-superuser finished execute
  t
 ---
 (0 rows)
@@ -168,6 +172,8 @@
                       ^
 NOTICE:  in executor check perms: superuser attempting execute
 NOTICE:  in executor check perms: superuser finished execute
+NOTICE:  in executor check perms: superuser attempting execute
+NOTICE:  in executor check perms: superuser finished execute
  t
 ---
 (0 rows)


I can duplicate that by adding "force_parallel_mode = regress"
to test_oat_hooks.conf, so a fair bet is that the duplication
comes from executing the same hook in both leader and worker.

            regards, tom lane



Re: New Object Access Type hooks

From
Andrew Dunstan
Date:
On 3/22/22 12:02, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> That seems quite weird. I'm not sure how it's getting loaded at all if
>> not via shared_preload_libraries
> Some other animals are showing this:
>
> diff -U3 /home/postgres/pgsql/src/test/modules/test_oat_hooks/expected/test_oat_hooks.out
/home/postgres/pgsql/src/test/modules/test_oat_hooks/results/test_oat_hooks.out
> --- /home/postgres/pgsql/src/test/modules/test_oat_hooks/expected/test_oat_hooks.out    2022-03-22 11:57:40.224991011
-0400
> +++ /home/postgres/pgsql/src/test/modules/test_oat_hooks/results/test_oat_hooks.out    2022-03-22 11:59:59.998983366
-0400
> @@ -48,6 +48,8 @@
>  SELECT * FROM regress_test_table;
>  NOTICE:  in executor check perms: superuser attempting execute
>  NOTICE:  in executor check perms: superuser finished execute
> +NOTICE:  in executor check perms: superuser attempting execute
> +NOTICE:  in executor check perms: superuser finished execute
>   t 
>  ---
>  (0 rows)
> @@ -95,6 +97,8 @@
>                        ^
>  NOTICE:  in executor check perms: non-superuser attempting execute
>  NOTICE:  in executor check perms: non-superuser finished execute
> +NOTICE:  in executor check perms: non-superuser attempting execute
> +NOTICE:  in executor check perms: non-superuser finished execute
>   t 
>  ---
>  (0 rows)
> @@ -168,6 +172,8 @@
>                        ^
>  NOTICE:  in executor check perms: superuser attempting execute
>  NOTICE:  in executor check perms: superuser finished execute
> +NOTICE:  in executor check perms: superuser attempting execute
> +NOTICE:  in executor check perms: superuser finished execute
>   t 
>  ---
>  (0 rows)
>
>
> I can duplicate that by adding "force_parallel_mode = regress"
> to test_oat_hooks.conf, so a fair bet is that the duplication
> comes from executing the same hook in both leader and worker.
>
>             



OK, thanks. My test didn't include that one setting :-(


If I can't com up with a very quick fix I'll revert it.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: New Object Access Type hooks

From
Mark Dilger
Date:

> On Mar 22, 2022, at 9:09 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
>
> If I can't com up with a very quick fix I'll revert it.

The problem is coming from the REGRESS_exec_check_perms, which was included in the patch to demonstrate when the other
hooksfired relative to the ExecutorCheckPerms_hook, but since it is causing problems, I can submit a patch with that
removed. Give me a couple minutes.... 


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: New Object Access Type hooks

From
Mark Dilger
Date:

> On Mar 22, 2022, at 9:11 AM, Mark Dilger <mark.dilger@enterprisedb.com> wrote:
>
> Give me a couple minutes....


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Attachment

Re: New Object Access Type hooks

From
Tom Lane
Date:
Mark Dilger <mark.dilger@enterprisedb.com> writes:
> The problem is coming from the REGRESS_exec_check_perms, which was included in the patch to demonstrate when the
otherhooks fired relative to the ExecutorCheckPerms_hook, but since it is causing problems, I can submit a patch with
thatremoved.  Give me a couple minutes.... 

Maybe better to suppress the audit messages if in a parallel worker?

            regards, tom lane



Re: New Object Access Type hooks

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> On 3/22/22 11:26, Mark Dilger wrote:
>> culicidae is complaining:
>> 2022-03-22 14:53:27.198 UTC [2167008][not initialized][:0] FATAL:  test_oat_hooks must be loaded via
shared_preload_libraries

> That seems quite weird. I'm not sure how it's getting loaded at all if
> not via shared_preload_libraries

After checking culicidae's config, I've duplicated this failure
by building with EXEC_BACKEND defined.  So I'd opine that there
is something broken about the method test_oat_hooks uses to
decide if it was loaded via shared_preload_libraries or not.
(Note that the failures appear to be coming out of auxiliary
processes such as the checkpointer.)

As a quick-n-dirty fix to avoid reverting the entire test module,
perhaps just delete this error check for now.

            regards, tom lane



Re: New Object Access Type hooks

From
Mark Dilger
Date:

> On Mar 22, 2022, at 9:33 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Andrew Dunstan <andrew@dunslane.net> writes:
>> On 3/22/22 11:26, Mark Dilger wrote:
>>> culicidae is complaining:
>>> 2022-03-22 14:53:27.198 UTC [2167008][not initialized][:0] FATAL:  test_oat_hooks must be loaded via
shared_preload_libraries
>
>> That seems quite weird. I'm not sure how it's getting loaded at all if
>> not via shared_preload_libraries
>
> After checking culicidae's config, I've duplicated this failure
> by building with EXEC_BACKEND defined.  So I'd opine that there
> is something broken about the method test_oat_hooks uses to
> decide if it was loaded via shared_preload_libraries or not.
> (Note that the failures appear to be coming out of auxiliary
> processes such as the checkpointer.)
>
> As a quick-n-dirty fix to avoid reverting the entire test module,
> perhaps just delete this error check for now.

Ok, done as you suggest:



—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Attachment

Re: New Object Access Type hooks

From
Tom Lane
Date:
Mark Dilger <mark.dilger@enterprisedb.com> writes:
> On Mar 22, 2022, at 9:33 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> As a quick-n-dirty fix to avoid reverting the entire test module,
>> perhaps just delete this error check for now.

> Ok, done as you suggest:

I only suggested removing the error check in _PG_init, not
changing the way the test works.

            regards, tom lane



Re: New Object Access Type hooks

From
Andrew Dunstan
Date:
On 3/22/22 12:58, Tom Lane wrote:
> Mark Dilger <mark.dilger@enterprisedb.com> writes:
>> On Mar 22, 2022, at 9:33 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> As a quick-n-dirty fix to avoid reverting the entire test module,
>>> perhaps just delete this error check for now.
>> Ok, done as you suggest:
> I only suggested removing the error check in _PG_init, not
> changing the way the test works.
>
>             



Mark and I discussed this offline, and decided there was no requirement
for the module to be preloaded. Do you have a different opinion?


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: New Object Access Type hooks

From
Mark Dilger
Date:

> On Mar 22, 2022, at 9:58 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
>> Ok, done as you suggest:
>
> I only suggested removing the error check in _PG_init, not
> changing the way the test works.

I should have been more explicit and said, "done as y'all suggest".  The "To" line of that email was to you and Andrew.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: New Object Access Type hooks

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> On 3/22/22 12:58, Tom Lane wrote:
>> I only suggested removing the error check in _PG_init, not
>> changing the way the test works.

> Mark and I discussed this offline, and decided there was no requirement
> for the module to be preloaded. Do you have a different opinion?

No, I was actually about to make the same point: it seems to me there
are arguable use-cases for loading it shared, loading it per-session
(perhaps via ALTER USER SET or ALTER DATABASE SET to target particular
users/DBs), or even manually LOADing it.  So the module code should
not be prejudging how it's used.

On reflection, I withdraw my complaint about changing the way the
test script loads the module.  Getting rid of the need for a custom
.conf file simplifies the test module, and that seems good.
So I'm on board with Mark's patch now.

            regards, tom lane



Re: New Object Access Type hooks

From
Andrew Dunstan
Date:
On 3/22/22 13:08, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> On 3/22/22 12:58, Tom Lane wrote:
>>> I only suggested removing the error check in _PG_init, not
>>> changing the way the test works.
>> Mark and I discussed this offline, and decided there was no requirement
>> for the module to be preloaded. Do you have a different opinion?
> No, I was actually about to make the same point: it seems to me there
> are arguable use-cases for loading it shared, loading it per-session
> (perhaps via ALTER USER SET or ALTER DATABASE SET to target particular
> users/DBs), or even manually LOADing it.  So the module code should
> not be prejudging how it's used.
>
> On reflection, I withdraw my complaint about changing the way the
> test script loads the module.  Getting rid of the need for a custom
> .conf file simplifies the test module, and that seems good.
> So I'm on board with Mark's patch now.
>
>             



OK, I have pushed that.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: New Object Access Type hooks

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> OK, I have pushed that.

It seems like you could remove the NO_INSTALLCHECK restriction
too.  You already removed the comment defending it, and it
seems to work fine as an installcheck now if I remove that
locally.

Other nitpicks:

* the IsParallelWorker test could use a comment

* I notice a typo "permisisons" in test_oat_hooks.sql

            regards, tom lane



Re: New Object Access Type hooks

From
Andrew Dunstan
Date:
On 3/22/22 14:01, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> OK, I have pushed that.
> It seems like you could remove the NO_INSTALLCHECK restriction
> too.  You already removed the comment defending it, and it
> seems to work fine as an installcheck now if I remove that
> locally.
>
> Other nitpicks:
>
> * the IsParallelWorker test could use a comment
>
> * I notice a typo "permisisons" in test_oat_hooks.sql
>
>             



Fixed


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: New Object Access Type hooks

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> Fixed

Now that we got past the hard failures, we can see that the test
falls over with (some?) non-default encodings, as for instance
here:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=prion&dt=2022-03-22%2020%3A23%3A13

I can replicate that by running the test under LANG=en_US.iso885915.
What I think is happening is:

(1) Rather unwisely, the relevant InvokeNamespaceSearchHook calls
appear in recomputeNamespacePath.  That means that their timing
depends heavily on accidents of caching.

(2) If we decide that we need an encoding conversion to talk to
the client, there'll be a lookup for the conversion function
early during session startup.  That will cause the namespace
search path to get computed then, before the test module has been
loaded and certainly before the audit GUC has been turned on.

(3) At the point where the test expects some audit notices
to come out, nothing happens because the search path is
already validated.

I'm inclined to think that (1) is a seriously bad idea,
not only because of this instability, but because

(a) the namespace cache logic is unlikely to cause the search-path
cache to get invalidated when something happens that might cause an
OAT hook to wish to change its decision, and

(b) this placement means that the hook is invoked during cache loading
operations that are likely to be super-sensitive to any additional
catalog accesses a hook might wish to do.  (I await the results of the
CLOBBER_CACHE_ALWAYS animals with trepidation.)

Now, if our attitude to the OAT hooks is that we are going to
sprinkle some at random and whether they are useful is someone
else's problem, then maybe these are not interesting concerns.

            regards, tom lane



Re: New Object Access Type hooks

From
Andres Freund
Date:
Hi,

On 2022-03-22 13:47:27 -0400, Andrew Dunstan wrote:
> OK, I have pushed that.

Seems like it might actually be good to test that object access hooks work
well in a parallel worker. How about going the other way and explicitly setting
force_parallel_mode = disabled for parts of the test and to enabled for
others?

- Andres



Re: New Object Access Type hooks

From
Mark Dilger
Date:

> On Mar 22, 2022, at 3:20 PM, Andres Freund <andres@anarazel.de> wrote:
>
> Seems like it might actually be good to test that object access hooks work
> well in a parallel worker. How about going the other way and explicitly setting
> force_parallel_mode = disabled for parts of the test and to enabled for
> others?

Wouldn't we get differing numbers of NOTICE messages depending on how many parallel workers there are?  Or would you
proposesetting the number of workers to a small, fixed value? 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: New Object Access Type hooks

From
Tom Lane
Date:
Mark Dilger <mark.dilger@enterprisedb.com> writes:
>> On Mar 22, 2022, at 3:20 PM, Andres Freund <andres@anarazel.de> wrote:
>> Seems like it might actually be good to test that object access hooks work
>> well in a parallel worker. How about going the other way and explicitly setting
>> force_parallel_mode = disabled for parts of the test and to enabled for
>> others?

> Wouldn't we get differing numbers of NOTICE messages depending on how
> many parallel workers there are?  Or would you propose setting the
> number of workers to a small, fixed value?

The value would have to be "1", else you are going to have issues
with notices from different workers being interleaved differently
from run to run.  You might have that anyway, due to interleaving
of leader and worker messages.

            regards, tom lane



Re: New Object Access Type hooks

From
Andres Freund
Date:
Hi,

On 2022-03-22 18:41:45 -0400, Tom Lane wrote:
> Mark Dilger <mark.dilger@enterprisedb.com> writes:
> >> On Mar 22, 2022, at 3:20 PM, Andres Freund <andres@anarazel.de> wrote:
> >> Seems like it might actually be good to test that object access hooks work
> >> well in a parallel worker. How about going the other way and explicitly setting
> >> force_parallel_mode = disabled for parts of the test and to enabled for
> >> others?
> 
> > Wouldn't we get differing numbers of NOTICE messages depending on how
> > many parallel workers there are?  Or would you propose setting the
> > number of workers to a small, fixed value?

Yes.


> The value would have to be "1", else you are going to have issues
> with notices from different workers being interleaved differently
> from run to run.

Yea. Possible one could work around those with some effort (using multiple
notification channels maybe), but there seems little to glean from multiple
workers that a single worker wouldn't show.


> You might have that anyway, due to interleaving of leader and worker
> messages.

That part could perhaps be addressed by setting parallel_leader_participation
= 0.

Greetings,

Andres Freund



Re: New Object Access Type hooks

From
Justin Pryzby
Date:
If I'm not wrong, this is still causing issues at least on cfbot/windows, even
since f0206d99.

https://cirrus-ci.com/task/5266352712712192
https://cirrus-ci.com/task/5061218867085312
https://cirrus-ci.com/task/5663822005403648
https://cirrus-ci.com/task/5744257246953472

https://cirrus-ci.com/task/5744257246953472
[22:26:50.939] test test_oat_hooks               ... FAILED      173 ms

https://api.cirrus-ci.com/v1/artifact/task/5744257246953472/log/src/test/modules/test_oat_hooks/regression.diffs
diff -w -U3 c:/cirrus/src/test/modules/test_oat_hooks/expected/test_oat_hooks.out
c:/cirrus/src/test/modules/test_oat_hooks/results/test_oat_hooks.out
--- c:/cirrus/src/test/modules/test_oat_hooks/expected/test_oat_hooks.out    2022-03-22 22:13:23.386895200 +0000
+++ c:/cirrus/src/test/modules/test_oat_hooks/results/test_oat_hooks.out    2022-03-22 22:26:51.104419600 +0000
@@ -15,12 +15,6 @@
 NOTICE:  in process utility: superuser finished CreateRoleStmt
 CREATE TABLE regress_test_table (t text);
 NOTICE:  in process utility: superuser attempting CreateStmt
-NOTICE:  in object access: superuser attempting namespace search (subId=0) [no report on violation, allowed]
-LINE 1: CREATE TABLE regress_test_table (t text);
-                     ^
-NOTICE:  in object access: superuser finished namespace search (subId=0) [no report on violation, allowed]
-LINE 1: CREATE TABLE regress_test_table (t text);
-                     ^
 NOTICE:  in object access: superuser attempting create (subId=0) [explicit]



Re: New Object Access Type hooks

From
Tom Lane
Date:
Justin Pryzby <pryzby@telsasoft.com> writes:
> If I'm not wrong, this is still causing issues at least on cfbot/windows, even
> since f0206d99.

That's probably a variant of the encoding dependency I described
upthread.

            regards, tom lane



Re: New Object Access Type hooks

From
Andrew Dunstan
Date:
On 3/22/22 18:18, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> Fixed
> Now that we got past the hard failures, we can see that the test
> falls over with (some?) non-default encodings, as for instance
> here:
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=prion&dt=2022-03-22%2020%3A23%3A13
>
> I can replicate that by running the test under LANG=en_US.iso885915.
> What I think is happening is:
>
> (1) Rather unwisely, the relevant InvokeNamespaceSearchHook calls
> appear in recomputeNamespacePath.  That means that their timing
> depends heavily on accidents of caching.
>
> (2) If we decide that we need an encoding conversion to talk to
> the client, there'll be a lookup for the conversion function
> early during session startup.  That will cause the namespace
> search path to get computed then, before the test module has been
> loaded and certainly before the audit GUC has been turned on.
>
> (3) At the point where the test expects some audit notices
> to come out, nothing happens because the search path is
> already validated.
>
> I'm inclined to think that (1) is a seriously bad idea,
> not only because of this instability, but because
>
> (a) the namespace cache logic is unlikely to cause the search-path
> cache to get invalidated when something happens that might cause an
> OAT hook to wish to change its decision, and
>
> (b) this placement means that the hook is invoked during cache loading
> operations that are likely to be super-sensitive to any additional
> catalog accesses a hook might wish to do.  (I await the results of the
> CLOBBER_CACHE_ALWAYS animals with trepidation.)
>
> Now, if our attitude to the OAT hooks is that we are going to
> sprinkle some at random and whether they are useful is someone
> else's problem, then maybe these are not interesting concerns.


So this was a pre-existing problem that the test has exposed? I don't
think we can just say "you deal with it", and if I understand you right
you don't think that either.

I could make the buildfarm quiet again by resetting NO_INSTALLCHECK
temporarily.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: New Object Access Type hooks

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> On 3/22/22 18:18, Tom Lane wrote:
>> Now, if our attitude to the OAT hooks is that we are going to
>> sprinkle some at random and whether they are useful is someone
>> else's problem, then maybe these are not interesting concerns.

> So this was a pre-existing problem that the test has exposed? I don't
> think we can just say "you deal with it", and if I understand you right
> you don't think that either.

Yeah, my point exactly: the placement of those hooks needs to be rethought.
I'm guessing what we ought to do is let the cached namespace OID list
get built without interference, and then allow the OAT hook to filter
it when it's read.

> I could make the buildfarm quiet again by resetting NO_INSTALLCHECK
> temporarily.

I was able to reproduce it under "make check" as long as I had
LANG set to one of the troublesome values, so I'm not real sure
that that'll be enough.

            regards, tom lane



Re: New Object Access Type hooks

From
Andrew Dunstan
Date:
On 3/22/22 20:07, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> On 3/22/22 18:18, Tom Lane wrote:
>>> Now, if our attitude to the OAT hooks is that we are going to
>>> sprinkle some at random and whether they are useful is someone
>>> else's problem, then maybe these are not interesting concerns.
>> So this was a pre-existing problem that the test has exposed? I don't
>> think we can just say "you deal with it", and if I understand you right
>> you don't think that either.
> Yeah, my point exactly: the placement of those hooks needs to be rethought.
> I'm guessing what we ought to do is let the cached namespace OID list
> get built without interference, and then allow the OAT hook to filter
> it when it's read.
>
>> I could make the buildfarm quiet again by resetting NO_INSTALLCHECK
>> temporarily.
> I was able to reproduce it under "make check" as long as I had
> LANG set to one of the troublesome values, so I'm not real sure
> that that'll be enough.
>
>             


The buildfarm only runs installcheck under different locales/encodings.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: New Object Access Type hooks

From
Andrew Dunstan
Date:
On 3/22/22 20:11, Andrew Dunstan wrote:
> On 3/22/22 20:07, Tom Lane wrote:
>> Andrew Dunstan <andrew@dunslane.net> writes:
>>> On 3/22/22 18:18, Tom Lane wrote:
>>>> Now, if our attitude to the OAT hooks is that we are going to
>>>> sprinkle some at random and whether they are useful is someone
>>>> else's problem, then maybe these are not interesting concerns.
>>> So this was a pre-existing problem that the test has exposed? I don't
>>> think we can just say "you deal with it", and if I understand you right
>>> you don't think that either.
>> Yeah, my point exactly: the placement of those hooks needs to be rethought.
>> I'm guessing what we ought to do is let the cached namespace OID list
>> get built without interference, and then allow the OAT hook to filter
>> it when it's read.
>>
>>> I could make the buildfarm quiet again by resetting NO_INSTALLCHECK
>>> temporarily.
>> I was able to reproduce it under "make check" as long as I had
>> LANG set to one of the troublesome values, so I'm not real sure
>> that that'll be enough.
>>
>>             
>
> The buildfarm only runs installcheck under different locales/encodings.


But you're right about the non-installcheck cases. fairywren had that
issue. I have committed a (tested) fix for those too to force
NO_LOCALE/UTF8.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: New Object Access Type hooks

From
Andres Freund
Date:
Hi,

I just rebased the meson tree (atop 75b1521dae1) and the test_oat_hooks test
fail on windows with meson. They don't with our "homegrown" buildsystem, but
just because it won't run them.

https://cirrus-ci.com/build/6101947223638016
https://cirrus-ci.com/task/5869668815601664?logs=check_world#L67

https://api.cirrus-ci.com/v1/artifact/task/5869668815601664/log/build/testrun/test_oat_hooks/pg_regress/regression.diffs

diff -w -U3 C:/cirrus/src/test/modules/test_oat_hooks/expected/test_oat_hooks.out
C:/cirrus/build/testrun/test_oat_hooks/pg_regress/results/test_oat_hooks.out
--- C:/cirrus/src/test/modules/test_oat_hooks/expected/test_oat_hooks.out    2022-03-24 18:56:39.592048000 +0000
+++ C:/cirrus/build/testrun/test_oat_hooks/pg_regress/results/test_oat_hooks.out    2022-03-24 19:03:33.910466700
+0000
@@ -15,12 +15,6 @@
 NOTICE:  in process utility: superuser finished CreateRoleStmt
 CREATE TABLE regress_test_table (t text);
 NOTICE:  in process utility: superuser attempting CreateStmt
-NOTICE:  in object access: superuser attempting namespace search (subId=0) [no report on violation, allowed]
-LINE 1: CREATE TABLE regress_test_table (t text);
-                     ^
-NOTICE:  in object access: superuser finished namespace search (subId=0) [no report on violation, allowed]
-LINE 1: CREATE TABLE regress_test_table (t text);
-                     ^
 NOTICE:  in object access: superuser attempting create (subId=0) [explicit]
 NOTICE:  in object access: superuser finished create (subId=0) [explicit]
 NOTICE:  in object access: superuser attempting create (subId=0) [explicit]


I don't think this is meson specific...

Greetings,

Andres Freund



Re: New Object Access Type hooks

From
Andrew Dunstan
Date:
On 3/24/22 16:59, Andres Freund wrote:
> Hi,
>
> I just rebased the meson tree (atop 75b1521dae1) and the test_oat_hooks test
> fail on windows with meson. They don't with our "homegrown" buildsystem, but
> just because it won't run them.
>
> https://cirrus-ci.com/build/6101947223638016
> https://cirrus-ci.com/task/5869668815601664?logs=check_world#L67
>
https://api.cirrus-ci.com/v1/artifact/task/5869668815601664/log/build/testrun/test_oat_hooks/pg_regress/regression.diffs
>
> diff -w -U3 C:/cirrus/src/test/modules/test_oat_hooks/expected/test_oat_hooks.out
C:/cirrus/build/testrun/test_oat_hooks/pg_regress/results/test_oat_hooks.out
> --- C:/cirrus/src/test/modules/test_oat_hooks/expected/test_oat_hooks.out    2022-03-24 18:56:39.592048000 +0000
> +++ C:/cirrus/build/testrun/test_oat_hooks/pg_regress/results/test_oat_hooks.out    2022-03-24 19:03:33.910466700
+0000
> @@ -15,12 +15,6 @@
>  NOTICE:  in process utility: superuser finished CreateRoleStmt
>  CREATE TABLE regress_test_table (t text);
>  NOTICE:  in process utility: superuser attempting CreateStmt
> -NOTICE:  in object access: superuser attempting namespace search (subId=0) [no report on violation, allowed]
> -LINE 1: CREATE TABLE regress_test_table (t text);
> -                     ^
> -NOTICE:  in object access: superuser finished namespace search (subId=0) [no report on violation, allowed]
> -LINE 1: CREATE TABLE regress_test_table (t text);
> -                     ^
>  NOTICE:  in object access: superuser attempting create (subId=0) [explicit]
>  NOTICE:  in object access: superuser finished create (subId=0) [explicit]
>  NOTICE:  in object access: superuser attempting create (subId=0) [explicit]
>
>
> I don't think this is meson specific...


Even if you use NO_LOCALE=1/ENCODING=UTF8 as the Makefile now does?


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: New Object Access Type hooks

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> On 3/24/22 16:59, Andres Freund wrote:
>> I just rebased the meson tree (atop 75b1521dae1) and the test_oat_hooks test
>> fail on windows with meson.

> Even if you use NO_LOCALE=1/ENCODING=UTF8 as the Makefile now does?

Note that that's basically a workaround for buggy placement of the
OAT hooks, as per previous discussion.  I hope that we fix that bug
pretty soon, so it shouldn't really be a factor for the meson conversion.

            regards, tom lane



Re: New Object Access Type hooks

From
Andres Freund
Date:
Hi,

On 2022-03-24 17:44:31 -0400, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
> > On 3/24/22 16:59, Andres Freund wrote:
> >> I just rebased the meson tree (atop 75b1521dae1) and the test_oat_hooks test
> >> fail on windows with meson.
> 
> > Even if you use NO_LOCALE=1/ENCODING=UTF8 as the Makefile now does?

I didn't do that, no. I guess I should have re-checked that in the makefile /
reread the commit message in more detail - although it really doesn't provide
a lot of that. That's, uh, some shoveling problems under the carpet.

I'll do that for now then.


> Note that that's basically a workaround for buggy placement of the
> OAT hooks, as per previous discussion.  I hope that we fix that bug
> pretty soon, so it shouldn't really be a factor for the meson conversion.

Yea. I found it's a lot easier to rebase the meson tree frequently rather than
do it in larger batches, so I just do so every few days...

Greetings,

Andres Freund



Re: New Object Access Type hooks

From
Mark Dilger
Date:

> On Mar 21, 2022, at 10:03 PM, Thomas Munro <thomas.munro@gmail.com> wrote:
>
> On Fri, Mar 18, 2022 at 4:22 PM Mark Dilger
> <mark.dilger@enterprisedb.com> wrote:
>> (FYI, I got a test failure from src/test/recovery/t/013_crash_restart.pl when testing v1-0001.  I'm not sure yet
whatthat is about.) 
>
> Doesn't look like 0001 has anything to do with that...   Are you on a
> Mac?  Did it look like this recent failure from CI?
>
> https://cirrus-ci.com/task/4686108033286144
>
https://api.cirrus-ci.com/v1/artifact/task/4686108033286144/log/src/test/recovery/tmp_check/log/regress_log_013_crash_restart
>
https://api.cirrus-ci.com/v1/artifact/task/4686108033286144/log/src/test/recovery/tmp_check/log/013_crash_restart_primary.log
>
> I have no idea what is going on there, but searching for discussion
> brought me here...

I just got a crash in this test again.  Are you still interested?  I still have the logs.  No core file appears to have
beengenerated. 

The test failure is

not ok 5 - psql query died successfully after SIGQUIT

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: New Object Access Type hooks

From
Tom Lane
Date:
Mark Dilger <mark.dilger@enterprisedb.com> writes:
> I just got a crash in this test again.  Are you still interested?  I still have the logs.  No core file appears to
havebeen generated. 
> The test failure is
> not ok 5 - psql query died successfully after SIGQUIT

Hmm ... I can see one problem with that test:

ok( pump_until(
        $killme,
        $psql_timeout,
        \$killme_stderr,
        qr/WARNING:  terminating connection because of crash of another server process|server closed the connection
unexpectedly|connectionto server was lost/m 
    ),
    "psql query died successfully after SIGQUIT");

It's been a little while since that message looked like that.
Nowadays you get

WARNING:  terminating connection because of unexpected SIGQUIT signal
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.

Usually the test would succeed anyway because of matching the
second or third regex alternative, but I wonder if there is
some other spelling of libpq's complaint that shows up
occasionally.  It'd be nice if we could see the contents of
$killme_stderr upon failure.

            regards, tom lane



Re: New Object Access Type hooks

From
Tom Lane
Date:
I wrote:
> Usually the test would succeed anyway because of matching the
> second or third regex alternative, but I wonder if there is
> some other spelling of libpq's complaint that shows up
> occasionally.  It'd be nice if we could see the contents of
> $killme_stderr upon failure.

OK, now I'm confused, because pump_until is very clearly
*trying* to report exactly that:

        if (not $proc->pumpable())
        {
            diag("pump_until: process terminated unexpectedly when searching for \"$until\" with stream:
\"$$stream\"");
            return 0;
        }

and if I intentionally break the regex then I do see this
output when running the test by hand:

# Running: pg_ctl kill QUIT 1922645
ok 4 - killed process with SIGQUIT
# pump_until: process terminated unexpectedly when searching for "(?^m:WARNING:  terminating connection because of
crashof another server process|server closed the connection foounexpectedly|connection to server was lost)" with
stream:"psql:<stdin>:9: WARNING:  terminating connection because of unexpected SIGQUIT signal 
# psql:<stdin>:9: server closed the connection unexpectedly
#       This probably means the server terminated abnormally
#       before or while processing the request.
# "
not ok 5 - psql query died successfully after SIGQUIT

Is our CI setup failing to capture stderr from TAP tests??

            regards, tom lane



Re: New Object Access Type hooks

From
Mark Dilger
Date:

> On Apr 4, 2022, at 10:41 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Is our CI setup failing to capture stderr from TAP tests??

I'm looking into the way our TAP test infrastructure assigns port numbers to nodes, and whether that is reasonable
duringparallel test runs with nodes stopping and starting again.  On casual inspection, that doesn't seem ok, because
theCluster.pm logic to make sure nodes get unique ports doesn't seem to be thinking about other parallel tests running.
It will notice if another node is already bound to the port, but if another node has been killed and not yet restarted,
won'tthings go off the rails? 

I'm writing a parallel test just for this.  Will get back to you.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: New Object Access Type hooks

From
Tom Lane
Date:
I wrote:
> Is our CI setup failing to capture stderr from TAP tests??

Oh, I'm barking up the wrong tree.  This test must have been run
against HEAD between 6da65a3f9 (23 Feb) and 2beb4acff (31 Mar), when
pump_until indeed didn't print this essential information :-(

If you just got this failure, could you look in the log to
see if there's a pump_until report?

            regards, tom lane



Re: New Object Access Type hooks

From
Mark Dilger
Date:

> On Apr 4, 2022, at 10:51 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Oh, I'm barking up the wrong tree.  This test must have been run
> against HEAD between 6da65a3f9 (23 Feb) and 2beb4acff (31 Mar), when
> pump_until indeed didn't print this essential information :-(
>
> If you just got this failure, could you look in the log to
> see if there's a pump_until report?

I was running `make -j12 check-world` against my local patched version of master:

commit 80399fa5f208c4acd4ec194c47e534ba8dd3ae7c (HEAD -> 0001)
Author: Mark Dilger <mark.dilger@enterprisedb.com>
Date:   Mon Mar 28 13:35:11 2022 -0700

    Allow grant and revoke of privileges on parameters

    Add new SET and ALTER SYSTEM privileges for configuration parameters
    (GUCs), and a new catalog, pg_parameter_acl, for tracking grants of
    these privileges.

    The privilege to SET a parameter marked USERSET is inherent in that
    parameter's marking and cannot be revoked.  This saves cycles when
    performing SET operations, as looking up privileges in the catalog
    can be skipped.  If we find that administrators need to revoke SET
    privilege on a particular variable from public, that variable can be
    redefined in future releases as SUSET with a default grant of SET to
    PUBLIC issued.

commit 4eb9798879680dcc0e3ebb301cf6f925dfa69422 (origin/master, origin/HEAD, master)
Author: Andrew Dunstan <andrew@dunslane.net>
Date:   Mon Apr 4 10:12:30 2022 -0400

    Avoid freeing objects during json aggregate finalization

    Commit f4fb45d15c tried to free memory during aggregate finalization.
    This cause issues, particularly when used as a window function, so stop
    doing that.

    Per complaint by Jaime Casanova and diagnosis by Andres Freund

    Discussion: https://postgr.es/m/YkfeMNYRCGhySKyg@ahch-to


The test logs are attached.


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Attachment

Re: New Object Access Type hooks

From
Tom Lane
Date:
Mark Dilger <mark.dilger@enterprisedb.com> writes:
> The test logs are attached.

Server log looks as-expected:

2022-04-04 09:11:51.087 PDT [2084] 013_crash_restart.pl LOG:  statement: SELECT pg_sleep(3600);
2022-04-04 09:11:51.094 PDT [2083] 013_crash_restart.pl WARNING:  terminating connection because of unexpected SIGQUIT
signal
2022-04-04 09:11:51.095 PDT [2070] LOG:  server process (PID 2083) exited with exit code 2
2022-04-04 09:11:51.095 PDT [2070] DETAIL:  Failed process was running: INSERT INTO alive
VALUES($$in-progress-before-sigquit$$)RETURNING status; 
2022-04-04 09:11:51.095 PDT [2070] LOG:  terminating any other active server processes

I was hoping to see regress_log_013_crash_restart, though.

            regards, tom lane



Re: New Object Access Type hooks

From
Mark Dilger
Date:

> On Apr 4, 2022, at 11:07 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> 
> I was hoping to see regress_log_013_crash_restart, though.



—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Attachment

Re: New Object Access Type hooks

From
Mark Dilger
Date:

> On Apr 4, 2022, at 10:44 AM, Mark Dilger <mark.dilger@enterprisedb.com> wrote:
>
> I'm writing a parallel test just for this.  Will get back to you.

Ok, that experiment didn't accomplish anything, beyond refreshing my memory regarding Cluster.pm preferring sockets
overports. 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: New Object Access Type hooks

From
Tom Lane
Date:
Mark Dilger <mark.dilger@enterprisedb.com> writes:

> # Running: pg_ctl kill QUIT 2083
> ok 4 - killed process with SIGQUIT
> # pump_until: process terminated unexpectedly when searching for "(?^m:WARNING:  terminating connection because of
crashof another server process|server closed the connection unexpectedly|connection to server was lost)" with stream:
"psql:<stdin>:9:WARNING:  terminating connection because of unexpected SIGQUIT signal 
> # psql:<stdin>:9: could not send data to server: Socket is not connected
> # "
> not ok 5 - psql query died successfully after SIGQUIT

And there we have it: the test wasn't updated for the new backend message
spelling, and we're seeing a different frontend behavior.  Evidently the
backend is dying before we're able to send the "SELECT 1;" to it.

I'm not quite sure whether it's a libpq bug that it doesn't produce the
"connection to server was lost" message here, but in any case I suspect
that we shouldn't be checking for the second and third regex alternatives.
The "terminating connection" warning absolutely should get through, and
if it doesn't we want to know about it.  So my proposal for a fix is
to change the regex to be just "WARNING:  terminating connection because
of unexpected SIGQUIT signal".

            regards, tom lane



Re: New Object Access Type hooks

From
Tom Lane
Date:
I wrote:
> The "terminating connection" warning absolutely should get through,

... oh, no, that's not guaranteed at all, since it's sent from quickdie().
So scratch that.  Maybe we'd better add "could not send data to server"
to the regex?

            regards, tom lane



Re: New Object Access Type hooks

From
Mark Dilger
Date:

> On Apr 4, 2022, at 12:05 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> I wrote:
>> The "terminating connection" warning absolutely should get through,
>
> ... oh, no, that's not guaranteed at all, since it's sent from quickdie().
> So scratch that.  Maybe we'd better add "could not send data to server"
> to the regex?

If it fails in pqsecure_raw_write(), you get either "server closed the connection unexpectedly" or "could not send data
toserver".  Do we need to support pgtls_write() or pg_GSS_write(), which have different error messages?  Can anybody
runthe tests with TLS or GSS enabled?  I assume the test framework prevents this, but I didn't check too closely.... 

Is it possible that pgFlush will call pqSendSome which calls pqReadData before trying to write anything, and get back a
"couldnot receive data from server" from pqsecure_raw_read()? 

It's a bit hard to prove to myself which paths might be followed through this code.  Thoughts?

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: New Object Access Type hooks

From
Tom Lane
Date:
Mark Dilger <mark.dilger@enterprisedb.com> writes:
>> On Apr 4, 2022, at 12:05 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> So scratch that.  Maybe we'd better add "could not send data to server"
>> to the regex?

> If it fails in pqsecure_raw_write(), you get either "server closed the connection unexpectedly" or "could not send
datato server".  Do we need to support pgtls_write() or pg_GSS_write(), which have different error messages? 

Don't see why, since this test sets up a new cluster in which neither
is enabled.

> Is it possible that pgFlush will call pqSendSome which calls pqReadData before trying to write anything, and get back
a"could not receive data from server" from pqsecure_raw_read()? 

Yeah, it's plausible to get a failure on either the write or read side
depending on timing.

Perhaps libpq should be trying harder to make those cases look alike, but
this test is about server behavior not libpq behavior, so I'm inclined
to just make it lax.

            regards, tom lane



Re: New Object Access Type hooks

From
Mark Dilger
Date:

> On Apr 4, 2022, at 1:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Yeah, it's plausible to get a failure on either the write or read side
> depending on timing.
>
> Perhaps libpq should be trying harder to make those cases look alike, but
> this test is about server behavior not libpq behavior, so I'm inclined
> to just make it lax.

+1.

I've gotten this test failure only a few times in perhaps the last six months, so if we narrow the opportunity for test
failurewithout closing it entirely, we're just making the test failures that much harder to diagnose. 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: New Object Access Type hooks

From
Tom Lane
Date:
Mark Dilger <mark.dilger@enterprisedb.com> writes:
> On Apr 4, 2022, at 1:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Perhaps libpq should be trying harder to make those cases look alike, but
>> this test is about server behavior not libpq behavior, so I'm inclined
>> to just make it lax.

> +1.

> I've gotten this test failure only a few times in perhaps the last six months, so if we narrow the opportunity for
testfailure without closing it entirely, we're just making the test failures that much harder to diagnose. 

Done that way.

            regards, tom lane



Re: New Object Access Type hooks

From
Michael Paquier
Date:
On Thu, Mar 24, 2022 at 05:44:31PM -0400, Tom Lane wrote:
> Note that that's basically a workaround for buggy placement of the
> OAT hooks, as per previous discussion.  I hope that we fix that bug
> pretty soon, so it shouldn't really be a factor for the meson conversion.

So, this issue is still listed as an open item.  What should we do?
From what I get, the caching issues with the namespace lookup hook are
not new to v15, they just get exposed by the new test module
test_oat_hooks/.  FWIW, I would vote against moving around hook calls
in back branches as that could cause compatibility problems in
existing code relying on them, but it surely is unstable to keep these
when recomputing the search_path.

A removal from recomputeNamespacePath() implies an addition at the end
of fetch_search_path() and fetch_search_path_array().  Perhaps an
extra one in RangeVarGetCreationNamespace()?  The question is how much
of these we want, for example the search hook would be called now even
when doing relation-specific checks like RelationIsVisible() and the
kind.
--
Michael

Attachment

Re: New Object Access Type hooks

From
Michael Paquier
Date:
On Mon, Apr 18, 2022 at 03:50:11PM +0900, Michael Paquier wrote:
> A removal from recomputeNamespacePath() implies an addition at the end
> of fetch_search_path() and fetch_search_path_array().  Perhaps an
> extra one in RangeVarGetCreationNamespace()?  The question is how much
> of these we want, for example the search hook would be called now even
> when doing relation-specific checks like RelationIsVisible() and the
> kind.

I have been playing with this issue, and if we want to minimize the
number of times the list of namespaces in activeSearchPath gets
checked through the search hook, it looks like this is going to
require an additional cached list of namespace OIDs filtered through
InvokeNamespaceSearchHook().  However, it is unclear to me how we can
guarantee that any of the code paths forcing a recomputation of
activeSearchPath are not used for a caching phase, so it looks rather
easy to mess up things and finish with a code path using an unfiltered
activeSearchPath.  The set of *IsVisible() routines should be fine, at
least.

At the end, I am not sure that it is a wise time to redesign this
area close to beta2, so I would vote for leaving this issue aside for
now.  Another thing that we could do is to tweak the tests and silence
the part around OAT_NAMESPACE_SEARCH, which would increase the coverage
with installcheck, removing the need for ENCODING and NO_LOCALE.
--
Michael

Attachment