Thread: Hooks for session start and end, take two

Hooks for session start and end, take two

From
Michael Paquier
Date:
Hi all,

Attached is a patch set to respawn the issue of $subject which has
been discussed here:
https://www.postgresql.org/message-id/20170720204733.40f2b7eb.nagata@sraoss.co.jp

The patch has been committed once as of cd8ce3a but it got shortly
reverted after with 98d54bb because of buildfarm failures.

The root of the buildfarm issues was that session hooks cannot be
tested with a simple LOAD, hence we need to mark the test with
NO_INSTALLCHECK.  Unfortunately we lacked support for that in MSVC
scripts, until I solved that with 431f1599 when refactoring PGXS
makefile rules for regression tests.

While on it, I have done a review over the patch, cleaning it up a bit
and I found some issues, so the initial patch was not fully baked
either:
- previous hook calls were only called for normal backends, which was
incorrect as we define the backend so as we apply no backend-related
filtering for the hook.
- The README could be also more talkative.
- test_session_hooks--1.0.sql also got confused with the module name.
And actually there is no need to have the SQL and control files just
for a module loading a hook.  So it is enough to use MODULE_big for
this purpose.
- The query generated needs to use quote_literal_cstr for the string
values added to the query.
- sample_session_start_hook and sample_session_end_hook missed a
(void), causing a compiler warning on Windows.

Attached is an updated patch set to reintroduce the hook, as there was
a ask for it recently, fixing also the issue that we previously tried
to deal with.  I have tested the patch on Windows to make sure that
the test gets correctly bypassed, so this time we should not have any
buildfarm failures.

I am adding that to next CF.  Credits go of course to the initial
authors and reviewers of this feature.

Any opinions?
--
Michael

Attachment

Re: Hooks for session start and end, take two

From
legrand legrand
Date:
Hello,

Thank you for the work done on this subject.
After starting to play with it, I have a question and a remark:

> - previous hook calls were only called for normal backends, which was
> incorrect as we define the backend so as we apply no backend-related
> filtering for the hook. 

Does that mean that all processes seen in pg_stat_activity like
- autovacuum launcher
- logical replication launcher
- background writer
- checkpointer
- walwriter
... 
- Parallel worker
are available with that hook (it seems not) ?

The patch refers to a 2017 copyright, that's all I found yet ;o)

Regards
PAscal



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html



Re: Hooks for session start and end, take two

From
Michael Paquier
Date:
On Thu, Sep 26, 2019 at 09:57:57AM -0700, legrand legrand wrote:
> Does that mean that all processes seen in pg_stat_activity like
> - autovacuum launcher
> - logical replication launcher
> - background writer
> - checkpointer
> - walwriter
> ...
> - Parallel worker
> are available with that hook (it seems not) ?

All processes using PostgresMain() for their startup take this code
path like WAL senders and normal backend sessions, but not things
going through StartChildProcess() (WAL receiver, bgwriter, etc.) or
other processes like autovacuum processes which use a different start
code path.

> The patch refers to a 2017 copyright, that's all I found yet ;o)

Fixed, thanks!
--
Michael

Attachment

Re: Hooks for session start and end, take two

From
Mike Palmiotto
Date:
On Fri, Sep 27, 2019 at 12:54 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Thu, Sep 26, 2019 at 09:57:57AM -0700, legrand legrand wrote:
> > Does that mean that all processes seen in pg_stat_activity like
> > - autovacuum launcher
> > - logical replication launcher
> > - background writer
> > - checkpointer
> > - walwriter
> > ...
> > - Parallel worker
> > are available with that hook (it seems not) ?
>
> All processes using PostgresMain() for their startup take this code
> path like WAL senders and normal backend sessions, but not things
> going through StartChildProcess() (WAL receiver, bgwriter, etc.) or
> other processes like autovacuum processes which use a different start
> code path.

There is another patchset submitted to this CF
(https://commitfest.postgresql.org/24/2259/) that attempts to
centralize all process startup to make use of the StartChildProcess
entrypoint. The patchset then introduces a fork_process hook that
would be hit by every backend type immediately after forking. Looking
at your patch, it seems like there won't be too much conflict between
the two, but I wonder if we can collaborate on the hook portion to
achieve both of our goals?

As noted in the related thread, I'm currently breaking out the
centralization patch into separate pieces: one for the infrastructure
change (including MyAuxProc adoption of the centralized startup), one
for each process type, and one for the fork_process hook. The
substance of the patchset will remain mostly the same if you want to
take a look.

I look forward to hearing your thoughts on the matter.

Thanks,
-- 
Mike Palmiotto
https://crunchydata.com



Re: Hooks for session start and end, take two

From
legrand legrand
Date:
Michael Paquier-2 wrote
> On Thu, Sep 26, 2019 at 09:57:57AM -0700, legrand legrand wrote:
>> Does that mean that all processes seen in pg_stat_activity like
>> - autovacuum launcher
>> - logical replication launcher
>> - background writer
>> - checkpointer
>> - walwriter
>> ... 
>> - Parallel worker
>> are available with that hook (it seems not) ?
> 
> All processes using PostgresMain() for their startup take this code
> path like WAL senders and normal backend sessions, but not things
> going through StartChildProcess() (WAL receiver, bgwriter, etc.) or
> other processes like autovacuum processes which use a different start
> code path.

OK I confirm:
- "client backend" appears at session start and end hook,
- "autovacuum worker" and "pg_background" only appears at session end hook
  (backend_start can be retreived from pg_stat_activity),
- "parallel workers" are not visible at all
  (because extension cannot assign XIDs during a parallel operation)

All seems fine to me.

Regards
PAscal






--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html



Re: Hooks for session start and end, take two

From
Fabrízio de Royes Mello
Date:
On Fri, Sep 27, 2019 at 4:26 PM legrand legrand <legrand_legrand@hotmail.com> wrote:
>
> OK I confirm:
> - "client backend" appears at session start and end hook,
> - "autovacuum worker" and "pg_background" only appears at session end hook
>   (backend_start can be retreived from pg_stat_activity),
> - "parallel workers" are not visible at all
>   (because extension cannot assign XIDs during a parallel operation)
>
> All seems fine to me.
>

Hi all,

First of all thanks Michael for bringing this to life again.

I poked a little with the patch and everything is ok. Your check for normal backend on test_session_hooks is much simpler than I did before:

+    /* just consider normal backends */
+    if (MyBackendId == InvalidBackendId)
+        return;

But one thing came to my mind, why not in this first version we hook just normal backends? 

Regards,

--
   Fabrízio de Royes Mello         Timbira - http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

Re: Hooks for session start and end, take two

From
Michael Paquier
Date:
On Sat, Sep 28, 2019 at 10:43:36AM -0300, Fabrízio de Royes Mello wrote:
> I poked a little with the patch and everything is ok. Your check for normal
> backend on test_session_hooks is much simpler than I did before:
>
> +    /* just consider normal backends */
> +    if (MyBackendId == InvalidBackendId)
> +        return;
>
> But one thing came to my mind, why not in this first version we hook just
> normal backends?

This code path can only be taken by normal backends, so that would
apply, still I don't actually see why we should limit us here on the
backend side.  If for a reason or another those two code paths begin
to be taken by a backend with InvalidBackendId, then users of the
session start/end hook will need to think how to handle it if they
didn't from the start, which sounds like a good thing to me.
--
Michael

Attachment

Re: Hooks for session start and end, take two

From
Fabrízio de Royes Mello
Date:

On Sun, Sep 29, 2019 at 10:29 PM Michael Paquier <michael@paquier.xyz> wrote:
>
> This code path can only be taken by normal backends, so that would
> apply, still I don't actually see why we should limit us here on the
> backend side.  If for a reason or another those two code paths begin
> to be taken by a backend with InvalidBackendId, then users of the
> session start/end hook will need to think how to handle it if they
> didn't from the start, which sounds like a good thing to me.
>

Makes sense to me. I become a reviewer and run all tests (make check && make check-world) and everything is ok. Changed status to "ready for commiter".

Regards,

--
   Fabrízio de Royes Mello         Timbira - http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

Re: Hooks for session start and end, take two

From
Michael Paquier
Date:
On Mon, Sep 30, 2019 at 02:46:42PM -0300, Fabrízio de Royes Mello wrote:
> Makes sense to me. I become a reviewer and run all tests (make check &&
> make check-world) and everything is ok. Changed status to "ready for
> commiter".

Thanks.  There was an extra problem in the module hidden in the logs
of the regression tests: the autovacuum launcher has a backend ID
assigned at the moment of the session end hook (not the start), but it
is connected to no databases so I added a check on that in the start
and end hooks of the module.  That's more logic this way anyway as we
run a transaction in the test, and much better than using directly
GetUserIdAndSecContext() which would not assert for an invalid user
ID.  And done with e788bd9.
--
Michael

Attachment

Re: Hooks for session start and end, take two

From
Andres Freund
Date:
Hi,

Moving the discussion of the issues leading to this being reverted back
to -hackers:

On 2019-10-02 23:52:31 +0900, Fujii Masao wrote:
> On Wed, Oct 2, 2019 at 10:08 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> >
> > On 2019-Oct-02, Michael Paquier wrote:
> >
> > > On Wed, Oct 02, 2019 at 01:27:50PM +0900, Fujii Masao wrote:
> > > > If only session end hook is problematic, you will commit session start
> > > > hook again?
> > >
> > > Sure, it would be possible to cut the apple in half here.  Now my
> > > understanding was that both hooks were a set.  What do people think?
> >
> > I think that having just session start is still useful

> +1

Well, I think the patch actually needs to do a lot more design work to
be comittable and usable. It's very unclear how these hooks are actually
supposed to be used safely.  Issues I see:

- the startup hook isn't actually guaranteed to be able to write
  anything, we might be in hot-standby
- the startup hook doesn't just run in normal sessions, it also runs in
  walsenders, including ones not connected to a database. But for some
  implementation reasons it won't run for background workers.
- the shutdown hook placement means that any error triggered within the
  handler probably cannot be cleaned up properly anymore
- do we actually want to run code like this for e.g. FATAL errors?
- THERE IS NOT A SINGLE COMMENT EXPLAINING WHAT CAN BE SAFELY DONE IN THESE
  HOOKS. In fact, outside of the tests, the only comments in this are:
  /* Hook for plugins to get control at start and end of session */
  /* Hook for plugins to get control at start of session */
  /* Hook at session end */
  /* Hook for plugins to get control at end of session */


> Regarding session end hook, you can do the almost same thing as that hook
> by calling on_shmem_exit(), before_shmem_exit() or on_proc_exit() in
> other hook like session start hook. This approach also has the same issue
> discussed upthread, though. Anyway, I'm not sure if session end hook is
> "actually" necessary.

No, it's not actually the same (at least for
before_shmem_exit). ShutdownPostgres() runs deliberately as the *last*
before_shmem_exit to run before we switch over to tearing down shmem:

    /*
     * Set up process-exit callback to do pre-shutdown cleanup.  This is the
     * first before_shmem_exit callback we register; thus, this will be the
     * last thing we do before low-level modules like the buffer manager begin
     * to close down.  We need to have this in place before we begin our first
     * transaction --- if we fail during the initialization transaction, as is
     * entirely possible, we need the AbortTransaction call to clean up.
     */

 *
 * User-level cleanup, such as temp-relation removal and UNLISTEN, happens
 * via separate callbacks that execute before this one.  We don't combine the
 * callbacks because we still want this one to happen if the user-level
 * cleanup fails.


- Andres



Re: Hooks for session start and end, take two

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2019-10-02 23:52:31 +0900, Fujii Masao wrote:
>> Regarding session end hook, you can do the almost same thing as that hook
>> by calling on_shmem_exit(), before_shmem_exit() or on_proc_exit() in
>> other hook like session start hook. This approach also has the same issue
>> discussed upthread, though. Anyway, I'm not sure if session end hook is
>> "actually" necessary.

> No, it's not actually the same (at least for
> before_shmem_exit).

Yeah.  The important point here is that if you want to be able to execute
SQL, then you can't close down *ANY* subsystems before that.  Even in the
simple scenario exercised by the test case, imagine that there are user
triggers attached to the table it wants to insert a row in.  They could
execute anything at all.

Thus, we can't run any shutdown hooks before this one, if indeed we
consider it to be a shutdown hook at all.

A possible fix is to do it as the first action in proc_exit, but that will
fall foul of Andres' points about not wanting to do it in non-session
backends, nor in FATAL exits, nor in the case where a previous try failed.
Perhaps a better idea is to put it into PostgresMain's handling of client
EOF, despite the large comment saying not to add more code there.

(Or, maybe invent a new class of shutdown callbacks?
before_before_shmem_exit seems like a pretty grotty concept, but it's
more or less what we need here.  It would sure be nice if we had an
actual specification for what is allowed to happen in these different
classes of callbacks...)

In any case, the hook would have to be responsible for cancelling any open
transaction for itself, in order to have a clean environment to run SQL
code in.  It cannot piggyback on existing transaction-closing code to do
that, because any user-written code could throw an error inside the hook's
transaction and thereby break the invariant that we're not in a
transaction after that point.

Bottom line: there needs to be thought and testing of the case where the
executed SQL throws an error.  We need to recover from that and exit
cleanly, and the committed patch surely would not have.  Likewise there
should be a test for the case where we exit the session mid-transaction
or mid-failed-transaction.

BTW, my first thought about why the test case was failing was that it
had race conditions.  We now see that the problems were worse than that,
but before a new on-session-exit test goes in, you need to think about
it.  You have no guarantee that the callback will execute before the
client-side test script does its next action.  Thus, for example, I
wondered if the SELECTs were failing because the expected insertion
hadn't happened yet, or if the DROP ROLEs were failing because there
was still a live session logged in under the role.

            regards, tom lane



Re: Hooks for session start and end, take two

From
Michael Paquier
Date:
On Wed, Oct 02, 2019 at 01:51:43PM -0400, Tom Lane wrote:
> Thus, we can't run any shutdown hooks before this one, if indeed we
> consider it to be a shutdown hook at all.
>
> A possible fix is to do it as the first action in proc_exit, but that will
> fall foul of Andres' points about not wanting to do it in non-session
> backends, nor in FATAL exits, nor in the case where a previous try failed.
> Perhaps a better idea is to put it into PostgresMain's handling of client
> EOF, despite the large comment saying not to add more code there.

Perhaps.

> (Or, maybe invent a new class of shutdown callbacks?
> before_before_shmem_exit seems like a pretty grotty concept, but it's
> more or less what we need here.  It would sure be nice if we had an
> actual specification for what is allowed to happen in these different
> classes of callbacks...)

It seems to me that we may actually be looking at a concept of
callback priority here.  Say you could assign a number to each
callback registered, and control their order at shutdown.  I am
wondering if we shouldn't do that anyway.  Likely there could be cases
where it makes sense to ensure the ordering of the actions where we
have dependencies between some subsystems?  For example
ShutdownPostgres is registered first, meaning that it runs last
anyway.  Now for other future things or even extensions there could be
an argument?

> In any case, the hook would have to be responsible for cancelling any open
> transaction for itself, in order to have a clean environment to run SQL
> code in.  It cannot piggyback on existing transaction-closing code to do
> that, because any user-written code could throw an error inside the hook's
> transaction and thereby break the invariant that we're not in a
> transaction after that point.

Still, it would mean that any module hooking in the area would need to
do the cleanup all the time...
--
Michael

Attachment

Re: Hooks for session start and end, take two

From
Michael Paquier
Date:
On Wed, Oct 02, 2019 at 10:23:54AM -0700, Andres Freund wrote:
> - the startup hook isn't actually guaranteed to be able to write
>   anything, we might be in hot-standby

Right.

> - the startup hook doesn't just run in normal sessions, it also runs in
>   walsenders, including ones not connected to a database. But for some
>   implementation reasons it won't run for background workers.

The idea here is to have a hook which can be triggered at the start of
a process which can be externally triggered, which I guess is normal
even for WAL senders not connected to a database.

> - do we actually want to run code like this for e.g. FATAL errors?

That was the intention.

> - THERE IS NOT A SINGLE COMMENT EXPLAINING WHAT CAN BE SAFELY DONE IN THESE
>   HOOKS. In fact, outside of the tests, the only comments in this are:
>   /* Hook for plugins to get control at start and end of session */
>   /* Hook for plugins to get control at start of session */
>   /* Hook at session end */
>   /* Hook for plugins to get control at end of session */

We include in the backend code a dozen of hooks or so similar to this
one (excluding from the count GUC and reloption hooks and such).  For
most of them we do not provide much extensive documentation nor
explanations similar to that, so it is rather unclear what would be a
minimum and acceptable set of comments and/or documentation which
would be fine (the same comment could basically apply to the planner
or post-parse hooks).  Should we finally add a specific section in the
user-visible docs even if there has been reluctance to do so?  My
guess is that we go down to this kind of requirement if we want to be
able to never forget to add documentation for any kind of new hook.
--
Michael

Attachment

Re: Hooks for session start and end, take two

From
Andres Freund
Date:
Hi,

On 2019-10-03 11:03:39 +0900, Michael Paquier wrote:
> On Wed, Oct 02, 2019 at 10:23:54AM -0700, Andres Freund wrote:
> > - the startup hook isn't actually guaranteed to be able to write
> >   anything, we might be in hot-standby
> 
> Right.
> 
> > - the startup hook doesn't just run in normal sessions, it also runs in
> >   walsenders, including ones not connected to a database. But for some
> >   implementation reasons it won't run for background workers.
> 
> The idea here is to have a hook which can be triggered at the start of
> a process which can be externally triggered, which I guess is normal
> even for WAL senders not connected to a database.

But what can you do in that situation? Without a database connection,
for example, you better not consider inserting anything into a table.


> > - do we actually want to run code like this for e.g. FATAL errors?
> 
> That was the intention.

I have *SERIOUS* problems with performing additional writing
transactions in case of a FATAL. Something might have thrown that for a
reason, and just carrying on executing an arbitrary amount of code, this
might even involve triggers etc, imo seriously is not OK.


> > - THERE IS NOT A SINGLE COMMENT EXPLAINING WHAT CAN BE SAFELY DONE IN THESE
> >   HOOKS. In fact, outside of the tests, the only comments in this are:
> >   /* Hook for plugins to get control at start and end of session */
> >   /* Hook for plugins to get control at start of session */
> >   /* Hook at session end */
> >   /* Hook for plugins to get control at end of session */
> 
> We include in the backend code a dozen of hooks or so similar to this
> one (excluding from the count GUC and reloption hooks and such).  For
> most of them we do not provide much extensive documentation nor
> explanations similar to that

I don't mean sgml docs, but rather code comments. Most hooks don't have
a caveat list as long as the hooks proposed here. And for most the
specific location they hook into is more obvious: Whereas e.g. planner
hook can just happen when planning happens, the proposed hooks here
could plausible be put into many different places.

If somebody had bothered to add an explanation as to why the hook
placements are a good idea, they might have noticed that the location
the shutdown hook was placed at was terrible.


> so it is rather unclear what would be a minimum and acceptable set of
> comments and/or documentation which would be fine (the same comment
> could basically apply to the planner or post-parse hooks).

Meh. As described above, I don't think most other hooks are as vague and
dangerous as these. Nor is the existance of underdocumented code an
argument for adding more.


> Should we finally add a specific section in the user-visible docs even
> if there has been reluctance to do so?  My guess is that we go down to
> this kind of requirement if we want to be able to never forget to add
> documentation for any kind of new hook.

This seems like an entirely separate discussion.


Greetings,

Andres Freund



Re: Hooks for session start and end, take two

From
Michael Paquier
Date:
On Wed, Oct 02, 2019 at 09:09:53PM -0700, Andres Freund wrote:
> On 2019-10-03 11:03:39 +0900, Michael Paquier wrote:
>> The idea here is to have a hook which can be triggered at the start of
>> a process which can be externally triggered, which I guess is normal
>> even for WAL senders not connected to a database.
>
> But what can you do in that situation? Without a database connection,
> for example, you better not consider inserting anything into a table.

Adding extra custom logging information, or plug that information
elsewhere than Postgres.  I have use cases for that when it comes to
store external telemetry data or audit things for events happening
specifically in Postgres.

> I have *SERIOUS* problems with performing additional writing
> transactions in case of a FATAL. Something might have thrown that for a
> reason, and just carrying on executing an arbitrary amount of code, this
> might even involve triggers etc, imo seriously is not OK.

Well, hook authors can do a lot of stupid things..  Anyway it looks
that the end hook is out of scope as far as the discussion has gone
based on the lack of facility, and that there is still interest for
the start hook.

I am just attaching a separate patch which adds only the start hook.
The discussion could always respawn from that.  This takes care of the
HBA configuration when testing on Windows, and the module is still
named "hooks" intentionally to avoid a future rename.
--
Michael

Attachment

Re: Hooks for session start and end, take two

From
Andres Freund
Date:
Hi,

On 2019-10-03 13:20:09 +0900, Michael Paquier wrote:
> On Wed, Oct 02, 2019 at 09:09:53PM -0700, Andres Freund wrote:
> > I have *SERIOUS* problems with performing additional writing
> > transactions in case of a FATAL. Something might have thrown that for a
> > reason, and just carrying on executing an arbitrary amount of code, this
> > might even involve triggers etc, imo seriously is not OK.
> 
> Well, hook authors can do a lot of stupid things..

I don't have much problems if they knowingly do stupid stuff. But if
what we provide basically only consists out of pitfalls, than we need to
reconsider the interface, or at least have guidance on how to avoid
them.


> Anyway it looks that the end hook is out of scope as far as the
> discussion has gone based on the lack of facility, and that there is
> still interest for the start hook.

I don't really think that's a fair summary of what's been discussed.


> diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
> index e8d8e6f828..6d80cc2d64 100644
> --- a/src/backend/tcop/postgres.c
> +++ b/src/backend/tcop/postgres.c
> @@ -171,6 +171,9 @@ static ProcSignalReason RecoveryConflictReason;
>  static MemoryContext row_description_context = NULL;
>  static StringInfoData row_description_buf;
>  
> +/* Hook for plugins to get control at start of session */
> +session_start_hook_type session_start_hook = NULL;
> +
>  /* ----------------------------------------------------------------
>   *        decls for routines only used in this file
>   * ----------------------------------------------------------------
> @@ -3968,6 +3971,9 @@ PostgresMain(int argc, char *argv[],
>      if (!IsUnderPostmaster)
>          PgStartTime = GetCurrentTimestamp();
>  
> +    if (session_start_hook)
> +        (*session_start_hook) ();
> +
>      /*
>       * POSTGRES main processing loop begins here
>       *
> diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
> index ec21f7e45c..f42935f019 100644
> --- a/src/include/tcop/tcopprot.h
> +++ b/src/include/tcop/tcopprot.h
> @@ -30,6 +30,10 @@ extern PGDLLIMPORT const char *debug_query_string;
>  extern int    max_stack_depth;
>  extern int    PostAuthDelay;
>  
> +/* Hook for plugins to get control at start session */
> +typedef void (*session_start_hook_type) (void);
> +extern PGDLLIMPORT session_start_hook_type session_start_hook;
> +
>  /* GUC-configurable parameters */
 
In this state I think this patch should be flat out rejected.

I'm seriously baffled at how this stuff is being pursued aggressively,
quite apparently without any serious design considerations. I mean this
stuff had to be backed out twice, yet it's being reproposed without much
consideration for concerns?

- Andres



Re: Hooks for session start and end, take two

From
Fabrízio de Royes Mello
Date:

On Thu, Oct 3, 2019 at 1:35 AM Andres Freund <andres@anarazel.de> wrote:
>
> [...]
>
> In this state I think this patch should be flat out rejected.
>

Ok.

> I'm seriously baffled at how this stuff is being pursued aggressively,
> quite apparently without any serious design considerations. I mean this
> stuff had to be backed out twice, yet it's being reproposed without much
> consideration for concerns?
>

And what if (again) for the first version of session start hook we do it just for:
- client backends
- background workers

For sure we'll need new design, but for now I'm can't imagine a use case for sessions different than listed above.

Regards,

--
   Fabrízio de Royes Mello         Timbira - http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

Re: Hooks for session start and end, take two

From
Andres Freund
Date:
Hi,

On 2019-10-03 11:19:58 -0300, Fabrízio de Royes Mello wrote:
> On Thu, Oct 3, 2019 at 1:35 AM Andres Freund <andres@anarazel.de> wrote:
> >
> > [...]
> >
> > In this state I think this patch should be flat out rejected.
> >
> 
> Ok.
> 
> > I'm seriously baffled at how this stuff is being pursued aggressively,
> > quite apparently without any serious design considerations. I mean this
> > stuff had to be backed out twice, yet it's being reproposed without much
> > consideration for concerns?
> >
> 
> And what if (again) for the first version of session start hook we do it
> just for:
> - client backends
> - background workers
> 
> For sure we'll need new design, but for now I'm can't imagine a use case
> for sessions different than listed above.

I think what would need to be designed is something more like
RegisterXactCallback():

/*
 *    start- and end-of-transaction callbacks for dynamically loaded modules
 */
typedef enum
{
    XACT_EVENT_COMMIT,
    XACT_EVENT_PARALLEL_COMMIT,
    XACT_EVENT_ABORT,
    XACT_EVENT_PARALLEL_ABORT,
    XACT_EVENT_PREPARE,
    XACT_EVENT_PRE_COMMIT,
    XACT_EVENT_PARALLEL_PRE_COMMIT,
    XACT_EVENT_PRE_PREPARE
} XactEvent;

typedef void (*XactCallback) (XactEvent event, void *arg);

extern void RegisterXactCallback(XactCallback callback, void *arg);
extern void UnregisterXactCallback(XactCallback callback, void *arg);


which would be called at various parts of a processes lifetime. Maybe
something like:

1) shortly after process creation
2) shortly after shared memory attach
3) after database connection has been established
4) after database writes are possible (this excludes database-less walsenders)
5a) before closing database connection due to fatal error
    (with a note saying that no writes are ever allowed)
5b) before closing database connection due to normal exit
    (with a note saying that errors better be handled, and that no
    transaction may escape)
6) before shared memory detach
7) before final exit

Greetings,

Andres Freund



Re: Hooks for session start and end, take two

From
Mike Palmiotto
Date:
On Thu, Oct 3, 2019 at 11:09 AM Andres Freund <andres@anarazel.de> wrote:
>
> <snip>
> I think what would need to be designed is something more like
> RegisterXactCallback():
>
> /*
>  *      start- and end-of-transaction callbacks for dynamically loaded modules
>  */
> typedef enum
> {
>         XACT_EVENT_COMMIT,
>         XACT_EVENT_PARALLEL_COMMIT,
>         XACT_EVENT_ABORT,
>         XACT_EVENT_PARALLEL_ABORT,
>         XACT_EVENT_PREPARE,
>         XACT_EVENT_PRE_COMMIT,
>         XACT_EVENT_PARALLEL_PRE_COMMIT,
>         XACT_EVENT_PRE_PREPARE
> } XactEvent;
>
> typedef void (*XactCallback) (XactEvent event, void *arg);
>
> extern void RegisterXactCallback(XactCallback callback, void *arg);
> extern void UnregisterXactCallback(XactCallback callback, void *arg);
>
>
> which would be called at various parts of a processes lifetime. Maybe
> something like:
>
> 1) shortly after process creation
> 2) shortly after shared memory attach
> 3) after database connection has been established
> 4) after database writes are possible (this excludes database-less walsenders)
> 5a) before closing database connection due to fatal error
>     (with a note saying that no writes are ever allowed)
> 5b) before closing database connection due to normal exit
>     (with a note saying that errors better be handled, and that no
>     transaction may escape)
> 6) before shared memory detach
> 7) before final exit

This suggestion really resonates with me, as I had considered
something similar in the process centralization patchset (CF entry
linked upthread and discussion here:
https://www.postgresql.org/message-id/flat/CAMN686FE0OdZKp9YPO=htC6LnA6aW4r-+jq=3Q5RAoFQgW8EtA@mail.gmail.com).
It would be possible to build such a mechanism into that existing
architecture rather easily. Has anyone had a chance to review that
work?

Thanks,
-- 
Mike Palmiotto
https://crunchydata.com



RE: Hooks for session start and end, take two

From
"tsunakawa.takay@fujitsu.com"
Date:
From: Michael Paquier <michael@paquier.xyz>
> Adding extra custom logging information, or plug that information
> elsewhere than Postgres.  I have use cases for that when it comes to
> store external telemetry data or audit things for events happening
> specifically in Postgres.
> 
> Well, hook authors can do a lot of stupid things..  Anyway it looks
> that the end hook is out of scope as far as the discussion has gone
> based on the lack of facility, and that there is still interest for
> the start hook.

I've got interested in this.  What's the current status of this patch?  The CF entry shows it was committed.

https://commitfest.postgresql.org/25/2251/

But I understood not, because the relevant code doesn't appear in HEAD, and Git log shows that it was reverted.  Am I
correct?


I'm thinking of using this feature to address an issue raised by a user who is considering migrating from Oracle.
Oraclecaches SQL execution plans in shared memory and enables any session to use them.  He wants to hide the plan
creationtime from end users after database restart (perhaps the plan creation is slow.)  To do that in Oracle, he runs
possibleSQL statements in a session solely for warm-up before resuming service for end users.  Then, later new sessions
ofend users can use the cached execution plans.
 

Ideally, PostgreSQL should be able to share query plans among sessions without user intervention.  But I guess it would
bedifficult.  Instead, with this session start hook and the connection pooling, individual sessions can run possible
SQLstatements using PREPARE and EXECUTE once (forcing generic plans) at session start.
 



BTW, the user interface of the feature is unduly difficult for my use case and the use cases you mentioned.  I think
usersshould be able to use the feature only with SQL, without creating extensions.  That also allows the use of this
featurein DBaaS.  How about creating a new event trigger type, connect trigger, after Oracle's logon trigger?
 

CREATE EVENT TRIGGER mytrigger
    AFTER CONNECT [ON mydatabase]
    EXECUTE {PROCEDURE | FUNCTION} myproc();

If some hacker wants to do what SQL can't, he can specify a UDF in C.


Regards
Takayuki Tsunakawa



Re: Hooks for session start and end, take two

From
Michael Paquier
Date:
On Fri, Dec 20, 2019 at 02:45:26AM +0000, tsunakawa.takay@fujitsu.com wrote:
> I've got interested in this.  What's the current status of this
> patch?  The CF entry shows it was committed.
>
> But I understood not, because the relevant code doesn't appear in
> HEAD, and Git log shows that it was reverted.  Am I correct?

The patch has been committed once as of e788bd9, then reverted as of
9555cc8 because it had a couple of fundamental issues and many people
were not happy with it.  The latest discussions point out to some more
advanced designs based on callbacks at certain points of a session
lifetime.  You may want to double-check on that first.
--
Michael

Attachment

Re: Hooks for session start and end, take two

From
Alvaro Herrera
Date:
On 2019-Dec-20, Michael Paquier wrote:

> On Fri, Dec 20, 2019 at 02:45:26AM +0000, tsunakawa.takay@fujitsu.com wrote:
> > I've got interested in this.  What's the current status of this
> > patch?  The CF entry shows it was committed.
> >
> > But I understood not, because the relevant code doesn't appear in
> > HEAD, and Git log shows that it was reverted.  Am I correct? 
> 
> The patch has been committed once as of e788bd9, then reverted as of
> 9555cc8 because it had a couple of fundamental issues and many people
> were not happy with it.

Hmm, should we mark the commitfest entry as rejected then?  Having it be
marked committed seems pretty confusing.  The next version of the patch
would have its own CF entry, I presume.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Hooks for session start and end, take two

From
Tom Lane
Date:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> On 2019-Dec-20, Michael Paquier wrote:
>> The patch has been committed once as of e788bd9, then reverted as of
>> 9555cc8 because it had a couple of fundamental issues and many people
>> were not happy with it.

> Hmm, should we mark the commitfest entry as rejected then?  Having it be
> marked committed seems pretty confusing.  The next version of the patch
> would have its own CF entry, I presume.

RWF seems appropriate.  We haven't rejected the concept altogether,
AFAICT.

            regards, tom lane



Re: Hooks for session start and end, take two

From
Alvaro Herrera
Date:
On 2019-Dec-20, Tom Lane wrote:

> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> > On 2019-Dec-20, Michael Paquier wrote:
> >> The patch has been committed once as of e788bd9, then reverted as of
> >> 9555cc8 because it had a couple of fundamental issues and many people
> >> were not happy with it.
> 
> > Hmm, should we mark the commitfest entry as rejected then?  Having it be
> > marked committed seems pretty confusing.  The next version of the patch
> > would have its own CF entry, I presume.
> 
> RWF seems appropriate.  We haven't rejected the concept altogether,
> AFAICT.

Fair enough.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Hooks for session start and end, take two

From
Michael Paquier
Date:
On Fri, Dec 20, 2019 at 11:33:02AM -0300, Alvaro Herrera wrote:
> Fair enough.

And done:
https://commitfest.postgresql.org/25/2251/
Sorry for the late reply.
--
Michael

Attachment