Thread: Re: proposal: schema variables
On Tue, 2024-10-29 at 08:16 +0100, Pavel Stehule wrote: > again, necessary rebase I have started looking at patch 5, and I have some questions and comments. - The commit message is headed "memory cleaning after DROP VARIABLE", but the rest of the commit message speaks of sinval messages. These two things are independent, aren't they? And both lead to the need to validate the variables, right? Then this code comment would for example be wrong: /* true after accepted sinval message */ static bool needs_validation = false; It also becomes "true" after DROP VARIABLE, right? I am happy to fix the comment, but I want to understand the patch first. - I see that the patch adds cleanup of invalid session variable to each COMMIT. Is that a good idea? I'd expect that it is good enough to clean up whenever session variables are accessed. Calling remove_invalid_session_variables() during each COMMIT will affect all transactions, and I don't see the benefit. Also, do we need to call it during pg_session_variables()? Yours, Laurenz Albe
so 2. 11. 2024 v 6:46 odesílatel Laurenz Albe <laurenz.albe@cybertec.at> napsal:
On Tue, 2024-10-29 at 08:16 +0100, Pavel Stehule wrote:
> again, necessary rebase
I have started looking at patch 5, and I have some questions and comments.
- The commit message is headed "memory cleaning after DROP VARIABLE", but
the rest of the commit message speaks of sinval messages. These two
things are independent, aren't they? And both lead to the need to validate
the variables, right?
Maybe it can be formulated differently, but it is true. There are a lot of sinval messages, but in this case
only sinval messages related to DROP VARIABLE are interesting.
Then this code comment would for example be wrong:
/* true after accepted sinval message */
static bool needs_validation = false;
It also becomes "true" after DROP VARIABLE, right?
I am happy to fix the comment, but I want to understand the patch first.
sinval message can be raised by any operation over the pg_variable table.
<-><-->if (hashvalue == 0 || svar->hashvalue == hashvalue)
<-><-->{
<-><--><-->svar->is_valid = false;
+<-><--><-->needs_validation = true;
+<-><-->}
+<->}
<-><-->{
<-><--><-->svar->is_valid = false;
+<-><--><-->needs_validation = true;
+<-><-->}
+<->}
When I execute DROP VARIABLE, then the hash value is specified, but the hash can be zero for some massive cleaning, and there are other events that can send sinval message. I think an ANALYZE does this. So the comment /* true after accepted sinval message */ is more accurate than /* true after DROP VARIABLE */.
- I see that the patch adds cleanup of invalid session variable to each
COMMIT. Is that a good idea? I'd expect that it is good enough to clean
up whenever session variables are accessed.
Calling remove_invalid_session_variables() during each COMMIT will affect
all transactions, and I don't see the benefit.
1. Minimize the unwanted surprises for users that will check memory usage - So if you drop the variables, then the allocated space is released in possibly near time. The rule - allocated space is released, when in the next transaction you use any session variable looks a little bit crazy (although I think so there will not be real significant difference in functionality). Correct me, if I am wrong, but I don't remember any memory (or resource) cleaning in Postgres, that is delayed to second transactions. I agree, there is overhead of cleaning, but this can be very fast when the user doesn't use session variables, because the hash table with session variables is not initialized. I can imagine some usage some hooks there as alternative
2. The main reason why it is implemented is implementation of temporal variables with RESET or DROP on transaction end. Related code should be triggered at commit time, it cannot be delayed.
Also, do we need to call it during pg_session_variables()?
I think it can be removed. Originally pg_session_variables showed only valid variables, but it is not true now.
Regards
Pavel
Yours,
Laurenz Albe
On Sat, 2024-11-02 at 08:36 +0100, Pavel Stehule wrote: > so 2. 11. 2024 v 6:46 odesílatel Laurenz Albe <laurenz.albe@cybertec.at> napsal: > > - The commit message is headed "memory cleaning after DROP VARIABLE", but > > the rest of the commit message speaks of sinval messages. These two > > things are independent, aren't they? And both lead to the need to validate > > the variables, right? > > Maybe it can be formulated differently, but it is true. There are a lot of sinval messages, but in this case > only sinval messages related to DROP VARIABLE are interesting. Okay... > > Then this code comment would for example be wrong: > > > > /* true after accepted sinval message */ > > static bool needs_validation = false; > > > > It also becomes "true" after DROP VARIABLE, right? > > I am happy to fix the comment, but I want to understand the patch first. > > > > sinval message can be raised by any operation over the pg_variable table. I set a watchpoint on "needs_validation", and when I run DROP VARIABLE, it is called directly from the command: Hardware watchpoint 2: needs_validation Old value = false New value = true 0x00000000006ad44c in SessionVariableDropPostprocess (varid=<optimized out>) at session_variable.c:169 169 svar->drop_lxid = MyProc->vxid.lxid; (gdb) where #0 0x00000000006ad44c in SessionVariableDropPostprocess (varid=<optimized out>) at session_variable.c:169 #1 0x0000000000625dec in DropVariableById (varid=<optimized out>) at pg_variable.c:259 #2 0x00000000005fec57 in doDeletion (object=0x4ac85e8, flags=0) at dependency.c:1450 ... #8 0x00000000008bb7e0 in standard_ProcessUtility (pstmt=0x49ac700, queryString=0x49abbb0 "DROP VARIABLE a;", readOnlyTree=<optimizedout>, context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x49acac0, qc=0x7ffe19d2db60) at utility.c:1076 ... #12 0x00000000008b6742 in exec_simple_query (query_string=0x49abbb0 "DROP VARIABLE a;") at postgres.c:1283 Later, there is a sinval message and "pg_variable_cache_callback()" is called, which sets "needs_validation" again. Perhaps my confustion is as follows: if DROP VARIABLE sends an invalidation message, and the handler sets "needs_validation", why is it necessary to set "needs_validation" in SessionVariableDropPostprocess() too? If we didn't set "needs_validation" in SessionVariableDropPostprocess(), the comment would be true. > > - I see that the patch adds cleanup of invalid session variable to each > > COMMIT. Is that a good idea? I'd expect that it is good enough to clean > > up whenever session variables are accessed. > > Calling remove_invalid_session_variables() during each COMMIT will affect > > all transactions, and I don't see the benefit. > > If I remember well, there were two reasons why I did it. > > 1. Minimize the unwanted surprises for users that will check memory usage - So if you > drop the variables, then the allocated space is released in possibly near time. > The rule - allocated space is released, when in the next transaction you use any > session variable looks a little bit crazy (although I think so there will not be real > significant difference in functionality). > Correct me, if I am wrong, but I don't remember any memory (or resource) cleaning > in Postgres, that is delayed to second transactions. I agree, there is overhead of cleaning, > but this can be very fast when the user doesn't use session variables, because the hash table > with session variables is not initialized. I can imagine some usage some hooks there as alternative I don't think that is a good enough reason. Memory used by an idle backend is not totally predictable for other reasons as well: - the catalog cache - memory that PostgreSQL freed, but that is kept in the malloc arena so that it can be reused for the next malloc() call I believe that the disadvantage of keeping some memory allocated across transaction is not as bad as the pain of going through all variables on each COMMIT. If you have set one or two session variables and run a lot of statements in autocommit mode, that is an unnecessary overhead. > 2. The main reason why it is implemented is implementation of temporal variables > with RESET or DROP on transaction end. Related code should be triggered at > commit time, it cannot be delayed. That makes sense. If I remember right, temporary variables are an optional feature. So I suggest that you move AtPreEOXact_SessionVariables() to the patch that adds temporary session variables. > > Also, do we need to call it during pg_session_variables()? > > I think it can be removed. Originally pg_session_variables showed only valid variables, but it is not true now. Right. I'll wait for your reaction to my above suggestions before I start working on the comments. Yours, Laurenz Albe