Re: Schema variables - new implementation for Postgres 15 - Mailing list pgsql-hackers

From Julien Rouhaud
Subject Re: Schema variables - new implementation for Postgres 15
Date
Msg-id 20220829090012.c6viynmm5lf2peqj@jrouhaud
Whole thread Raw
In response to Re: Schema variables - new implementation for Postgres 15  (Pavel Stehule <pavel.stehule@gmail.com>)
Responses Re: Schema variables - new implementation for Postgres 15
List pgsql-hackers
Hi,

On Sat, Aug 27, 2022 at 01:17:45PM +0200, Pavel Stehule wrote:
>
> after some thinking I think that instead of sequence I can use LSN. The
> combination oid, LSN should be unique forever

Yeah I was about suggesting doing that instead of a sequence, so +1 for that
approach!

I've been spending a bit of time trying to improve the test coverage on the
protection for concurrently deleted and recreated variables, and thought that a
new isolation test should be enough.  I'm attaching a diff (in .txt extension)
that could be applied to 009-regress-tests-for-session-variables.patch, but
while working on that I discovered a few problems.

First, the pg_debug_show_used_session_variables() function reports what's
currently locally known, but there's no guarantee that
AcceptInvalidationMessages() will be called prior to its execution.  For
instance if you're in a transaction and already hold a lock on the function and
execute it again.

It therefore means that it can display that a locally cached variable isn't
dropped and still holds a value, while it's not the case.  While it may be
surprising, I think that's still the wanted behavior as you want to know what
is the cache state.  FTR this is tested in the last permutation in the attached
patch (although the expected output contains the up-to-date information, so you
can see the failure).

But if invalidation are processed when calling the function, the behavior seems
surprising as far as I can see the cleanup seems to be done in 2 steps: mark t
he hash entry as removed and then remove the hash entry.  For instance:

(conn 1) CREATE VARIABLE myvar AS text;
(conn 1) LET myvar = 'something';
(conn 2) DROP VARIABLE myvar;
(conn 1) SELECT schema, name, removed FROM pg_debug_show_used_session_variables();
 schema | name  | removed
--------+-------+---------
 public | myvar | t
(1 row)

(conn 1) SELECT schema, name, removed FROM pg_debug_show_used_session_variables();
 schema | name | removed
--------+------+---------
(0 rows)

Why are two steps necessary here, and is that really wanted?

Finally, I noticed that it's quite easy to get cache lookup failures when using
transactions.  AFAICS it's because the current code first checks in the local
cache (which often isn't immediately invalidated when in a transaction),
returns an oid (of an already dropped variable), then the code acquires a lock
on that non-existent variable, which internally accepts invalidation after the
lock is acquired.  The rest of the code can then fail with some "cache lookup
error" in the various functions as the invalidation has now been processed.
This is also tested in the attached isolation test.

I think that using a retry approach based on SharedInvalidMessageCounter change
detection, like RangeVarGetRelidExtended(), in IdentifyVariable() should be
enough to fix that class of problem, but maybe some other general functions
would need similar protection too.

While looking at the testing, I also noticed that the main regression tests
comments are now outdated since the new (and more permissive) approach for
dropped variable detection.  For instance:

+ ALTER TYPE public.svar_test_type DROP ATTRIBUTE c;
+ -- should to fail
+ SELECT public.svar;
+   svar   
+ ---------
+  (10,20)
+ (1 row)
+ 
+ ALTER TYPE public.svar_test_type ADD ATTRIBUTE c int;
+ -- should to fail too (different type, different generation number);
+ SELECT public.svar;
+    svar   
+ ----------
+  (10,20,)
+ (1 row)

I'm also unsure if this one is showing a broken behavior or not:

+ CREATE VARIABLE public.avar AS int;
+ -- should to fail
+ SELECT avar FROM xxtab;
+  avar
+ ------
+    10
+ (1 row)
+ 
+ -- should be ok
+ SELECT public.avar FROM xxtab;
+  avar
+ ------
+ 
+ (1 row)


For reference, with the code as-is I get the following diff when testing the
attached isolation test:

--- /Users/rjuju/git/postgresql/src/test/isolation/expected/session-variable.out    2022-08-29 15:41:11.000000000
+0800
+++ /Users/rjuju/git/pg/pgmaster_debug/src/test/isolation/output_iso/results/session-variable.out    2022-08-29
15:42:17.000000000+0800
 
@@ -16,21 +16,21 @@
 step let: LET myvar = 'test';
 step val: SELECT myvar;
 myvar
 -----
 test
 (1 row)

 step s1: BEGIN;
 step drop: DROP VARIABLE myvar;
 step val: SELECT myvar;
-ERROR:  column or variable "myvar" does not exist
+ERROR:  cache lookup failed for session variable 16386
 step sr1: ROLLBACK;

 starting permutation: let val dbg drop create dbg val
 step let: LET myvar = 'test';
 step val: SELECT myvar;
 myvar
 -----
 test
 (1 row)

@@ -68,20 +68,16 @@
 schema|name |removed
 ------+-----+-------
 public|myvar|f
 (1 row)

 step drop: DROP VARIABLE myvar;
 step create: CREATE VARIABLE myvar AS text
 step dbg: SELECT schema, name, removed FROM pg_debug_show_used_session_variables();
 schema|name |removed
 ------+-----+-------
-public|myvar|t
+public|myvar|f
 (1 row)

 step val: SELECT myvar;
-myvar
------
-
-(1 row)
-
+ERROR:  cache lookup failed for session variable 16389
 step sr1: ROLLBACK;


Attachment

pgsql-hackers by date:

Previous
From: John Naylor
Date:
Subject: Re: use ARM intrinsics in pg_lfind32() where available
Next
From: Matthias van de Meent
Date:
Subject: Re: effective_multixact_freeze_max_age issue