Re: Schema variables - new implementation for Postgres 15 - Mailing list pgsql-hackers

From Pavel Stehule
Subject Re: Schema variables - new implementation for Postgres 15
Date
Msg-id CAFj8pRAhP32ROovyWxP-QuR+ADDUh8imt_Qr+sRnxMY-Gj8Vsg@mail.gmail.com
Whole thread Raw
In response to Re: Schema variables - new implementation for Postgres 15  (Julien Rouhaud <rjuju123@gmail.com>)
Responses Re: Schema variables - new implementation for Postgres 15
List pgsql-hackers
Hi

po 29. 8. 2022 v 11:00 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
Hi,

On Sat, Aug 27, 2022 at 01:17:45PM +0200, Pavel Stehule wrote:
>
> after some thinking I think that instead of sequence I can use LSN. The
> combination oid, LSN should be unique forever

Yeah I was about suggesting doing that instead of a sequence, so +1 for that
approach!

I've been spending a bit of time trying to improve the test coverage on the
protection for concurrently deleted and recreated variables, and thought that a
new isolation test should be enough.  I'm attaching a diff (in .txt extension)
that could be applied to 009-regress-tests-for-session-variables.patch, but
while working on that I discovered a few problems.

First, the pg_debug_show_used_session_variables() function reports what's
currently locally known, but there's no guarantee that
AcceptInvalidationMessages() will be called prior to its execution.  For
instance if you're in a transaction and already hold a lock on the function and
execute it again.

It therefore means that it can display that a locally cached variable isn't
dropped and still holds a value, while it's not the case.  While it may be
surprising, I think that's still the wanted behavior as you want to know what
is the cache state.  FTR this is tested in the last permutation in the attached
patch (although the expected output contains the up-to-date information, so you
can see the failure).

But if invalidation are processed when calling the function, the behavior seems
surprising as far as I can see the cleanup seems to be done in 2 steps: mark t
he hash entry as removed and then remove the hash entry.  For instance:

(conn 1) CREATE VARIABLE myvar AS text;
(conn 1) LET myvar = 'something';
(conn 2) DROP VARIABLE myvar;
(conn 1) SELECT schema, name, removed FROM pg_debug_show_used_session_variables();
 schema | name  | removed
--------+-------+---------
 public | myvar | t
(1 row)

(conn 1) SELECT schema, name, removed FROM pg_debug_show_used_session_variables();
 schema | name | removed
--------+------+---------
(0 rows)

Why are two steps necessary here, and is that really wanted?

The value is removed in the first command, but at the end of transaction. pg_debug_show_used_session_variables is called before, and at this moment the variable should be in memory.

I enhanced pg_debug_show_used_session_variables about debug output for start and end, and you can see it.

(2022-08-30 19:38:49) postgres=# set client_min_messages to debug1;
SET
(2022-08-30 19:38:55) postgres=# CREATE VARIABLE myvar AS text;
DEBUG:  record for session variable "myvar" (oid:16390) was created in pg_variable
CREATE VARIABLE
(2022-08-30 19:39:03) postgres=# LET myvar = 'something';
DEBUG:  session variable "public.myvar" (oid:16390) has new entry in memory (emitted by WRITE)
DEBUG:  session variable "public.myvar" (oid:16390) has new value
LET
(2022-08-30 19:39:11) postgres=# SELECT schema, name, removed FROM pg_debug_show_used_session_variables();
DEBUG:  pg_variable_cache_callback 84 2941368844
DEBUG:  session variable "public.myvar" (oid:16390) should be rechecked (forced by sinval)
DEBUG:  pg_debug_show_used_session_variables start
DEBUG:  effective call of sync_sessionvars_all()
DEBUG:  pg_debug_show_used_session_variables end
DEBUG:  session variable "public.myvar" (oid:16390) is removing from memory
┌────────┬───────┬─────────┐
│ schema │ name  │ removed │
╞════════╪═══════╪═════════╡
│ public │ myvar │ t       │
└────────┴───────┴─────────┘
(1 row)

(2022-08-30 19:39:32) postgres=# SELECT schema, name, removed FROM pg_debug_show_used_session_variables();
DEBUG:  pg_debug_show_used_session_variables start
DEBUG:  pg_debug_show_used_session_variables end
┌────────┬──────┬─────────┐
│ schema │ name │ removed │
╞════════╪══════╪═════════╡
└────────┴──────┴─────────┘
(0 rows)


But I missed call sync_sessionvars_all in the drop variable. If I execute this routine there I can fix this behavior and the cleaning in sync_sessionvars_all can be more aggressive.

After change

(2022-08-31 06:25:54) postgres=# let x = 10;
LET
(2022-08-31 06:25:59) postgres=# SELECT schema, name, removed FROM pg_debug_show_used_session_variables();
┌────────┬──────┬─────────┐
│ schema │ name │ removed │
╞════════╪══════╪═════════╡
│ public │ x    │ f       │
└────────┴──────┴─────────┘
(1 row)

-- after drop in other session

(2022-08-31 06:26:00) postgres=# SELECT schema, name, removed FROM pg_debug_show_used_session_variables();
┌────────┬──────┬─────────┐
│ schema │ name │ removed │
╞════════╪══════╪═════════╡
└────────┴──────┴─────────┘
(0 rows)




 

Finally, I noticed that it's quite easy to get cache lookup failures when using
transactions.  AFAICS it's because the current code first checks in the local
cache (which often isn't immediately invalidated when in a transaction),
returns an oid (of an already dropped variable), then the code acquires a lock
on that non-existent variable, which internally accepts invalidation after the
lock is acquired.  The rest of the code can then fail with some "cache lookup
error" in the various functions as the invalidation has now been processed.
This is also tested in the attached isolation test.

I think that using a retry approach based on SharedInvalidMessageCounter change
detection, like RangeVarGetRelidExtended(), in IdentifyVariable() should be
enough to fix that class of problem, but maybe some other general functions
would need similar protection too.

I did it, and with this change it passed the isolation test. Thank you for your important help!

 

While looking at the testing, I also noticed that the main regression tests
comments are now outdated since the new (and more permissive) approach for
dropped variable detection.  For instance:

+ ALTER TYPE public.svar_test_type DROP ATTRIBUTE c;
+ -- should to fail
+ SELECT public.svar;
+   svar   
+ ---------
+  (10,20)
+ (1 row)
+
+ ALTER TYPE public.svar_test_type ADD ATTRIBUTE c int;
+ -- should to fail too (different type, different generation number);
+ SELECT public.svar;
+    svar   
+ ----------
+  (10,20,)
+ (1 row)


the comments are obsolete, fixed
 

+ CREATE VARIABLE public.avar AS int;
+ -- should to fail
+ SELECT avar FROM xxtab;
+  avar
+ ------
+    10
+ (1 row)
+
+ -- should be ok
+ SELECT public.avar FROM xxtab;
+  avar
+ ------
+
+ (1 row)

fixed
 


For reference, with the code as-is I get the following diff when testing the
attached isolation test:

--- /Users/rjuju/git/postgresql/src/test/isolation/expected/session-variable.out        2022-08-29 15:41:11.000000000 +0800
+++ /Users/rjuju/git/pg/pgmaster_debug/src/test/isolation/output_iso/results/session-variable.out       2022-08-29 15:42:17.000000000 +0800
@@ -16,21 +16,21 @@
 step let: LET myvar = 'test';
 step val: SELECT myvar;
 myvar
 -----
 test
 (1 row)

 step s1: BEGIN;
 step drop: DROP VARIABLE myvar;
 step val: SELECT myvar;
-ERROR:  column or variable "myvar" does not exist
+ERROR:  cache lookup failed for session variable 16386
 step sr1: ROLLBACK;

 starting permutation: let val dbg drop create dbg val
 step let: LET myvar = 'test';
 step val: SELECT myvar;
 myvar
 -----
 test
 (1 row)

@@ -68,20 +68,16 @@
 schema|name |removed
 ------+-----+-------
 public|myvar|f
 (1 row)

 step drop: DROP VARIABLE myvar;
 step create: CREATE VARIABLE myvar AS text
 step dbg: SELECT schema, name, removed FROM pg_debug_show_used_session_variables();
 schema|name |removed
 ------+-----+-------
-public|myvar|t
+public|myvar|f
 (1 row)

 step val: SELECT myvar;
-myvar
------
-
-(1 row)
-
+ERROR:  cache lookup failed for session variable 16389
 step sr1: ROLLBACK;


attached updated patches

Regards

Pavel
 
Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [PATCH v1] fix potential memory leak in untransformRelOptions
Next
From: Andres Freund
Date:
Subject: Re: Tracking last scan time