Thread: proposal: session server side variables
Hi,
long time I working on this topic. Session server side variables are one major missing feature in PLpgSQL. Now I hope, I can summarize requests for implementation in Postgres:==========
2. Should not block a implementation of ANSI/SQL SQL modules - the modules and PSM languages are big chapter and should be implemented together and maybe from scratch - isn't easy to inject it to our environment pretty. More the modules are partially redundant with schemas and with our extensions. This is reason, why I don't take functionality described in standard.
------------
------------
This proposal doesn't propose Oracle's package variables and related behave. When we have not a full ADA environment, then partial implementation should be too complex with strange (foreign) behave in our environment. But Oracle's package variables should be emulated over proposed layer and this emulation should be really secure - no security by obscurity.
Comments, notices?
Regards
Pavel
On 14 October 2016 at 13:30, Pavel Stehule <pavel.stehule@gmail.com> wrote: > Hi, > > long time I working on this topic. Session server side variables are one > major missing feature in PLpgSQL. Now I hope, I can summarize requests for > implementation in Postgres: +1 > 2. accessed with respecting access rights: > > GRANT SELECT|UPDATE|ALL ON VARIABLE variable TO role > REVOKE SELECT|UPDATE|ALL ON VARIABLE variable FROM role This bit is important. For those wondering "why the hell would you want these, just (ab)use GUCs"... this is why. Think RLS. Especially when we eventually have session start / at login triggers, but even before then, you can initialise some expensive state once at the start of the session, transfer it from the app, or whatever. You initialise it via a SECURITY DEFINER procedure so the session user does not have the rights to write to the variable, and it can only be set via arbitration from the database security logic. From then on your RLS policies, your triggers, etc, can all simply inspect the session variable. People use package variables in another major database with a feature called virtual private database for something similar. So this will interest anyone who wants to make porting those users easier, too. > 4. non transactional - the metadata are transactional, but the content is > not. but only within the session, right? You're not proposing some kind of inter-backend IPC where one backend sets a session var and another backend accesses it and sees the value set by the first session? Speaking of which: parallel query. How do you envision this working in parallel query, where the workers are different backends? Especially since things like RLS are where it'd be quite desirable. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Pavel Stehule wrote > Session server side variables are one major missing feature in PLpgSQL. I think this would also be useful outside of PL/pgSQL to support query level variables similar to what SQL Server does. -- View this message in context: http://postgresql.nabble.com/proposal-session-server-side-variables-tp5925827p5925836.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.
2016-10-14 9:56 GMT+02:00 Craig Ringer <craig@2ndquadrant.com>:
On 14 October 2016 at 13:30, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> Hi,
>
> long time I working on this topic. Session server side variables are one
> major missing feature in PLpgSQL. Now I hope, I can summarize requests for
> implementation in Postgres:
+1
> 2. accessed with respecting access rights:
>
> GRANT SELECT|UPDATE|ALL ON VARIABLE variable TO role
> REVOKE SELECT|UPDATE|ALL ON VARIABLE variable FROM role
This bit is important.
For those wondering "why the hell would you want these, just (ab)use
GUCs"... this is why.
Think RLS. Especially when we eventually have session start / at login
triggers, but even before then, you can initialise some expensive
state once at the start of the session, transfer it from the app, or
whatever. You initialise it via a SECURITY DEFINER procedure so the
session user does not have the rights to write to the variable, and it
can only be set via arbitration from the database security logic. From
then on your RLS policies, your triggers, etc, can all simply inspect
the session variable.
People use package variables in another major database with a feature
called virtual private database for something similar. So this will
interest anyone who wants to make porting those users easier, too.
> 4. non transactional - the metadata are transactional, but the content is
> not.
but only within the session, right? You're not proposing some kind of
inter-backend IPC where one backend sets a session var and another
backend accesses it and sees the value set by the first session?
In this moment I propose only local (not shared variables). I hope so access can be safe with IMMUTABLE access function.
First time I would to implement basic set of features, that can be enhanced in future.
1. shared variables
2. variables initialized on session start, ...
Speaking of which: parallel query. How do you envision this working in
parallel query, where the workers are different backends? Especially
since things like RLS are where it'd be quite desirable.
It should be solved by IMMUTABLE PARALLEL SAFE access functions. The content of variable should be processed in planning time, and then the access from more processes is not necessary.
Default access function should VOLATILE PARALLEL UNSAFE - but immutable sets can be defined and used (and I see a sense of these function, because with these function the variables are accessed in query planning time).
Regards
Pavel
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Hi
2016-10-14 9:56 GMT+02:00 Craig Ringer <craig@2ndquadrant.com>:
On 14 October 2016 at 13:30, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> Hi,
>
> long time I working on this topic. Session server side variables are one
> major missing feature in PLpgSQL. Now I hope, I can summarize requests for
> implementation in Postgres:
+1
> 2. accessed with respecting access rights:
>
> GRANT SELECT|UPDATE|ALL ON VARIABLE variable TO role
> REVOKE SELECT|UPDATE|ALL ON VARIABLE variable FROM role
This bit is important.
For those wondering "why the hell would you want these, just (ab)use
GUCs"... this is why.
Think RLS. Especially when we eventually have session start / at login
triggers, but even before then, you can initialise some expensive
state once at the start of the session, transfer it from the app, or
whatever. You initialise it via a SECURITY DEFINER procedure so the
session user does not have the rights to write to the variable, and it
can only be set via arbitration from the database security logic. From
then on your RLS policies, your triggers, etc, can all simply inspect
the session variable.
People use package variables in another major database with a feature
called virtual private database for something similar. So this will
interest anyone who wants to make porting those users easier, too.
> 4. non transactional - the metadata are transactional, but the content is
> not.
but only within the session, right? You're not proposing some kind of
inter-backend IPC where one backend sets a session var and another
backend accesses it and sees the value set by the first session?
Speaking of which: parallel query. How do you envision this working in
parallel query, where the workers are different backends? Especially
since things like RLS are where it'd be quite desirable.
In first stage the session variables should be marked as parallel unsafe - but in future - there can be used similar technique like shared hashjoin.
I am sending proof concept - it doesn't support access to fields of composite variables, but any other functionality is done.
Most important features:
1. the values are stored in native types
2. access to content is protected by ACL - like the content of tables
3. the content is not MVCC based - no any cost of UPDATE
4. simple API allows access to content of variables from any supported environment.
Regards
Pavel
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On 14 October 2016 at 23:09, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> but only within the session, right? You're not proposing some kind of >> inter-backend IPC where one backend sets a session var and another >> backend accesses it and sees the value set by the first session? > > In this moment I propose only local (not shared variables). I hope so access > can be safe with IMMUTABLE access function. OK, good. Though I suspect you'll have a hard time with IMMUTABLE functions and need STABLE. I don't think it's correct to claim that these vars are immutable, since that'd allow users to do silly things like build them into index expressions. Splat. > Default access function should VOLATILE PARALLEL UNSAFE - but immutable sets > can be defined and used (and I see a sense of these function, because with > these function the variables are accessed in query planning time). I don't really understand the purpose of an immutable variable. It seems inherently contradictory. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Hi
2016-11-25 15:32 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:
Hi2016-10-14 9:56 GMT+02:00 Craig Ringer <craig@2ndquadrant.com>:On 14 October 2016 at 13:30, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> Hi,
>
> long time I working on this topic. Session server side variables are one
> major missing feature in PLpgSQL. Now I hope, I can summarize requests for
> implementation in Postgres:
+1
> 2. accessed with respecting access rights:
>
> GRANT SELECT|UPDATE|ALL ON VARIABLE variable TO role
> REVOKE SELECT|UPDATE|ALL ON VARIABLE variable FROM role
This bit is important.
For those wondering "why the hell would you want these, just (ab)use
GUCs"... this is why.
Think RLS. Especially when we eventually have session start / at login
triggers, but even before then, you can initialise some expensive
state once at the start of the session, transfer it from the app, or
whatever. You initialise it via a SECURITY DEFINER procedure so the
session user does not have the rights to write to the variable, and it
can only be set via arbitration from the database security logic. From
then on your RLS policies, your triggers, etc, can all simply inspect
the session variable.
People use package variables in another major database with a feature
called virtual private database for something similar. So this will
interest anyone who wants to make porting those users easier, too.
> 4. non transactional - the metadata are transactional, but the content is
> not.
but only within the session, right? You're not proposing some kind of
inter-backend IPC where one backend sets a session var and another
backend accesses it and sees the value set by the first session?
Speaking of which: parallel query. How do you envision this working in
parallel query, where the workers are different backends? Especially
since things like RLS are where it'd be quite desirable.In first stage the session variables should be marked as parallel unsafe - but in future - there can be used similar technique like shared hashjoin.I am sending proof concept - it doesn't support access to fields of composite variables, but any other functionality is done.Most important features:1. the values are stored in native types2. access to content is protected by ACL - like the content of tables3. the content is not MVCC based - no any cost of UPDATE4. simple API allows access to content of variables from any supported environment.
next update - setattr, getattr functions are working now
notes, comments?
Regards
Pavel
RegardsPavel
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On 28.11.2016 10:42, Pavel Stehule wrote: > > next update - setattr, getattr functions are working now > > notes, comments? > > Regards > > Pavel > It is interesting! Do you have plans to support also table variables? For example, like this: create type composite_type_2 as (a int, b text); create variable var7 composite_type_2; select insertvar('var7','(10,Hello world\, Hello world\, Hello world)'); select insertvar('var7','(1000,Hola, hola!)'); select * from getvar('var7'); a | b ------+--------------------------------------- 10 | Hello world, Hello world, Hello world 1000 | Hola, hola! Or it is a bad idea? Or it is not related to this patch? We have the extension (https://github.com/postgrespro/pg_variables). And it supports table like variables. It shows better performance against temporary tables. -- Artur Zakirov Postgres Professional: http://www.postgrespro.com Russian Postgres Company
Hi
2016-11-28 10:39 GMT+01:00 Artur Zakirov <a.zakirov@postgrespro.ru>:
On 28.11.2016 10:42, Pavel Stehule wrote:
next update - setattr, getattr functions are working now
notes, comments?
Regards
Pavel
It is interesting!
Do you have plans to support also table variables? For example, like this:
create type composite_type_2 as (a int, b text);
create variable var7 composite_type_2;
select insertvar('var7','(10,Hello world\, Hello world\, Hello world)');
select insertvar('var7','(1000,Hola, hola!)');
select * from getvar('var7');
a | b
------+---------------------------------------
10 | Hello world, Hello world, Hello world
1000 | Hola, hola!
Or it is a bad idea? Or it is not related to this patch?
Minimally in first stage I have not plan to support tables. It opens lot of questions - lot of code to implement - how to implement indexes, statistics, MVCC?
But some workaround is not hard - you can store a array of composite types.
postgres=# select setvar('a', array(select row(10,'ahoj')::test from generate_series(1,10)));
┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
│ a
╞══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════
│ {"(10,ahoj)","(10,ahoj)","(10,ahoj)","(10,ahoj)","(10,ahoj)","(10,ahoj)","(10,ahoj)","(10,ahoj)","(10,ahoj)","(10,aho
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
(1 row)
Time: 0,992 ms
postgres=# select * from unnest(getvar('a'));
┌────┬──────┐
│ a │ b │
╞════╪══════╡
│ 10 │ ahoj │
│ 10 │ ahoj │
│ 10 │ ahoj │
│ 10 │ ahoj │
│ 10 │ ahoj │
│ 10 │ ahoj │
│ 10 │ ahoj │
│ 10 │ ahoj │
│ 10 │ ahoj │
│ 10 │ ahoj │
└────┴──────┘
(10 rows)
postgres=# select setvar('a', array(select row(10,'ahoj')::test from generate_series(1,10)));
┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
│ a
╞══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════
│ {"(10,ahoj)","(10,ahoj)","(10,ahoj)","(10,ahoj)","(10,ahoj)","(10,ahoj)","(10,ahoj)","(10,ahoj)","(10,ahoj)","(10,aho
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
(1 row)
Time: 0,992 ms
postgres=# select * from unnest(getvar('a'));
┌────┬──────┐
│ a │ b │
╞════╪══════╡
│ 10 │ ahoj │
│ 10 │ ahoj │
│ 10 │ ahoj │
│ 10 │ ahoj │
│ 10 │ ahoj │
│ 10 │ ahoj │
│ 10 │ ahoj │
│ 10 │ ahoj │
│ 10 │ ahoj │
│ 10 │ ahoj │
└────┴──────┘
(10 rows)
For fast append it needs another significant work (and can be done in next step), but almost all work did Tom already.
We have the extension (https://github.com/postgrespro/pg_variables). And it supports table like variables. It shows better performance against temporary tables.
--
Artur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Hi
Most important features:1. the values are stored in native types2. access to content is protected by ACL - like the content of tables3. the content is not MVCC based - no any cost of UPDATE4. simple API allows access to content of variables from any supported environment.next update - setattr, getattr functions are working now
new update - rebased after partitioning patch
Regards
Pavel
notes, comments?RegardsPavel
RegardsPavel
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Attachment
2016-12-15 15:36 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:
HiMost important features:1. the values are stored in native types2. access to content is protected by ACL - like the content of tables3. the content is not MVCC based - no any cost of UPDATE4. simple API allows access to content of variables from any supported environment.next update - setattr, getattr functions are working nownew update - rebased after partitioning patch
next update - with some initial doc
Regards
Pavel
RegardsPavelnotes, comments?RegardsPavel
RegardsPavel
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Attachment
Hello Pavel, > The session variables should be: I have often wished I had such a feature, psql client side :-variables are just awful raw text things. A few comments, mostly about the design: > 1. persistent objects with temporal unshared typed content. The life of > content should be limited by session or by transaction. The content is > initialized to default (when it is defined) or to NULL when variable is > first accessed in variable' time scope (session, transaction). > > CREATE VARIABLE [schema.]variable type [DEFAULT default_value] > [TRANSACTION|SESION SCOPE] I'm not sure of the order, and from a parser perspective it is nice to announce the type before the value. Maybe a SQL-server like @-prefix would be nice, something like: CREATE VARIABLE @foo TEXT DEFAULT 'hello' SCOPE SESSION; > DROP VARIABLE [schema.]variable In the long term, What would be the possible scopes? TRANSACTION, SESSION, PERSISTANT ? Would some scopes orthogonal (eg SHARED between sessions for a USER in a DATABASE, SHARED at the cluster level?). How to deal with namespace issues? > 2. accessed with respecting access rights: > > GRANT SELECT|UPDATE|ALL ON VARIABLE variable TO role > REVOKE SELECT|UPDATE|ALL ON VARIABLE variable FROM role At least for transaction and session scopes it does not make sense that they would be accessible outside the session/transaction, so grant/revoke do not seem necessary? > 3. accessed/updated with special function "getvar", "setvar": > > FUNCTION getvar(regclass) RETURNS type > FUNCTION setvar(regclass, type) RETURNS void From an aesthetical point of view, I do not like that much. If you use CREATE & DROP, then logically you should use ALTER: CREATE VARIABLE @name TEXT DEFAULT 'calvin'; CREATE VARIABLE @name TEXT = 'calvin'; ALTER VARIABLE @name SET VALUETO 'hobbes'; ALTER VARIABLE @name = 'hoobes'; DROP VARIABLE @name; Maybe "SET" could be an option as well, but it is less logical: SET @name = 'susie'; But then "SET @..." would just be a shortcut for ALTER VARIABLE. Also a nicer way to reference them would be great, like SQL server. SELECT * FROM SomeTable WHERE name = @name; A function may be called behind the scene, I'm just arguing about the syntax here... Important question, what nice syntax to assign the result of a query to a variable? Maybe it could be: SET @name = query-returning-one-row; -- hmmm SET @name FROM query-returning-one-row; -- maybe better Or: ALTER VARIABLE @name WITH one-row-query; Special variables could allow to get the number of rows modified by the last option, like in PL/pgSQL but at the SQL level? > 4. non transactional - the metadata are transactional, but the content is > not. Hmmm... Do you mean: CREATE VARIABLE foo INT DEFAULT 1 SCOPE SESSION; BEGIN; SET @foo = 2; ROLLBACK; Then @foo is 2 despite the roolback? Yuk! I think that if the implementation is based on some system table for storage, then you could get the transaction properties for free, and it seems more logical to do so: CREATE TEMPORARY TABLE pg_session_variables(name TEXT PRIMARY KEY, value TEXT, oidtype, ...); CREATE VARIABLE @foo INTEGER; -- INSERT INTO TABLE ... SELECT * FROM x WHERE name = @foo; -- SELECT * FROM x WHERE name = (SELECT value::INT FROM pg_session_variables WHERE name='foo') So maybe some simple syntactic rewriting would be enough? Or some SPI function? -- Fabien.
2016-12-23 16:27 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
Hello Pavel,The session variables should be:
I have often wished I had such a feature, psql client side :-variables are just awful raw text things.
A few comments, mostly about the design:1. persistent objects with temporal unshared typed content. The life of
content should be limited by session or by transaction. The content is
initialized to default (when it is defined) or to NULL when variable is
first accessed in variable' time scope (session, transaction).
CREATE VARIABLE [schema.]variable type [DEFAULT default_value]
[TRANSACTION|SESION SCOPE]
I'm not sure of the order, and from a parser perspective it is nice to announce the type before the value.
I little bit dislike this style - in my proposal the session variables are very near to a sequences - and we have not any special symbols for sequences.
Session secure variables are some different than in MSSQL or MySQL - so I would not to use same syntax.
I really would to use pg_class as base for metadata of variables - conflicts are not possible. I can reuse safe GRANT/REVOKE mechanism ..
With different syntax it all lost sense - and I'll to implement it again.
Maybe a SQL-server like @-prefix would be nice, something like:
CREATE VARIABLE @foo TEXT DEFAULT 'hello' SCOPE SESSION;DROP VARIABLE [schema.]variable
In the long term, What would be the possible scopes?
TRANSACTION, SESSION, PERSISTANT ?
Would some scopes orthogonal (eg SHARED between sessions for a USER in a DATABASE, SHARED at the cluster level?).
I have a plan to support TRANSACTION and SESSION scope. Persistent or shared scope needs much more complex rules, and some specialized extensions will be better.
How to deal with namespace issues?2. accessed with respecting access rights:
GRANT SELECT|UPDATE|ALL ON VARIABLE variable TO role
REVOKE SELECT|UPDATE|ALL ON VARIABLE variable FROM role
At least for transaction and session scopes it does not make sense that they would be accessible outside the session/transaction, so grant/revoke do not seem necessary?
It is necessary - and I think so it is fundamental feature - any other features can be more or less replaced by extensions, but this one cannot or not simply - you have to protect content against some users - some cookies, ids have to be protected. It can be used well with RLS.
Ada language has packages, package variables. I would not to introduce packages because are redundant to schemas, but I need some mechanism for content protecting. I would not to introduce packages, because than I will have problem with joining ADA packages with Perl, Python. Instead I introduce secure granted access. More - I don't need to solve lexical scope - and I can use a wide used mechanism.
Ada language has packages, package variables. I would not to introduce packages because are redundant to schemas, but I need some mechanism for content protecting. I would not to introduce packages, because than I will have problem with joining ADA packages with Perl, Python. Instead I introduce secure granted access. More - I don't need to solve lexical scope - and I can use a wide used mechanism.
3. accessed/updated with special function "getvar", "setvar":
FUNCTION getvar(regclass) RETURNS type
FUNCTION setvar(regclass, type) RETURNS void
From an aesthetical point of view, I do not like that much.
If you use CREATE & DROP, then logically you should use ALTER:
CREATE VARIABLE @name TEXT DEFAULT 'calvin';
CREATE VARIABLE @name TEXT = 'calvin';
ALTER VARIABLE @name SET VALUE TO 'hobbes';
ALTER VARIABLE @name = 'hoobes';
DROP VARIABLE @name;
Maybe "SET" could be an option as well, but it is less logical:
SET @name = 'susie';
But then "SET @..." would just be a shortcut for ALTER VARIABLE.
I would to use a SET statement too. But it is used for another target now. Using ALTER in this content looks strange to me. It is used for changing metadata not a value.
Next step can be support of SQL statements
With SQL support you can do
SELECT varname;
SELECT * FROM compositevarname;
UPDATE varname SET value TO xxx;
UPDATE compositevarname SET field TO xxx;
Also a nicer way to reference them would be great, like SQL server.
SELECT * FROM SomeTable WHERE name = @name;
A function may be called behind the scene, I'm just arguing about the syntax here...
Important question, what nice syntax to assign the result of a query to a variable? Maybe it could be:
SET @name = query-returning-one-row; -- hmmm
SET @name FROM query-returning-one-row; -- maybe better
Or:
ALTER VARIABLE @name WITH one-row-query;
Special variables could allow to get the number of rows modified by the last option, like in PL/pgSQL but at the SQL level?4. non transactional - the metadata are transactional, but the content is
not.
Hmmm... Do you mean:
CREATE VARIABLE foo INT DEFAULT 1 SCOPE SESSION;
BEGIN;
SET @foo = 2;
ROLLBACK;
Then @foo is 2 despite the roolback? Yuk!
This is similar to sequences.
If you need transactional content - then you should to use tables.
I think that if the implementation is based on some system table for storage, then you could get the transaction properties for free, and it seems more logical to do so:
CREATE TEMPORARY TABLE pg_session_variables(name TEXT PRIMARY KEY, value TEXT, oidtype, ...);
CREATE VARIABLE @foo INTEGER; -- INSERT INTO TABLE ...
SELECT * FROM x WHERE name = @foo;
-- SELECT * FROM x WHERE name = (SELECT value::INT FROM pg_session_variables WHERE name='foo')
So maybe some simple syntactic rewriting would be enough? Or some SPI function?
--
Fabien.
Hello, > I little bit dislike this style - in my proposal the session variables are > very near to a sequences - and we have not any special symbols for > sequences. Yep, but we do not need a syntax to reference a sequence either... it is automatic and usually hidden behind SERIAL. I know there is a NEXTVAL function, I just never call it, so it is fine... If I define a variable I expect to have to use it. > Session secure variables are some different than in MSSQL or MySQL - so I > would not to use same syntax. I'm not sure why pg variables should be different from these other tools. What is the use case to cover? The few times I wished I had variables would have been covered by session-limited variables, for which grant/revoke do not make sense. > I really would to use pg_class as base for metadata of variables - > conflicts are not possible. I can reuse safe GRANT/REVOKE mechanism .. > With different syntax it all lost sense - and I'll to implement it again. I also hate having my time going down the drain, but this cannot be the justification for a feature. > I have a plan to support TRANSACTION and SESSION scope. That looks ok to me. > Persistent or shared scope needs much more complex rules, and some > specialized extensions will be better. Or maybe they should be avoided altogether? > [GRANT]. > It is necessary - and I think so it is fundamental feature - any other > features can be more or less replaced by extensions, but this one cannot or > not simply - you have to protect content against some users - some > cookies, ids have to be protected. It can be used well with RLS. > Ada language has packages, package variables. I would not to introduce > packages because are redundant to schemas, but I need some mechanism for > content protecting. I do not understand why GRANT make sense. If a variable is set by a session/tx and only accessible to this session/tx, then only the client who put it can get it back, so it is more of a syntactic commodity? What appropriate use case would need more? > I would not to introduce packages, because than I will have problem with > joining ADA packages with Perl, Python. Instead I introduce secure > granted access. More - I don't need to solve lexical scope - and I can > use a wide used mechanism. >> 3. accessed/updated with special function "getvar", "setvar": >>> >>> FUNCTION getvar(regclass) RETURNS type >>> FUNCTION setvar(regclass, type) RETURNS void >> >> From an aesthetical point of view, I do not like that much. >> >> If you use CREATE & DROP, then logically you should use ALTER: >> >> CREATE VARIABLE @name TEXT DEFAULT 'calvin'; >> CREATE VARIABLE @name TEXT = 'calvin'; >> ALTER VARIABLE @name SET VALUE TO 'hobbes'; >> ALTER VARIABLE @name = 'hoobes'; >> DROP VARIABLE @name; >> >> Maybe "SET" could be an option as well, but it is less logical: >> >> SET @name = 'susie'; >> >> But then "SET @..." would just be a shortcut for ALTER VARIABLE. > > I would to use a SET statement too. But it is used for another target now. > Using ALTER in this content looks strange to me. It is used for changing > metadata not a value. ALTER SEQUENCE does allow to change its value? Or maybe use UPDATE, as you suggest below... > Next step can be support of SQL statements > With SQL support you can do > > SELECT varname; > UPDATE varname SET value TO xxx; > SELECT * FROM compositevarname; > UPDATE compositevarname SET field TO xxx; I'm not at ease with the syntax because varname is both a value and a relation somehow... But maybe that make sense? Not sure, I'll think about it. >> Hmmm... Do you mean: >> >> CREATE VARIABLE foo INT DEFAULT 1 SCOPE SESSION; >> BEGIN; >> SET @foo = 2; >> ROLLBACK; >> >> Then @foo is 2 despite the roolback? Yuk! > > This is similar to sequences. That is not a good reason to do the same. Sequences are special objects for which the actual value is expected to be of no importance, only that it is different from the previous and the next. I do not think that "variables" should behave like that, because their value is important. > If you need transactional content - then you should to use tables. Why not. Maybe variables just need be a syntactic convenience around that? A variable is a table with one row holding one value... In which case GRANT/REVOKE makes sense, because a table may be shared and persistent, thus is not limited to a session or a transaction. That allows to set constraints. CREATE VARIABLE foo INT NOT NULL DEFAULT 1 SCOPE SESSION/SESSION SCOPE; -> CREATE TEMPORARY TABLE foo(val INT NOT NULL DEFAULT 1) ONE ROW; -> INSERT INTO foo VALUES(); @foo -> (SELECT val FROM foo LIMIT 1) @foo.field -> (SELECT field FROM foo LIMIT 1) SET @foo = 2; -> UPDATE @foo SET val = 2; SET @foo.field = 3; -> UPDATE foo SET field = 3; DROP VARIABLE foo; -> DROP TABLE foo; -- Fabien.
2016-12-23 18:46 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
Hello,I little bit dislike this style - in my proposal the session variables are
very near to a sequences - and we have not any special symbols for
sequences.
Yep, but we do not need a syntax to reference a sequence either... it is automatic and usually hidden behind SERIAL. I know there is a NEXTVAL function, I just never call it, so it is fine... If I define a variable I expect to have to use it.Session secure variables are some different than in MSSQL or MySQL - so I
would not to use same syntax.
I'm not sure why pg variables should be different from these other tools.
because MySQL variables are not declared - and allows assign everywhere - and MSSQL variables are not persistent. Its total different creatures.
What is the use case to cover? The few times I wished I had variables would have been covered by session-limited variables, for which grant/revoke do not make sense.I really would to use pg_class as base for metadata of variables -
conflicts are not possible. I can reuse safe GRANT/REVOKE mechanism ..
With different syntax it all lost sense - and I'll to implement it again.
I also hate having my time going down the drain, but this cannot be the justification for a feature.I have a plan to support TRANSACTION and SESSION scope.
That looks ok to me.Persistent or shared scope needs much more complex rules, and some specialized extensions will be better.
Or maybe they should be avoided altogether?[GRANT].
It is necessary - and I think so it is fundamental feature - any other
features can be more or less replaced by extensions, but this one cannot or
not simply - you have to protect content against some users - some
cookies, ids have to be protected. It can be used well with RLS.
Ada language has packages, package variables. I would not to introduce
packages because are redundant to schemas, but I need some mechanism for
content protecting.
I do not understand why GRANT make sense. If a variable is set by a session/tx and only accessible to this session/tx, then only the client who put it can get it back, so it is more of a syntactic commodity?
In one session you can use lot of roles - some code can be used for securing interactive work, some can be for securing some API, sometimes you can secure a access to some sources. You can switch lot of roles by using security definer functions.
What appropriate use case would need more?I would not to introduce packages, because than I will have problem with joining ADA packages with Perl, Python. Instead I introduce secure granted access. More - I don't need to solve lexical scope - and I can use a wide used mechanism.3. accessed/updated with special function "getvar", "setvar":
FUNCTION getvar(regclass) RETURNS type
FUNCTION setvar(regclass, type) RETURNS void
From an aesthetical point of view, I do not like that much.
If you use CREATE & DROP, then logically you should use ALTER:
CREATE VARIABLE @name TEXT DEFAULT 'calvin';
CREATE VARIABLE @name TEXT = 'calvin';
ALTER VARIABLE @name SET VALUE TO 'hobbes';
ALTER VARIABLE @name = 'hoobes';
DROP VARIABLE @name;
Maybe "SET" could be an option as well, but it is less logical:
SET @name = 'susie';
But then "SET @..." would just be a shortcut for ALTER VARIABLE.
I would to use a SET statement too. But it is used for another target now.
Using ALTER in this content looks strange to me. It is used for changing
metadata not a value.
ALTER SEQUENCE does allow to change its value? Or maybe use UPDATE, as you suggest below...Next step can be support of SQL statements
With SQL support you can do
SELECT varname;
UPDATE varname SET value TO xxx;SELECT * FROM compositevarname;
UPDATE compositevarname SET field TO xxx;
I'm not at ease with the syntax because varname is both a value and a relation somehow... But maybe that make sense? Not sure, I'll think about it.Hmmm... Do you mean:
CREATE VARIABLE foo INT DEFAULT 1 SCOPE SESSION;
BEGIN;
SET @foo = 2;
ROLLBACK;
Then @foo is 2 despite the roolback? Yuk!
This is similar to sequences.
That is not a good reason to do the same. Sequences are special objects for which the actual value is expected to be of no importance, only that it is different from the previous and the next. I do not think that "variables" should behave like that, because their value is important.If you need transactional content - then you should to use tables.
Why not.
Maybe variables just need be a syntactic convenience around that?
There is pretty similar relation between sequences and tables and variables and tables.
A variable is a table with one row holding one value... In which case GRANT/REVOKE makes sense, because a table may be shared and persistent, thus is not limited to a session or a transaction.
That allows to set constraints.
In first iteration the constraint can be implemented with domains - but there is not any break to implement constraints directly on variables.
CREATE VARIABLE foo INT NOT NULL DEFAULT 1 SCOPE SESSION/SESSION SCOPE;
-> CREATE TEMPORARY TABLE foo(val INT NOT NULL DEFAULT 1) ONE ROW;
-> INSERT INTO foo VALUES();
@foo
-> (SELECT val FROM foo LIMIT 1)
@foo.field
-> (SELECT field FROM foo LIMIT 1)
SET @foo = 2;
-> UPDATE @foo SET val = 2;
SET @foo.field = 3;
-> UPDATE foo SET field = 3;
DROP VARIABLE foo;
-> DROP TABLE foo;
--
Fabien.
On 12/23/2016 08:20 AM, Pavel Stehule wrote: > 2016-12-23 16:27 GMT+01:00 Fabien COELHO: >> I have often wished I had such a feature, psql client side :-variables are >> just awful raw text things. Agreed. >> A few comments, mostly about the design: >> >> 1. persistent objects with temporal unshared typed content. The life of >>> content should be limited by session or by transaction. The content is >>> initialized to default (when it is defined) or to NULL when variable is >>> first accessed in variable' time scope (session, transaction). >>> >>> CREATE VARIABLE [schema.]variable type [DEFAULT default_value] >>> [TRANSACTION|SESION SCOPE] I haven't looked, but I take it the SQL standard is silent on the issue of variables? > I really would to use pg_class as base for metadata of variables - > conflicts are not possible. I can reuse safe GRANT/REVOKE mechanism .. That would be very useful. >> In the long term, What would be the possible scopes? >> >> TRANSACTION, SESSION, PERSISTANT ? >> >> Would some scopes orthogonal (eg SHARED between sessions for a USER in a >> DATABASE, SHARED at the cluster level?). > > I have a plan to support TRANSACTION and SESSION scope. Persistent or > shared scope needs much more complex rules, and some specialized extensions > will be better. I can see where persistent variables would be very useful though. >> 2. accessed with respecting access rights: >>> >>> GRANT SELECT|UPDATE|ALL ON VARIABLE variable TO role >>> REVOKE SELECT|UPDATE|ALL ON VARIABLE variable FROM role >> >> At least for transaction and session scopes it does not make sense that >> they would be accessible outside the session/transaction, so grant/revoke >> do not seem necessary? > > It is necessary - and I think so it is fundamental feature - any other > features can be more or less replaced by extensions, but this one cannot or > not simply - you have to protect content against some users - some > cookies, ids have to be protected. It can be used well with RLS. How would this work for transaction and session scopes though? What would be the point -- no other access is possible other than what happens in the session. Do you envision something like CREATE VARIABLE foo ...;GRANT SELECT ON VARIABLE foo TO bob;SET ROLE bob; ? >> 3. accessed/updated with special function "getvar", "setvar": >>> >>> FUNCTION getvar(regclass) RETURNS type >>> FUNCTION setvar(regclass, type) RETURNS void >>> >> >> From an aesthetical point of view, I do not like that much. >> >> If you use CREATE & DROP, then logically you should use ALTER: >> >> CREATE VARIABLE @name TEXT DEFAULT 'calvin'; >> CREATE VARIABLE @name TEXT = 'calvin'; >> ALTER VARIABLE @name SET VALUE TO 'hobbes'; >> ALTER VARIABLE @name = 'hoobes'; >> DROP VARIABLE @name; Makes sense. >> Maybe "SET" could be an option as well, but it is less logical: >> >> SET @name = 'susie'; >> >> But then "SET @..." would just be a shortcut for ALTER VARIABLE. Maybe. Not sure I like that. > I would to use a SET statement too. But it is used for another target now. > Using ALTER in this content looks strange to me. It is used for changing > metadata not a value. > > Next step can be support of SQL statements > > With SQL support you can do > > SELECT varname; +1 > SELECT * FROM compositevarname; +1 > UPDATE varname SET value TO xxx; > UPDATE compositevarname SET field TO xxx; These need more thought I think. >> Also a nicer way to reference them would be great, like SQL server. >> >> SELECT * FROM SomeTable WHERE name = @name; >> >> A function may be called behind the scene, I'm just arguing about the >> syntax here... >> >> Important question, what nice syntax to assign the result of a query to a >> variable? Maybe it could be: >> >> SET @name = query-returning-one-row; -- hmmm >> SET @name FROM query-returning-one-row; -- maybe better >> >> Or: >> >> ALTER VARIABLE @name WITH one-row-query; >> >> Special variables could allow to get the number of rows modified by the >> last option, like in PL/pgSQL but at the SQL level? I think the SET syntax is growing on me, but I suspect there may be push back on overloading that syntax. >> 4. non transactional - the metadata are transactional, but the content is >>> not. >>> >> >> Hmmm... Do you mean: >> >> CREATE VARIABLE foo INT DEFAULT 1 SCOPE SESSION; >> BEGIN; >> SET @foo = 2; >> ROLLBACK; >> >> Then @foo is 2 despite the roolback? Yuk! Agreed > This is similar to sequences. I don't see how variables really have anything to do with sequences. > If you need transactional content - then you should to use tables. I definitely have use-cases where transactional variables would be useful. >> I think that if the implementation is based on some system table for >> storage, then you could get the transaction properties for free, and it >> seems more logical to do so: >> >> CREATE TEMPORARY TABLE pg_session_variables(name TEXT PRIMARY KEY, value >> TEXT, oidtype, ...); >> >> CREATE VARIABLE @foo INTEGER; -- INSERT INTO TABLE ... >> >> SELECT * FROM x WHERE name = @foo; >> -- SELECT * FROM x WHERE name = (SELECT value::INT FROM >> pg_session_variables WHERE name='foo') >> >> So maybe some simple syntactic rewriting would be enough? Or some SPI >> function? I was thinking along those lines too. Joe -- Crunchy Data - http://crunchydata.com PostgreSQL Support for Secure Enterprises Consulting, Training, & Open Source Development
> because MySQL variables are not declared - and allows assign everywhere - Ok. I do not do MySQL. > and MSSQL variables are not persistent. Yep, but they might be? > In one session you can use lot of roles - some code can be used for > securing interactive work, some can be for securing some API, sometimes you > can secure a access to some sources. You can switch lot of roles by using > security definer functions. Hmmm. Switching role within a transaction. I never did need that... but that is a use case. >> If you need transactional content - then you should to use tables. >> >> Why not. >> >> Maybe variables just need be a syntactic convenience around that? > > There is pretty similar relation between sequences and tables and variables > and tables. Yep. A sequence is a one row table, so a variable may be also a one row table as well, but with more flexibility about its type, and some nice syntactic sugar (like SERIAL which is syntactic sugar for CREATE SEQUENCE ...). > In first iteration the constraint can be implemented with domains - but > there is not any break to implement constraints directly on variables. Hmmm. If a variable is implemented as a one row table, then constraints are already available there, as well as grant & revoke, they can be any type including composite, nearly nothing to implement to get... A "one row" table would be a CREATE + one INSERT, UPDATE allowed, further INSERT and DELETE are disallowed by construction. Then some syntactic sugar for variables (session => temporary table, persistent => standard table). Note sure about a "transaction variable", though... maybe an [unlogged] table automatically dropped on commit? -- Fabien.
Fabien COELHO <coelho@cri.ensmp.fr> writes: >> In first iteration the constraint can be implemented with domains - but >> there is not any break to implement constraints directly on variables. > Hmmm. If a variable is implemented as a one row table, then constraints > are already available there, as well as grant & revoke, they can be any > type including composite, nearly nothing to implement to get... > A "one row" table would be a CREATE + one INSERT, UPDATE allowed, further > INSERT and DELETE are disallowed by construction. Then some syntactic > sugar for variables (session => temporary table, persistent => standard > table). Note sure about a "transaction variable", though... maybe an > [unlogged] table automatically dropped on commit? I think it's entirely silly to be inventing something that's morally a one-row table, when we already have perfectly good one-row tables. The value of a server-side variable facility would be mostly that it doesn't have all the overhead implied by tables. I think that is a direct reason not to think about overhead like constraints, as well. regards, tom lane
2016-12-23 19:28 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
What is use case for transactional variables? I miss any experience - I wrote lot plpgsql lines and newer would it.
because MySQL variables are not declared - and allows assign everywhere -
Ok. I do not do MySQL.and MSSQL variables are not persistent.
Yep, but they might be?In one session you can use lot of roles - some code can be used for
securing interactive work, some can be for securing some API, sometimes you
can secure a access to some sources. You can switch lot of roles by using
security definer functions.
Hmmm. Switching role within a transaction. I never did need that... but that is a use case.
Any application with security definer functions - depends on different communities - it is used sometimes strongly.
If you need transactional content - then you should to use tables.
Why not.
Maybe variables just need be a syntactic convenience around that?
There is pretty similar relation between sequences and tables and variables
and tables.
Yep. A sequence is a one row table, so a variable may be also a one row table as well, but with more flexibility about its type, and some nice syntactic sugar (like SERIAL which is syntactic sugar for CREATE SEQUENCE ...).In first iteration the constraint can be implemented with domains - but
there is not any break to implement constraints directly on variables.
Hmmm. If a variable is implemented as a one row table, then constraints are already available there, as well as grant & revoke, they can be any type including composite, nearly nothing to implement to get...
A "one row" table would be a CREATE + one INSERT, UPDATE allowed, further INSERT and DELETE are disallowed by construction. Then some syntactic sugar for variables (session => temporary table, persistent => standard table). Note sure about a "transaction variable", though... maybe an [unlogged] table automatically dropped on commit?
Probably we have different expectation from variables. I don't expect so variable can be changed by any rollback.
What is use case for transactional variables? I miss any experience - I wrote lot plpgsql lines and newer would it.
When I remove ACID, and allow only one value - then the implementation can be simple and fast - some next step can be support of expandable types. Sure - anybody can use temporary tables now and in future. But it is slow - more now, because we doesn't support global temporary tables. But ACID needs lot of CPU times, needs possible VACUUM, ...
No ACID variables are simple to implement, simple to directly accessible from any PL (although I am thinking about better support in 2nd phase for PLpgSQL).
--
Fabien.
Hello Tom, >> Hmmm. If a variable is implemented as a one row table, then constraints >> are already available there, as well as grant & revoke, they can be any >> type including composite, nearly nothing to implement to get... > >> A "one row" table would be a CREATE + one INSERT, UPDATE allowed, further >> INSERT and DELETE are disallowed by construction. Then some syntactic >> sugar for variables (session => temporary table, persistent => standard >> table). Note sure about a "transaction variable", though... maybe an >> [unlogged] table automatically dropped on commit? > > I think it's entirely silly Thanks for "silly". Last time it was "academic". Is it better? :-) > to be inventing something that's morally a one-row table, when we > already have perfectly good one-row tables. Hmmm. Although I can think of ways to ensure that a table is one row (unique, check, trigger, rules, whatever...), none are straightforward. > The value of a server-side variable facility would be mostly that it > doesn't have all the overhead implied by tables. I do not know that. I think that discussing semantics and syntax does have value as well. ISTM that there are 4 intermixed issues related to server-side variables: (1) what should be their possible semantics and capabilities e.g. with respect to transactions, permissions, namespace,types, constraints, ... => what is the use cases for them? (2) what should be their syntax => what do you want do write to use them? (3) how to implement them (4) how to optimize the implementation, eventually I think that a special purpose variable infrastructure implied by your remark is just starting from the end point. The first three points seem relevant too because they help focus on other issues. I'm not claiming that having variables as one-row tables is the best ever solution, but I do not think that it is such a terrible idea either. At least it provides a clear and consistent range of existing semantics out of the box (unlogged/temporary/standard, permissions...), thus provide a clean model for discussion, even if it is rejected for some reason afterwards. Also it seems easy to implement (as syntactic sugar) and play with, and I'm not sure that sure that performance would be that bad, as for session (temporary unlogged?) variables they would probably simply stay in cache. If the performance is an issue in concrete use cases they could be optimized in the end, to reduce time and space, which would be great. > I think that is a direct reason not to think about overhead like > constraints, as well. Hmmm. If there is no constraint, the overhead is limited? If one has use for constraints, then they can pay the overhead? -- Fabien.
On 12/23/16 4:24 PM, Fabien COELHO wrote: > I think that a special purpose variable infrastructure implied by your > remark is just starting from the end point. The first three points seem > relevant too because they help focus on other issues. If you want to ignore performance, there are things you can do with non-transactional variables that are simply not possible with tables. But even ignoring that, the performance cost of temp tables is massive compared to variables. Not only is the access far more complex, but bloating is a major problem (both in the table itself as well as in the catalog). That's part of the driver for all the discussion about things like permanent temp tables (which still leaves a bloat and performance problem in the table itself). -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532)
Hello Pavel, >> Hmmm. Switching role within a transaction. I never did need that... but >> that is a use case. > > Any application with security definer functions - depends on different > communities - it is used sometimes strongly. Hmmm. So I understand that you would like to do something like: - call a secure function which sets a session variable with restricted permissions - do some things which cannot accessor change the variable - call another secure function which can access, update, remove the variable... > Probably we have different expectation from variables. I don't expect so > variable can be changed by any rollback. Indeed, it seems that we do not have the same expectations. > What is use case for transactional variables? I miss any experience - I > wrote lot plpgsql lines and newer would it. Here are two use cases, which are neither good nor bad, but that I have in mind when I'm argumenting. (1) First use case I'm thinking of is software update, with persistent transactional variables, eg: -- let assume we have application_version = 1 BEGIN; -- lock things up -- update application schema and data to version2 -- set application_version = 2 -- unlock things COMMIT; I would not want the application_version to remain at 2 if the COMMIT fails, obviously. This is usually implemented with a one-row table, but some kind of variable syntax could be quite elegant. For this use case, a variable should be persistant, it does not it to be efficient, it should have permissions and should be transactional. (2) Second use case I'm thinking of is some kind of large batch management. -- variable batch_1_is_done = false BEGIN; -- try to do large batch 1... -- set batch_1_is_done = true COMMIT; -- then test whether it worked, do some cleanup if not... -- there are some discussions to get some \if in psql... For this second example, I would not like batch_is_done to be true if the commit failed, but I do not think that any permissions would be useful, and it would be fine if it is just accessible from a session only. > When I remove ACID, and allow only one value - then the implementation can > be simple and fast - some next step can be support of expandable types. > Sure - anybody can use temporary tables now and in future. But it is slow - > more now, because we doesn't support global temporary tables. But ACID > needs lot of CPU times, needs possible VACUUM, ... Yep, but if you need persistant and transactional then probably you can accept less performant... > No ACID variables are simple to implement, simple to directly accessible > from any PL (although I am thinking about better support in 2nd phase for > PLpgSQL). ACID may be simple to implement with some kind of underlying table, or maybe a row in a table. How efficient it could be is another question, but then if the feature does not allow some use cases, and it not so interesting to have it. That is why I think that it is worth discussing "silly" semantics and syntax. The namespace issue is unclear to me. Would a variable name clash with a table name? It should if you want to be able write "SELECT stuff FROM variablename", which may or may not be a good idea. -- Fabien.
2016-12-24 12:19 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
Hello Pavel,Hmmm. Switching role within a transaction. I never did need that... but
that is a use case.
Any application with security definer functions - depends on different
communities - it is used sometimes strongly.
Hmmm. So I understand that you would like to do something like:
- call a secure function which sets a session variable with restricted
permissions
- do some things which cannot access or change the variable
- call another secure function which can access, update, remove the
variable...Probably we have different expectation from variables. I don't expect so
variable can be changed by any rollback.
Indeed, it seems that we do not have the same expectations.What is use case for transactional variables? I miss any experience - I
wrote lot plpgsql lines and newer would it.
Here are two use cases, which are neither good nor bad, but that I have in mind when I'm argumenting.
(1) First use case I'm thinking of is software update, with persistent transactional variables, eg:
-- let assume we have application_version = 1
BEGIN;
-- lock things up
-- update application schema and data to version 2
-- set application_version = 2
-- unlock things
COMMIT;
I would not want the application_version to remain at 2 if the COMMIT fails, obviously. This is usually implemented with a one-row table, but some kind of variable syntax could be quite elegant. For this use case, a variable should be persistant, it does not it to be efficient, it should have permissions and should be transactional.
(2) Second use case I'm thinking of is some kind of large batch management.
-- variable batch_1_is_done = false
BEGIN;
-- try to do large batch 1...
-- set batch_1_is_done = true
COMMIT;
-- then test whether it worked, do some cleanup if not...
-- there are some discussions to get some \if in psql...
For this second example, I would not like batch_is_done to be true if the commit failed, but I do not think that any permissions would be useful, and it would be fine if it is just accessible from a session only.
On server side you can use PLpgSQL and handling exception.
On client side you can use technique used in MSSQL, where variables are not transactional too.
BEGIN
-- servar state 1
statement;
-- servar state 2
statement;
COMMIT
We should to introduce client side session variable :STATUS
\if eq(:STATUS, 'ok')
...
...
When I remove ACID, and allow only one value - then the implementation can
be simple and fast - some next step can be support of expandable types.
Sure - anybody can use temporary tables now and in future. But it is slow -
more now, because we doesn't support global temporary tables. But ACID
needs lot of CPU times, needs possible VACUUM, ...
Yep, but if you need persistant and transactional then probably you can accept less performant...
When you accept less performance, then you can use temporary tables. You can easy wrap it by few polymorphic functions.
No ACID variables are simple to implement, simple to directly accessible
from any PL (although I am thinking about better support in 2nd phase for
PLpgSQL).
ACID may be simple to implement with some kind of underlying table, or maybe a row in a table. How efficient it could be is another question, but then if the feature does not allow some use cases, and it not so interesting to have it. That is why I think that it is worth discussing "silly" semantics and syntax.
The namespace issue is unclear to me. Would a variable name clash with a table name? It should if you want to be able write "SELECT stuff FROM variablename", which may or may not be a good idea.
It is based on history and experience - one fundamental issue of languages for stored procedures is a conflict of variables and SQL identifiers. When variables are based on pg_class, there are not possibility to any new conflict.
More I can use a security related to schema - It is partial coverage of package variables.
Regards
Pavel
--
Fabien.
Hello Jim, > If you want to ignore performance, there are things you can do with > non-transactional variables that are simply not possible with tables. But > even ignoring that, the performance cost of temp tables is massive compared > to variables. Ok, then read "variables are like tables" instead of "variables are tables". Note that I'm definitely interested in performance in the end, but I'm first interested in discussing features. Currently I'm unconvinced by a proposal of variables that are in the relation namespace (pg_class), are subject to GRANT/REVOKE, are managed with CREATE/DROP *but* are somehow partly non transactional, and only alive through sessions, and are only accessed through functions... My current thinking is that the proposed design is either too heavy (pg_class, permissions) or too light (then why not possible persistence and/or transactions?): I would be more at ease with very light-weight typed? session variables stored in the server process "as is", without any attempt at pg_class & permissions, or with more full featured variables, but not something half-baked which seems designed for a particular use case that I do not have. > Not only is the access far more complex, but bloating is a major problem > (both in the table itself as well as in the catalog). That's part of the > driver for all the discussion about things like permanent temp tables > (which still leaves a bloat and performance problem in the table > itself). If a variable as currently discussed is in pg_class and subject to permissions, then probably it will cost on the catalog side anyway, and at least their existent would be transactional even if their value is not. -- Fabien.
2016-12-26 8:30 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
Hello Jim,If you want to ignore performance, there are things you can do with non-transactional variables that are simply not possible with tables. But even ignoring that, the performance cost of temp tables is massive compared to variables.
Ok, then read "variables are like tables" instead of "variables are tables". Note that I'm definitely interested in performance in the end, but I'm first interested in discussing features.
Currently I'm unconvinced by a proposal of variables that are in the relation namespace (pg_class), are subject to GRANT/REVOKE, are managed with CREATE/DROP *but* are somehow partly non transactional, and only alive through sessions, and are only accessed through functions...
My current thinking is that the proposed design is either too heavy (pg_class, permissions) or too light (then why not possible persistence and/or transactions?): I would be more at ease with very light-weight typed? session variables stored in the server process "as is", without any attempt at pg_class & permissions, or with more full featured variables, but not something half-baked which seems designed for a particular use case that I do not have.Not only is the access far more complex, but bloating is a major problem (both in the table itself as well as in the catalog). That's part of the driver for all the discussion about things like permanent temp tables (which still leaves a bloat and performance problem in the table itself).
If a variable as currently discussed is in pg_class and subject to permissions, then probably it will cost on the catalog side anyway, and at least their existent would be transactional even if their value is not.
the access right check has **constant** small cost (check in object cache). The proposed variables has not any negative effect on catalogue bloating because a metadata are persistent.
It is reason why I use statement "CREATE VARIABLE", not "DECLARE VARIABLE"
Regards
Pavel
--
Fabien.
Hello Pavel, On Sat, 24 Dec 2016, Pavel Stehule wrote: Maybe you could consider removing the part of the message that you are not responding to, so that it would be easier for the reader to see your answers and comments. >> Hmmm. So I understand that you would like to do something like: >> >> - call a secure function which sets a session variable with restricted >> permissions >> - do some things which cannot access or change the variable >> - call another secure function which can access, update, remove the >> variable... I'm still not clear with your use case. Did I read you correctly? ISTM that the above use case could be managed with insert/update/delete in a table with chosen permissions from the functions... >> Yep, but if you need persistant and transactional then probably you can >> accept less performant... > > When you accept less performance, then you can use temporary tables. You > can easy wrap it by few polymorphic functions. Probably. This is probably true as well from what I understood from your use case. >> The namespace issue is unclear to me. Would a variable name clash with a >> table name? It should if you want to be able write "SELECT stuff FROM >> variablename", which may or may not be a good idea. > > It is based on history and experience - one fundamental issue of languages > for stored procedures is a conflict of variables and SQL identifiers. I agree that this is a pain, which could be solved by using a prefix, say $<name> for instance. > When variables are based on pg_class, there are not possibility to any > new conflict. If variables are based on pg_class, ISTM that they will cost anyway. To sum up my current opinion, taking into accounts your use case and Tom & Jim argments about performance, I think that variables should be either: - full-featured database objects well integrated in the database logic: CREATE/ALTER/DROP, in pg_class, subject to standard permissions, constraints, transactions, possibly persistent... Basically like a one-row table, although the implementation should be more efficient, I agree. ** OR ** - very light-weight, a simple server process key-value store, which would not use CREATE/ALTER/DROP which suggest otherwise, they would not be subject to permissions nor transactions nor persistence but die with the session, goodbye. A possible concession to permissions would be to have a per-role store, and/or some visibility/accessibility declaration at creation time, but certainly not GRANT/RESTORE syntax which suggest a database persistent object. I'm very reserved about anything in between these two options, which looks like a database object but is not really one, so I think that it create confusion. In both case, the syntax should be nice and elegant... i.e. not only based on functions, probably it should use some prefix convention (@, $)... For the light weight option. DECLARE @someday DATE [ = <SQL EXPRESSION> ] [visibility restriction?]; ... then use @now as a possible value anywhere,which will be substituted quite early in the execution process, before planning. -- update a variable value: [SET, ASSIGN, ... nothing?] @someday = <SQL EXPRESSION>; Ok, that is basically more or less the mysql syntax, too bad, but I think it makes sense for a lightweight object which should not look like a database object at all to avoid confusion. As far as implementation is concerned, I would use a TEXT to TEXT hash table, and implicit cast the result when substituting. @var -> 'text value of var'::type_it_was_declared_with -- Fabien.
Hi
In both case, the syntax should be nice and elegant... i.e. not only based on functions, probably it should use some prefix convention (@, $)...
For the light weight option.
DECLARE @someday DATE [ = <SQL EXPRESSION> ] [visibility restriction?];
... then use @now as a possible value anywhere, which will be
substituted quite early in the execution process, before planning.
-- update a variable value:
[SET, ASSIGN, ... nothing?] @someday = <SQL EXPRESSION>;
Ok, that is basically more or less the mysql syntax, too bad, but I think it makes sense for a lightweight object which should not look like a database object at all to avoid confusion.
As far as implementation is concerned, I would use a TEXT to TEXT hash table, and implicit cast the result when substituting.
@var -> 'text value of var'::type_it_was_declared_with
We are talking about two different features (although the part of name can be same):
you are talk about light session variables like MSSQL or MySQL (with same syntax), I am talking about secure session variables like Oracle package variables (with similar access syntax).
Theoretically, there can be implemented both - but cannot be implemented together. Its are partially different features. My proposal is clearly related to analogy with Oracle package variables and should to help to people who does migration from Oracle, or who writing application in Oracle style - database first, almost logic in database.
I have two important reasons why I insist on pg_class base.
1. security .. it is really fundamental part
2. possibility to static check by plpgsql_check - without entry in pg_class (or other catalogue table) I have not any valid information about type, existence of any variable.
Although I am not supporter (due possible issues with plpgsql_checks) of MySQL or MSSQL style variables I am not strongly against this implementation with same syntax. But it is different feature, with different benefits and costs.
I didn't proposed the packages (and package variables) due issues in multilingual PostgreSQL environment and because it is redundant to PostgreSQL schemas. Instead I proposed >>secure global session variables<< (global like global temporary tables).
Currently light session variables can be implemented as not big extension. Secure session variables depends on pg_class internals.
I am not sure if we need a special symbols - it is traditional only. Set/Get functions can do same work - years we use same technique for sequences. Setter function is simply. Currently is impossible to write elegant getter function - because the analyzer has limited work with "any" returning functions.
Can be nice to have special hook for functions that returns "any" to push there some other external informations.
Regards
Pavel
--
Fabien.
2016-12-26 10:54 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:
Hi
In both case, the syntax should be nice and elegant... i.e. not only based on functions, probably it should use some prefix convention (@, $)...
For the light weight option.
DECLARE @someday DATE [ = <SQL EXPRESSION> ] [visibility restriction?];
... then use @now as a possible value anywhere, which will be
substituted quite early in the execution process, before planning.
-- update a variable value:
[SET, ASSIGN, ... nothing?] @someday = <SQL EXPRESSION>;
Ok, that is basically more or less the mysql syntax, too bad, but I think it makes sense for a lightweight object which should not look like a database object at all to avoid confusion.
As far as implementation is concerned, I would use a TEXT to TEXT hash table, and implicit cast the result when substituting.
@var -> 'text value of var'::type_it_was_declared_with We are talking about two different features (although the part of name can be same):you are talk about light session variables like MSSQL or MySQL (with same syntax), I am talking about secure session variables like Oracle package variables (with similar access syntax).Theoretically, there can be implemented both - but cannot be implemented together. Its are partially different features. My proposal is clearly related to analogy with Oracle package variables and should to help to people who does migration from Oracle, or who writing application in Oracle style - database first, almost logic in database.I have two important reasons why I insist on pg_class base.1. security .. it is really fundamental part
Dynamic created variables (like MySQL) cannot be safe - anybody can create variables with self preferred visibility.
2. possibility to static check by plpgsql_check - without entry in pg_class (or other catalogue table) I have not any valid information about type, existence of any variable.Although I am not supporter (due possible issues with plpgsql_checks) of MySQL or MSSQL style variables I am not strongly against this implementation with same syntax. But it is different feature, with different benefits and costs.I didn't proposed the packages (and package variables) due issues in multilingual PostgreSQL environment and because it is redundant to PostgreSQL schemas. Instead I proposed >>secure global session variables<< (global like global temporary tables).Currently light session variables can be implemented as not big extension. Secure session variables depends on pg_class internals.I am not sure if we need a special symbols - it is traditional only. Set/Get functions can do same work - years we use same technique for sequences. Setter function is simply. Currently is impossible to write elegant getter function - because the analyzer has limited work with "any" returning functions.Can be nice to have special hook for functions that returns "any" to push there some other external informations.RegardsPavel
--
Fabien.
Hello Pavel, > you are talk about light session variables like MSSQL or MySQL (with same > syntax), I am talking about secure session variables like Oracle package > variables (with similar access syntax). Hmmm. I do not know this Oracle stuff... After looking at the online documentation, my understanding of "Oracle package variables" refers to full fledged database objects, in particular they are not session limited. The example I found is about a variable holding the total number of employees, with functions hire & fire (well, they call it remove_emp) to update them when inserting or deleting an employee. AFAICS they are shared between backends, subjects to transactions and permissions, constraints and so on. So they look more like the first category I outlined, and probably they cost as any persistent database object, which make sense. They constitute a consistent design. This is * not * what you are proposing. > [...] I have two important reasons why I insist on pg_class base. > > 1. security .. it is really fundamental part > 2. possibility to static check by plpgsql_check - without entry in pg_class > (or other catalogue table) I have not any valid information about type, > existence of any variable. Hmmm. I'm not quite convinced that putting session variables in pg_class is a good idea, because it adds significant costs for the use case of "standard" simple session variables, which is quite more probable than session-but-with-permissions variables. As far as security is concerned, ISTM that a limited but still useful access control can be implemented for a key-value store with simple session variables, see below. As far as typing is concerned, ISTM that it can be done as well for session variables by going through text and using casts when setting and getting values, or through some other simple ad-hoc checking. > Although I am not supporter (due possible issues with plpgsql_checks) of > MySQL or MSSQL style variables I am not strongly against this > implementation with same syntax. But it is different feature, with > different benefits and costs. > I didn't proposed the packages (and package variables) due issues in > multilingual PostgreSQL environment and because it is redundant to > PostgreSQL schemas. > Instead I proposed >>secure global session variables<< (global like > global temporary tables). That's where I'm leaving you and start disagreeing, because it is not consistent: you are proposing session variables that do not look like session variable and are somehow costly. I could agree with real "secure global variables" as in Oracle packages, a consistent kind of database object which stores a persistent value safely and securely. That would cost, but that is life in a database, you have great things for a price. Probably that could be implemented as a row in some special table, or as a one-row table, or whatever. I could also agree with à la MS or MY-SQL session variables that look like session variables, with limited ambition, light-weight and inexpensive. I disagree with having a half-backed stuff, where something looks like a database object (i.e. CREATE/ALTER/DROP/GRANT/REVOKE) but is really a session object with strange properties. I also disagree to the pg_class approach as it creates in effect an expensive session object while a simple session object would cost much less and would be much more useful. To summarize, I still think that your design is not consistent, even if it makes sense for some degree wrt the implementation. A possible compromise I have proposed is to have some declared access restrictions on simple session variables, so that say only the owner can access it, but they should stay and look like light-weight session variables nevertheless. That could look like: SET ROLE Admin; DECLARE @secure_variable INTEGER RESTRICT; -- only accessible to Admin SET @secure_variable = 3; SET ROLE BasicUser; SELECT @secure_variable; -- say NULL or error does not exist... SET ROLE Admin; SELECT @secure_variable; 3 ... > Currently light session variables can be implemented as not big > extension. Sure. I would review that as well. -- Fabien.
2016-12-26 13:08 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
Hello Pavel,you are talk about light session variables like MSSQL or MySQL (with same
syntax), I am talking about secure session variables like Oracle package
variables (with similar access syntax).
Hmmm. I do not know this Oracle stuff... After looking at the online documentation, my understanding of "Oracle package variables" refers to full fledged database objects, in particular they are not session limited. The example I found is about a variable holding the total number of employees, with functions hire & fire (well, they call it remove_emp) to update them when inserting or deleting an employee.
AFAICS they are shared between backends, subjects to transactions and permissions, constraints and so on. So they look more like the first category I outlined, and probably they cost as any persistent database object, which make sense. They constitute a consistent design.
This is * not * what you are proposing.[...] I have two important reasons why I insist on pg_class base.
1. security .. it is really fundamental part
2. possibility to static check by plpgsql_check - without entry in pg_class
(or other catalogue table) I have not any valid information about type,
existence of any variable.
Hmmm. I'm not quite convinced that putting session variables in pg_class is a good idea, because it adds significant costs for the use case of "standard" simple session variables, which is quite more probable than session-but-with-permissions variables.
As far as security is concerned, ISTM that a limited but still useful access control can be implemented for a key-value store with simple session variables, see below.
As far as typing is concerned, ISTM that it can be done as well for session variables by going through text and using casts when setting and getting values, or through some other simple ad-hoc checking.Although I am not supporter (due possible issues with plpgsql_checks) of
MySQL or MSSQL style variables I am not strongly against this
implementation with same syntax. But it is different feature, with
different benefits and costs.I didn't proposed the packages (and package variables) due issues in
multilingual PostgreSQL environment and because it is redundant to
PostgreSQL schemas.Instead I proposed >>secure global session variables<< (global like global temporary tables).
That's where I'm leaving you and start disagreeing, because it is not consistent: you are proposing session variables that do not look like session variable and are somehow costly.
I could agree with real "secure global variables" as in Oracle packages, a consistent kind of database object which stores a persistent value safely and securely. That would cost, but that is life in a database, you have great things for a price. Probably that could be implemented as a row in some special table, or as a one-row table, or whatever.
I could also agree with à la MS or MY-SQL session variables that look like session variables, with limited ambition, light-weight and inexpensive.
I disagree with having a half-backed stuff, where something looks like a database object (i.e. CREATE/ALTER/DROP/GRANT/REVOKE) but is really a session object with strange properties. I also disagree to the pg_class approach as it creates in effect an expensive session object while a simple session object would cost much less and would be much more useful.
To summarize, I still think that your design is not consistent, even if it makes sense for some degree wrt the implementation.
A possible compromise I have proposed is to have some declared access restrictions on simple session variables, so that say only the owner can access it, but they should stay and look like light-weight session variables nevertheless. That could look like:
SET ROLE Admin;
DECLARE @secure_variable INTEGER RESTRICT; -- only accessible to Admin
SET @secure_variable = 3;
SET ROLE BasicUser;
SELECT @secure_variable; -- say NULL or error does not exist...
what will be if BasicUser does DECLARE @secure_variable
There are not any granularity of rights - you cannot to grant access ...
SET ROLE Admin;
SELECT @secure_variable; 3
I am sorry, I don't see benefit in your proposal. Probably there will be only one agreement, so there are not agreement between us :(
Regards
Pavel
...Currently light session variables can be implemented as not big extension.
Sure. I would review that as well.
--
Fabien.
A possible compromise I have proposed is to have some declared access restrictions on simple session variables, so that say only the owner can access it, but they should stay and look like light-weight session variables nevertheless. That could look like:
SET ROLE Admin;
DECLARE @secure_variable INTEGER RESTRICT; -- only accessible to Admin
Why introduce any another security system?
Regards
Pavel
Hello Pavel, >> AFAICS they are shared between backends, [...] They constitute a >> consistent design. > > no > http://stackoverflow.com/questions/2383061/scope-of-oracle-package-level-variables If stackoverflow says so, too bad for me:-) Now I do not understand the point of the example I read on Oracle documentation: why having an employee count accessed by some functions if it is reset on each new session? So I do retract "it constitute a consistent design". It looks more like a PL/SQL confined hack. Note that Oracle also seems to have session variables with set with DEFINE and referenced with &variable. >> [...] That could look like: >> >> SET ROLE Admin; >> DECLARE @secure_variable INTEGER RESTRICT; -- only accessible to Admin >> SET @secure_variable = 3; >> >> SET ROLE BasicUser; >> SELECT @secure_variable; -- say NULL or error does not exist... >> > what will be if BasicUser does DECLARE @secure_variable Then there would be a distinct global @secure_variable unrelated to the previous one, that would be hidden from Admin who would see its own private @secure_variable. Maybe "restrict" is not the right word, though, let us use "private". SET ROLE User1; -- use @var: does not exist in scope error DECLARE @var INTEGER PRIVATE; SET @var = 1; -- use@var: get 1 SET ROLE User2; -- use @var: does not exist in scope error DECLARE @var INTEGER PUBLIC; SET @var = 2; -- use @var;get 2 SET ROLE User1; -- use @var: get 1 (private version) SET ROLE User3; -- use @var: get 2 (public version created by User2). > There are not any granularity of rights - you cannot to grant access ... Indeed, at least directly. With the above version you can just control whether everybody or only the owner has access. However with some minimal more effort the owner of a private session variable could provide a grantable function for accessing this variable: the benefit would be that the function is permanent, i.e. would not need to be granted each time the variable is used, it could be done once and for all. CREATE FUNCTION setSecret(INT) SECURITY DEFINER ... AS $$ DECLARE IF NOT EXISTS @secret TEXT PRIVATE; SET @secret= $1; $$ LANGUAGE SQL; CREATE FUNCTION useSecret(TEXT) SECURITY DEFINER TEXT AS $$ -- would fail if @secret has not been set yet... SELECTsha256sum(@secret || ':' || $1); $$ LANGUAGE SQL; CREATE FUNCTION getSecret() RETURNS TEXT SECURITY DEFINER AS $$ DECLARE IF NOT EXISTS @secret TEXT PRIVATE; SELECT@secret; $$ LANGUAGE SQL; -- then REVOKE/GRANT ... ON FUNCTION set/use/getSecret(...); > I am sorry, I don't see benefit in your proposal. The benefit I see is to have MS/MY-SQL/Oracle like light-weight (inexpensive, untransactional) session variables and still a minimal access control which might be enough for significant use cases. If more is really needed, consider the function hack, or maybe some one-row table with all the power of grant. Ok, the table solution is more heavy weight, but then this is also for a special requirement, and it would work as well for persistence. > Probably there will be only one agreement, so there are not agreement > between us :( It seems so. I do believe that I am trying to propose a solution which take into account your use case as I understand it (you did not confirm nor infirm) which is to store securely but not safely some kind of temporary data between different function calls with SECURITY DEFINER within the same session. I'm trying to avoid "special-case" medium-weight (i.e. pg_class-based) session variables with permissions, which could preclude MY/MS-SQL/Oracle like light-weight session variables which are I think interesting in their own right. -- Fabien.
2016-12-26 15:53 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
Hello Pavel,AFAICS they are shared between backends, [...] They constitute a consistent design.
no
http://stackoverflow.com/questions/2383061/scope-of-oracle- package-level-variables
If stackoverflow says so, too bad for me:-) Now I do not understand the point of the example I read on Oracle documentation: why having an employee count accessed by some functions if it is reset on each new session?
please, can send link?
so some better documentation https://docs.oracle.com/cd/E11882_01/appdev.112/e25519/packages.htm#LNPLS99926
I am sure, so package variables are not shared between sessions/backends - bacause Oracle uses different mechanism for interprocess communication - wrote it in Orafce
"When a session references a package item, Oracle Database instantiates the package for that session. Every session that references a package has its own instantiation of that package."
"When a session references a package item, Oracle Database instantiates the package for that session. Every session that references a package has its own instantiation of that package."
So I do retract "it constitute a consistent design". It looks more like a PL/SQL confined hack.
Note that Oracle also seems to have session variables with set with DEFINE and referenced with &variable.[...] That could look like:what will be if BasicUser does DECLARE @secure_variable
SET ROLE Admin;
DECLARE @secure_variable INTEGER RESTRICT; -- only accessible to Admin
SET @secure_variable = 3;
SET ROLE BasicUser;
SELECT @secure_variable; -- say NULL or error does not exist...
Then there would be a distinct global @secure_variable unrelated to the previous one, that would be hidden from Admin who would see its own private @secure_variable. Maybe "restrict" is not the right word, though, let us use "private".
SET ROLE User1;
-- use @var: does not exist in scope error
DECLARE @var INTEGER PRIVATE;
SET @var = 1;
-- use @var: get 1
SET ROLE User2;
-- use @var: does not exist in scope error
DECLARE @var INTEGER PUBLIC;
SET @var = 2;
-- use @var; get 2
SET ROLE User1;
-- use @var: get 1 (private version)
SET ROLE User3;
-- use @var: get 2 (public version created by User2).There are not any granularity of rights - you cannot to grant access ...
Indeed, at least directly. With the above version you can just control whether everybody or only the owner has access.
However with some minimal more effort the owner of a private session variable could provide a grantable function for accessing this variable: the benefit would be that the function is permanent, i.e. would not need to be granted each time the variable is used, it could be done once and for all.
CREATE FUNCTION setSecret(INT) SECURITY DEFINER ... AS $$
DECLARE IF NOT EXISTS @secret TEXT PRIVATE;
SET @secret = $1;
$$ LANGUAGE SQL;
CREATE FUNCTION useSecret(TEXT) SECURITY DEFINER TEXT AS $$
-- would fail if @secret has not been set yet...
SELECT sha256sum(@secret || ':' || $1);
$$ LANGUAGE SQL;
CREATE FUNCTION getSecret() RETURNS TEXT SECURITY DEFINER AS $$
DECLARE IF NOT EXISTS @secret TEXT PRIVATE;
SELECT @secret;
$$ LANGUAGE SQL;
-- then
REVOKE/GRANT ... ON FUNCTION set/use/getSecret(...);I am sorry, I don't see benefit in your proposal.
The benefit I see is to have MS/MY-SQL/Oracle like light-weight (inexpensive, untransactional) session variables and still a minimal access control which might be enough for significant use cases.
If more is really needed, consider the function hack, or maybe some one-row table with all the power of grant. Ok, the table solution is more heavy weight, but then this is also for a special requirement, and it would work as well for persistence.Probably there will be only one agreement, so there are not agreement between us :(
It seems so. I do believe that I am trying to propose a solution which take into account your use case as I understand it (you did not confirm nor infirm) which is to store securely but not safely some kind of temporary data between different function calls with SECURITY DEFINER within the same session.
I'm trying to avoid "special-case" medium-weight (i.e. pg_class-based) session variables with permissions, which could preclude MY/MS-SQL/Oracle like light-weight session variables which are I think interesting in their own right.
I am sorry, it is not secure. Theoretically it can work if you have granted order of function calls, but if not?
regards
Pavel
--
Fabien.
Hello Pavel, >> SET ROLE Admin; >> DECLARE @secure_variable INTEGER RESTRICT; -- only accessible to Admin > Why introduce another security system? That is a good question. I would prefer to avoid it and just have simple session variables... but this is not what you want, so I'm trying to find a compromise which both gives you the feature you are seeking and would keep session variables as inexpensive, i.e. without catalog costs. A simplistic PUBLIC/PRIVATE permissions on simple session variable can be done with nothing (no additional data structures): (store: hash_put(<owner id if private or nothing> || '@varname', value); consult: if exists(<owner id> || '@varname') then return it else if exists('@varname') then return it else "error variable does not exist"). Now if you can offer an inexpensive GRANT/REVOKE on simple session variables, i.e. without catalog changes, then I think I would agree to it, even if I would not like it much. The reason I "do not like much" is subjective. I associate SQL permission commands (GRANT, REVOKE...) to real SQL "objects" (i.e. persistent, transactional, secured things subject to CREATE ALTER DROP...). However light-weight session variables are not really like that. Also if you can have inexpensive GRANT/REVOKE then probably I would also have to accept "CREATE SESSION VARIABLE @foo", because it would be consistent to have it with GRANT/REVOKE. I would "not like it much" either to have CREATE for an non persistant object, but that is life... However I understood that for permissions you do need "pg_class", which means catalog changes on session variable creation, which means expensive for simple session variables, so not desirable. -- Fabien.
2016-12-26 17:15 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
Hello Pavel,SET ROLE Admin;
DECLARE @secure_variable INTEGER RESTRICT; -- only accessible to AdminWhy introduce another security system?
That is a good question.
I would prefer to avoid it and just have simple session variables... but this is not what you want, so I'm trying to find a compromise which both gives you the feature you are seeking and would keep session variables as inexpensive, i.e. without catalog costs.
A simplistic PUBLIC/PRIVATE permissions on simple session variable can be done with nothing (no additional data structures): (store: hash_put(<owner id if private or nothing> || '@varname', value); consult: if exists(<owner id> || '@varname') then return it else if exists('@varname') then return it else "error variable does not exist").
Now if you can offer an inexpensive GRANT/REVOKE on simple session variables, i.e. without catalog changes, then I think I would agree to it, even if I would not like it much.
The reason I "do not like much" is subjective. I associate SQL permission commands (GRANT, REVOKE...) to real SQL "objects" (i.e. persistent, transactional, secured things subject to CREATE ALTER DROP...). However light-weight session variables are not really like that.
Also if you can have inexpensive GRANT/REVOKE then probably I would also have to accept "CREATE SESSION VARIABLE @foo", because it would be consistent to have it with GRANT/REVOKE. I would "not like it much" either to have CREATE for an non persistant object, but that is life...
However I understood that for permissions you do need "pg_class", which means catalog changes on session variable creation, which means expensive for simple session variables, so not desirable.
I don't understand to "expensive" word. How much often you create/drop these variables? The deployment cycle is similar to functions. I don't propose any feature, that can enforce bloating of system catalogue.
The variable metadata will be persistent across sessions. I can imagine the local (temporal) session variable where life cycle of metadata will be related to session like our current temporary tables. But it is not in this proposal.
Regards
Pavel
--
Fabien.
> please, can send link? My badly interpreted PL/SQL example was on the same page you point to below: > so some better documentation > https://docs.oracle.com/cd/E11882_01/appdev.112/e25519/packages.htm#LNPLS99926 There is a private 'number_hired' which given its name I thought was counting the number of employee, but it was just counting the number of "hire_employee" calls in the current session... Not very interesting. > I am sure, so package variables are not shared between sessions/backends Indeed, I misinterpreted the Oracle documentation example. >> [ grantable function example to access a private session variable... ] > > I am sorry, it is not secure. Theoretically it can work if you have > granted order of function calls, but if not? I'm not sure I understand. If you do not grant/revoke permissions as you want on the functions, then it can be invoked by anybody. My point is that it is *possible* to tune permissions so as to control exactly who may access a private session variable. That is exactly the same with a grantable session variable if you do not have done the necessary grant/revoke, there is no difference? -- Fabien.
2016-12-26 17:33 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
please, can send link?
My badly interpreted PL/SQL example was on the same page you point to below:so some better documentation
https://docs.oracle.com/cd/E11882_01/appdev.112/e25519/packa ges.htm#LNPLS99926
There is a private 'number_hired' which given its name I thought was counting the number of employee, but it was just counting the number of "hire_employee" calls in the current session... Not very interesting.I am sure, so package variables are not shared between sessions/backends
Indeed, I misinterpreted the Oracle documentation example.[ grantable function example to access a private session variable... ]
I am sorry, it is not secure. Theoretically it can work if you have granted order of function calls, but if not?
I'm not sure I understand.
If you do not grant/revoke permissions as you want on the functions, then it can be invoked by anybody.
My point is that it is *possible* to tune permissions so as to control exactly who may access a private session variable.
That is exactly the same with a grantable session variable if you do not have done the necessary grant/revoke, there is no difference?
If you use pattern DECLARE IF NOT EXISTS, you cannot be sure so some other did it. It is working only if you create variables in session as first.
Only if object is fixed in schema, then object is trustworthy - because you have to have rights to modify schema. In my proposal only trustworthy user can create the variable in some schema. Not trustworthy user can use public schema, or we can support temporary objects (similar to your proposal) in hypothetical schema "private". I have strong tools in Postgres for enforcing security, and I would to use it.
Regards
Pavel
--
Fabien.
Hello Pavel, > I don't understand to "expensive" word. Hmmm... > How much often you create/drop these variables? Hmmm... As for "session variables" à la MY/MS-SQL, ISTM that a variable is "created" each time it is asked for, and it disappears completely at the end of the session... So you can have some kind of minimal SQL scripting with variables executed server-side, without resorting to a PL. Also useful in interactive mode, although there the performance is less an issue. > The deployment cycle is similar to functions. I don't > propose any feature, that can enforce bloating of system catalogue. Hmmm.... > The variable metadata will be persistent across sessions. I can imagine the > local (temporal) session variable where life cycle of metadata will be > related to session like our current temporary tables. But it is not in this > proposal. So it seems that I misunderstood a key detail in your proposal. Please accept my apology for my slow witedness. It is better to discuss in front of a white board... Now I understand that you want to create a kind of "persistant" session variable... that is at each new session the variable is instancianted in the session "automatically", whether the session will use it or not... Or is it instanciated on demand, i.e. when explicitely accessed? Could you confirm the interpretation? ISTM that "on demand" would be better. As you gathered I was understanding that the "CREATE VARIABLE & GRANTS" were to be issued in each session for each variable used, inducing catalog changes each time, hence my quite heavy ranting... What I understand now is still a somehow strange object, but nevertheless the CREATE, DROP, GRANT, REVOKE at least are more justified because then object is somehow really persistent in the database, even if not with a value. So I'll have to think about it... A few more questions: Bar the grantability, why wouldn't simple session variables work for this purpose? That is what is the added value of having them "declared" permanently, compared to created on demand, if the creation is really light weight? ISTM that in the Oracle package version, they are linked to PL/SQL, they are not part of SQL layer itself, so maybe they are only created when some PL/SQL from the package is invoked, and not created otherwise? How would this feature interact with a kind of non persistent "simple" session variables that are found in MY/MS/Oracle SQL? One of my concern is that such a feature should not preclude other kind of session variables. And how would it interact with some "fully persistent/shared" variables? -- Fabien.
2016-12-26 18:20 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
Hello Pavel,I don't understand to "expensive" word.
Hmmm...How much often you create/drop these variables?
Hmmm... As for "session variables" à la MY/MS-SQL, ISTM that a variable is "created" each time it is asked for, and it disappears completely at the end of the session... So you can have some kind of minimal SQL scripting with variables executed server-side, without resorting to a PL. Also useful in interactive mode, although there the performance is less an issue.The deployment cycle is similar to functions. I don't
propose any feature, that can enforce bloating of system catalogue.
Hmmm....The variable metadata will be persistent across sessions. I can imagine the
local (temporal) session variable where life cycle of metadata will be
related to session like our current temporary tables. But it is not in this
proposal.
So it seems that I misunderstood a key detail in your proposal. Please accept my apology for my slow witedness. It is better to discuss in front of a white board...
sure, mainly with my language skills
Now I understand that you want to create a kind of "persistant" session variable... that is at each new session the variable is instancianted in the session "automatically", whether the session will use it or not... Or is it instanciated on demand, i.e. when explicitely accessed?
I am starting with simple default solution - variable is initialized on demand (when it is used first time in session). Attention: in my concept - initialization and creation are different things.
In future initialization can be joined with login - we can have a autologin function (trigger), or we can have a login initialized variables (and if default expr is function call, then these functions can be used as autologin functions).
Could you confirm the interpretation? ISTM that "on demand" would be better.
Now, what I propose, and what is implemented in prototype is "on demand"
As you gathered I was understanding that the "CREATE VARIABLE & GRANTS" were to be issued in each session for each variable used, inducing catalog changes each time, hence my quite heavy ranting...
What I understand now is still a somehow strange object, but nevertheless the CREATE, DROP, GRANT, REVOKE at least are more justified because then object is somehow really persistent in the database, even if not with a value.
metadata are persistent like functions, sequences - the value is related to session, only value.
So I'll have to think about it...
A few more questions:
Bar the grantability, why wouldn't simple session variables work for this purpose? That is what is the added value of having them "declared" permanently, compared to created on demand, if the creation is really light weight?
the rights should be persistent, and should be attached to some persistent object. Hypothetically, we can introduce new kind of objects, but it disallow future usage of direct DML and SELECT statements.
ISTM that in the Oracle package version, they are linked to PL/SQL, they are not part of SQL layer itself, so maybe they are only created when some PL/SQL from the package is invoked, and not created otherwise?
PL/SQL is different creature - it is living outside SQL catalogue - in packages. I would not to reimplemented it for PL/pgSQL from following reasons: we have schemas (that can be used as Oracle packages), we have a extensions (that can be used partially as Oracle's packages), we have a mix PL languages - more time I mixed PLpgSQL and PLPerlu. So mapping 1:1 from Oracle is not good for Postgres.
How would this feature interact with a kind of non persistent "simple" session variables that are found in MY/MS/Oracle SQL? One of my concern is that such a feature should not preclude other kind of session variables.
depends .. In my terminology your proposal is "untrusted temporary local session variables" - it can share 50% of code - more if implementation will be based on getter/setter function, less if it will be based on gram implementation.
These variables should not be declared explicitly - it can be declared implicitly by setting. They should not use any schema - bat can use getter/setter functions
so you can write
select setvar('@var1', 10);
select getvar('@var1')
I little bit afraid of direct using the variables in query - inside special functions we (and users) have control to choose volatility: direct using can do some unexpected behave for users.
And how would it interact with some "fully persistent/shared" variables?
I have not any use case for this. I don't know about any similar feature in other database systems. Oracle uses dbms_pipe or dbms_alert for interprocess communication.
I am thinking so it is possible to implement. If it is not ACID, then it can work as custom statistic counters. If it should be ACID? Then is better to use table. What I know, now is preferred share nothing design in parallel systems - so for me, it looks like little bit dangerous feature - and I see only one use case - custom statistics - where possible race condition is not hard issue.
But I don't plan to implement it in first stage. There should be strong use case for implementing any complex feature in shared memory. But any implementation doesn't breaking to implement it in feature.
Regards
Pavel
--
Fabien.
2016-12-26 19:13 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:
2016-12-26 18:20 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:And how would it interact with some "fully persistent/shared" variables?
I have not any use case for this. I don't know about any similar feature in other database systems. Oracle uses dbms_pipe or dbms_alert for interprocess communication.
I am thinking so it is possible to implement. If it is not ACID, then it can work as custom statistic counters. If it should be ACID? Then is better to use table. What I know, now is preferred share nothing design in parallel systems - so for me, it looks like little bit dangerous feature - and I see only one use case - custom statistics - where possible race condition is not hard issue.
But I don't plan to implement it in first stage. There should be strong use case for implementing any complex feature in shared memory. But any implementation doesn't breaking to implement it in feature.
for custom statistic some extension based on bg worker can be better - sharing variables are risk of race conditions - and developers are people only.
I have not a clean opinion about this - the implementation should not be hard, but I am not sure if this gun I would to give to users.
Regards
Pavel
Regards
Pavel
--
Fabien.
On Fri, Dec 23, 2016 at 4:00 PM, Joe Conway <mail@joeconway.com> wrote:
>
> >> In the long term, What would be the possible scopes?
> >>
> >> TRANSACTION, SESSION, PERSISTANT ?
> >>
> >> Would some scopes orthogonal (eg SHARED between sessions for a USER in a
> >> DATABASE, SHARED at the cluster level?).
> >
> > I have a plan to support TRANSACTION and SESSION scope. Persistent or
> > shared scope needs much more complex rules, and some specialized extensions
> > will be better.
>
>
> I can see where persistent variables would be very useful though.
>
I'm thinking about PERSISTENT VARIABLES and maybe they can be the redesign of our hard coded "reloptions", with the ability to provide users to create their own customized. If we think more carefully we already have some persistent variables with specialized context: reloptions (hardcoded), security labels (local and shared catalog) and comments (local and shared catalog). I was clear enough?
Regards,
--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello
2016-12-27 21:38 GMT+01:00 Fabrízio de Royes Mello <fabriziomello@gmail.com>:
Can you talk more about your though??
On Fri, Dec 23, 2016 at 4:00 PM, Joe Conway <mail@joeconway.com> wrote:
>
> >> In the long term, What would be the possible scopes?
> >>
> >> TRANSACTION, SESSION, PERSISTANT ?
> >>
> >> Would some scopes orthogonal (eg SHARED between sessions for a USER in a
> >> DATABASE, SHARED at the cluster level?).
> >
> > I have a plan to support TRANSACTION and SESSION scope. Persistent or
> > shared scope needs much more complex rules, and some specialized extensions
> > will be better.
>
>
> I can see where persistent variables would be very useful though.
>
I'm thinking about PERSISTENT VARIABLES and maybe they can be the redesign of our hard coded "reloptions", with the ability to provide users to create their own customized. If we think more carefully we already have some persistent variables with specialized context: reloptions (hardcoded), security labels (local and shared catalog) and comments (local and shared catalog). I was clear enough?
What is difference from table Values(keys varchar primary key, value text) ? Where is any benefit?
Best regards
Pavel
Regards,--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello
On Tue, Dec 27, 2016 at 6:55 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>
>
> 2016-12-27 21:38 GMT+01:00 Fabrízio de Royes Mello <fabriziomello@gmail.com>:
>>
>>
>> On Fri, Dec 23, 2016 at 4:00 PM, Joe Conway <mail@joeconway.com> wrote:
>> >
>> > >> In the long term, What would be the possible scopes?
>> > >>
>> > >> TRANSACTION, SESSION, PERSISTANT ?
>> > >>
>> > >> Would some scopes orthogonal (eg SHARED between sessions for a USER in a
>> > >> DATABASE, SHARED at the cluster level?).
>> > >
>> > > I have a plan to support TRANSACTION and SESSION scope. Persistent or
>> > > shared scope needs much more complex rules, and some specialized extensions
>> > > will be better.
>> >
>> >
>> > I can see where persistent variables would be very useful though.
>> >
>>
>> Can you talk more about your though??
>>
>> I'm thinking about PERSISTENT VARIABLES and maybe they can be the redesign of our hard coded "reloptions", with the ability to provide users to create their own customized. If we think more carefully we already have some persistent variables with specialized context: reloptions (hardcoded), security labels (local and shared catalog) and comments (local and shared catalog). I was clear enough?
>
>
> What is difference from table Values(keys varchar primary key, value text) ? Where is any benefit?
>
IMHO it's a totally different thing because with this approach we'll have:
- track dependency (aka pg_depend)
- different LockLevel for each persistent variable (reloption)
- syscache to improve performance
And It's a way to store metadata about metadata, i.e. property about our objects. This can be useful to:
- author extensions to store your own properties with each object
- postgres developers to have a simple way to add a new reloption just adding a new row to bootstrap data
Regards,
--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello
Fabien, I don't really see the point of "persistent variables". What benefit do they add over relations? You can add a simple function to fetch a tuple if you want it not to look like a subquery. Do it with heap access in C if you like, save the (trivial) planning costs. I do see value to two different things discussed here: * Pavel's proposal for persistent-declaration, non-persistent-value session variables with access control. These will be very beneficial with RLS, where we need very fast lookups. Their purpose is that you can set up expensive state with SECURITY DEFINER functions, C-level functions, etc, then test it very cheaply later from RLS and from other functions. Their advantage over relations is very cheap, fast access. I can maybe see global temporary relations being an alternative to these, if we optimise by using a tuplestore to back them and only switch to a relfilenode if the relation grows. The pg_catalog entries would be persistent so you could GRANT or REVOKE access to them, etc. Especially if we optimised the 1-row case this could work. It'd be less like what Oracle does, but might let us re-use more functionality and have fewer overlapping features. Pavel? * Fabien's earlier mention of transient session / query variables, a-la MySQL or MS-SQL. They're obviously handy for more general purpose programming, but our strict division between SQL and plpgsql makes them a bit less useful than in MS-SQL with T-SQL. I think it's a very separate topic to this and should be dealt with in a separate thread if/when someone wants to work on them. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Hello Craig, > Fabien, I don't really see the point of "persistent variables". What > benefit do they add over relations? A relation is a set of values, a variable is a scalar with one value. It is always possible to declare a set and use it as a singleton, but somehow it seems cleaner to ask for what you want and have the database maintain the singleton property just like any other constraint. Behind the scene a "persistent variable" would probably be implemented as a row in a special table or some kind of one-row table... So there is no deep semantical difference, but mostly a syntactic one: you ask for a variable and you use it as a variable, i.e. there can be a simple well integrated syntax to get its value without having to "SELECT FROM" or resorting to functions. > You can add a simple function to fetch a tuple if you want it not to > look like a subquery. ISTM that if there are some kind of (persistent/session/...) variables, there should be a simple direct way of getting its value, like @var or &var or whatever. If one must write pg_get_variable_value('var')::ZZZ, it somehow defeats the purpose, as "(SELECT var FROM some_table)" is shorter. > I do see value to two different things discussed here: > > * Pavel's proposal for persistent-declaration, non-persistent-value > session variables with access control. [...] Yep, that is one. I missed the half-persistence property at the beginning... > * Fabien's earlier mention of transient session / query variables, a-la > [...] I think it's a very separate topic to this and should be dealt > with in a separate thread if/when someone wants to work on them. Yes and no: ISTM that at least a global design should be discussed *before* some kind of special-case variables (session-alive, persistent-in-existence-but-not-in-value, not-transactional, subject-to-permissions, not-subject-to-constraints...) are introduced, so that the special case does not preclude the possible future existence of other types of variables. Then I would be more at ease with having a special case implemented first, knowing that others may come and fit neatly, both semantically and syntaxically. I'm bothered by the half-persistence proposed, because it interferes both with possible session (light-weight, only in memory) and persistent (heavy-weight, in catalog) variables. Also, I'm not yet convinced that simple privatizable transcient/session variables would not be enough to fit the use case, so that for the same price there would be session variables for all, not only special ones with permissions. -- Fabien.
2016-12-28 14:19 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
Hello Craig,Fabien, I don't really see the point of "persistent variables". What
benefit do they add over relations?
A relation is a set of values, a variable is a scalar with one value.
It is always possible to declare a set and use it as a singleton, but somehow it seems cleaner to ask for what you want and have the database maintain the singleton property just like any other constraint.
Behind the scene a "persistent variable" would probably be implemented as a row in a special table or some kind of one-row table... So there is no deep semantical difference, but mostly a syntactic one: you ask for a variable and you use it as a variable, i.e. there can be a simple well integrated syntax to get its value without having to "SELECT FROM" or resorting to functions.You can add a simple function to fetch a tuple if you want it not to
look like a subquery.
ISTM that if there are some kind of (persistent/session/...) variables, there should be a simple direct way of getting its value, like @var or &var or whatever. If one must write pg_get_variable_value('var')::ZZZ, it somehow defeats the purpose, as "(SELECT var FROM some_table)" is shorter.
just note - getter function returns typed value - there are not necessary any other casting
I do see value to two different things discussed here:
* Pavel's proposal for persistent-declaration, non-persistent-value
session variables with access control. [...]
Yep, that is one. I missed the half-persistence property at the beginning...* Fabien's earlier mention of transient session / query variables, a-la [...] I think it's a very separate topic to this and should be dealt with in a separate thread if/when someone wants to work on them.
Yes and no: ISTM that at least a global design should be discussed *before* some kind of special-case variables (session-alive, persistent-in-existence-but-not-in-value, not-transactional, subject-to-permissions, not-subject-to-constraints...) are introduced, so that the special case does not preclude the possible future existence of other types of variables.
Then I would be more at ease with having a special case implemented first, knowing that others may come and fit neatly, both semantically and syntaxically.
I'm bothered by the half-persistence proposed, because it interferes both with possible session (light-weight, only in memory) and persistent (heavy-weight, in catalog) variables.
Also, I'm not yet convinced that simple privatizable transcient/session variables would not be enough to fit the use case, so that for the same price there would be session variables for all, not only special ones with permissions.
--
Fabien.
On 28 December 2016 at 21:19, Fabien COELHO <coelho@cri.ensmp.fr> wrote: > Also, I'm not yet convinced that simple privatizable transcient/session > variables would not be enough to fit the use case, so that for the same > price there would be session variables for all, not only special ones with > permissions. Since, unlike Oracle, we don't have compiled packages or plan-caching above the session level, there's not the same hard requirement for the variable definition to be persistent. So... maybe? The main question then becomes how you integrate access control. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
2016-12-28 4:40 GMT+01:00 Craig Ringer <craig@2ndquadrant.com>:
Fabien, I don't really see the point of "persistent variables". What
benefit do they add over relations?
You can add a simple function to fetch a tuple if you want it not to
look like a subquery. Do it with heap access in C if you like, save
the (trivial) planning costs.
I do see value to two different things discussed here:
* Pavel's proposal for persistent-declaration, non-persistent-value
session variables with access control. These will be very beneficial
with RLS, where we need very fast lookups. Their purpose is that you
can set up expensive state with SECURITY DEFINER functions, C-level
functions, etc, then test it very cheaply later from RLS and from
other functions. Their advantage over relations is very cheap, fast
access.
I can maybe see global temporary relations being an alternative to
these, if we optimise by using a tuplestore to back them and only
switch to a relfilenode if the relation grows. The pg_catalog entries
would be persistent so you could GRANT or REVOKE access to them, etc.
Especially if we optimised the 1-row case this could work. It'd be
less like what Oracle does, but might let us re-use more functionality
and have fewer overlapping features. Pavel?
This is hard question - I have different expectation from variables and from tables. Sure, there can be a optimization for one row global temporary tables, but it fails on first update. And with this optimization we should not to use current heap access functionality - or we increase a complexity of currently lot of complex code. From success global temp tables implementation I am expecting unbloating catalogue and all relation related functionality - indexes, column statistics, usage statistics, MVCC. I can't to imagine a one row optimization there. More it has limit - it is pretty hard implement there expandable types.
I see very useful to define variable as memory based NOT ACID, not MVCC storage (always). Tables are ACI(D) MVCC. The minimalism is great, but have to has practical limit - we have variables inside/outside plpgsql - although we can use temporary tables.
There are secondary question if we can introduce some better interface for writing getter/setter function, where current relations can be used.
* Fabien's earlier mention of transient session / query variables,
a-la MySQL or MS-SQL. They're obviously handy for more general purpose
programming, but our strict division between SQL and plpgsql makes
them a bit less useful than in MS-SQL with T-SQL. I think it's a very
separate topic to this and should be dealt with in a separate thread
if/when someone wants to work on them.
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
2016-12-28 15:00 GMT+01:00 Craig Ringer <craig@2ndquadrant.com>:
On 28 December 2016 at 21:19, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
> Also, I'm not yet convinced that simple privatizable transcient/session
> variables would not be enough to fit the use case, so that for the same
> price there would be session variables for all, not only special ones with
> permissions.
Since, unlike Oracle, we don't have compiled packages or plan-caching
above the session level, there's not the same hard requirement for the
variable definition to be persistent.
So... maybe? The main question then becomes how you integrate access control.
For security the variable should be persistent.
If you would to do statical analyse (what you usually would), then variable should be persistent.
Currently the big issue of plpgsql_check is work with temporary tables. Local objects or dynamic sql is stop for static check.
Regards
Pavel
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
>> Also, I'm not yet convinced that simple privatizable transcient/session >> variables would not be enough to fit the use case, [...] > So... maybe? The main question then becomes how you integrate access control. For what it's worth, permissions on persistent functions could be used to control access to private-to-a-role transcient/session variables, see: https://www.mail-archive.com/pgsql-hackers@postgresql.org/msg300651.html The good point is that the implementation on top of session variables would be trivial and very efficient, just prefix the variable with the owner id in the key-value storage. The bad point is that there is no full-featured GRANT on the variable itself, and it looks adhoc, somehow. -- Fabien.
> For security the variable should be persistent. Why should they? If it is a session variable, being created when needed or used with the right type could be enough? > If you would to do statical analyse (what you usually would), then variable > should be persistent. I do not understand what static analysis you would need/want to do on session variables. > Currently the big issue of plpgsql_check is work with temporary tables. Do you mean that temporary table are too slow/costly? > Local objects or dynamic sql is stop for static check. Hmm. If something is dynamic, it is not static, but I do not understand your point. -- Fabien.
2016-12-28 15:38 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
For security the variable should be persistent.
Why should they? If it is a session variable, being created when needed or used with the right type could be enough?
You cannot to trust some fuzzy object - or you have to play hard game with securing content - hashing, coding, decoding - it is slow, cpu intensive
If you would to do statical analyse (what you usually would), then variable
should be persistent.
I do not understand what static analysis you would need/want to do on session variables.Currently the big issue of plpgsql_check is work with temporary tables.
Do you mean that temporary table are too slow/costly?
No, I mean so when you use temporary tables inside plpgsql functions, then the static analyze like plpgsql check is almost impossible.
Local objects or dynamic sql is stop for static check.
Hmm. If something is dynamic, it is not static, but I do not understand your point.
I cannot to speak instead you, but lot of people prefer static analyze of code. The static analyze can be done only on static (persistent metadata). You cannot do it with dynamic (unfixed in schema) objects.
regards
Pavel
--
Fabien.
On 28 December 2016 at 22:04, Pavel Stehule <pavel.stehule@gmail.com> wrote: > For security the variable should be persistent. > > If you would to do statical analyse (what you usually would), then variable > should be persistent. > > Currently the big issue of plpgsql_check is work with temporary tables. > Local objects or dynamic sql is stop for static check. Someone asked me off-list what use cases such a thing would have, since it seems not to be spelled out very clearly in this discussion. I think we're all assuming knowledge here. So. * Session starts * app does SELECT setup_user('user-auth-key-data', 'some-other-blob') ** setup_user is SECURITY DEFINER to 'appadmin' ** 'appadmin' owns a variable IS_AUDITOR. Other roles have only read access to it. ** setup_user(...) does whatever expensive/slow work it has to do ** setup_user sets USER_IS_AUDITOR var * Later RLS policies simply reference USER_IS_AUDITOR var. They don't need to know the 'user-auth-key-data', or do whatever expensive processing that it does. * Other later triggers, etc, also reference USER_IS_AUDITOR * User cannot make themselves an auditor by SETting USER_IS_AUDITOR That's the general idea. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Hello, >> Why should they? If it is a session variable, being created when needed or >> used with the right type could be enough? > > You cannot to trust some fuzzy object - or you have to play hard game with > securing content - hashing, coding, decoding - it is slow, cpu intensive I'm afraid that I do not understand. I do not think that a hash get in memory is particularly costly. I do not see how you could do less that that to manage variables with associated values. >> Currently the big issue of plpgsql_check is work with temporary tables. > > No, I mean so when you use temporary tables inside plpgsql functions, then > the static analyze like plpgsql check is almost impossible. > I cannot to speak instead you, but lot of people prefer static analyze of > code. Hmmm... I know a little bit about that field, ISTM that you are speaking of the current capabilities of a particular static analysis tool, but I'm not sure that the tool capabilities could not be enhanced to manage more things. > The static analyze can be done only on static (persistent metadata). A session variable is a bit like a global temporary variable. A static analysis tool could take into account a global temporary variable. > You cannot do it with dynamic (unfixed in schema) objects. The private session variables I suggested have a fixed (static) name, and their type could be infered by a static analysis tool, eg: ... DECLARE @foo BOOLEAN PRIVATE; -- I know that there is private a boolean variable "@foo" of unknown value SET @foo= TRUE; --- I know that @foo is true... ... -- Fabien.
2016-12-28 16:42 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
Hello,Why should they? If it is a session variable, being created when needed or
used with the right type could be enough?
You cannot to trust some fuzzy object - or you have to play hard game with
securing content - hashing, coding, decoding - it is slow, cpu intensive
I'm afraid that I do not understand. I do not think that a hash get in memory is particularly costly. I do not see how you could do less that that to manage variables with associated values.Currently the big issue of plpgsql_check is work with temporary tables.
No, I mean so when you use temporary tables inside plpgsql functions, then
the static analyze like plpgsql check is almost impossible.I cannot to speak instead you, but lot of people prefer static analyze of
code.
Hmmm... I know a little bit about that field, ISTM that you are speaking of the current capabilities of a particular static analysis tool, but I'm not sure that the tool capabilities could not be enhanced to manage more things.
It cannot be - the static analyze is limited to function scope - in static analyze you don't know a order of calls.
The static analyze can be done only on static (persistent metadata).
A session variable is a bit like a global temporary variable. A static analysis tool could take into account a global temporary variable.You cannot do it with dynamic (unfixed in schema) objects.
The private session variables I suggested have a fixed (static) name, and their type could be infered by a static analysis tool, eg:
...
DECLARE @foo BOOLEAN PRIVATE;
-- I know that there is private a boolean variable "@foo" of unknown value
SET @foo = TRUE;
--- I know that @foo is true...
...
This is not too friendly - you have to repeat DECLARE in every function. What is somebody somewhere write @fooo or use DECLARE @foo integer instead. There is big space for errors.
Regards
Pavel
--
Fabien.
My 0.02€ to try to illustrate a possible private session variable based implementation for this use case: > * Session starts \c app > * app does SELECT setup_user('user-auth-key-data', 'some-other-blob') SELECT setup_user('fjshdfjkshfjks', 'jklfsjfklsjfk'); > ** setup_user is SECURITY DEFINER to 'appadmin' -- appadmin did: CREATE FUNCTION setup_user(TEXT, TEXT) RETURNS BOOLEAN SECURITY DEFINER AS $$ > ** 'appadmin' owns a variable IS_AUDITOR. Other roles have only read > access to it. not sure how it is used afterwards... is it the same as USER_IS_AUDITOR? > ** setup_user(...) does whatever expensive/slow work it has to do ... checks, updates, whatever > ** setup_user sets USER_IS_AUDITOR var -- declare a private session variable DECLARE @user_is_auditor BOOLEAN PRIVATE; -- set its value to whatever appropriate SET @user_is_auditor = ???; --- returns its value RETURN @user_is_auditor; $$ LANGUAGE xxx; > * Later RLS policies simply reference USER_IS_AUDITOR var. They don't > need to know the 'user-auth-key-data', or do whatever expensive > processing that it does. -- appadmin did: CREATE FUNCTION isUserAuditor() RETURNS BOOLEAN SECURITY DEFINER AS $$ -- say variable is just confirmed if it existsalready in session? DECLARE @user_is_auditor BOOLEAN PRIVATE; RETURN @user_is_auditor; $$ LANGUAGE xxx; > * Other later triggers, etc, also reference USER_IS_AUDITOR The variable is not directly referenced, one would have to call isUserAuditor() to access the private session value, but then you can GRANT/REVOKE whatever you want on the access function. > * User cannot make themselves an auditor by SETting USER_IS_AUDITOR Indeed, the user cannot access the private variable, only appadmin can, and probably root could. The user could create its own private session variable @user_is_auditor, or a public session variable of the same name. That would be distinct variables which would not influence isUserAuditor which would use its own. -- Fabien.
2016-12-28 16:57 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
My 0.02€ to try to illustrate a possible private session variable based implementation for this use case:* Session starts
\c app* app does SELECT setup_user('user-auth-key-data', 'some-other-blob')
SELECT setup_user('fjshdfjkshfjks', 'jklfsjfklsjfk');** setup_user is SECURITY DEFINER to 'appadmin'
-- appadmin did:
CREATE FUNCTION setup_user(TEXT, TEXT)
RETURNS BOOLEAN SECURITY DEFINER AS $$** 'appadmin' owns a variable IS_AUDITOR. Other roles have only read
access to it.
not sure how it is used afterwards... is it the same as USER_IS_AUDITOR?** setup_user(...) does whatever expensive/slow work it has to do
... checks, updates, whatever** setup_user sets USER_IS_AUDITOR var
-- declare a private session variable
DECLARE @user_is_auditor BOOLEAN PRIVATE;
-- set its value to whatever appropriate
SET @user_is_auditor = ???;
--- returns its value
RETURN @user_is_auditor;
$$ LANGUAGE xxx;* Later RLS policies simply reference USER_IS_AUDITOR var. They don't
need to know the 'user-auth-key-data', or do whatever expensive
processing that it does.
-- appadmin did:
CREATE FUNCTION isUserAuditor()
RETURNS BOOLEAN SECURITY DEFINER AS $$
-- say variable is just confirmed if it exists already in session?
DECLARE @user_is_auditor BOOLEAN PRIVATE;
RETURN @user_is_auditor;
$$ LANGUAGE xxx;* Other later triggers, etc, also reference USER_IS_AUDITOR
The variable is not directly referenced, one would have to call isUserAuditor() to access the private session value, but then you can GRANT/REVOKE whatever you want on the access function.* User cannot make themselves an auditor by SETting USER_IS_AUDITOR
Indeed, the user cannot access the private variable, only appadmin can, and probably root could.
The user could create its own private session variable @user_is_auditor, or a public session variable of the same name. That would be distinct variables which would not influence isUserAuditor which would use its own.
so what is worse - I did one new entry in pg_class and one entry in pg_attributes. You wrote two entries in pg_proc function - more you have to ensure consistency of these functions.
Regards
Pavel
--
Fabien.
On 12/28/16 9:57 AM, Fabien COELHO wrote: >> * Other later triggers, etc, also reference USER_IS_AUDITOR > > The variable is not directly referenced, one would have to call > isUserAuditor() to access the private session value, but then you can > GRANT/REVOKE whatever you want on the access function. Why force users to create Yet Another Function as a getter? There's 2 big points that I think keep getting missed: 1) Variables would be completely non-transactional. The only way you can do that today is to use a "non-standard" language (such as plperl or plpython), or by creating a custom GUC (which is ugly because it necessitates changing postgresql.conf and is only text). A solution to this problem would be to provide a plpgsql equivalent to plperl or plpython's session hashes. I'm sure there are use cases that would be satisfied by simple doing that, but... 2) Variables provide permissions. Theoretically you could allow the hypothetical plpgsql session variables in (1) to be marked private, but that means you now have to keep all those variables on a per-role basis, users are forced to create accessor functions, and you run a serious risk of confusion from getting the function ownerships wrong. That certainly seems no better than defining permanent variables and giving them permissions (as Pavel suggested). More importantly, the security definer trick you're suggesting has a fatal flaw: you can't call one SECDEF function from another SECDEF function. So as soon as you have multiple privileged roles making use of variables, there's a serious risk of not being able to make use of these private variables at all. Now maybe pg_class is absolutely the wrong place to store info about predefined variables, but that's an implementation detail, not a design flaw. Some other points: We should protect for the possibility of truly global (as in cross-session) variables. Presumably these would have to be pre-defined via DDL before use. These would be uniquely valuable as a means of communication between sessions that are connected to different databases. I could also see use in cross-database in-memory queues. AFAIK both of these would be pretty easy to do with the shared memory infrastructure we now have. It would be nice if we could come up with a plan for what permanently defined temp tables looked like, so the syntax and operation was similar to the permanently defined session variables that Pavel is proposing. That said, given how long that has been an open issue I think it's completely unfair to stonewall this feature if we can't get permanent temp tables figured out. While permanent temp tables would eliminate some objections to store "session variables", the fact still remains that any kind of table would still be MVCC, and that is NOT always what you want. It would be nice if whatever syntax was decided for defined session variables allowed room for "variables" that were actually MVCC, because sometimes that actually is what you want. Yes, you could simulate the same thing with functions, but why make users do all that work if we could easily provide the same functionality? These should probably be called something other than "variables", but presumably all the other syntax and settings could be the same. Again, it's not the job of this proposal to boil that ocean, but it would be nice to leave the option open. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532)
Hello Jim, > 1) Variables would be completely non-transactional. [...] A solution to > this problem would be to provide a plpgsql equivalent to plperl or > plpython's session hashes. That is what I have in mind with "session variables" à la MS/MY SQL, but at the SQL level, not PL/pgSQL level. > 2) Variables provide permissions. Theoretically you could allow the > hypothetical plpgsql session variables in (1) to be marked private, but > that means you now have to keep all those variables on a per-role basis, No, I think a common "session hash" could be used, with some basic prefix convention, there is no need to per-role hash. > users are forced to create accessor functions, Yes if the private variable should be accessed. If the variable is private, then it is private and nothing is needed. Idem for public. > and you run a serious risk of confusion from getting the function > ownerships wrong. One can get the permissions on special session variable wrong as well... I do not see how it differs. > That certainly seems no better than defining permanent variables and > giving them permissions (as Pavel suggested). As far as permissions is concerned, ISTM that it is the same. The point is that there would be only usual fast "session variables", like MS/MY/Oracle SQL, and they could be used for Pavel rather special use case as well. > More importantly, the security definer trick you're suggesting has a > fatal flaw: you can't call one SECDEF function from another SECDEF > function. I do not see why there would be such a restriction? postgres@db> CREATE FUNCTION secfunc() RETURNS TEXT SECURITY DEFINER AS $$ SELECT CURRENT_USER::TEXT; $$ LANGUAGESQL; fabien@db> CREATE FUNCTION secfunc2() RETURNS TEXT SECURITY DEFINER AS $$ SELECT secfunc() || ' - ' || CURRENT_USER;$$ LANGUAGE SQL; *@db> SELECT secfunc2(); -- returns: "postgres - fabien" from both sides... > So as soon as you have multiple privileged roles making use of > variables, there's a serious risk of not being able to make use of these > private variables at all. ? > Now maybe pg_class is absolutely the wrong place to store info about > predefined variables, but that's an implementation detail, not a design > flaw. Yes, but is not my only point. > We should protect for the possibility of truly global (as in cross-session) > variables. Yes, why not... Although having "cross-session session variable" seems to create some problems of its own... Would they be cross-database as well? > Presumably these would have to be pre-defined via DDL before use. Probably. > These would be uniquely valuable as a means of communication between sessions > that are connected to different databases. Why not. > I could also see use in cross-database in-memory queues. AFAIK both of > these would be pretty easy to do with the shared memory infrastructure > we now have. Yep, that would means keeping some "session hash" in shared buffers. > It would be nice if we could come up with a plan for what permanently defined > temp tables looked like, so the syntax and operation was similar to the > permanently defined session variables that Pavel is proposing. Yes. And basic session variables as well. > That said, given how long that has been an open issue I think it's > completely unfair to stonewall this feature if we can't get permanent > temp tables figured out. ... > While permanent temp tables would eliminate some objections to store "session > variables", the fact still remains that any kind of table would still be > MVCC, and that is NOT always what you want. Yes, I understood that. But sometimes it may be what is wanted. > It would be nice if whatever syntax was decided for defined session variables > allowed room for "variables" that were actually MVCC, because sometimes that > actually is what you want. Yep, that is what I'm arguing about, having a global variable design, and then maybe inside this design just a special case implemented. > Yes, you could simulate the same thing with functions, but why make > users do all that work if we could easily provide the same > functionality? The easy is unclear. Eg if special declared with permissions partially persistent session variables preclude future basic session variables, or their efficiency, or their syntax, it would be a problem. Hence the point of discussing before proceeding. > These should probably be called something other than > "variables", but presumably all the other syntax and settings could be > the same. Again, it's not the job of this proposal to boil that ocean, > but it would be nice to leave the option open. Yep. -- Fabien.
2016-12-28 17:53 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:
On 12/28/16 9:57 AM, Fabien COELHO wrote:* Other later triggers, etc, also reference USER_IS_AUDITOR
The variable is not directly referenced, one would have to call
isUserAuditor() to access the private session value, but then you can
GRANT/REVOKE whatever you want on the access function.
Why force users to create Yet Another Function as a getter?
There's 2 big points that I think keep getting missed:
1) Variables would be completely non-transactional. The only way you can do that today is to use a "non-standard" language (such as plperl or plpython), or by creating a custom GUC (which is ugly because it necessitates changing postgresql.conf and is only text). A solution to this problem would be to provide a plpgsql equivalent to plperl or plpython's session hashes. I'm sure there are use cases that would be satisfied by simple doing that, but...
2) Variables provide permissions. Theoretically you could allow the hypothetical plpgsql session variables in (1) to be marked private, but that means you now have to keep all those variables on a per-role basis, users are forced to create accessor functions, and you run a serious risk of confusion from getting the function ownerships wrong. That certainly seems no better than defining permanent variables and giving them permissions (as Pavel suggested). More importantly, the security definer trick you're suggesting has a fatal flaw: you can't call one SECDEF function from another SECDEF function. So as soon as you have multiple privileged roles making use of variables, there's a serious risk of not being able to make use of these private variables at all.
Now maybe pg_class is absolutely the wrong place to store info about predefined variables, but that's an implementation detail, not a design flaw.
We can talk about implementation - for me a variable is a object that holds data - more, I would to fix this object in schema without possible collision with tables. I plan to support SELECT for access and UPDATE for changes in future - so using pg_class is natural. On second hand - lot of fields in pg_class are not used for variables everywhere. I would to fix variables in some schema, but I would not to solve possible collision between variables and other SQL objects - due possible direct access inside SQL statements in future.
Some other points:
We should protect for the possibility of truly global (as in cross-session) variables. Presumably these would have to be pre-defined via DDL before use. These would be uniquely valuable as a means of communication between sessions that are connected to different databases. I could also see use in cross-database in-memory queues. AFAIK both of these would be pretty easy to do with the shared memory infrastructure we now have.
I didn't write any what close this possibility.
It would be nice if we could come up with a plan for what permanently defined temp tables looked like, so the syntax and operation was similar to the permanently defined session variables that Pavel is proposing. That said, given how long that has been an open issue I think it's completely unfair to stonewall this feature if we can't get permanent temp tables figured out.
While permanent temp tables would eliminate some objections to store "session variables", the fact still remains that any kind of table would still be MVCC, and that is NOT always what you want.
It would be nice if whatever syntax was decided for defined session variables allowed room for "variables" that were actually MVCC, because sometimes that actually is what you want. Yes, you could simulate the same thing with functions, but why make users do all that work if we could easily provide the same functionality? These should probably be called something other than "variables", but presumably all the other syntax and settings could be the same. Again, it's not the job of this proposal to boil that ocean, but it would be nice to leave the option open.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
On 12/28/16 11:29 AM, Fabien COELHO wrote: > > Hello Jim, > >> 1) Variables would be completely non-transactional. [...] A solution >> to this problem would be to provide a plpgsql equivalent to plperl or >> plpython's session hashes. > > That is what I have in mind with "session variables" à la MS/MY SQL, but > at the SQL level, not PL/pgSQL level. I'm just saying there might be use for a plpgsql equivalent to the session hashes that other PLs provide, but that's a different issue. >> users are forced to create accessor functions, > > Yes if the private variable should be accessed. If the variable is > private, then it is private and nothing is needed. Idem for public. Why force the extra busywork? Just allow for them to be public. For that matter, if we're going to start introducing private objects, that certainly needs to be thought through. >> and you run a serious risk of confusion from getting the function >> ownerships wrong. > > One can get the permissions on special session variable wrong as well... > I do not see how it differs. It's a lot harder to mess up an explicit grant than it is to mess up object ownership. >> More importantly, the security definer trick you're suggesting has a >> fatal flaw: you can't call one SECDEF function from another SECDEF >> function. > > I do not see why there would be such a restriction? > > postgres@db> CREATE FUNCTION secfunc() RETURNS TEXT SECURITY DEFINER > AS $$ SELECT CURRENT_USER::TEXT; $$ LANGUAGE SQL; > > fabien@db> CREATE FUNCTION secfunc2() RETURNS TEXT SECURITY DEFINER > AS $$ SELECT secfunc() || ' - ' || CURRENT_USER; $$ LANGUAGE > SQL; > > *@db> SELECT secfunc2(); -- returns: "postgres - fabien" from both > sides... Perhaps I've got the restrictions on SECDEF wrong, but I know there's problems with it. Certainly one issue is you can't change roles back to the calling user. Maybe they would be workable in this case, but it's just a bunch of extra busywork for the user that serves no real purpose. >> We should protect for the possibility of truly global (as in >> cross-session) variables. > > Yes, why not... Although having "cross-session session variable" seems > to create some problems of its own... Would they be cross-database as well? Yes. It'd be a shared catalog. ... >> Yes, you could simulate the same thing with functions, but why make >> users do all that work if we could easily provide the same functionality? > > The easy is unclear. Eg if special declared with permissions partially > persistent session variables preclude future basic session variables, or > their efficiency, or their syntax, it would be a problem. Hence the > point of discussing before proceeding. Then IMHO what needs to happen is to have a discussion on actual syntax instead of calling into question the value of the feature. Following this thread has been very painful because the communications have not been very clear. Focus on grammar would probably be a big improvement in that regard. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532)
On 12/28/2016 10:17 AM, Jim Nasby wrote: > Then IMHO what needs to happen is to have a discussion on actual syntax > instead of calling into question the value of the feature. Following > this thread has been very painful because the communications have not > been very clear. Focus on grammar would probably be a big improvement in > that regard. +1 -- Crunchy Data - http://crunchydata.com PostgreSQL Support for Secure Enterprises Consulting, Training, & Open Source Development
2016-12-28 19:17 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:
On 12/28/16 11:29 AM, Fabien COELHO wrote:
Hello Jim,1) Variables would be completely non-transactional. [...] A solution
to this problem would be to provide a plpgsql equivalent to plperl or
plpython's session hashes.
That is what I have in mind with "session variables" à la MS/MY SQL, but
at the SQL level, not PL/pgSQL level.
I'm just saying there might be use for a plpgsql equivalent to the session hashes that other PLs provide, but that's a different issue.users are forced to create accessor functions,
Yes if the private variable should be accessed. If the variable is
private, then it is private and nothing is needed. Idem for public.
Why force the extra busywork? Just allow for them to be public.
For that matter, if we're going to start introducing private objects, that certainly needs to be thought through.and you run a serious risk of confusion from getting the function
ownerships wrong.
One can get the permissions on special session variable wrong as well...
I do not see how it differs.
It's a lot harder to mess up an explicit grant than it is to mess up object ownership.More importantly, the security definer trick you're suggesting has a
fatal flaw: you can't call one SECDEF function from another SECDEF
function.
I do not see why there would be such a restriction?
postgres@db> CREATE FUNCTION secfunc() RETURNS TEXT SECURITY DEFINER
AS $$ SELECT CURRENT_USER::TEXT; $$ LANGUAGE SQL;
fabien@db> CREATE FUNCTION secfunc2() RETURNS TEXT SECURITY DEFINER
AS $$ SELECT secfunc() || ' - ' || CURRENT_USER; $$ LANGUAGE
SQL;
*@db> SELECT secfunc2(); -- returns: "postgres - fabien" from both
sides...
Perhaps I've got the restrictions on SECDEF wrong, but I know there's problems with it. Certainly one issue is you can't change roles back to the calling user.
Maybe they would be workable in this case, but it's just a bunch of extra busywork for the user that serves no real purpose.We should protect for the possibility of truly global (as in
cross-session) variables.
Yes, why not... Although having "cross-session session variable" seems
to create some problems of its own... Would they be cross-database as well?
Yes. It'd be a shared catalog.
...Yes, you could simulate the same thing with functions, but why make
users do all that work if we could easily provide the same functionality?
The easy is unclear. Eg if special declared with permissions partially
persistent session variables preclude future basic session variables, or
their efficiency, or their syntax, it would be a problem. Hence the
point of discussing before proceeding.
Then IMHO what needs to happen is to have a discussion on actual syntax instead of calling into question the value of the feature. Following this thread has been very painful because the communications have not been very clear. Focus on grammar would probably be a big improvement in that regard.
I don't think. There are some significant questions:
1. Should be "variables" fixed in schema? Should be metadata persistent?
2. Should we use GRANT/REVOKE statements for "variables"?
3. Should be "variable" name unique in schema like tables, indexes, sequences?
4. The content of "variables" should be nontransactional or transactional.
5. What mode we should to support, what mode will be default "unshared", "shared"
That is all. All discussion is about these questions. These questions creates multidimensional space, that can be covered. But we cannot to implement all in one stage.
We can have schema unbound variables declared by
DECLARE @xx AS int, and accessed with get('@xx') and set('@xx', val)
and we have to have bound variables created by
CREATE VARIABLE public.xx AS int and accessed with get('public.xx'), set('public.xx', val)
We can have both - I don't see any problem. Implementation DECLARE statement doesn't eliminate implementation CREATE statement. I can understand so somebody doesn't need secure variables - so it doesn't need CREATE statement. But if you need secure variables then some PRIVATE flags is minimally redundant to our standard security design.
My proposal doesn't eliminate Fabien's proposal - proposal by Fabien can be good enough for Fabien - and for interactive work, but is not good enough for some checking and security usage.
Regards
Pavel
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
Hello Jim, >> Yes if the private variable should be accessed. If the variable is >> private, then it is private and nothing is needed. Idem for public. > > Why force the extra busywork? Just allow for them to be public. There is no extra busywork, having possibly private session variables cost nearly nothing, and I'm arguing that they could be enough for Pavel special use case. > For that matter, if we're going to start introducing private objects, that > certainly needs to be thought through. Yep. It is a discussion. >> One can get the permissions on special session variable wrong as well... >> I do not see how it differs. > > It's a lot harder to mess up an explicit grant than it is to mess up object > ownership. ISTM that it rather depends on the default permissions on the object... If a variable object is publicly accessible by default, then you have to take care about revoke & grant and you can mess up with it. Moreover, it also suggest that session variables could be private by default. >> *@db> SELECT secfunc2(); -- returns: "postgres - fabien" from both >> sides... > > Perhaps I've got the restrictions on SECDEF wrong, but I know there's > problems with it. Certainly one issue is you can't change roles back to the > calling user. It is the same with suid executable on Unix, I do not see as an issue. > Maybe they would be workable in this case, but it's just a bunch of extra > busywork for the user that serves no real purpose. Pavel special case is the purpose. > Then IMHO what needs to happen is to have a discussion on actual syntax > instead of calling into question the value of the feature. I think that the discussion should be *both* about syntax and semantics. > Following this thread has been very painful because the communications > have not been very clear. Yep. The fact that I initially missed the "half persistence" property of Pavel's design has not helped. I suggest to move the discussion on the following wiki page that I have just bootstrapped: https://wiki.postgresql.org/wiki/Variable_Design -- Fabien
Hello Pavel, >> Hmmm... I know a little bit about that field, ISTM that you are speaking >> of the current capabilities of a particular static analysis tool, but I'm >> not sure that the tool capabilities could not be enhanced to manage more >> things. > > It cannot be - the static analyze is limited to function scope - in static > analyze you don't know a order of calls. I have been doing interprocedural static analysis for the last 25 years, and I can assure you that those techniques are not limited to the scope of a function. As for global variables, I agree that you may proove more things about them if you know the order of calls. >> The private session variables I suggested have a fixed (static) name, and >> their type could be infered by a static analysis tool, eg: >> ... >> DECLARE @foo BOOLEAN PRIVATE; >> -- I know that there is private a boolean variable "@foo" of unknown >> value >> SET @foo = TRUE; >> --- I know that @foo is true... >> ... > > This is not too friendly Friendly is subjective. ISTM That it gets the job done with minimal syntax and implementation. It avoids getter/setter functions which are unfriendly to me. > - you have to repeat DECLARE in every function. That is the same in nearly all programming languages, you have to declare external variables somehow before using them, that is life. Some declarations could be avoided if an unknown variable is assumed to have untyped value NULL by default. > What is somebody somewhere write @fooo NULL ? Unkown variable error ? > or use DECLARE @foo integer instead. It would not matter if it is someone else, because @foo would be their private version. If it is yourself, an error could be raised if a session variable is found to be declared with two distinct types. A static analysis tool would be able to detect that as well. > There is big space for errors. Whatever the features and syntax, you can always shoot yourself in the foot. I have open a wiki page to help with this discussion: https://wiki.postgresql.org/wiki/Variable_Design -- Fabien.
2016-12-29 9:36 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
Hello Pavel,Hmmm... I know a little bit about that field, ISTM that you are speaking
of the current capabilities of a particular static analysis tool, but I'm
not sure that the tool capabilities could not be enhanced to manage more
things.
It cannot be - the static analyze is limited to function scope - in static
analyze you don't know a order of calls.
I have been doing interprocedural static analysis for the last 25 years, and I can assure you that those techniques are not limited to the scope of a function. As for global variables, I agree that you may proove more things about them if you know the order of calls.The private session variables I suggested have a fixed (static) name, and
their type could be infered by a static analysis tool, eg:
...
DECLARE @foo BOOLEAN PRIVATE;
-- I know that there is private a boolean variable "@foo" of unknown
value
SET @foo = TRUE;
--- I know that @foo is true...
...
This is not too friendly
Friendly is subjective. ISTM That it gets the job done with minimal syntax and implementation. It avoids getter/setter functions which are unfriendly to me.rst
getter/setter functions are implementation in first step. I spoke, so this step is not last.
- you have to repeat DECLARE in every function.
That is the same in nearly all programming languages, you have to declare external variables somehow before using them, that is life.
Some declarations could be avoided if an unknown variable is assumed to have untyped value NULL by default.What is somebody somewhere write @fooo
NULL ? Unkown variable error ?or use DECLARE @foo integer instead.
It would not matter if it is someone else, because @foo would be their private version. If it is yourself, an error could be raised if a session variable is found to be declared with two distinct types. A static analysis tool would be able to detect that as well.There is big space for errors.
Whatever the features and syntax, you can always shoot yourself in the foot.
I disagree - some concepts are more robust, other less.
Regards
Pavel
I have open a wiki page to help with this discussion:
https://wiki.postgresql.org/wiki/Variable_Design
--
Fabien.
>> CREATE FUNCTION setup_user(TEXT, TEXT) >> RETURNS BOOLEAN SECURITY DEFINER AS $$ >> CREATE FUNCTION isUserAuditor() >> RETURNS BOOLEAN SECURITY DEFINER AS $$ > > so what is worse - I did one new entry in pg_class and one entry in > pg_attributes. You wrote two entries in pg_proc function - more you have to > ensure consistency of these functions. You are not comparing the same perimeter, the setup_user() function is necessary to both approaches for the described use case where a read-only value is needed: With your approach: 1. CREATE VARIABLE secure_stuff SESSION SCOPE ... 2. REVOKE/GRANT ... on VARIABLE secure_stuff 3. CREATE FUNCTION setup_user(...) With this approach: 1. CREATE FUNCTION access_secure_stuff(...) 2. REVOKE/GRANT ... on FUNCTION access_secure_stuff 3. CREATE FUNCTION setup_user(...) The REVOKE/GRANT are basically the same on VARIABLE and on FUNCTION. So it is not really that different as far as catalog entry count is concerned. The benefit is that it avoids a special concept and use a more generic one, i.e. basic session variables. The added cost is that a two line function must be written, which does not look like a big issue to implement a pretty special use case. -- Fabien.
2016-12-29 9:46 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
CREATE FUNCTION setup_user(TEXT, TEXT)
RETURNS BOOLEAN SECURITY DEFINER AS $$CREATE FUNCTION isUserAuditor()
RETURNS BOOLEAN SECURITY DEFINER AS $$
so what is worse - I did one new entry in pg_class and one entry in
pg_attributes. You wrote two entries in pg_proc function - more you have to
ensure consistency of these functions.
You are not comparing the same perimeter, the setup_user() function is necessary to both approaches for the described use case where a read-only value is needed:
With your approach:
1. CREATE VARIABLE secure_stuff SESSION SCOPE ...
2. REVOKE/GRANT ... on VARIABLE secure_stuff
3. CREATE FUNCTION setup_user(...)
With this approach:
1. CREATE FUNCTION access_secure_stuff(...)
2. REVOKE/GRANT ... on FUNCTION access_secure_stuff
3. CREATE FUNCTION setup_user(...)
The REVOKE/GRANT are basically the same on VARIABLE and on FUNCTION.
So it is not really that different as far as catalog entry count is concerned.
The benefit is that it avoids a special concept and use a more generic one, i.e. basic session variables.
There is big difference - you concept missing any safe point. You have to specify same information more times.
I am sorry, this discussion is in cycle - there is no sense to continue.
Regards
Pavel
The added cost is that a two line function must be written, which does not look like a big issue to implement a pretty special use case.
--
Fabien.
Please, could you remove the part of the mail you are not responding to and just keep the relevant part? >> Whatever the features and syntax, you can always shoot yourself in the >> foot. > > I disagree Hmmm... I have succeeded in shotting myself in the foot with possibly every feature of every language I have used. This is called experience... in the end you do know how NOT to do things:-) > - some concepts are more robust, other less. Sure. The use-case under discussion is about ONE session variable holding an expensive to compute security status which can be consulted by other functions. I think that probably one can have it right with both approaches, even if it is on the second try... -- Fabien.
2016-12-29 9:57 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
Please, could you remove the part of the mail you are not responding to and just keep the relevant part?Whatever the features and syntax, you can always shoot yourself in the
foot.
I disagree
Hmmm... I have succeeded in shotting myself in the foot with possibly every feature of every language I have used. This is called experience... in the end you do know how NOT to do things:-)- some concepts are more robust, other less.
Sure. The use-case under discussion is about ONE session variable holding an expensive to compute security status which can be consulted by other functions. I think that probably one can have it right with both approaches, even if it is on the second try...
The robustness in not only about content, but about code maintenance, possibility to quickly search errors or better don't do errors
--
Fabien.
> There is big difference - you concept missing any safe point. You have to > specify same information more times. Not necessarily, and if so maybe twice. I'm ok to recognize that it is a difference between both approaches, and an inconvenient of the one I'm proposing. There also see inconvenients to the other design as well, so there will not be any perfect solution, IMO. That is the point of discussing. > I am sorry, this discussion is in cycle - there is no sense to continue. If the only open option is to agree with your initial design, then obviously this is not a path for reaching a consensus. Could you put your ideal (final) design proposition on the wiki page? That would avoid repeating the same cyclic arguments, they would be written only once... -- Fabien.
2016-12-29 10:11 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
There is big difference - you concept missing any safe point. You have to
specify same information more times.
Not necessarily, and if so maybe twice. I'm ok to recognize that it is a difference between both approaches, and an inconvenient of the one I'm proposing. There also see inconvenients to the other design as well, so there will not be any perfect solution, IMO. That is the point of discussing.
I am sorry, this discussion is in cycle - there is no sense to continue.
If the only open option is to agree with your initial design, then obviously this is not a path for reaching a consensus.
There are two concepts - both can be implemented, and used (can be used together). Both these concepts has some advantage and some disadvantages. It is hard to expect, so there is possible full agreement - because everybody has different preferences.
I understand so for you can be your proposal more readable, but for me, your design of usage and security looks not well. It is acceptable without PRIVATE flags and similar flags. It is not designed be secure. (MySQL has nothing similar, I don't know if MSSQL has some, but probably not). Ok. We have different priorities. For you is not usual so in one session there can be more more times switch of secure context. It is usual for me, and for applications what I write.
Could you put your ideal (final) design proposition on the wiki page? That would avoid repeating the same cyclic arguments, they would be written only once...
yes, I'll do it.
Regards
Pavel
--
Fabien.
2016-12-29 10:32 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:
2016-12-29 10:11 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:There is big difference - you concept missing any safe point. You have to
specify same information more times.
Not necessarily, and if so maybe twice. I'm ok to recognize that it is a difference between both approaches, and an inconvenient of the one I'm proposing. There also see inconvenients to the other design as well, so there will not be any perfect solution, IMO. That is the point of discussing.I am sorry, this discussion is in cycle - there is no sense to continue.
If the only open option is to agree with your initial design, then obviously this is not a path for reaching a consensus.There are two concepts - both can be implemented, and used (can be used together). Both these concepts has some advantage and some disadvantages. It is hard to expect, so there is possible full agreement - because everybody has different preferences.I understand so for you can be your proposal more readable, but for me, your design of usage and security looks not well. It is acceptable without PRIVATE flags and similar flags. It is not designed be secure. (MySQL has nothing similar, I don't know if MSSQL has some, but probably not). Ok. We have different priorities. For you is not usual so in one session there can be more more times switch of secure context. It is usual for me, and for applications what I write.
Could you put your ideal (final) design proposition on the wiki page? That would avoid repeating the same cyclic arguments, they would be written only once...yes, I'll do it.
But I'll remove some strange ideas. Why persistent variables?
Please, one argument. We have tables. What is wrong on tables?
Anything what will be persistent will have similar performance like tables.
Regards
Pavel
RegardsPavel
--
Fabien.
Hello Pavel, > There are two concepts - both can be implemented, and used (can be used > together). That is one point I would like to ascertain clearly and explicitely, so having various designs side by side, eg in the wiki page, would help if and where they interact. The second point I am keen on discussing is how the proposed designs provide a solution to different use cases, and at what cost. I've added sections about use cases (I have listed 3) and how they could be supported in the wiki page. > Both these concepts has some advantage and some disadvantages. It is > hard to expect, so there is possible full agreement - because everybody > has different preferences. Sure. > I understand so for you can be your proposal more readable, but for me, > your design of usage and security looks not well. Yep, there are pros and cons to all proposals. I wish they are listed somewhere, and possibly discussed, because some pros/cons depends on some detailed features. > It is acceptable without PRIVATE flags and similar flags. It is not > designed be secure. Indeed. I've taken this point somehow into account and changed my proposal so that session variables are private by default, and now I'm not even sure that there should exist public session variables at all... > (MySQL has nothing similar, I don't know if MSSQL has some, but probably > not). Ok. We have different priorities. For you is not usual so in one > session there can be more more times switch of secure context. It is > usual for me, and for applications what I write. I have added a section in the wiki to present succintely existing stuff in other products. >> Could you put your ideal (final) design proposition on the wiki page? > yes, I'll do it. Good! -- Fabien.
>> yes, I'll do it. > > But I'll remove some strange ideas. Why? I rather expect that you would comment that you find them strange and argue why: there is no reason to "remove" a concept idea as such at least early in a discussion process... > Why persistent variables? Because *you* want persistent session variables... I did not invent it, I just removed the "session" word and generalized the concept. Sometimes one wants to store sets, sometimes one wants to only store value. > Please, one argument. We have tables. What is wrong on tables? Nothing is wrong as such. Cons arguments are: the syntax is verbose just for one scalar, and the one-row property is currently not easily enforced by pg, AFAIK. Note that I'm not claiming that it should be implemented, but if some kind of half-persistent variables are implemented, I think it should be consistent with possibly fully-persistent variable as well, even if they are not implemented immediately, or ever. > Anything what will be persistent will have similar performance like > tables. Yes, sure. It is a different use case. Argue in the wiki! -- Fabien.
2016-12-29 11:42 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
yes, I'll do it.
But I'll remove some strange ideas.
Why?
I rather expect that you would comment that you find them strange and argue why: there is no reason to "remove" a concept idea as such at least early in a discussion process...Why persistent variables?
Because *you* want persistent session variables... I did not invent it, I just removed the "session" word and generalized the concept.
I newer talked about persistent data. I talked about persistent metadata. I really don't propose any possible substitution of tables (relations). I newer did it.
The used terminology is not 100% clean and natural - maybe better name is "global temporary unshared untransactional unrelational storage" - when I use a terminology related to temporary tables.
I don't see any sense to have two similar storages or two redundant access methods - not in PostgreSQL level.
Sometimes one wants to store sets, sometimes one wants to only store value.Please, one argument. We have tables. What is wrong on tables?
Nothing is wrong as such. Cons arguments are: the syntax is verbose just for one scalar, and the one-row property is currently not easily enforced by pg, AFAIK.
Note that I'm not claiming that it should be implemented, but if some kind of half-persistent variables are implemented, I think it should be consistent with possibly fully-persistent variable as well, even if they are not implemented immediately, or ever.
There is small language barrier :) - I am thinking about features, what we should to implement, not what can be implemented. There is lot of possibilities, that we can find in universe. But only few has technical/practical sense.
SQL is verbose language - I like it - because it different - the developer have to know when he is working with database, with persistent data. The cost of access to data is significantly higher. When this information is hidden from developer, then the result is terrible slow.
Regards
Pavel
Anything what will be persistent will have similar performance like tables.
Yes, sure. It is a different use case. Argue in the wiki!
--
Fabien.
> I newer talked about persistent data. I talked about persistent metadata. Sure, I finally understood that detail. Now if I hear "persistent variable", I by default understand that both metadata and data are persistent... It requires some effort to understand the subtelty. > I really don't propose any possible substitution of tables (relations). > I newer did it. Sure. > The used terminology is not 100% clean and natural - maybe better name is > "global temporary unshared untransactional unrelational storage" - Hmmm. Too long:-) But these properties need to be spelled out. > [...] I don't see any sense to have two similar storages or two > redundant access methods - not in PostgreSQL level. Then say so in the wiki in the cons. Personnaly, I'm not sure. Maybe having a clean way of declaring a one-row "singleton" table enforced by postgresql would be enough. -- Fabien.
2016-12-29 14:25 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
I newer talked about persistent data. I talked about persistent metadata.
Sure, I finally understood that detail. Now if I hear "persistent variable", I by default understand that both metadata and data are persistent... It requires some effort to understand the subtelty.I really don't propose any possible substitution of tables (relations). I newer did it.
Sure.The used terminology is not 100% clean and natural - maybe better name is
"global temporary unshared untransactional unrelational storage" -
Hmmm. Too long:-) But these properties need to be spelled out.[...] I don't see any sense to have two similar storages or two redundant access methods - not in PostgreSQL level.
Then say so in the wiki in the cons.
Personnaly, I'm not sure. Maybe having a clean way of declaring a one-row "singleton" table enforced by postgresql would be enough.
There is a singleton table :)
create table foo(x integer unique not null default 1 check(x = 1), y integer);
insert into foo(y) values(100);
analyze foo;
The storage is not important and is not interesting - any different behave for persistent objects different than MVCC can be big surprise for users.
What is interesting are getter functions - they can be volatile or stable/immutable - what can be interesting, because then the value can be used by planner.
For example - MySQL @var is volatile - can be changed in query - that's mean, you cannot use it as const for planner :( - the behave will be same (with same risks to performance) like using plpgsql variable in query.
With getter functions you can do bigger game.
Regards
Pavel
--
Fabien.
2016-12-29 14:41 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:
2016-12-29 14:25 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:I newer talked about persistent data. I talked about persistent metadata.
Sure, I finally understood that detail. Now if I hear "persistent variable", I by default understand that both metadata and data are persistent... It requires some effort to understand the subtelty.I really don't propose any possible substitution of tables (relations). I newer did it.
Sure.The used terminology is not 100% clean and natural - maybe better name is
"global temporary unshared untransactional unrelational storage" -
Hmmm. Too long:-) But these properties need to be spelled out.[...] I don't see any sense to have two similar storages or two redundant access methods - not in PostgreSQL level.
Then say so in the wiki in the cons.
Personnaly, I'm not sure. Maybe having a clean way of declaring a one-row "singleton" table enforced by postgresql would be enough.There is a singleton table :)create table foo(x integer unique not null default 1 check(x = 1), y integer);insert into foo(y) values(100);analyze foo;The storage is not important and is not interesting - any different behave for persistent objects different than MVCC can be big surprise for users.
our sequences - simple, persistent, and not ACID - I found lot of people that cannot accept it - not quickly
What is interesting are getter functions - they can be volatile or stable/immutable - what can be interesting, because then the value can be used by planner.For example - MySQL @var is volatile - can be changed in query - that's mean, you cannot use it as const for planner :( - the behave will be same (with same risks to performance) like using plpgsql variable in query.With getter functions you can do bigger game.RegardsPavel
--
Fabien.
> There is a singleton table :) > > create table foo(x integer unique not null default 1 check(x = 1), y integer); > insert into foo(y) values(100); > analyze foo; I know this one. It can be empty, which a singleton cannot be. For a singleton table, you should have one and only one row, you cannot insert or delete, so this is only part of the real thing. > For example - MySQL @var is volatile - can be changed in query - that's > mean, you cannot use it as const for planner :( Indeed, because of the ":=" within in a SELECT query, the variable is updated at each round. Yuk. -- Fabien.
2016-12-29 20:23 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
There is a singleton table :)
create table foo(x integer unique not null default 1 check(x = 1), y integer);
insert into foo(y) values(100);
analyze foo;
I know this one. It can be empty, which a singleton cannot be. For a singleton table, you should have one and only one row, you cannot insert or delete, so this is only part of the real thing.
subselect is "singleton" - it returns one row every time.
Not sure if term "singleton" is valid in relation database.
For example - MySQL @var is volatile - can be changed in query - that's
mean, you cannot use it as const for planner :(
Indeed, because of the ":=" within in a SELECT query, the variable is updated at each round. Yuk.
--
Fabien.
On 12/29/2016 02:23 PM, Fabien COELHO wrote: > >> There is a singleton table :) >> >> create table foo(x integer unique not null default 1 check(x = 1), y >> integer); >> insert into foo(y) values(100); >> analyze foo; > > I know this one. It can be empty, which a singleton cannot be. For a > singleton table, you should have one and only one row, you cannot > insert or delete, so this is only part of the real thing. Surely we can do a bit better than that, if that's what you really want. Create the table with an initial row and add a trigger preventing anything except update. -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
> Not sure if term "singleton" is valid in relation database. The term is valid in mathematics: The relational model is an extension of set theory, eg the algebra includes set operators such as union, intersection, difference... In the set theory, a singleton designate is a exactly-one-element set. At least Wikipedia and my dwindling memories tell me so:-) -- Fabien.
>> I know this one. It can be empty, which a singleton cannot be. For a >> singleton table, you should have one and only one row, you cannot insert or >> delete, so this is only part of the real thing. > > Surely we can do a bit better than that, if that's what you really want. > Create the table with an initial row and add a trigger preventing anything > except update. Yes. I just meant that this is not available easily "out of the box", but it is obviously doable with some effort... which people would seldom do. -- Fabien.
On 30 December 2016 at 14:50, Fabien COELHO <coelho@cri.ensmp.fr> wrote: > >>> I know this one. It can be empty, which a singleton cannot be. For a >>> singleton table, you should have one and only one row, you cannot insert or >>> delete, so this is only part of the real thing. >> >> >> Surely we can do a bit better than that, if that's what you really want. >> Create the table with an initial row and add a trigger preventing anything >> except update. > > > Yes. > > I just meant that this is not available easily "out of the box", but it is > obviously doable with some effort... which people would seldom do. Sure... but every feature has a cost too, in maintenance and reliability. This needs to have compelling use cases, and it's not free to add multiple similar-ish/overlapping features. So some agreement on what we should actually have is needed, along with a compelling case for what it delivers that we can't already do. Pavel's personal requirements include that it be well suited for static analysis of plpgsql using his plpgsql_check tool. So he wants persistent definitions. You expect different things from variables, to the point where it's a different feature. It's unlikely two unrelated variable implementations will be accepted. I think progress can only be achieved by setting out requirements, a feature matrix, and which proposed implementations can deliver which desired features. Then go from there. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
> Pavel's personal requirements include that it be well suited for > static analysis of plpgsql using his plpgsql_check tool. So he wants > persistent definitions. I've been in static analysis for the last 25 years, and the logic of this statement fails me. Pavel wants that the plpgsql_check tool can statically analyse session variables, which is fine with me. It does not fellow that the definition must be "persistent" as such, it fellows that it should be available in the PL/pgSQL script so that a tool which reads it can check it by looking at the codes. IMO this is a lesser requirement. I do not think that a feature should be designed around the current limitations of a particular external tool, esp. if said tool can be improved at a reasonable cost. > I think progress can only be achieved by setting out requirements, a > feature matrix, and which proposed implementations can deliver which > desired features. Then go from there. Yep. I've bootstapped a wiki page, feel free to improve it as you see fit: https://wiki.postgresql.org/wiki/Variable_Design -- Fabien.
On 30 December 2016 at 16:46, Fabien COELHO <coelho@cri.ensmp.fr> wrote: > >> Pavel's personal requirements include that it be well suited for >> static analysis of plpgsql using his plpgsql_check tool. So he wants >> persistent definitions. > > > I've been in static analysis for the last 25 years, and the logic of this > statement fails me. I have no opinion here, as I've not seen plpgsql_check nor do I understand the issues Pavel perceives with having dynamic definitions of variables. All I'm saying is that you two are talking around in circles by repeating different requirements to each other, and it's not going to get anywhere unless you both change your approach. It sounds like you're already trying to do that. > I do not think that a feature should be designed around the current > limitations of a particular external tool, esp. if said tool can be improved > at a reasonable cost. Not arguing there. I was initially inclined to favour Pavel's proposal because it fits a RLS use case I was somewhat interested in. But so would dynamic variables resolved at runtime so long as they were fast. Personally I don't much care what the result is, so long as it can satisfy some kind of reasonable security isolation, such that role A can set it, B can read it but not set it, and role C can do neither. Preferably without resorting to creating SECURITY DEFINER accessors, since they're messy and slow. Support for data typing would also be nice too. If it doesn't deliver security controls then IMO there's not much advantage over (ab)using GUCs with current_setting(...). Exploring the other areas discussed: Personally I think MVCC, persistent variables are a totally unnecessary idea that solves a problem we don't have. But maybe I don't understand your use cases. I expect anything like that would land up using a pg_catalog relation as a k/v-like store with different columns for different types or something of the like, which is really something the user can do well enough for themselves. I don't see the point at all. Non-MVCC persistent variables would probably be prohibitively expensive to make crash-safe, and seem even more pointless. Now, I can see shared variables whose state is visible across backends but is neither MVCC nor persistent being a fun toy, albeit not one I find likely to be particularly useful personally. But we can probably already do that in extensions, we've got most if not all of the needed infrastructure. Because we're a shared-nothing-by-default system, such variables will probably need shared memory segments that need to be allocated and, if new vars are added or their values grow too much, re-allocated. Plus locks to control access. All of which we can already do. Most of the uses I can think of for such things are met reasonably well by advisory locking already, and I expect most of the rest would be met by autonomous commit, so it feels a bit like a feature looking for a use-case. So .... lets take a step back or eight and ask "why?" Pavel: * Why is it so necessary for plpgsql variables to be implemented as persistent entities that are in the catalogs in order for you to achieve the static checking you want to? Is this due to limitations of your approach in plpgsql_check, or more fundamental issues? Explain. Fabien: * What use do you have for persistent-data variables? Please set out some use cases where they solve problems that are currently hard to to solve or greatly improve on the existing solutions. * On what basis do you _oppose_ persistently defining variables in the catalogs as their own entities? (My own objection is that "temporary variables" would make our existing catalog bloat issues for temp objects even worse). * Do you expect temporary/transient session variables to come into existence when first set, or to require some form of explicit definition? Everyone: * Does anyone care about or want variables whose value is shared across sessions? If so, why? Set out use cases. * Does anyone care about or want variables whose value becomes visible as soon as set, i.e. non-MVCC? If so, why? Set out use cases. * Does anyone care about or want variables whose value is persistent on-disk across restarts and/or crashes, maybe recorded in WAL for replication, etc? If so, justify how this is better than a relation in real-world practical terms. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 30 December 2016 at 17:29, Craig Ringer <craig@2ndquadrant.com> wrote: > So .... lets take a step back or eight and ask "why?" Oh, and speaking of, I see Pavel's approach as looking for a PostgreSQL-adapted way to do something like Oracle's PL/SQL package variables. Right Pavel? If so, their properties are, as far as I as a non-Oracle-person can tell: * Can be package-private or public. If public, can be both got and set by anyone. If private, can be got and set directly only by code in package. (Our equivalent is "by the owner"). As far as I can tell outside access to package-private variables still uses the variable get/set syntax, but is automatically proxied via getter/setter methods defined in the package, if defined, otherwise inaccessible. * Value not visible across sessions. Ever. * Can have an initialiser / DEFAULT value. * Non-persistent, value lost at session end. A typical example, where package variables are init'd from a table: http://www.dba-oracle.com/plsql/t_plsql_global_data.htm which relies on package initializers, something we don't have (but can work around easily enough with a little verbosity). This shows both public vars and package-private ones. See also https://docs.oracle.com/cd/B19306_01/appdev.102/b14261/constantvar_declaration.htm I believe these package variable properties are the properties Pavel seeks to model/emulate. Declared statically, value persistent only within the same session, non-transactional, can be private. Certainly there's nothing here that requires us to allow GRANTs. Simple ownership tests would supply us with similar functionality to what Oracle users have, allowing for our lack of packages and inability to hide the _existence_ of an object, only its contents. My views: I am personally overwhelmingly opposed to variables that automagically create themselves when dereferenced, a-la Perl. Write $serialised (english spelling) not $serialized (US spelling) and you get a silent null. Fun! Hell. No. This is why failure to "use strict" in Perl is a near-criminal offense. I'd also strongly prefer to require vars to be declared before first use. Again, like "use strict", and consistent with how Pg behaves elsewhere. Otherwise we need some kind of magic syntax to say "this is a variable", plus vars that get created on first assignment suck almost as badly as ones that're null on undefined deference. Spend half an hour debugging and figure out that you typo'd an assignment. Again, "use strict". I fail to see any real utility to cross-session vars, persistent or otherwise, at this point. Use a normal or unlogged relation. I don't see the point of untyped variables with no ownership or rights controls. (ab)use a GUC. Note that you can achieve both xact-scoped and session-scoped that way, with xact-scoped vars assigned using SET LOCAL being unwound on xact end. Unless we also propose to add ON CONNECT triggers, I think some kind of persistency of declaration is useful but not critical. We'll land up with apps sending preambles of declarations on session start otherwise. But the most compelling use cases are for things where there'll be a procedure invoked by the user or app on connect anyway, so it can declare stuff there. I'm utterly unconvinced that it's necessary to have them in the catalogs to achieve static checking. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
2016-12-30 10:29 GMT+01:00 Craig Ringer <craig@2ndquadrant.com>:
On 30 December 2016 at 16:46, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
>
>> Pavel's personal requirements include that it be well suited for
>> static analysis of plpgsql using his plpgsql_check tool. So he wants
>> persistent definitions.
>
>
> I've been in static analysis for the last 25 years, and the logic of this
> statement fails me.
I have no opinion here, as I've not seen plpgsql_check nor do I
understand the issues Pavel perceives with having dynamic definitions
of variables.
All I'm saying is that you two are talking around in circles by
repeating different requirements to each other, and it's not going to
get anywhere unless you both change your approach. It sounds like
you're already trying to do that.
> I do not think that a feature should be designed around the current
> limitations of a particular external tool, esp. if said tool can be improved
> at a reasonable cost.
Not arguing there.
I was initially inclined to favour Pavel's proposal because it fits a
RLS use case I was somewhat interested in. But so would dynamic
variables resolved at runtime so long as they were fast.
Personally I don't much care what the result is, so long as it can
satisfy some kind of reasonable security isolation, such that role A
can set it, B can read it but not set it, and role C can do neither.
Preferably without resorting to creating SECURITY DEFINER accessors,
since they're messy and slow. Support for data typing would also be
nice too.
If it doesn't deliver security controls then IMO there's not much
advantage over (ab)using GUCs with current_setting(...).
Exploring the other areas discussed:
Personally I think MVCC, persistent variables are a totally
unnecessary idea that solves a problem we don't have. But maybe I
don't understand your use cases. I expect anything like that would
land up using a pg_catalog relation as a k/v-like store with different
columns for different types or something of the like, which is really
something the user can do well enough for themselves. I don't see the
point at all.
Non-MVCC persistent variables would probably be prohibitively
expensive to make crash-safe, and seem even more pointless.
Now, I can see shared variables whose state is visible across backends
but is neither MVCC nor persistent being a fun toy, albeit not one I
find likely to be particularly useful personally. But we can probably
already do that in extensions, we've got most if not all of the needed
infrastructure. Because we're a shared-nothing-by-default system, such
variables will probably need shared memory segments that need to be
allocated and, if new vars are added or their values grow too much,
re-allocated. Plus locks to control access. All of which we can
already do. Most of the uses I can think of for such things are met
reasonably well by advisory locking already, and I expect most of the
rest would be met by autonomous commit, so it feels a bit like a
feature looking for a use-case.
So .... lets take a step back or eight and ask "why?"
Pavel:
* Why is it so necessary for plpgsql variables to be implemented as
persistent entities that are in the catalogs in order for you to
achieve the static checking you want to? Is this due to limitations of
your approach in plpgsql_check, or more fundamental issues? Explain.
There are few reasons:
1. plpgsql_check cannot to know a order of calls of functions. So any dynamic created object and related operations are not checkable by plpgsql_check (or any tool). If you create variable in one function, then this information is not available in other function.
2. You can write some hints - like Fabien proposal - it is not vulnerable against typo. It is much more safer to have one "created" variable, then more "declared" variables and believe so all declarations are valid. The created variable is only on one place - you can do simple check if reference to variable is valid or not. With query to system catalogue, you can get the list of all variables - you can see very simply if some variables are redundant, obsolete, wrong.
3. The catalogue objects allow to create well granularity of access rights. without it, you have to introduce only "PRIVATE" variables - and then you have to GRANT rights via security definer functions. There is not simple reply to question who can work with this variable, who has access, ... you have to check a rights to getter functions. When variables are catalogue objects, then this reply is simple. Catalogue objects allows to have "declared" security. Without catalogue objects we have to construct the security. For me, a declared security is stronger.
Regards
Pavel
Fabien:
* What use do you have for persistent-data variables? Please set out
some use cases where they solve problems that are currently hard to to
solve or greatly improve on the existing solutions.can
* On what basis do you _oppose_ persistently defining variables in the
catalogs as their own entities? (My own objection is that "temporary
variables" would make our existing catalog bloat issues for temp
objects even worse).
* Do you expect temporary/transient session variables to come into
existence when first set, or to require some form of explicit
definition?
Everyone:
* Does anyone care about or want variables whose value is shared
across sessions? If so, why? Set out use cases.
* Does anyone care about or want variables whose value becomes visible
as soon as set, i.e. non-MVCC? If so, why? Set out use cases.
* Does anyone care about or want variables whose value is persistent
on-disk across restarts and/or crashes, maybe recorded in WAL for
replication, etc? If so, justify how this is better than a relation in
real-world practical terms.
Thank you for this summary
Pavel
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Hello Craig, A long mail with many questions, that I tried to answered clearly, the result is too long... > [...] I have no opinion here, as I've not seen plpgsql_check nor do I > understand the issues Pavel perceives with having dynamic definitions of > variables. I understand that Pavel assumes that a static analysis tool cannot take into account a dynamic variable, hence is reserve. > All I'm saying is that you two are talking around in circles by > repeating different requirements to each other, and it's not going to > get anywhere unless you both change your approach. It sounds like > you're already trying to do that. Yep, that is why I have created the wiki page, so at least a argument should not be repeated cyclically, it should be written once. > [...] I was initially inclined to favour Pavel's proposal because it > fits a RLS use case I was somewhat interested in. But so would dynamic > variables resolved at runtime so long as they were fast. Fitting the need of use cases is the key point, obviously. > [...] Preferably without resorting to creating SECURITY DEFINER > accessors, since they're messy and slow. I'm not sure what you mean about "messy", but if you can objectivate this it can be an argument. Please feel free to explain it on the wiki. As for "slow", I have just tested overheads with pgbench, comparing a direct arithmetic operation (as a proxy to a fast session variable consultation) to constant returning plpgsql functions with security definer and security invoker, on a direct socket connection, with prepared statements: select 1 + 0 : 0.020 ms select one_sd() : 0.024 ms select one_si() : 0.024 ms I do not think that there is a significant "slowness" issue with using function calls. > Exploring the other areas discussed: > > Personally I think MVCC, persistent variables are a totally unnecessary > [...] But maybe I don't understand your use cases. I've done a survey about the schema of projects based on databases, mysql or pgsql. A significant number of them use a common pattern based on a one-row table, essentially to hold some scalar information about the application version and facilitate upgrades. However the one-row property is just hoped for, and on principle a database is about declaring constraints that are enforced afterwards. I see two clean solutions to this use case: declaring tables as one row, or having scalar objects. > Now, I can see shared variables whose state is visible across backends > but is neither MVCC nor persistent being a fun toy, albeit not one I > find likely to be particularly useful personally. Yep, I'm skeptical as well. I would like to see a convincing use case. > Pavel: > > * Why is it so necessary for plpgsql variables to be implemented as > persistent entities that are in the catalogs in order for you to > achieve the static checking you want to? Is this due to limitations of > your approach in plpgsql_check, or more fundamental issues? Explain. Note about this question not addressed to me: currently "plpgsql_check" cannot analyze any session variables as no such concept exists, whether with or without persistent declarations. > Fabien: > > * What use do you have for persistent-data variables? Please set out > some use cases where they solve problems that are currently hard to to > solve or greatly improve on the existing solutions. It is about the often seen one-row pattern, that I think should be enforced either with some singleton table declaration, or scalar objects. > * On what basis do you _oppose_ persistently defining variables in the > catalogs as their own entities? In understand that you are speaking of "persistent session variables". For me a database is about persistence (metadata & data) with safety (transactions) and security (permissions)... and maybe performance:-) Pavel's proposal creates a new object with 2 (secure metadata-persistence) out of 4 properties... I'm not a ease with introducting a new half-database concept in a database. I could accept it for a convincing use case that absolutely require that for deep reasons. On the other hand there are dynamic session variables (mysql, mssql, oracle have some variants) which are useful on their own without pretending to be database objects (no CREATE/ALTER/DROP, GRANT/REVOKE). If I can make these to handle the special case and avoid a new special half-database concept, I would prefer it. The key point about all that is to discuss, understand and evaluate the involved use cases. > (My own objection is that "temporary variables" would make our existing > catalog bloat issues for temp objects even worse). I do agree that inefficient temporary variables are worthless, but ISTM that Pavel's proposal is not exactly about temporary variables, it is about temporary-valued permanent-variables. So there is no temporary (on the fly) variable as such, and if it is extended for this purpose then indeed the catalog costs look expensive. > * Do you expect temporary/transient session variables to come into > existence when first set, or to require some form of explicit > definition? It is still an open question in the proposal I have written in the wiki. Opinions are welcome. A declaration is required for typing if a variable it set, but it may not be necessary for consulting the variable if the default value of non existing variables is NULL, but then typos are not errors, although they could be warnings... MySQL does on-assignment limited typing, eg "SET @foo = (...)::UNSIGNED". I prefer a declaration, but I could live with that kind of thing, although I do not like it because what happens on a subsequent "SET @foo = '17'::TEXT" is unclear... is the type changed or is the value cast to unsigned anyway? MySQL does the later. > * Does anyone care about or want variables whose value is shared > across sessions? If so, why? Set out use cases. Rather add them to the wiki, please! > * Does anyone care about or want variables whose value becomes visible > as soon as set, i.e. non-MVCC? If so, why? Set out use cases. Idem, in the wiki! (1) Having some kind of variable, especially in interactive mode, allows to manipulate previous results and reuse them later, without having to resort to repeated sub-queries or to retype non trivial values. Client side psql :-variables are untyped and unescaped, thus not very convenient for this purpose. (2) Writing a simple psql script which can be parameterized to some degree by setting some variables and then include the script. > * Does anyone care about or want variables whose value is persistent > on-disk across restarts and/or crashes, maybe recorded in WAL for > replication, etc? If so, justify how this is better than a relation in > real-world practical terms. IMO it is *not* better, but I wish that I could declare a table as a singleton. See the wiki. -- Fabien.
2016-12-30 11:03 GMT+01:00 Craig Ringer <craig@2ndquadrant.com>:
On 30 December 2016 at 17:29, Craig Ringer <craig@2ndquadrant.com> wrote:
> So .... lets take a step back or eight and ask "why?"
Oh, and speaking of, I see Pavel's approach as looking for a
PostgreSQL-adapted way to do something like Oracle's PL/SQL package
variables. Right Pavel?
It was main motivation - the question was - how to share (in one session) secure some information between function calls.
The PostgreSQL is specific in multi language support - but purpose is same.
If so, their properties are, as far as I as a non-Oracle-person can tell:
* Can be package-private or public. If public, can be both got and set
by anyone. If private, can be got and set directly only by code in
package. (Our equivalent is "by the owner"). As far as I can tell
outside access to package-private variables still uses the variable
get/set syntax, but is automatically proxied via getter/setter methods
defined in the package, if defined, otherwise inaccessible.
* Value not visible across sessions. Ever.
* Can have an initialiser / DEFAULT value.
* Non-persistent, value lost at session end.
A typical example, where package variables are init'd from a table:
http://www.dba-oracle.com/plsql/t_plsql_global_data.htm
which relies on package initializers, something we don't have (but can
work around easily enough with a little verbosity).
This shows both public vars and package-private ones.
See also https://docs.oracle.com/cd/B19306_01/appdev.102/b14261/ constantvar_declaration.htm
I believe these package variable properties are the properties Pavel
seeks to model/emulate. Declared statically, value persistent only
within the same session, non-transactional, can be private.
Certainly there's nothing here that requires us to allow GRANTs.
Simple ownership tests would supply us with similar functionality to
what Oracle users have, allowing for our lack of packages and
inability to hide the _existence_ of an object, only its contents.
The packages has own scope - so any access from packages is allowed. I cannot do it in Postgres without explicitly written setter/getter functions. So GRANTS reduces a requirement to write security definer envelop functions.
Sure - owner doesn't need it. If your application is one user, or if you are owner, then you don't need to use GRANT.
My views:
I am personally overwhelmingly opposed to variables that automagically
create themselves when dereferenced, a-la Perl. Write $serialised
(english spelling) not $serialized (US spelling) and you get a silent
null. Fun! Hell. No. This is why failure to "use strict" in Perl is a
near-criminal offense.
I'd also strongly prefer to require vars to be declared before first
use. Again, like "use strict", and consistent with how Pg behaves
elsewhere. Otherwise we need some kind of magic syntax to say "this is
a variable", plus vars that get created on first assignment suck
almost as badly as ones that're null on undefined deference. Spend
half an hour debugging and figure out that you typo'd an assignment.
Again, "use strict".
I fail to see any real utility to cross-session vars, persistent or
otherwise, at this point. Use a normal or unlogged relation.
I don't see the point of untyped variables with no ownership or rights
controls. (ab)use a GUC. Note that you can achieve both xact-scoped
and session-scoped that way, with xact-scoped vars assigned using SET
LOCAL being unwound on xact end.
Unless we also propose to add ON CONNECT triggers, I think some kind
of persistency of declaration is useful but not critical. We'll land
up with apps sending preambles of declarations on session start
otherwise. But the most compelling use cases are for things where
there'll be a procedure invoked by the user or app on connect anyway,
so it can declare stuff there. I'm utterly unconvinced that it's
necessary to have them in the catalogs to achieve static checking.
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Please Pavel, could you avoid citing a whole long mail just for commenting one point? >> * Why is it so necessary for plpgsql variables to be implemented as >> persistent entities that are in the catalogs in order for you to >> achieve the static checking you want to? Is this due to limitations of >> your approach in plpgsql_check, or more fundamental issues? Explain. > > There are few reasons: > > 1. plpgsql_check cannot to know a order of calls of functions. Indeed. > So any dynamic created object and related operations are not checkable > by plpgsql_check (or any tool). NO. Your first sentence does not imply this very general statement. Some things that I think can be statically proved within a session, that would cover some security concerns: (1) For statically named private dynamic variables declared/used at different points it can be checked without relying on the function order that all declarations are consistent, i.e. the same type, same default value if any. (2) Then the value of the variable is either the default value (eg NULL) or the assigned value at points of assignement, which must be a valid value for the type, otherwise the assignement would have failed. (3) If there is only one assignment in the code, then you know that the variable can only have been assigned a non default value from this point. Probably nice to have in a security context, but it requires you to be sure that you have access to all the code... (4) For a session boolean, then for code "IF @var IS NOT NULL AND NOT @var THEN RAISE 'cannot access'; END", in a sequence, then one can prove that for any pl code after this point in the sequence @var has been previously assigned to true, otherwise the exception would have been raised. AFAICS, for "static" session variables the only difference is that the declaration consistency (1) is slightly more built-in, although you still have to check that no function DROP/re-CREATE the session variable with a different type, which is quite close to checking (1) for a dynamic session variable. Other properties (2), (3), (4) are exactly the same. > 2. [...] 3. Please could you put your pros & cons in the wiki as well? Or don't you want to use it? -- Fabien.
2016-12-30 14:45 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
Please Pavel, could you avoid citing a whole long mail just for commenting one point?* Why is it so necessary for plpgsql variables to be implemented as
persistent entities that are in the catalogs in order for you to
achieve the static checking you want to? Is this due to limitations of
your approach in plpgsql_check, or more fundamental issues? Explain.
There are few reasons:
1. plpgsql_check cannot to know a order of calls of functions.
Indeed.So any dynamic created object and related operations are not checkable by plpgsql_check (or any tool).
NO. Your first sentence does not imply this very general statement.
If you have not unique definition, you cannot to it. There is not possibility different between typo and decision. We are talking about static analyze - so code should not be executed - and you don't know what function will be started first.
Some things that I think can be statically proved within a session, that would cover some security concerns:
(1) For statically named private dynamic variables declared/used at different points it can be checked without relying on the function order that all declarations are consistent, i.e. the same type, same default value if any.
what is "static" private dynamic variable ? You are using new terminology. Static variables like Clang static variables are not usable - you would to share content between different functions.
(2) Then the value of the variable is either the default value (eg NULL) or the assigned value at points of assignement, which must be a valid value for the type, otherwise the assignement would have failed.
(3) If there is only one assignment in the code, then you know that the
variable can only have been assigned a non default value from this point.
Probably nice to have in a security context, but it requires you to be sure that you have access to all the code...
(4) For a session boolean, then for code "IF @var IS NOT NULL AND NOT @var THEN RAISE 'cannot access'; END", in a sequence, then one can prove that for any pl code after this point in the sequence @var has been previously assigned to true, otherwise the exception would have been raised.
You are speaking about runtime check.There is not a problem to define some rules for runtime check.
What is not possible in your design
1.
BEGIN
RAISE NOTICE '%', @var;
END;
Without "execution", you cannot to say if usage of @var is correct or not
ok, we can use a DECLARE var
DECLARE @var EXTERNAL
BEGIN
RAISE NOTICE '%', @var;
END;
ok, I can do static check - but
1. anytime I have to repeat DECLARE statement
2. there is not possible to find typo in DECLARE statement
Regards
Pavel
AFAICS, for "static" session variables the only difference is that the declaration consistency (1) is slightly more built-in, although you still have to check that no function DROP/re-CREATE the session variable with a different type, which is quite close to checking (1) for a dynamic session variable.
Other properties (2), (3), (4) are exactly the same.2. [...] 3.
Please could you put your pros & cons in the wiki as well?
Or don't you want to use it?
--
Fabien.
2016-12-30 12:01 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:
2016-12-30 10:29 GMT+01:00 Craig Ringer <craig@2ndquadrant.com>:On 30 December 2016 at 16:46, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
>
>> Pavel's personal requirements include that it be well suited for
>> static analysis of plpgsql using his plpgsql_check tool. So he wants
>> persistent definitions.
>
>
> I've been in static analysis for the last 25 years, and the logic of this
> statement fails me.
I have no opinion here, as I've not seen plpgsql_check nor do I
understand the issues Pavel perceives with having dynamic definitions
of variables.
All I'm saying is that you two are talking around in circles by
repeating different requirements to each other, and it's not going to
get anywhere unless you both change your approach. It sounds like
you're already trying to do that.
> I do not think that a feature should be designed around the current
> limitations of a particular external tool, esp. if said tool can be improved
> at a reasonable cost.
Not arguing there.
I was initially inclined to favour Pavel's proposal because it fits a
RLS use case I was somewhat interested in. But so would dynamic
variables resolved at runtime so long as they were fast.
Personally I don't much care what the result is, so long as it can
satisfy some kind of reasonable security isolation, such that role A
can set it, B can read it but not set it, and role C can do neither.
Preferably without resorting to creating SECURITY DEFINER accessors,
since they're messy and slow. Support for data typing would also be
nice too.
If it doesn't deliver security controls then IMO there's not much
advantage over (ab)using GUCs with current_setting(...).
Exploring the other areas discussed:
Personally I think MVCC, persistent variables are a totally
unnecessary idea that solves a problem we don't have. But maybe I
don't understand your use cases. I expect anything like that would
land up using a pg_catalog relation as a k/v-like store with different
columns for different types or something of the like, which is really
something the user can do well enough for themselves. I don't see the
point at all.
Non-MVCC persistent variables would probably be prohibitively
expensive to make crash-safe, and seem even more pointless.
Now, I can see shared variables whose state is visible across backends
but is neither MVCC nor persistent being a fun toy, albeit not one I
find likely to be particularly useful personally. But we can probably
already do that in extensions, we've got most if not all of the needed
infrastructure. Because we're a shared-nothing-by-default system, such
variables will probably need shared memory segments that need to be
allocated and, if new vars are added or their values grow too much,
re-allocated. Plus locks to control access. All of which we can
already do. Most of the uses I can think of for such things are met
reasonably well by advisory locking already, and I expect most of the
rest would be met by autonomous commit, so it feels a bit like a
feature looking for a use-case.
So .... lets take a step back or eight and ask "why?"
Pavel:
* Why is it so necessary for plpgsql variables to be implemented as
persistent entities that are in the catalogs in order for you to
achieve the static checking you want to? Is this due to limitations of
your approach in plpgsql_check, or more fundamental issues? Explain.There are few reasons:1. plpgsql_check cannot to know a order of calls of functions. So any dynamic created object and related operations are not checkable by plpgsql_check (or any tool). If you create variable in one function, then this information is not available in other function.2. You can write some hints - like Fabien proposal - it is not vulnerable against typo. It is much more safer to have one "created" variable, then more "declared" variables and believe so all declarations are valid. The created variable is only on one place - you can do simple check if reference to variable is valid or not. With query to system catalogue, you can get the list of all variables - you can see very simply if some variables are redundant, obsolete, wrong.3. The catalogue objects allow to create well granularity of access rights. without it, you have to introduce only "PRIVATE" variables - and then you have to GRANT rights via security definer functions. There is not simple reply to question who can work with this variable, who has access, ... you have to check a rights to getter functions. When variables are catalogue objects, then this reply is simple. Catalogue objects allows to have "declared" security. Without catalogue objects we have to construct the security. For me, a declared security is stronger.
My concept is similar to "global" variables in C language. Without header files you can use it - but maybe linker fails, maybe runtime fails. For me - information in catalogue is exactly what we do by declaration in header file.
Regards
Pavel
RegardsPavel
Fabien:
* What use do you have for persistent-data variables? Please set out
some use cases where they solve problems that are currently hard to to
solve or greatly improve on the existing solutions.can
* On what basis do you _oppose_ persistently defining variables in the
catalogs as their own entities? (My own objection is that "temporary
variables" would make our existing catalog bloat issues for temp
objects even worse).
* Do you expect temporary/transient session variables to come into
existence when first set, or to require some form of explicit
definition?
Everyone:
* Does anyone care about or want variables whose value is shared
across sessions? If so, why? Set out use cases.
* Does anyone care about or want variables whose value becomes visible
as soon as set, i.e. non-MVCC? If so, why? Set out use cases.
* Does anyone care about or want variables whose value is persistent
on-disk across restarts and/or crashes, maybe recorded in WAL for
replication, etc? If so, justify how this is better than a relation in
real-world practical terms.Thank you for this summaryPavel
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Hello again, >> So any dynamic created object and related operations are not checkable by >>> plpgsql_check (or any tool). >> >> NO. Your first sentence does not imply this very general statement. > > If you have not unique definition, you cannot to it. There is not > possibility different between typo and decision. We are talking about > static analyze - so code should not be executed - and you don't know > what function will be started first. Yes, I assure you that I really know how static analysis works... All the properties I described below may be proved without executing the code, that was my point... >> Some things that I think can be statically proved within a session, that >> would cover some security concerns: >> >> (1) For statically named private dynamic variables declared/used at >> different points it can be checked without relying on the function order >> that all declarations are consistent, i.e. the same type, same default >> value if any. > > what is "static" private dynamic variable ? You are using new terminology. They are my session variable, I just spelled out some properties to be precise about what I am stating, otherwise it is a mess. The name of the variable is "static" (statically-named), i.e. it is known directly in the code. However the variable creation and value are "dynamic". > Static variables like Clang static variables are not usable - you would to > share content between different functions. Sure. I mean static as in "static analysis", i.e. by looking at the code without executing it, as you put it. >> (2) (3) (4) [...] > You are speaking about runtime check. Not really, I was speaking about properties statically provable, which I understood was your concern. Now the properties proved may imply a runtime assumption, for instance that the code has executed without error up to some point in the program, which is basically impossible to prove statically. > BEGIN > RAISE NOTICE '%', @var; > END; > > Without "execution", you cannot to say if usage of @var is correct or not It depends about your definition of "correct". For this very instance it would not matter: if the variable was not created beforehand, then an error is raised because it does not exist, if it was created before hand, then an error is raised because that is what the code is doing... So an error is always raised if the variable is not right. > ok, we can use a DECLARE var > > DECLARE @var EXTERNAL I do not know what you mean by 'EXTERNAL'. > BEGIN > RAISE NOTICE '%', @var; > END; > > ok, I can do static check - but > 1. anytime I have to repeat DECLARE statement Yes, twice in the complete use case: one for the function which checks the credentials, one for the getter function. > 2. there is not possible to find typo in DECLARE statement It is possible to find "some" typos, depending on the code: you can check that a variable is both assigned and used somewhere, otherwise it is very probably a typo. Perl does that *statically*, before executing a script. -- Fabien.
2016-12-30 15:34 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
Hello again,So any dynamic created object and related operations are not checkable byplpgsql_check (or any tool).
NO. Your first sentence does not imply this very general statement.
If you have not unique definition, you cannot to it. There is not possibility different between typo and decision. We are talking about static analyze - so code should not be executed - and you don't know what function will be started first.
Yes, I assure you that I really know how static analysis works... All the properties I described below may be proved without executing the code, that was my point...Some things that I think can be statically proved within a session, that
would cover some security concerns:
(1) For statically named private dynamic variables declared/used at
different points it can be checked without relying on the function order
that all declarations are consistent, i.e. the same type, same default
value if any.
what is "static" private dynamic variable ? You are using new terminology.
They are my session variable, I just spelled out some properties to be precise about what I am stating, otherwise it is a mess. The name of the variable is "static" (statically-named), i.e. it is known directly in the code. However the variable creation and value are "dynamic".Static variables like Clang static variables are not usable - you would to
share content between different functions.
Sure. I mean static as in "static analysis", i.e. by looking at the code without executing it, as you put it.(2) (3) (4) [...]You are speaking about runtime check.
Not really, I was speaking about properties statically provable, which I understood was your concern. Now the properties proved may imply a runtime assumption, for instance that the code has executed without error up to some point in the program, which is basically impossible to prove statically.BEGIN
RAISE NOTICE '%', @var;
END;
Without "execution", you cannot to say if usage of @var is correct or not
It depends about your definition of "correct".
For this very instance it would not matter: if the variable was not created beforehand, then an error is raised because it does not exist, if it was created before hand, then an error is raised because that is what the code is doing... So an error is always raised if the variable is not right.ok, we can use a DECLARE var
DECLARE @var EXTERNAL
I do not know what you mean by 'EXTERNAL'.
it means "used as shared in session"
BEGIN
RAISE NOTICE '%', @var;
END;
ok, I can do static check - but1. anytime I have to repeat DECLARE statement
Yes, twice in the complete use case: one for the function which checks the credentials, one for the getter function.2. there is not possible to find typo in DECLARE statement
It is possible to find "some" typos, depending on the code: you can check that a variable is both assigned and used somewhere, otherwise it is very probably a typo. Perl does that *statically*, before executing a script.
"Before execution" .. I am speaking "without execution".
theoretically, if you check all possible functions, you can find some issues - but you have to check all function on server. Elsewhere you cannot to find typo in DECLARE statement.
Regards
Pavel
--
Fabien.
>>> DECLARE @var EXTERNAL >> >> I do not know what you mean by 'EXTERNAL'. > > it means "used as shared in session" Shared by whom? There is no such thing in the proposal I have made on the wiki or in mails. In the proposal variables are private to the role which creates them. >> It is possible to find "some" typos, depending on the code: you can check >> that a variable is both assigned and used somewhere, otherwise it is very >> probably a typo. Perl does that *statically*, before executing a script. > > "Before execution" .. I am speaking "without execution". Before execution is also without execution. You can run "perl -c" to get the warning. > theoretically, if you check all possible functions, you can find some > issues - but you have to check all function on server. Yes, sure. It seems to be the same with your proposal: if a hidden function drops and recreates a session variable with a different type, or changes its permission, then some static checks are void as well, that is life. Also, a SQL function may access and modify the variables unexpectedly, that would be missed by a PL/pgSQL analysis tool... There is no miracle. Basically, there may be issues even if static analysis tools says that all is well. > Elsewhere you cannot to find typo in DECLARE statement. Indeed, probably there exists some class of typos that may not be found by some static analysis implementations on PL/pgSQL functions which uses basic session variables. By the way, are you planing to contribute to the wiki? https://wiki.postgresql.org/wiki/Variable_Design -- Fabien.
2016-12-30 18:39 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
DECLARE @var EXTERNAL
I do not know what you mean by 'EXTERNAL'.
it means "used as shared in session"
Shared by whom? There is no such thing in the proposal I have made on the wiki or in mails. In the proposal variables are private to the role which creates them.
shared by functions in session.
It is possible to find "some" typos, depending on the code: you can check
that a variable is both assigned and used somewhere, otherwise it is very
probably a typo. Perl does that *statically*, before executing a script.
"Before execution" .. I am speaking "without execution".
Before execution is also without execution. You can run "perl -c" to get the warning.theoretically, if you check all possible functions, you can find some
issues - but you have to check all function on server.
Yes, sure. It seems to be the same with your proposal: if a hidden function drops and recreates a session variable with a different type, or changes its permission, then some static checks are void as well, that is life. Also, a SQL function may access and modify the variables unexpectedly, that would be missed by a PL/pgSQL analysis tool... There is no miracle.
No - metadata, in my design, are persistent - like tables - so you don't calculate so any functions can drop a variables. The deployment of secure variables is same like table deployment. No dynamic there.
Basically, there may be issues even if static analysis tools says that all is well.Elsewhere you cannot to find typo in DECLARE statement.
Indeed, probably there exists some class of typos that may not be found by some static analysis implementations on PL/pgSQL functions which uses basic session variables.
yes, and I would not to append any new case that cannot be covered by plpgsql check. Dynamic SQL and our temporal tables are enough issues already.
By the way, are you planing to contribute to the wiki?
https://wiki.postgresql.org/wiki/Variable_Design
I wrote my notes there.
--
Fabien.
On 30 December 2016 at 21:00, Fabien COELHO <coelho@cri.ensmp.fr> wrote: > As for "slow", I have just tested overheads with pgbench, comparing a direct > arithmetic operation (as a proxy to a fast session variable consultation) to > constant returning plpgsql functions with security definer and security > invoker, on a direct socket connection, with prepared statements: > > select 1 + 0 : 0.020 ms > select one_sd() : 0.024 ms > select one_si() : 0.024 ms That's one call per executor run. Not really an effective test. Consider cases like row security where you're testing 10000 rows. Hopefully the planner will inline the test if it's a function declared stable, but it may not. > However the one-row property is just hoped for, and on principle a database > is about declaring constraints that are enforced afterwards. > > I see two clean solutions to this use case: declaring tables as one row, or > having scalar objects. I agree that's a common issue. The unique partial index on 1 hack in postgres works, though it's ugly. Adding a whole new different storage concept seems like massive overkill for this problem, which is minor and already easily solved. Someone could make 1-row tables prettier with a new constraint type instead maybe, if it's really considered that ugly. Personally I'd just document the unique expression index hack. CREATE UNIQUE INDEX onerow ON mytable((1)); >> * On what basis do you _oppose_ persistently defining variables in the >> catalogs as their own entities? > > In understand that you are speaking of "persistent session variables". > > For me a database is about persistence (metadata & data) with safety > (transactions) and security (permissions)... and maybe performance:-) > > Pavel's proposal creates a new object with 2 (secure metadata-persistence) > out of 4 properties... I'm not a ease with introducting a new half-database > concept in a database. I strongly disagree. If you want "all-database" properties ... use tables. We generally add new features when that's not sufficient to achieve something. Most notably SEQUENCEs, which deliberately violate transaction isolation and atomicity in order to deliver a compelling benefit not otherwise achieveable. Similarly for advisory locking. > On the other hand there are dynamic session variables (mysql, mssql, oracle > have some variants) which are useful on their own without pretending to be > database objects (no CREATE/ALTER/DROP, GRANT/REVOKE). We have precent here for sequences. Yes, they do confuse users, but they're also VERY useful, and the properties of variables would be clearer IMO. I'm not especially attached to doing them as database objects; I'm just as happy with something declared at session start by some function that then intends to set and use the variable. But I don't think your argument against a DDL-like approach holds water. >> (My own objection is that "temporary variables" would make our existing >> catalog bloat issues for temp objects even worse). > > > I do agree that inefficient temporary variables are worthless, but ISTM that > Pavel's proposal is not exactly about temporary variables, it is about > temporary-valued permanent-variables. So there is no temporary (on the fly) > variable as such, and if it is extended for this purpose then indeed the > catalog costs look expensive. I meant that we'd certainly want CREATE TEMPORARY VARIABLE for ones that go away at end of session, if we were going to have catalog-object-like variables. Which would result in catalog bloat. > (1) Having some kind of variable, especially in interactive mode, allows to > manipulate previous results and reuse them later, without having to resort > to repeated sub-queries or to retype non trivial values. > > Client side psql :-variables are untyped and unescaped, thus not very > convenient for this purpose. You can currently (ab)use user defined GUCs for this. Ugly, but effective, and honestly something we could bless into general use if we decided to. It's not that bad. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
2016-12-31 1:16 GMT+01:00 Craig Ringer <craig@2ndquadrant.com>:
On 30 December 2016 at 21:00, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
> As for "slow", I have just tested overheads with pgbench, comparing a direct
> arithmetic operation (as a proxy to a fast session variable consultation) to
> constant returning plpgsql functions with security definer and security
> invoker, on a direct socket connection, with prepared statements:
>
> select 1 + 0 : 0.020 ms
> select one_sd() : 0.024 ms
> select one_si() : 0.024 ms
That's one call per executor run. Not really an effective test.
Consider cases like row security where you're testing 10000 rows.
Hopefully the planner will inline the test if it's a function declared
stable, but it may not.
> However the one-row property is just hoped for, and on principle a database
> is about declaring constraints that are enforced afterwards.
>
> I see two clean solutions to this use case: declaring tables as one row, or
> having scalar objects.
I agree that's a common issue.
The unique partial index on 1 hack in postgres works, though it's ugly.
Adding a whole new different storage concept seems like massive
overkill for this problem, which is minor and already easily solved.
Someone could make 1-row tables prettier with a new constraint type
instead maybe, if it's really considered that ugly. Personally I'd
just document the unique expression index hack.
CREATE UNIQUE INDEX onerow ON mytable((1));
>> * On what basis do you _oppose_ persistently defining variables in the
>> catalogs as their own entities?
>
> In understand that you are speaking of "persistent session variables".
>
> For me a database is about persistence (metadata & data) with safety
> (transactions) and security (permissions)... and maybe performance:-)
>
> Pavel's proposal creates a new object with 2 (secure metadata-persistence)
> out of 4 properties... I'm not a ease with introducting a new half-database
> concept in a database.
I strongly disagree. If you want "all-database" properties ... use tables.
We generally add new features when that's not sufficient to achieve
something. Most notably SEQUENCEs, which deliberately violate
transaction isolation and atomicity in order to deliver a compelling
benefit not otherwise achieveable.
Similarly for advisory locking.
> On the other hand there are dynamic session variables (mysql, mssql, oracle
> have some variants) which are useful on their own without pretending to be
> database objects (no CREATE/ALTER/DROP, GRANT/REVOKE).
We have precent here for sequences. Yes, they do confuse users, but
they're also VERY useful, and the properties of variables would be
clearer IMO.
I'm not especially attached to doing them as database objects; I'm
just as happy with something declared at session start by some
function that then intends to set and use the variable. But I don't
think your argument against a DDL-like approach holds water.
>> (My own objection is that "temporary variables" would make our existing
>> catalog bloat issues for temp objects even worse).
>
>
> I do agree that inefficient temporary variables are worthless, but ISTM that
> Pavel's proposal is not exactly about temporary variables, it is about
> temporary-valued permanent-variables. So there is no temporary (on the fly)
> variable as such, and if it is extended for this purpose then indeed the
> catalog costs look expensive.
I meant that we'd certainly want CREATE TEMPORARY VARIABLE for ones
that go away at end of session, if we were going to have
catalog-object-like variables. Which would result in catalog bloat.
Because our catalog is MVCC, then bloating is unremovable - but if we implement global temporary tables, then metadata of temporary objects can be stored there - the main catalogue can be stable.
But the question? When you would to use local temporary variables? When you cannot to use global variables? Probably in adhoc scripts, in interactive work, ... It is minimal impact on catalogue.
The performance problems can be in PL usage, or intensive application usage - and there can be used global variables.
Analogy with our temporary tables - if we can use global temporary tables in critical PL, then local temporary tables can be nice feature perfect for interactive work, and nobody have to fix a catalogue bloat.
Design of possibility to do local temporary variable is minimal work. I don't afraid about performance when developers can use global variables as option
Regards
Pavel
> (1) Having some kind of variable, especially in interactive mode, allows to
> manipulate previous results and reuse them later, without having to resort
> to repeated sub-queries or to retype non trivial values.
>
> Client side psql :-variables are untyped and unescaped, thus not very
> convenient for this purpose.
You can currently (ab)use user defined GUCs for this. Ugly, but
effective, and honestly something we could bless into general use if
we decided to. It's not that bad.
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
>> unexpectedly, that would be missed by a PL/pgSQL analysis tool... There is >> no miracle. > > No - metadata, in my design, are persistent - like tables - so you don't > calculate so any functions can drop a variables. The deployment of secure > variables is same like table deployment. No dynamic there. You are missing my point: Static analysis is about proving properties. If you need metadata to be persistent, then you should check that it is the case, i.e. the static analysis must check that there is no metadata changes anywhere. For instance, an analysis tool should reject a function which contains: GRANT UPDATE ON VARIABLE super_secret_variable TO PUBLIC; Or does: DROP VARIABLE super_secret; CREATE VARIABLE super_secret ...; If a static analysis tool is specific to one language, then it can only checks that all is well in functions in those languages, but as there may be functions written in other languages as well then the check is somehow partial. This is not a bad thing, it just illustrate that you cannot check everything. That is quality ensurance. >> [...] Indeed, probably there exists some class of typos that may not be >> found by some static analysis implementations on PL/pgSQL functions >> which uses basic session variables. > > yes, and I would not to append any new case that cannot be covered by > plpgsql check. Dynamic SQL and our temporal tables are enough issues > already. I'm not sure that I understand these sentences. > I wrote my notes there. Great! I restructured a little bit and tried to improve the English. I also added questions when some statement that I think are too optimistic, or are unclear to me. -- Fabien.
2016-12-31 17:51 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
unexpectedly, that would be missed by a PL/pgSQL analysis tool... There is
no miracle.
No - metadata, in my design, are persistent - like tables - so you don't
calculate so any functions can drop a variables. The deployment of secure
variables is same like table deployment. No dynamic there.
You are missing my point: Static analysis is about proving properties. If you need metadata to be persistent, then you should check that it is the case, i.e. the static analysis must check that there is no metadata changes anywhere. For instance, an analysis tool should reject a function which contains:
GRANT UPDATE ON VARIABLE super_secret_variable TO PUBLIC;
It doesn't need to reject this functions - but this information is not visible in any other functions. But this tasks you are do in deployment scripts - not in functions.
Or does:
DROP VARIABLE super_secret;
CREATE VARIABLE super_secret ...;
But you don't do it in functions - these variables are persistent - you don't create it or drop inside functions. The content is secure, so you don't need to hide this variable against other.
If a static analysis tool is specific to one language, then it can only checks that all is well in functions in those languages, but as there may be functions written in other languages as well then the check is somehow partial. This is not a bad thing, it just illustrate that you cannot check everything. That is quality ensurance.[...] Indeed, probably there exists some class of typos that may not be found by some static analysis implementations on PL/pgSQL functions which uses basic session variables.
yes, and I would not to append any new case that cannot be covered by plpgsql check. Dynamic SQL and our temporal tables are enough issues already.
I'm not sure that I understand these sentences.
so I don't prefer any design that increase a area where plpgsql_check should not work.
I wrote my notes there.
Great! I restructured a little bit and tried to improve the English. I also added questions when some statement that I think are too optimistic, or are unclear to me.
we have just different perspectives
Regards
Pavel
--
Fabien.
Hello Craig, >> As for "slow", I have just tested overheads with pgbench, comparing a direct >> arithmetic operation (as a proxy to a fast session variable consultation) to >> constant returning plpgsql functions with security definer and security >> invoker, on a direct socket connection, with prepared statements: >> >> select 1 + 0 : 0.020 ms >> select one_sd() : 0.024 ms >> select one_si() : 0.024 ms > > That's one call per executor run. Not really an effective test. I really did 10 calls per transaction. For one call it was 24 ms vs 28 ms. However I'm not sure of the respective overheads of the protocol, planer and executor, though. > Consider cases like row security where you're testing 10000 rows. Another test: calling 1,000,000 times one_sd() or one_si() in a plpgsql loops seems to cost about 1.1 seconds on my laptop. I'd say that the function call is about 2/3 of that time, the rest is on the loop and exit test. SELECT NOW(); DO LANGUAGE plpgsql $$ DECLARE count INT DEFAULT 0; BEGIN LOOP count := count + ONE_SD() ; EXIT WHEN count = 1000000; END LOOP; END; $$; SELECT NOW(); Based on these evidences, I continue to think that there is no significant performance issue with calling simple security definer functions. > Hopefully the planner will inline the test if it's a function declared > stable, but it may not. Indeed they are, so the planner should factor out the test when possible. >>> * On what basis do you _oppose_ persistently defining variables in the >>> catalogs as their own entities? >> >> In understand that you are speaking of "persistent session variables". >> >> For me a database is about persistence (metadata & data) with safety >> (transactions) and security (permissions)... and maybe performance:-) >> >> Pavel's proposal creates a new object with 2 (secure metadata-persistence) >> out of 4 properties... I'm not a ease with introducting a new half-database >> concept in a database. > > I strongly disagree. If you want "all-database" properties ... use tables. Sure. I am not sure about what are you disagreeing with, as I'm just describing Pavel's proposal... > We generally add new features when that's not sufficient to achieve > something. Most notably SEQUENCEs, which deliberately violate > transaction isolation and atomicity in order to deliver a compelling > benefit not otherwise achieveable. Yes, sure. >> On the other hand there are dynamic session variables (mysql, mssql, oracle >> have some variants) which are useful on their own without pretending to be >> database objects (no CREATE/ALTER/DROP, GRANT/REVOKE). > > We have precent here for sequences. Yes, they do confuse users, but > they're also VERY useful, and the properties of variables would be > clearer IMO. Yep. But my point is that before adding a new strange object type I would prefer that there is no other solution. > I'm not especially attached to doing them as database objects; I'm > just as happy with something declared at session start by some > function that then intends to set and use the variable. But I don't > think your argument against a DDL-like approach holds water. I have expectations about objects hold by a database, and these new object fails them. If you do not have expectations, then all is fine. >> (1) Having some kind of variable, especially in interactive mode, allows to >> manipulate previous results and reuse them later, without having to resort >> to repeated sub-queries or to retype non trivial values. >> >> Client side psql :-variables are untyped and unescaped, thus not very >> convenient for this purpose. > > You can currently (ab)use user defined GUCs for this. How? It seems that I have missed the syntax to assign the result of a query to a user-defined guc, and to reuse it simply in a query. -- Fabien.
>> DROP VARIABLE super_secret; >> CREATE VARIABLE super_secret ...; > > But you don't do it in functions - these variables are persistent - you > don't create it or drop inside functions. The content is secure, so you > don't need to hide this variable against other. ISTM that you are still missing my point. I understood that you want a static analysis tool to re-assure you about how your session variables are manipulated. I do not see how such a tool can give any assurance without checking that the variable meta-data are not changed by some malicious code inserted in a function. >> >> I'm not sure that I understand these sentences. > > > so I don't prefer any design that increase a area where plpgsql_check > should not work. My assumption is that plpgsql_check can be improved. For instance, I assume that if "secure session variables" are added, then it will be enhanced to do some checking about these and take them into account. If "simple session variables" are added, I assume that it would also be updated accordingly. >> I wrote my notes there. >>> >> >> Great! I restructured a little bit and tried to improve the English. I >> also added questions when some statement that I think are too optimistic, >> or are unclear to me. > > we have just different perspectives I'm trying to have sentences that are both clear and true. If I think that a sentence is imprecise because it is missing a key hypothesis, then I try to improve it, whether it is mine or someone else. -- Fabien.
If you do not have expectations, then all is fine.(1) Having some kind of variable, especially in interactive mode, allows to
manipulate previous results and reuse them later, without having to resort
to repeated sub-queries or to retype non trivial values.
Client side psql :-variables are untyped and unescaped, thus not very
convenient for this purpose.
You can currently (ab)use user defined GUCs for this.
How? It seems that I have missed the syntax to assign the result of a query to a user-defined guc, and to reuse it simply in a query.
postgres=# select set_config('myvar.text', (select current_timestamp::text), false);
+-------------------------------+
| set_config |
+-------------------------------+
| 2016-12-31 18:56:42.894246+01 |
+-------------------------------+
(1 row)
Time: 0,448 ms
postgres=# select current_setting('myvar.text');
+-------------------------------+
| current_setting |
+-------------------------------+
| 2016-12-31 18:56:42.894246+01 |
+-------------------------------+
(1 row)
--
Fabien.
2016-12-31 18:46 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
DROP VARIABLE super_secret;
CREATE VARIABLE super_secret ...;
But you don't do it in functions - these variables are persistent - you
don't create it or drop inside functions. The content is secure, so you
don't need to hide this variable against other.
ISTM that you are still missing my point.
I understood that you want a static analysis tool to re-assure you about how your session variables are manipulated. I do not see how such a tool can give any assurance without checking that the variable meta-data are not changed by some malicious code inserted in a function.
if you afraid this, then just use grep to verify functions that have this code. It is same like tables - you can generate it dynamicly, but is risks - similar to use dynamic SQL. Sure, there is a exceptions - but there are rules for PL - don't use dynamic SQL if it is not deadly necessary, use SQL security, not own, ...
I'm not sure that I understand these sentences.
so I don't prefer any design that increase a area where plpgsql_check
should not work.
My assumption is that plpgsql_check can be improved. For instance, I assume that if "secure session variables" are added, then it will be enhanced to do some checking about these and take them into account. If "simple session variables" are added, I assume that it would also be updated accordingly.
in simple session variables there are not any safe point - any authoritative point. Sure I can do some - I can introduce some hints, etc - but it is workaround - nothing more - it like C development without header files.
I wrote my notes there.
Great! I restructured a little bit and tried to improve the English. I
also added questions when some statement that I think are too optimistic,
or are unclear to me.
we have just different perspectives
I'm trying to have sentences that are both clear and true. If I think that a sentence is imprecise because it is missing a key hypothesis, then I try to improve it, whether it is mine or someone else.
--
Fabien.
Hello Pavel, and Happy new year! >> (1) Having some kind of variable, especially in interactive mode, allows to >>>> manipulate previous results and reuse them later, without having to >>>> resort to repeated sub-queries or to retype non trivial values. >>>> >>>> Client side psql :-variables are untyped and unescaped, thus not very >>>> convenient for this purpose. >>> >>> You can currently (ab)use user defined GUCs for this. >> >> How? It seems that I have missed the syntax to assign the result of a >> query to a user-defined guc, and to reuse it simply in a query. > postgres=# select set_config('myvar.text', (select > current_timestamp::text), false); Thanks for the pointer! The documentation is rather scarse... They are indeed session or transaction-alive. They seem to be user-private, which is good. However they are text only, casts are needed in practice as shown by your example, and I find the syntax quite unfriendly for interactive use. I'm not sure about performance. I have added a subsection about them in the wiki. -- Fabien.
Hello Craig, and happy new year, > Someone asked me off-list what use cases such a thing would have, > since it seems not to be spelled out very clearly in this discussion. > I think we're all assuming knowledge here. > > So. > > * Session starts > * app does SELECT setup_user('user-auth-key-data', 'some-other-blob') > ** setup_user is SECURITY DEFINER to 'appadmin' > ** 'appadmin' owns a variable IS_AUDITOR. Other roles have only read > access to it. > ** setup_user(...) does whatever expensive/slow work it has to do > ** setup_user sets USER_IS_AUDITOR var > * Later RLS policies simply reference USER_IS_AUDITOR var. They don't > need to know the 'user-auth-key-data', or do whatever expensive > processing that it does. > * Other later triggers, etc, also reference USER_IS_AUDITOR > * User cannot make themselves an auditor by SETting USER_IS_AUDITOR > > That's the general idea. After giving it some thoughts, I have a question about this use case wrt to transactions: What if setup_user() succeeds as a function but the transaction it belongs to fails for some reason (eg deferred constraints, other operation related to setting user up but outside of this function fails, there is replication issue... whatever, a transaction may fail by definition)? ISTM that the security models requires that USER_IS_AUDITOR is reverted, so although it is definitely a session variable, it must be transactional (MVCC) nevertheless. -- Fabien.
2017-01-01 11:28 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
Hello Pavel, and Happy new year!(1) Having some kind of variable, especially in interactive mode, allows tomanipulate previous results and reuse them later, without having to
resort to repeated sub-queries or to retype non trivial values.
Client side psql :-variables are untyped and unescaped, thus not very
convenient for this purpose.
You can currently (ab)use user defined GUCs for this.
How? It seems that I have missed the syntax to assign the result of a
query to a user-defined guc, and to reuse it simply in a query.postgres=# select set_config('myvar.text', (select
current_timestamp::text), false);
Thanks for the pointer! The documentation is rather scarse...
They are indeed session or transaction-alive. They seem to be user-private, which is good. However they are text only, casts are needed in practice as shown by your example, and I find the syntax quite unfriendly for interactive use. I'm not sure about performance.
With some simple getter/setter functions you can get better comfort.
For not text variables you needs one cast more - probably only "date" "timestamp" can be noticeable slower.
Regards
Pavel
I have added a subsection about them in the wiki.
--
Fabien.
On 1 Jan. 2017 20:03, "Fabien COELHO" <coelho@cri.ensmp.fr> wrote:
What if setup_user() succeeds as a function but the transaction it belongs to fails for some reason (eg deferred constraints, other operation related to setting user up but outside of this function fails, there is replication issue... whatever, a transaction may fail by definition)?
ISTM that the security models requires that USER_IS_AUDITOR is reverted, so although it is definitely a session variable, it must be transactional (MVCC) nevertheless.
No strong opinion here.
IMO the simplest answer should be the main focus here: if it's session level, it's session level. Not kinda-sesion-level kinda-transaction-level.
I can see occasional uses for what you describe though. If we landed up with an xact scope option like we have for SET LOCAL GUCs, the option to mark it ON COMMIT RESET or ON COMMIT SET would be useful I guess. I'm not sure if it's worth the complexity.
I guess defaulting to rolling back variable effects on xact rollback would be ok too. Just kind of limiting.
2017-01-02 3:06 GMT+01:00 Craig Ringer <craig@2ndquadrant.com>:
On 1 Jan. 2017 20:03, "Fabien COELHO" <coelho@cri.ensmp.fr> wrote:
What if setup_user() succeeds as a function but the transaction it belongs to fails for some reason (eg deferred constraints, other operation related to setting user up but outside of this function fails, there is replication issue... whatever, a transaction may fail by definition)?
ISTM that the security models requires that USER_IS_AUDITOR is reverted, so although it is definitely a session variable, it must be transactional (MVCC) nevertheless.No strong opinion here.IMO the simplest answer should be the main focus here: if it's session level, it's session level. Not kinda-sesion-level kinda-transaction-level.I can see occasional uses for what you describe though. If we landed up with an xact scope option like we have for SET LOCAL GUCs, the option to mark it ON COMMIT RESET or ON COMMIT SET would be useful I guess. I'm not sure if it's worth the complexity.
In my proposal was support for transaction scope - ON COMMIT RESET clause should be ok
Regards
Pavel
I guess defaulting to rolling back variable effects on xact rollback would be ok too. Just kind of limiting.
Hello Craig, >> What if setup_user() succeeds as a function but the transaction it >> belongs to fails for some reason (eg deferred constraints, other >> operation related to setting user up but outside of this function >> fails, there is replication issue... whatever, a transaction may fail >> by definition)? >> >> ISTM that the security models requires that USER_IS_AUDITOR is >> reverted, so although it is definitely a session variable, it must be >> transactional (MVCC) nevertheless. > > No strong opinion here. > > IMO the simplest answer should be the main focus here: if it's session > level, it's session level. Not kinda-sesion-level kinda-transaction-level. There is no contradiction between session level & transactions: a session executes transactions, fine. TEMP tables are MVCC *and* session level. > I can see occasional uses for what you describe though. My question is not strictly about use, it is about a key security point related to the presented use case, which is about security. The whole discussion of the thread being about somehow-secured session variables. ISTM that if the transaction setting the value fails and the secure variable says that all is well thus allows other operations to proceed believing that the credentials have been veted while in reality they have not, that's no good. So my understanding of this use case is that the involved session variable which hold the state must be transactional. Other use cases may have different requirements and security implications. > If we landed up with an xact scope option like we have for SET LOCAL > GUCs, ISTM that it is a little different. The GUC local option makes the variable value always disappear after the xacts, whether it succeeds or not. The semantics needed here is that the value must disappear if the xact fails but not if it succeeds, which is the current behavior of GUCs with is_local=FALSE. > the option to mark it ON COMMIT RESET or ON COMMIT SET would be > useful I guess. I'm not sure if it's worth the complexity. My question right now is rather to determine what are the precise and hard requirements of the use case. > I guess defaulting to rolling back variable effects on xact rollback > would be ok too. Just kind of limiting. Yep, the variable value must be rolled back, I think. -- Fabien.
2017-01-02 10:39 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
Hello Craig,What if setup_user() succeeds as a function but the transaction it belongs to fails for some reason (eg deferred constraints, other operation related to setting user up but outside of this function fails, there is replication issue... whatever, a transaction may fail by definition)?
ISTM that the security models requires that USER_IS_AUDITOR is reverted, so although it is definitely a session variable, it must be transactional (MVCC) nevertheless.
No strong opinion here.
IMO the simplest answer should be the main focus here: if it's session
level, it's session level. Not kinda-sesion-level kinda-transaction-level.
There is no contradiction between session level & transactions: a session executes transactions, fine. TEMP tables are MVCC *and* session level.I can see occasional uses for what you describe though.
My question is not strictly about use, it is about a key security point related to the presented use case, which is about security. The whole discussion of the thread being about somehow-secured session variables.
ISTM that if the transaction setting the value fails and the secure variable says that all is well thus allows other operations to proceed believing that the credentials have been veted while in reality they have not, that's no good.
So my understanding of this use case is that the involved session variable which hold the state must be transactional. Other use cases may have different requirements and security implications.If we landed up with an xact scope option like we have for SET LOCAL GUCs,
ISTM that it is a little different. The GUC local option makes the variable value always disappear after the xacts, whether it succeeds or not. The semantics needed here is that the value must disappear if the xact fails but not if it succeeds, which is the current behavior of GUCs with is_local=FALSE.the option to mark it ON COMMIT RESET or ON COMMIT SET would be useful I guess. I'm not sure if it's worth the complexity.
My question right now is rather to determine what are the precise and hard requirements of the use case.I guess defaulting to rolling back variable effects on xact rollback would be ok too. Just kind of limiting.
Yep, the variable value must be rolled back, I think.
attention! rollback is significantly expensive than RESET.
There are no any product where variables are transactional - we should not to create wheel.
Regards
Pavel
--
Fabien.
Hello Pavel, > In my proposal was support for transaction scope - ON COMMIT RESET clause > should be ok Could you update the wiki, both the proposal and the use-case implementation, to reflect this point? Moreover, is there any actual use-case for non-transactional secure half-persistent session variables? AFAICS the "secure" part implies both permissions and transactional for the presented security-related use case. If there is no use case for these combined features, then ISTM that you should update to proposal so that the variables are always transactional, which is both simpler, more consistent, and I think more acceptable. Also, you used a TEMPORARY session variable in one implementation, but this is not described in the proposal, I think it is worth mentioning it there as well. -- Fabien.
>> Yep, the variable value must be rolled back, I think. > > Attention! rollback is significantly expensive than RESET. I'm quite unclear about the difference... Transactional for an unshared only-in-memory session object is probably easy to implement, no WAL is needed... So I do not see the difference. > There are no any product where variables are transactional - we should not > to create wheel. Well, AFAICS PostgreSQL GUCs are transactional. -- Fabien.
2017-01-02 11:48 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
Hello Pavel,In my proposal was support for transaction scope - ON COMMIT RESET clause
should be ok
Could you update the wiki, both the proposal and the use-case implementation, to reflect this point?
Moreover, is there any actual use-case for non-transactional secure half-persistent session variables? AFAICS the "secure" part implies both permissions and transactional for the presented security-related use case. If there is no use case for these combined features, then ISTM that you should update to proposal so that the variables are always transactional, which is both simpler, more consistent, and I think more acceptable.
If you are transaction sensitive, then you have to be sensitive to subtransactions - then the work is much more complex.
Is there use case, when you would to play with transactions and variables and RESET is not enough?
Also, you used a TEMPORARY session variable in one implementation, but this is not described in the proposal, I think it is worth mentioning it there as well.
I will fix it.
Regards
Pavel
--
Fabien.
>>> Attention! rollback is significantly expensive than RESET. >> >> I'm quite unclear about the difference... Transactional for an unshared >> only-in-memory session object is probably easy to implement, no WAL is >> needed... So I do not see the difference > you have to store previous value This does not fully answer my question. Maybe RESET would put NULL instead of the previous value in a rollback? If so, I must admit that I do not see any fundamental issue with holding temporarily the initial value of an in-memory session variables so as to be able to rool it back if required... >>> There are no any product where variables are transactional - we should >>> not to create wheel. >> >> Well, AFAICS PostgreSQL GUCs are transactional. > that is exception .. That is just logic: if you make an argument based on "it does not exist", then the argument is void if someone produces a counter example. > show me some transactiinal variables from msql, oracle, db2 I do not really know these three particular products. All I can say is that from a semantical point of view the contents of any one-row temporary relation is somehow a transactional session variable. However I do not know whether the 3 cited products have temporary tables, this is just a guess. -- Fabien.
Hello, >>> In my proposal was support for transaction scope - ON COMMIT RESET >>> clause should be ok >> >> Could you update the wiki, both the proposal and the use-case >> implementation, to reflect this point? >> >> Moreover, is there any actual use-case for non-transactional secure >> half-persistent session variables? AFAICS the "secure" part implies both >> permissions and transactional for the presented security-related use case. >> If there is no use case for these combined features, then ISTM that you >> should update to proposal so that the variables are always transactional, >> which is both simpler, more consistent, and I think more acceptable. > > If you are transaction sensitive, then you have to be sensitive to > subtransactions - then the work is much more complex. Maybe, probably, I do not really know. For now, I'm trying to determine how the proposals fits Craig's use case. The current status is that both proposals are useless because the use case needs "some" transactional property for security. But probably some improvements are possible. > Is there use case, when you would to play with transactions and variables > and RESET is not enough? I do not know. If you explain more clearly what is meant by a "RESET" on a variable when the transaction fails, then maybe I can have an opinion. Currently I'm just guessing in the dark the precise intended semantics. -- Fabien.
2017-01-02 16:55 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
Hello,In my proposal was support for transaction scope - ON COMMIT RESET clause should be ok
Could you update the wiki, both the proposal and the use-case
implementation, to reflect this point?
Moreover, is there any actual use-case for non-transactional secure
half-persistent session variables? AFAICS the "secure" part implies both
permissions and transactional for the presented security-related use case.
If there is no use case for these combined features, then ISTM that you
should update to proposal so that the variables are always transactional,
which is both simpler, more consistent, and I think more acceptable.
If you are transaction sensitive, then you have to be sensitive to
subtransactions - then the work is much more complex.
Maybe, probably, I do not really know. For now, I'm trying to determine how the proposals fits Craig's use case.
The current status is that both proposals are useless because the use case needs "some" transactional property for security. But probably some improvements are possible.Is there use case, when you would to play with transactions and variables
and RESET is not enough?
I do not know. If you explain more clearly what is meant by a "RESET" on a variable when the transaction fails, then maybe I can have an opinion. Currently I'm just guessing in the dark the precise intended semantics.
reset can means "set to default"
Now when I though about it - this scenario is not interesting for PL - probably can be interesting for some interactive work. In PL you can handle transactions - so you know if was or was not any exceptions. And if you didn't handle the exception, then you are in "need rollback state", so you cannot to anything - look on variable value too. In PL is usually important transaction start - difficult question if it can means subtransaction start too.
Regards
Pavel
--
Fabien.
Hello Pavel, PLEASE, could you remove the parts of emails you are not responding to when replying in the thread? THANKS. >>>> The current status is that both proposals are useless because the use >>>> case needs "some" transactional property for security. But probably >>>> some improvements are possible. >>> >>> Is there use case, when you would to play with transactions and >>> variables and RESET is not enough? >> >> I do not know. If you explain more clearly what is meant by a "RESET" on a >> variable when the transaction fails, then maybe I can have an opinion. >> Currently I'm just guessing in the dark the precise intended semantics. > > reset can means "set to default" "can"? The question is what does it mean in your proposal, not what it may mean. So I understand that it means "variable always reset to its default value at the end of the transaction". > Now when I though about it - this scenario is not interesting for PL - > probably can be interesting for some interactive work. In PL you can > handle transactions - so you know if was or was not any exceptions. And > if you didn't handle the exception, then you are in "need rollback > state", so you cannot to anything - look on variable value too. In PL is > usually important transaction start - difficult question if it can means > subtransaction start too. What I understand from this use case variation is that the secure variable is expected to be set & used only *within* a single transaction, although across multiple functions typically called from some server-side PL-script, so that its value outside of the transaction does not matter wrt to security concerns. Did I understand? For this use-case, ISTM that the scope of the variable is necessarily the transaction, not the session, i.e. like using "set_config(..., TRUE)". -- Fabien.
2017-01-03 13:03 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
Hello Pavel,
PLEASE, could you remove the parts of emails you are not responding to when replying in the thread? THANKS.The current status is that both proposals are useless because the use case needs "some" transactional property for security. But probably some improvements are possible.
Is there use case, when you would to play with transactions and variables and RESET is not enough?
I do not know. If you explain more clearly what is meant by a "RESET" on a
variable when the transaction fails, then maybe I can have an opinion.
Currently I'm just guessing in the dark the precise intended semantics.
reset can means "set to default"
"can"? The question is what does it mean in your proposal, not what it may mean. So I understand that it means "variable always reset to its default value at the end of the transaction".
yes
Now when I though about it - this scenario is not interesting for PL - probably can be interesting for some interactive work. In PL you can handle transactions - so you know if was or was not any exceptions. And if you didn't handle the exception, then you are in "need rollback state", so you cannot to anything - look on variable value too. In PL is usually important transaction start - difficult question if it can means subtransaction start too.
What I understand from this use case variation is that the secure variable is expected to be set & used only *within* a single transaction, although across multiple functions typically called from some server-side PL-script, so that its value outside of the transaction does not matter wrt to security concerns. Did I understand?
When you are running under only one transaction, then you don't need to solve reset variables on rollback, because you cannot do anything when system fails. Only when you are handling a exception, then system continue, and you can or you cannot to set the variable.
The usual scenario of using secure content is not related to transactions - it is related to session.
There are two kind of functions
A. slow and expensive that creates secure content/context
B. other that use secure content/context
When you are running A, then some secure context is initialised or it is invalided. When A fails, then B doesn't work. When A is successful, then context is valid to another call of A. Next call of A set context or invalidate context.
The transactions play nothing in this game.
The content of variable (used in PL) is defined by scope - not by successful or unsuccessful transactions. The security content will be valid although user have to do rollback.
For this use-case, ISTM that the scope of the variable is necessarily the transaction, not the session, i.e. like using "set_config(..., TRUE)".
--
Fabien.
Hello again, *** PLEASE, could you remove the parts of emails you are not responding to when replying in the thread? THANKS. *** >> [...] Did I understand? I guess that the answer is "no":-) > When you are running under only one transaction, then you don't need to > solve reset variables on rollback, because you cannot do anything when > system fails. Only when you are handling a exception, then system > continue, and you can or you cannot to set the variable. Sorry, I do not understand these sentences. > The usual scenario of using secure content is not related to > transactions - it is related to session. There are two kind of functions > > A. slow and expensive that creates secure content/context > B. other that use secure content/context > > When you are running A, then some secure context is initialised or it is > invalided. When A fails, then B doesn't work. Yes, I understand that it is the expected property: B must not work if A has failed... I'm trying to understand what properties are required on the session variables wrt to how A ran to achieve this. > When A is successful, then context is valid to another call of A. Next > call of A set context or invalidate context. The transactions play > nothing in this game. Anything in PostgreSQL is always under a transaction... My concern is for the following: -- in a session, there is a transaction BEGIN; SELECT A(...); -- in A: -- -> check this and that ... -- -> insert in log ... -- -> update something else ... -- -> all seems fine... -- SET SESSION VARIABLE status_ok= TRUE; -- -> could do something else... -- return from A ROLLBACK; -- the commit fails, becausesome differed trigger somewhere is unhappy -- or the user changed its mind... Now A has failed, but this could not be known from within the function, and the status_ok is wrong. If the session proceeds with: SELECT B(); Then B believes that A succeeded, which is not the case. The key issue is that the final status (commit or rollback) of the containing transaction cannot be known from within the function, so the session variables cannot reflect this status. So somehow the status_ok variable must be (1) rolledback to previous value or (2) removed (3) set to FALSE or (4) set to NULL. It cannot be TRUE if A containing transactions has failed for the security as I understand it. Maybe it could work with subtransactions: A calls A', A' succeeds (return, COMMIT is ok), *then* set user_status = ok. The session variables reflects that A' succeeded, and if A fails later it is ok because the security is based on the success of A', not the one of A. However, I'm not sure how subtransactions can be stated simply and within a session in pg. > The content of variable (used in PL) is defined by scope - not by > successful or unsuccessful transactions. The security content will be valid > although user have to do rollback. I do not understand how the "security context" can be valid of there has been a rollback which has cancelled all operations: Some log may not have been written for instance, which would be a key assumption for establishing the validity of the security context. -- Fabien.
2017-01-03 15:40 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
Hello again,
*** PLEASE, could you remove the parts of emails you are not responding to
when replying in the thread? THANKS. ***[...] Did I understand?
I guess that the answer is "no":-)When you are running under only one transaction, then you don't need to solve reset variables on rollback, because you cannot do anything when system fails. Only when you are handling a exception, then system continue, and you can or you cannot to set the variable.
Sorry, I do not understand these sentences.The usual scenario of using secure content is not related to transactions - it is related to session. There are two kind of functions
A. slow and expensive that creates secure content/context
B. other that use secure content/context
When you are running A, then some secure context is initialised or it is
invalided. When A fails, then B doesn't work.
Yes, I understand that it is the expected property: B must not work if A has failed... I'm trying to understand what properties are required on the session variables wrt to how A ran to achieve this.When A is successful, then context is valid to another call of A. Next call of A set context or invalidate context. The transactions play nothing in this game.
Anything in PostgreSQL is always under a transaction... My concern is for the following:
-- in a session, there is a transaction
BEGIN;
SELECT A(...);
-- in A:
-- -> check this and that ...
-- -> insert in log ...
-- -> update something else ...
-- -> all seems fine...
-- SET SESSION VARIABLE status_ok = TRUE;
-- -> could do something else...
-- return from A
ROLLBACK;
-- the commit fails, because some differed trigger somewhere is unhappy
-- or the user changed its mind...
Now A has failed, but this could not be known from within the function, and the status_ok is wrong. If the session proceeds with:
SELECT B();
Then B believes that A succeeded, which is not the case.
No, just your design is unhappy
SELECT A(..)
SET SESSION VARIABLE status_ok = false;
-- do all, if fails there, then follow line fails too, or never be executed
SET SESSION VARIABLE status_ok = true;
or
SET SESSION VARIABLE status_ok = true
TRY
do something
CATCH
ROLLBACK
SET SESSION VARIABLE status_ok = false
Both I can do in current PL
The key issue is that the final status (commit or rollback) of the containing transaction cannot be known from within the function, so the session variables cannot reflect this status.
So somehow the status_ok variable must be (1) rolledback to previous value or (2) removed (3) set to FALSE or (4) set to NULL. It cannot be TRUE if A containing transactions has failed for the security as I understand it.
you don't need do rollback variable if you write well A
Maybe it could work with subtransactions: A calls A', A' succeeds (return, COMMIT is ok), *then* set user_status = ok. The session variables reflects that A' succeeded, and if A fails later it is ok because the security is based on the success of A', not the one of A. However, I'm not sure how subtransactions can be stated simply and within a session in pg.The content of variable (used in PL) is defined by scope - not by
successful or unsuccessful transactions. The security content will be valid
although user have to do rollback.
I do not understand how the "security context" can be valid of there has been a rollback which has cancelled all operations: Some log may not have been written for instance, which would be a key assumption for establishing the validity of the security context.
If you use patterns that I wrote - the security context will be valid always
--
Fabien.
****** PLEASE ****** COULD YOU REMOVE THE PARTS OF EMAILS YOU ARE NOT RESPONDING TO WHEN REPLYING IN THE THREAD? ****** THANKS ****** >> [...] Then B believes that A succeeded, which is not the case. > > No, just your design is unhappy > SELECT A(..) > SET SESSION VARIABLE status_ok = false; > -- do all, if fails there, > -- then follow line fails too, or never be executed > SET SESSION VARIABLE status_ok = true; My point is that there is no commit in this code, the commit is performed *AFTER* the last set session, and it mail fail then. > or > > SET SESSION VARIABLE status_ok = true > TRY > do something > CATCH > ROLLBACK > SET SESSION VARIABLE status_ok = false > > Both I can do in current PL The fact that "do something" worked does not preclude the overall transaction to work and to revert "do something" and let status_ok as true. >> The key issue is that the final status (commit or rollback) of the >> containing transaction cannot be known from within the function, so the >> session variables cannot reflect this status. >> >> So somehow the status_ok variable must be (1) rolledback to previous value >> or (2) removed (3) set to FALSE or (4) set to NULL. It cannot be TRUE if A >> containing transactions has failed for the security as I understand it. >> > you don't need do rollback variable if you write well A My point is that A may still fail *after* setting the variable, because it is in a transaction. > If you use patterns that I wrote - the security context will be valid > always. No: This pattern assumes that operations started in the "TRY" zone cannot fail later on... This assumption is false because of possible deferred triggers for instance. See attached example: NOTICE: SET secured = FALSE NOTICE: SET secured = TRUE ERROR: insert or update on table "log" violates foreign key constraint "log_sid_fkey" DETAIL: Key (sid)=(3) is not present in table "stuff". The error occurs after secured has been set to TRUE. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
2017-01-03 17:33 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
****** PLEASE ******
COULD YOU REMOVE THE PARTS OF EMAILS YOU ARE NOT RESPONDING TO WHEN REPLYING IN THE THREAD?
****** THANKS ******[...] Then B believes that A succeeded, which is not the case.
No, just your design is unhappySELECT A(..)
SET SESSION VARIABLE status_ok = false;
-- do all, if fails there,
-- then follow line fails too, or never be executed
SET SESSION VARIABLE status_ok = true;
My point is that there is no commit in this code, the commit is performed
*AFTER* the last set session, and it mail fail then.
or
SET SESSION VARIABLE status_ok = true
TRY
do something
CATCH
ROLLBACK
SET SESSION VARIABLE status_ok = false
Both I can do in current PL
The fact that "do something" worked does not preclude the overall transaction to work and to revert "do something" and let status_ok as true.The key issue is that the final status (commit or rollback) of theyou don't need do rollback variable if you write well A
containing transaction cannot be known from within the function, so the
session variables cannot reflect this status.
So somehow the status_ok variable must be (1) rolledback to previous value
or (2) removed (3) set to FALSE or (4) set to NULL. It cannot be TRUE if A
containing transactions has failed for the security as I understand it.
My point is that A may still fail *after* setting the variable, because it is in a transaction.If you use patterns that I wrote - the security context will be valid
always.
No: This pattern assumes that operations started in the "TRY" zone cannot fail later on... This assumption is false because of possible deferred triggers for instance. See attached example:
ok .. it is pretty artificial, but ok. In this case the reset to NULL on ROLLBACK should be enough.
NOTICE: SET secured = FALSE
NOTICE: SET secured = TRUE
ERROR: insert or update on table "log" violates foreign key constraint "log_sid_fkey"
DETAIL: Key (sid)=(3) is not present in table "stuff".
The error occurs after secured has been set to TRUE.
It is possible only if you are use deferred constraints. It is hard to imagine this scenario in functions like A. Probably you would not to risk on rollback log information. So you will use there elog or some form of autonomous transaction.
--
Fabien.
>> ****** PLEASE ****** >> COULD YOU REMOVE THE PARTS OF EMAILS YOU ARE NOT RESPONDING TO WHEN >> REPLYING IN THE THREAD? >> ****** THANKS ****** Hmmm. It seems that you can't. You should, really. >>> If you use patterns that I wrote - the security context will be valid >>> always. >> >> No: This pattern assumes that operations started in the "TRY" zone >> cannot fail later on... This assumption is false because of possible >> deferred triggers for instance. See attached example: > > ok .. it is pretty artificial, but ok. Good. We seem to agree that some kind of transactional support is needed for the use case, which is pretty logical. > In this case the reset to NULL on ROLLBACK should be enough. Probably. So I expect that you are going to update your proposal somehow to provide some transactional properties. Note that if you have some mecanism for doing a NULL on rollback, then why not just keep and reset the previous value if needed? This just means that you have a transactional variable, which is fine from my point of view. As I already wrote, session variable are memory only, so transactional does not involve costs such as WAL. Also note that user-defined GUCs already implements the transactional property, so probably the mecanism is already available and can be reused. -- Fabien.
On 1/3/17 10:33 AM, Fabien COELHO wrote: > > ****** PLEASE ****** > > COULD YOU REMOVE THE PARTS OF EMAILS YOU ARE NOT RESPONDING TO WHEN > REPLYING IN THE THREAD? > > ****** THANKS ****** +1. Frankly, I've been skipping most of your (Pavel) replies in this thread because of this. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532)
2017-01-03 18:52 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
****** PLEASE ******
COULD YOU REMOVE THE PARTS OF EMAILS YOU ARE NOT RESPONDING TO WHEN
REPLYING IN THE THREAD?
****** THANKS ******
Hmmm. It seems that you can't. You should, really.
I am sorry - The gmail client mask me these parts. I'll clean it more
If you use patterns that I wrote - the security context will be valid always.
No: This pattern assumes that operations started in the "TRY" zone cannot fail later on... This assumption is false because of possible deferred triggers for instance. See attached example:
ok .. it is pretty artificial, but ok.
Good. We seem to agree that some kind of transactional support is needed for the use case, which is pretty logical.In this case the reset to NULL on ROLLBACK should be enough.
Probably.
So I expect that you are going to update your proposal somehow to provide some transactional properties.
Note that if you have some mecanism for doing a NULL on rollback, then why not just keep and reset the previous value if needed? This just means that you have a transactional variable, which is fine from my point of view. As I already wrote, session variable are memory only, so transactional does not involve costs such as WAL.
There is not cost such as WAL - in any update, you have to check if this is first update in transaction, and if it is, then you have to create new memory context and create new callback that will be evaluated on rollback.
Probably there is not big difference between RESET and UNDO in complexity of implementation. You have to do partial implementation of MVCC. No simple code.
Also note that user-defined GUCs already implements the transactional property, so probably the mecanism is already available and can be reused.
GUC are stack based - the value doesn't depends if transaction was successful or not.
--
Fabien.
Hello, > Probably there is not big difference between RESET and UNDO in complexity > of implementation. You have to do partial implementation of MVCC. No simple > code. I think so; yes; indeed. >> Also note that user-defined GUCs already implements the transactional >> property, so probably the mecanism is already available and can be reused. > > GUC are stack based - the value doesn't depends if transaction was > successful or not. Hmmm... this looks transactional to me: SELECT set_config('x.x', 'before', FALSE); -- 'before' BEGIN; SELECT set_config('x.x', 'within', FALSE); -- 'within' ROLLBACK; SELECT current_setting('x.x'); -- 'before' BEGIN; SELECT set_config('x.x', 'inside', FALSE); -- 'inside' COMMIT; SELECT current_setting('x.x'); -- 'inside' I would say the stack is needed for SAVEPOINT: SELECT set_config('x.x', 'before', FALSE); -- 'before' BEGIN; SELECT set_config('x.x', 'within', FALSE); -- 'within' SAVEPOINT within; SELECT set_config('x.x', 'inside', FALSE); -- 'inside' SELECT current_setting('x.x');-- 'inside' ROLLBACK TO SAVEPOINT within; SELECT current_setting('x.x'); -- 'within' SELECTset_config('x.x', 'further', FALSE); -- 'further' ROLLBACK; SELECT current_setting('x.x'); -- 'before' So basically the use case needs GUCs with some access control. Or just role-private GUCs and some access function tricks would do as well for the use case. At least it is probably much easier to add privacy to gucs than to (re)implement permissions and MVCC on some session variables. And it would be nice if GUCs could be typed as well... -- Fabien.
2017-01-03 20:56 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
Hello,Probably there is not big difference between RESET and UNDO in complexity
of implementation. You have to do partial implementation of MVCC. No simple
code.
I think so; yes; indeed.Also note that user-defined GUCs already implements the transactional
property, so probably the mecanism is already available and can be reused.
GUC are stack based - the value doesn't depends if transaction was
successful or not.
Hmmm... this looks transactional to me:
SELECT set_config('x.x', 'before', FALSE); -- 'before'
BEGIN;
SELECT set_config('x.x', 'within', FALSE); -- 'within'
ROLLBACK;
SELECT current_setting('x.x'); -- 'before'
BEGIN;
SELECT set_config('x.x', 'inside', FALSE); -- 'inside'
COMMIT;
SELECT current_setting('x.x'); -- 'inside'
I would say the stack is needed for SAVEPOINT:
SELECT set_config('x.x', 'before', FALSE); -- 'before'
BEGIN;
SELECT set_config('x.x', 'within', FALSE); -- 'within'
SAVEPOINT within;
SELECT set_config('x.x', 'inside', FALSE); -- 'inside'
SELECT current_setting('x.x'); -- 'inside'
ROLLBACK TO SAVEPOINT within;
SELECT current_setting('x.x'); -- 'within'
SELECT set_config('x.x', 'further', FALSE); -- 'further'
ROLLBACK;
SELECT current_setting('x.x'); -- 'before'
So basically the use case needs GUCs with some access control. Or just role-private GUCs and some access function tricks would do as well for the use case. At least it is probably much easier to add privacy to gucs than to (re)implement permissions and MVCC on some session variables. And it would be nice if GUCs could be typed as well...
With respect, I don't share your opinion - it is not enough for usage like package variables - there usually should not to use any dependency on transactions.
More it is dynamic - it should be hard inconsistency to implement CREATE or DECLARE statement for GUC. So it is out my proposal (and my goal).
Regards
Pavel
--
Fabien.
> With respect, I don't share your opinion - it is not enough for usage like > package variables - there usually should not to use any dependency on > transactions. I'm not sure I understand your point. If Oracle provides unsafe package variables that can fool auditors, it is not a sufficient reason for Pg to provide the same doubtful feature. And if they have sub-transactions then their feature may not necessarily be unsafe, at least if the coding is careful, but this point does not apply to pg. > More it is dynamic - it should be hard inconsistency to implement CREATE or > DECLARE statement for GUC. So it is out my proposal (and my goal). I have added a few questions/remarks about your updated proposal in the wiki. Feel free to update/answer/discuss these. I have also updated and simplified the "simple session variable" description, because now I'm convinced that they must be transactional, and that a distinct declaration statement is a pain. -- Fabien.
2017-01-04 9:56 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
With respect, I don't share your opinion - it is not enough for usage like
package variables - there usually should not to use any dependency on
transactions.
I'm not sure I understand your point. If Oracle provides unsafe package variables that can fool auditors, it is not a sufficient reason for Pg to provide the same doubtful feature. And if they have sub-transactions then their feature may not necessarily be unsafe, at least if the coding is careful, but this point does not apply to pg.
unsafe is wrong word - are you first man, what I know who are expecting transactions from variables - the variables are coming from procedural world - there are not transactions.
your mental model about variables is pretty artificial - it is strange so Oracle, MSSQL, DB2 30 years didn't find so variables should be transactional.
I agree, so there can be some advantages - but I disagree so transactional is major and required feature. There are possible artefacts on border transactional and untransactional world - so developer should to use patterns that reduces negative impacts of these artefacts.
More it is dynamic - it should be hard inconsistency to implement CREATE or
DECLARE statement for GUC. So it is out my proposal (and my goal).
I have added a few questions/remarks about your updated proposal in the wiki. Feel free to update/answer/discuss these.
I have also updated and simplified the "simple session variable" description, because now I'm convinced that they must be transactional, and that a distinct declaration statement is a pain.
I respect your opinion and don't agree with it.
Regards
Pavel
--
Fabien.
On 4 January 2017 at 17:31, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> I have also updated and simplified the "simple session variable" >> description, because now I'm convinced that they must be transactional, and >> that a distinct declaration statement is a pain. > > I respect your opinion and don't agree with it. Yeah. I'm pretty overwhelmingly unconvinced too. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Hello, >> I'm not sure I understand your point. If Oracle provides unsafe package >> variables that can fool auditors, it is not a sufficient reason for Pg to >> provide the same doubtful feature. And if they have sub-transactions then >> their feature may not necessarily be unsafe, at least if the coding is >> careful, but this point does not apply to pg. > > unsafe is wrong word - are you first man, what I know who are expecting > transactions from variables - the variables are coming from procedural > world - there are not transactions. We have established that the correctness of the security context use case presented by Craig requires transactional variables. This is not my fault. If you present a new feature to implement this use case, then it must match the case requirements. > your mental model about variables is pretty artificial - it is strange so > Oracle, MSSQL, DB2 30 years didn't find so variables should be > transactional. As already said, Pg GUCs are transactional, so Pg is out of its mind? Maybe it is not the case in Oracle when programmed with PL/SQL, then fine. As I said, your pattern can be correct iff a sub-transaction is used. If Oracle has sub-stransaction then untransactional variables can be used for the use case by setting them outside the security verification transaction. So what is maybe fine in Oracle is not fine with Pg without subtransactions. > I agree, so there can be some advantages - but I disagree so transactional > is major and required feature. Hmmm. I strongly oppose adding a feature which does not implement correctly the use case it is designed for. > There are possible artefacts on border transactional and untransactional > world - so developer should to use patterns that reduces negative > impacts of these artefacts. I do not think that probabilistic security is a good sales pitch. Moreover there is no particular issue with implenting the needed feature, all the mecanism are already available in Pg, so it looks masochistic to refuse to implement a feature which is already available and happen to be necessary to the use case correctness. -- Fabien.
>> I respect your opinion and don't agree with it. > > Yeah. I'm pretty overwhelmingly unconvinced too. I'm lost. The security-related use-case you have presented stores the status of the verification in a variable. If the variable is untransactional, then it has been shown that the variable status may say ok while the verification has really really failed. This means that subsequent operations would be executed believing wrongly that the security was ok. Not good. Morover, there is no special cost in implementing transactional on session variables, has it is already done by pg. It can probably be reused. An alternative is to implement sub (nested) transactions, like Oracle and MS SQL Server... but that would be quite some work. So basically I do not see any point in not doing transactional variables. -- Fabien.
> An alternative is to implement sub (nested) transactions, like Oracle > and MS SQL Server... but that would be quite some work. As a complement, a search showed that IBM DB2, cited as a reference by Pavel, has AUTONOMOUS transactions, which looks pretty much the same thing as nested transactions. The documentation presents an interesting use security-related use case: https://www.ibm.com/developerworks/data/library/techarticle/dm-0907autonomoustransactions/ The idea is that an application must record an attempt to access a data even if the attempt fails and is rolled-back. This feature used carefully within an appropriate pattern would allow to ensure that if the setup transaction fails then the session status is FALSE. One possible inconsistency which may arise with sub xacts is that the status may stay FALSE while the setup has succeeded, however this on the safe side wrt to the security use case. -- Fabien.
On 4 Jan. 2017 19:03, "Fabien COELHO" <coelho@cri.ensmp.fr> wrote: >>> I respect your opinion and don't agree with it. >> >> >> Yeah. I'm pretty overwhelmingly unconvinced too. > > I'm lost. > > The security-related use-case you have presented stores the status of the verification in a variable. If the variable isuntransactional, then it has been shown that the variable status > may say ok while the verification has really reallyfailed. That's only a concern if the setting xact performs writes. If it's a read-only lookup, all it has to do is set the variable last. I agree that transactional variables whose value assignments come into effect on commit would be useful. Like we have for NOTIFY. I do not agree that they are _necessary_ such that a feature is not useful without them. Nor do I agree that they are necessary for security related use. > Morover, there is no special cost in implementing transactional on session variables, has it is already done by pg. Itcan probably be reused. Um, what? No, not at all. GUCs are scoped, but not transactional, in the sense that a SET LOCAL _overrides_ the outer SET for the lifetime of the xact or until overwritten by some later SET LOCAL. On xact end, whether rollback or commit, we just unwind the whole scope by popping a stack. Additional handling is present for subxact and special scopes like functions with SET. Transactional assignments would instead need some kind of (sub)transaction stack that flattens onto the outer layer on commit or is popped and discarded on rollback. Not overwhelmingly expensive or hard, but not something we have already. We'd _definitely_ need to be able to declare such variables, so we could specify their ON COMMIT behaviour (which GUCs don't have) and define their scope (like we do for GUCs). > An alternative is to implement sub (nested) transactions, like Oracle and MS SQL Server... but that would be quite somework. What? We have those already, see SAVEPOINT and ROLLBACK TO SAVEPOINT. Unless you mean autonomous transactions, which are not really nested, they're closer to having the outer xact suspend while another xact works, then resuming the outer xact. -- Fabien.
2017-01-04 14:33 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
An alternative is to implement sub (nested) transactions, like Oracle and MS SQL Server... but that would be quite some work.
As a complement, a search showed that IBM DB2, cited as a reference by Pavel, has AUTONOMOUS transactions, which looks pretty much the same thing as nested transactions. The documentation presents an interesting use security-related use case:
I had on my mind autonomous transactions.
https://www.ibm.com/developerworks/data/library/techarticle/ dm-0907autonomoustransactions/
The idea is that an application must record an attempt to access a data even if the attempt fails and is rolled-back.
Now we can this feature emulate with dblink, and there are patches in commitfest based on background workers, and emulation will be cheaper.
Regards
Pavel
This feature used carefully within an appropriate pattern would allow to ensure that if the setup transaction fails then the session status is FALSE. One possible inconsistency which may arise with sub xacts is that the status may stay FALSE while the setup has succeeded, however this on the safe side wrt to the security use case.
--
Fabien.
Hello, >> The security-related use-case you have presented stores the status of >> the verification in a variable. If the variable is untransactional, >> then it has been shown that the variable status > may say ok while the >> verification has really really failed. > > That's only a concern if the setting xact performs writes. Sure. However I do not see the point of proposing a feature which works only sometimes, on the condition that the security setup does NOT involve storing data in the database (!), otherwise it may be insecure in some cases, sorry mates. That does not look too serious, esp if the use-case concern is all about security. > If it's a read-only lookup, all it has to do is set the variable last. Yep, I guess it would probably work for read-only transactions. > I agree that transactional variables whose value assignments come into > effect on commit would be useful. Like we have for NOTIFY. I do not > agree that they are _necessary_ such that a feature is not useful > without them. Nor do I agree that they are necessary for security > related use. The feature would be clearly misleading without transactional support, because people would use it with false expectation that it works securely, which is not the case. >> Morover, there is no special cost in implementing transactional on >> session variables, has it is already done by pg. It can probably be >> reused. > > Um, what? No, not at all. > > GUCs are scoped, but not transactional, [...] The documentation is very scarse, so I have tested it. All tests I have done with commit & rollback on session variables (SET SESSION) have shown a clean transactional behavior, with the value reverted on ROLLBACK, whether intentional or automatic, or the new value set on COMMIT. See attached scripts for instance. LOCAL variables are a different thing, they just disappear at the end of the session, it is more a scoping thing. > We'd _definitely_ need to be able to declare such variables, so we > could specify their ON COMMIT behaviour (which GUCs don't have) Hmmm. We do not have to declare any ON COMMIT behaviour of TABLES, they are just transactional. > and define their scope (like we do for GUCs). I'm fine with defining scopes. >> An alternative is to implement sub (nested) transactions, like Oracle >> and MS SQL Server... but that would be quite some work. > > What? We have those already, see SAVEPOINT and ROLLBACK TO SAVEPOINT. No, that is not what I meant. > Unless you mean autonomous transactions, which are not really nested, Yes, that is why I wrote "nested" above. > they're closer to having the outer xact suspend while another xact > works, then resuming the outer xact. Yep. The point is that you can test the success of the nested transaction before setting the status. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Um, what? No, not at all.
GUCs are scoped, but not transactional, [...]
The documentation is very scarse, so I have tested it.
All tests I have done with commit & rollback on session variables (SET SESSION) have shown a clean transactional behavior, with the value reverted on ROLLBACK, whether intentional or automatic, or the new value set on COMMIT. See attached scripts for instance.
Your test shows so SET SESSION has not transactional behaviour - the transactions fails, but the value is not reverted to NULL.
It is good example of antipattern for this routine type :)
Pavel
Fabien.
> Now we can this feature emulate with dblink, and there are patches in > commitfest based on background workers, and emulation will be cheaper. I had not noticed that "background session" proposal. That's definitely an interesting feature to have for some use cases. Dblink implies a new connection I think, pretty expensive. I wish that the proposal would be language independent, like DB2 simple AUTONOMOUS declaration on a function. It seems quite heavily linked to PL/pgSQL right now. -- Fabien.
>> See attached scripts for instance. > > Your test shows so SET SESSION has not transactional behaviour - the > transactions fails, but the value is not reverted to NULL. There are *two* function calls, the first fails and the second succeeds. Here is the trace with a some comments: [...] ## SET SESSION x.x = 'null'; SET -- previous has set x.x = 'null' ## SELECT setupSecurityContext(3); -- first setup... function call NOTICE: SET secured = FALSE NOTICE: SET secured = TRUE-- there is a SET to 'ok' just after this print -- at the end the transaction fails: ERROR: insert or update on table"log" violates foreign key constraint "log_sid_fkey" DETAIL: Key (sid)=(3) is not present in table "stuff". -- no resultis displayed from the SELECT ## SHOW x.x; nul -- the value is the initial value, it has been reverted ## SELECT setupSecurityContext(2); -- second setup... function call NOTICE: SET secured = FALSE NOTICE: SET secured =TRUE -- trues is displayed, the function succeeded t ## SHOW x.x; ok -- the new value is shown -- Fabien.
2017-01-04 17:30 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
Now we can this feature emulate with dblink, and there are patches in
commitfest based on background workers, and emulation will be cheaper.
I had not noticed that "background session" proposal. That's definitely an interesting feature to have for some use cases. Dblink implies a new connection I think, pretty expensive. I wish that the proposal would be language independent, like DB2 simple AUTONOMOUS declaration on a function. It seems quite heavily linked to PL/pgSQL right now.
Maybe year ago here was a discussion about autonomous transaction design - Robert proposed transaction scope - some like 'BEGIN AUTONOMOUS", I proposed function level. The syntax is not pretty important - this functionality is interesting - mainly for loging to tables - but there are risks - it again border transactional| untransactional - autonomous transactions are "untransactional" from outer transaction perspective - so some unwanted artefacts or risks are possible there - and application design should to respect it.
Regards
Pavel
--
Fabien.
2017-01-04 17:58 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
See attached scripts for instance.
Your test shows so SET SESSION has not transactional behaviour - the
transactions fails, but the value is not reverted to NULL.
There are *two* function calls, the first fails and the second succeeds. Here is the trace with a some comments:
[...]
## SET SESSION x.x = 'null';
SET
-- previous has set x.x = 'null'
## SELECT setupSecurityContext(3);
-- first setup... function call
NOTICE: SET secured = FALSE
NOTICE: SET secured = TRUE
-- there is a SET to 'ok' just after this print
-- at the end the transaction fails:
ERROR: insert or update on table "log" violates foreign key constraint "log_sid_fkey"
DETAIL: Key (sid)=(3) is not present in table "stuff".
-- no result is displayed from the SELECT
## SHOW x.x;
nul
-- the value is the initial value, it has been reverted
## SELECT setupSecurityContext(2);
-- second setup... function call
NOTICE: SET secured = FALSE
NOTICE: SET secured = TRUE
-- trues is displayed, the function succeeded
t
## SHOW x.x;
ok
-- the new value is shown
ok understand
The logic depends on transactions and on nesting level (nesting doesn't depends on transactions only)
/*
* Do GUC processing at transaction or subtransaction commit or abort, or
* when exiting a function that has proconfig settings, or when undoing a
* transient assignment to some GUC variables. (The name is thus a bit of
* a misnomer; perhaps it should be ExitGUCNestLevel or some such.)
* During abort, we discard all GUC settings that were applied at nesting
* levels >= nestLevel. nestLevel == 1 corresponds to the main transaction.
*/
void
AtEOXact_GUC(bool isCommit, int nestLevel)
Probably we should to use CallXactCallbacks instead - then is not a performance impact when there are not transactional variables.
Regards
Pavel
--
Fabien.
> ok understand Good. So we seem to agree that GUCS are transactional? > The logic depends on transactions and on nesting level (nesting doesn't > depends on transactions only) Yep, it probably also happens with LOCAL which hides the previous value and restores the initial one when exiting. > void AtEOXact_GUC(bool isCommit, int nestLevel) > > Probably we should to use CallXactCallbacks instead - then is not a > performance impact when there are not transactional variables. I do not understand your point. It is a very good thing that GUCs are transactional, and this should not be changed, it is a useful feature! Much more useful than non transactional. Moreover I think that transactional is expensive when writing things to disk, but in memory the overhead is reduced, and if you need it then you need it. -- Fabien.
2017-01-04 18:49 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
ok understand
Good. So we seem to agree that GUCS are transactional?The logic depends on transactions and on nesting level (nesting doesn't
depends on transactions only)
Yep, it probably also happens with LOCAL which hides the previous value and restores the initial one when exiting.void AtEOXact_GUC(bool isCommit, int nestLevel)
Probably we should to use CallXactCallbacks instead - then is not a
performance impact when there are not transactional variables.
I do not understand your point.
It is on critical path, so every check increase computer time for transaction end.
Regards
Pavel
It is a very good thing that GUCs are transactional, and this should not be changed, it is a useful feature! Much more useful than non transactional.
Personally, I never used - although I using often nesting
regards
Pavel
Moreover I think that transactional is expensive when writing things to disk, but in memory the overhead is reduced, and if you need it then you need it.
--
Fabien.
> [...] It is on critical path, so every check increase computer time for > transaction end. Hmmm... Everything executed is on the critical path... >> It is a very good thing that GUCs are transactional, and this should not >> be changed, it is a useful feature! Much more useful than non transactional. > > Personally, I never used - although I using often nesting Your position is contradictory: First you put forward a variable-with-permissions for a special use case, you insist that correctness is key and must be checked with static analysis tools that audit codes, that dynamic variables are too ugly for the purpose. Fine, even if I disagree with some details, there is some logic in that: security, audit, checks... why not. Then when one shows that correctness requires that the variable is transactional, this is not so important anymore based on the fact that some big companies do not do it like that, and suddenly it is enough that it probably works sometimes. And when the fact that pg already supports transactional variables is pointed out, just what the use case needs... then you suggest to remove the property. What can I say? You've lost me, really. -- Fabien.
2017-01-04 19:56 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
[...] It is on critical path, so every check increase computer time for transaction end.
Hmmm... Everything executed is on the critical path...It is a very good thing that GUCs are transactional, and this should not
be changed, it is a useful feature! Much more useful than non transactional.
Personally, I never used - although I using often nesting
Your position is contradictory:
First you put forward a variable-with-permissions for a special use case, you insist that correctness is key and must be checked with static analysis tools that audit codes, that dynamic variables are too ugly for the purpose. Fine, even if I disagree with some details, there is some logic in that: security, audit, checks... why not.
Then when one shows that correctness requires that the variable is transactional, this is not so important anymore based on the fact that some big companies do not do it like that, and suddenly it is enough that it probably works sometimes. And when the fact that pg already supports transactional variables is pointed out, just what the use case needs... then you suggest to remove the property.
The GUC are designed for different purpose - I don't know why somebody did there transaction support - I understand and I remember the nesting support.
look to code - the GUC related code has more about 10K lines, probably 5K lines is important for this purpose.
There are more reasons, why I would not to use GUC
0. it is not designed be secure - there is different security model - readonly, superuser, others
1. it is dynamic - not persistent - cannot be used as package variables simply
2. there is different placing - custom requires prefix - I prefer using our schemas, because schemas are used in pg like packages in Oracle
3. large number of GUC decrease performace of end of transactions, subtransactions
4. any RDBMS using untransactional variables - it should be default optimized behave
Regards
Pavel
What can I say? You've lost me, really.
--
Fabien.
On 5 January 2017 at 01:49, Fabien COELHO <coelho@cri.ensmp.fr> wrote: > >> ok understand > > > Good. So we seem to agree that GUCS are transactional? No. We don't agree. They aren't. The effects of SET LOCAL are reverted whether you commit or rollback. The effects of SET SESSION are never reverted, whether you commit or roll back. craig=> SET x.s = 'x'; SET craig=> BEGIN; BEGIN craig=> SET LOCAL x.s = 'y'; SET craig=> COMMIT; COMMIT craig=> SHOW x.s;x.s -----x (1 row) There are simply different scopes, one of which is the transaction scope. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 5 January 2017 at 08:35, Craig Ringer <craig@2ndquadrant.com> wrote: > On 5 January 2017 at 01:49, Fabien COELHO <coelho@cri.ensmp.fr> wrote: >> >>> ok understand >> >> >> Good. So we seem to agree that GUCS are transactional? > > No. We don't agree. They aren't. Uh. I take that back. craig=> SET x.s = 'x'; SET craig=> BEGIN; BEGIN craig=> SET x.s = 'y'; SET craig=> ROLLBACK; ROLLBACK craig=> SHOW x.s;x.s -----x (1 row) I'm surprised, I never knew this. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 01/04/2017 04:36 PM, Craig Ringer wrote: > On 5 January 2017 at 08:35, Craig Ringer <craig@2ndquadrant.com> wrote: >> On 5 January 2017 at 01:49, Fabien COELHO <coelho@cri.ensmp.fr> wrote: >>> Good. So we seem to agree that GUCS are transactional? >> >> No. We don't agree. They aren't. > > Uh. I take that back. > > craig=> SET x.s = 'x'; > SET > craig=> BEGIN; > BEGIN > craig=> SET x.s = 'y'; > SET > craig=> ROLLBACK; > ROLLBACK > craig=> SHOW x.s; > x.s > ----- > x > (1 row) > > > I'm surprised, I never knew this. (I have not been able to keep up with the shear volume on this thread,but this caught my eye...) Yeah -- I found it surprising when I first discovered it too. My opinion is that the design for variables should not behave this way. Joe -- Crunchy Data - http://crunchydata.com PostgreSQL Support for Secure Enterprises Consulting, Training, & Open Source Development
>>> Good. So we seem to agree that GUCS are transactional? > > I'm surprised, I never knew this. I must admit that it was also a (good) surprise for me. The documentation says it: """ If SET (or equivalently SET SESSION) is issued within a transaction that is later aborted, the effects of the SET command disappear when the transaction is rolled back. Once the surrounding transaction is committed, the effects will persist until the end of the session, unless overridden by another SET. """ But I have not found anything clear about user-defined parameters. -- Fabien.
2017-01-05 10:59 GMT+01:00 Fabien COELHO <fabien.coelho@mines-paristech.fr>:
Good. So we seem to agree that GUCS are transactional?
I'm surprised, I never knew this.
I must admit that it was also a (good) surprise for me.
The documentation says it:
"""
If SET (or equivalently SET SESSION) is issued within a transaction that is later aborted, the effects of the SET command disappear when the transaction is rolled back. Once the surrounding transaction is committed, the effects will persist until the end of the session, unless overridden by another SET.
"""
But I have not found anything clear about user-defined parameters.
Pavel
--
Fabien.
Hello Pavel, > There are more reasons, why I would not to use GUC > 0. it is not designed be secure - there is different security model - > readonly, superuser, others Sure, GUCs as is are not enough, but the model can be extended instead of re-inventing the wheel with a new kind of variable. > 1. it is dynamic - not persistent - cannot be used as package variables > simply Half-persistence (in definition, not in value) is not a key feature needed by the use-case. > 2. there is different placing - custom requires prefix - I prefer using our > schemas, because schemas are used in pg like packages in Oracle Idem. > 3. large number of GUC decrease performace of end of transactions, > subtransactions That is life. The presented use-case really needs only one variable. > 4. any RDBMS using untransactional variables - it should be default > optimized behave Hmmm. Untransactional variables do **NOT** fit the use case, it just works "sometimes", which is not acceptabe. I've spent too much time on reviewing this proposal. My conclusion is: - a clear use case linked to security setups has been presented which requires some kind of secure (i.e. with access control)session variables, currently not available in pg which has user-defined GUC which are dynamic, untyped (TEXT),public, transactional. - you have proposed a NEW kind of session variables which is: (1) statically typed, declared permanently in the catalog, in the schema/table namespace (2) values are session alive (3) untransactional, as you insist on that (your 4. above) (4) with permissions My feedback is that: - The proposed feature does not fit the presented use case it is intended for. There is no use case for untransactionalsecure session variables. The proposal should be amended so that the variables display by default sometransactional properties because it is required for correctly implementing the use case. - Personnaly, I'm not convinced that a NEW type of session variable is a good thing as pg already has one, and two isone too many. I would find it more useful to enhance existing dynamic session variables with, by order of importance: (1) private/public visibility (as Oracle does with package vars). this point is enough to implement the presenteduse case. (2) typing (casting is a pain) (3) improved syntax (set_config & current_setting is a pain) Eventually, unrelated to the use case, but in line with your motivations as I understand them: (4) add an option to make a GUC non transactional, iff there is a clear use case for that (maybe debug?). (5) have some "permanent" GUC type declarations (maybe editing the config file does that already, by the way?) -- Fabien.
2017-01-05 11:39 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
Hello Pavel,There are more reasons, why I would not to use GUC0. it is not designed be secure - there is different security model -
readonly, superuser, others
Sure, GUCs as is are not enough, but the model can be extended instead of re-inventing the wheel with a new kind of variable.1. it is dynamic - not persistent - cannot be used as package variables
simply
Half-persistence (in definition, not in value) is not a key feature needed by the use-case.2. there is different placing - custom requires prefix - I prefer using our
schemas, because schemas are used in pg like packages in Oracle
Idem.3. large number of GUC decrease performace of end of transactions,
subtransactions
That is life. The presented use-case really needs only one variable.4. any RDBMS using untransactional variables - it should be default
optimized behave
Hmmm. Untransactional variables do **NOT** fit the use case, it just works "sometimes", which is not acceptabe.
I've spent too much time on reviewing this proposal. My conclusion is:
- a clear use case linked to security setups has been presented
which requires some kind of secure (i.e. with access control) session
variables, currently not available in pg which has user-defined GUC
which are dynamic, untyped (TEXT), public, transactional.
- you have proposed a NEW kind of session variables which is:
(1) statically typed, declared permanently in the catalog, in the
schema/table namespace
(2) values are session alive
(3) untransactional, as you insist on that (your 4. above)
(4) with permissions
My feedback is that:
- The proposed feature does not fit the presented use case it is intended
for. There is no use case for untransactional secure session variables.
The proposal should be amended so that the variables display by default
some transactional properties because it is required for correctly
implementing the use case.
- Personnaly, I'm not convinced that a NEW type of session variable is
a good thing as pg already has one, and two is one too many. I would
find it more useful to enhance existing dynamic session variables with,
by order of importance:
(1) private/public visibility (as Oracle does with package vars).
this point is enough to implement the presented use case.
(2) typing (casting is a pain)
(3) improved syntax (set_config & current_setting is a pain)
Eventually, unrelated to the use case, but in line with your motivations as I understand them:
(4) add an option to make a GUC non transactional, iff there is
a clear use case for that (maybe debug?).
(5) have some "permanent" GUC type declarations (maybe editing the
config file does that already, by the way?)
Thank you for your work on this topic.
Unfortunately, there is significant disagreement in this topic between us. I see a schema based persistent metadata a catalog based security as fundamental feature. Editing config file is not acceptable in any view.
Best regards
Pavel
--
Fabien.
On 1/5/17 4:59 AM, Pavel Stehule wrote: > > - Personnaly, I'm not convinced that a NEW type of session variable is > a good thing as pg already has one, and two is one too many. I would > find it more useful to enhance existing dynamic session variables > with, > by order of importance: > > (1) private/public visibility (as Oracle does with package vars). > this point is enough to implement the presented use case. > > (2) typing (casting is a pain) > > (3) improved syntax (set_config & current_setting is a pain) > > Eventually, unrelated to the use case, but in line with your > motivations as I understand them: > > (4) add an option to make a GUC non transactional, iff there is > a clear use case for that (maybe debug?). > > (5) have some "permanent" GUC type declarations (maybe editing the > config file does that already, by the way?) > > > Thank you for your work on this topic. > > Unfortunately, there is significant disagreement in this topic between > us. I see a schema based persistent metadata a catalog based security as > fundamental feature. Editing config file is not acceptable in any view. I generally agree with that. That said, it probably wouldn't be hard to "register" GUCs during backend startup, based on what's in the catalog for the database you're connecting to. There's certainly already a place in the code to do this, since you can set per-database values for GUCs. That said, IIRC GUCs are setup in such a way that could could just create a new stack upon connection. Actually, I think that'd need to happen anyway, otherwise these variables are going to look like GUCs even though they're not. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532)
On 6 January 2017 at 08:44, Jim Nasby <Jim.Nasby@bluetreble.com> wrote: >> (1) private/public visibility (as Oracle does with package vars). >> this point is enough to implement the presented use case. Agreed. >> (2) typing (casting is a pain) We already have typed GUCs and allow them to be user-defined. See DefineCustomBoolVariable, DefineCustomIntVariable, etc. What we lack is a way to declare and use typed dynamically user-defined GUCs at runtime without a C extension. We also lack user interface to load and store values without going via their text representation; there's no current_setting_int4 etc. So if we allow for now the idea that we'd extend the GUC model to do this (which I'm not at all sure is a good thing) ... it's possible. >> (5) have some "permanent" GUC type declarations (maybe editing the >> config file does that already, by the way?) We have that, but it currently requires a C extension. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Thank you for your work on this topic.
Unfortunately, there is significant disagreement in this topic between
us. I see a schema based persistent metadata a catalog based security as
fundamental feature. Editing config file is not acceptable in any view.
I generally agree with that. That said, it probably wouldn't be hard to "register" GUCs during backend startup, based on what's in the catalog for the database you're connecting to. There's certainly already a place in the code to do this, since you can set per-database values for GUCs. That said, IIRC GUCs are setup in such a way that could could just create a new stack upon connection. Actually, I think that'd need to happen anyway, otherwise these variables are going to look like GUCs even though they're not.
Registration on every setup can be little bit expensive - more practical is variables initialized on demand - when they are used.
Regards
Pavel
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
2017-01-06 7:01 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:
Thank you for your work on this topic.
Unfortunately, there is significant disagreement in this topic between
us. I see a schema based persistent metadata a catalog based security as
fundamental feature. Editing config file is not acceptable in any view.
I generally agree with that. That said, it probably wouldn't be hard to "register" GUCs during backend startup, based on what's in the catalog for the database you're connecting to. There's certainly already a place in the code to do this, since you can set per-database values for GUCs. That said, IIRC GUCs are setup in such a way that could could just create a new stack upon connection. Actually, I think that'd need to happen anyway, otherwise these variables are going to look like GUCs even though they're not.Registration on every setup can be little bit expensive - more practical is variables initialized on demand - when they are used.
And if you have a catalog entry, then it is not necessary - self management variables in memory is not too complex
RegardsPavel
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
On Thu, Jan 5, 2017 at 5:39 AM, Fabien COELHO <coelho@cri.ensmp.fr> wrote: > Half-persistence (in definition, not in value) is not a key feature needed > by the use-case. Well, you don't get to decide that. You've been told by at least three or four people that they don't want variables to be transactional, you've been pointed to documentation links showing that in other database systems including Oracle variables are not transactional, and you still insist that this proposal is senseless unless variables are transactional. You have every right to decide what you think is useful, but you don't have a right to decide what other people think is useful. You don't get veto power over what Pavel wants to implement, even if you personally would not choose to implement it that way, and especially not when multiple people are agreeing with Pavel and disagreeing with you. On the substance of this issue, I would note that Pavel's idea of entering variables in pg_class has a number of advantages: 1. The code can control permissions using the same system that we use to control permissions on all other objects, instead of having to create a completely new one. Similarly for dependencies. This wouldn't matter if it were possible to get by with no system for privileges on variables or with a very simple system such as you proposed upthread, but I think that's somewhat unrealistic in the face of security-definer functions and row-level security. Execution in multiple privilege contexts within the same session is a fact of life, and whatever design gets chosen has to somehow cope with that; using the existing infrastructure is one reasonable choice. Trying to do something excessively simple here will result in security bugs. 2. The user doesn't need to re-declare the variables you want to use at the beginning of every session. This is also the reason why many people want global temporary tables. They don't do anything that can't be done with local temporary tables; they're just more convenient to use. 3. If somebody goes to drop a type, they'll have to use CASCADE to get rid of the dependent variable. That might tip them off to conduct an investigation into whether that variables is being used. If the variables are established entirely on a per-session basis, there will be no such warning at DROP time. You'll only find out about the problem when the application starts failing. 4. As Pavel already said, it makes things easier for a static checker. I'm not certain that anyone, including you, has correctly understood Pavel's point here, which is perhaps because Pavel's first language is not English. But I'm pretty sure I understand it, so let me try to restate it more straightforwardly. Suppose a static checker encounters the statement SET VARIABLE flumpf = 1 (assume for purposes of this bullet point that this is the syntax for setting a variable). If variables have to be catalogued, and flumpf is not in the catalog, this is a probably a bug. If flumpf is in the catalog but is of a type that cannot contain the value 1, this is also probably a bug. But if variables do not have to be catalogued, then the static checker will have a hard time inferring anything about the correctness of this statement. It is not impossible; for example, the static checker could have a requirement that the user running it set up the session with all required variables before running the checker. But that is not particularly convenient. IIUC, Pavel's point is that a user who creates a function which uses a certain variable would probably also put code in that same function to create the variable if it is not yet present. And a static checker probably can't see through that. You've argued that there is no problem here, but it seems absolutely unarguable to me that static checkers benefit from having more things statically configured and less stuff figured out at runtime. That's why it's easier to statically check programming languages with strongly typed variables than it is to statically check languages with run-time typing. Now, there are certainly advantages to NOT entering variables in pg_class, too. For example, it makes the operation much more light-weight. If the code uses basically the same variables over and over again, entering them in the catalog doesn't cost anything meaningful. If it's constantly creating new variables, that's going to create catalog bloat if those variables have to be added to the catalogs, and that's going to stink. I don't mean to pass judgement here in favor of Pavel's proposal and against other proposals. But it seems clear to me that Pavel's proposal does have certain fairly compelling advantages and a good deal of community support; and your replies don't seem to be acknowledging any of that. I think they should. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 10 January 2017 at 05:14, Robert Haas <robertmhaas@gmail.com> wrote: > 2. The user doesn't need to re-declare the variables you want to use > at the beginning of every session. This is also the reason why many > people want global temporary tables. They don't do anything that > can't be done with local temporary tables; they're just more > convenient to use. They'll also help a lot with pg_attribute and pg_class bloat. I don't feel strongly either way about catalog use, but definitely think declare-before-use is crucial. Pretty much every language that implicitly declares variables has landed up adding a way to require them to be declared for a reason. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Hello Robert, >> Half-persistence (in definition, not in value) is not a key feature >> needed by the use-case. > > Well, you don't get to decide that. I do not think that your reprimand is deserved about this point: I did not decide a subjective opinion, I noted an objective fact. > You've been told by at least three or four people that they don't want > variables to be transactional, you've been pointed to documentation > links showing that in other database systems including Oracle variables > are not transactional, and you still insist that this proposal is > senseless unless variables are transactional. Indeed. I have submitted a proof of this fact in the form of a counter example which (1) (pseudo) implements the use-case by logging into an audit table the fact a user accesses the secure level (2) shows that the status of a non transactional session variable used for keeping this status is wrong for the use case in some cases (it says that all is well while appending to the audit table failed). I have also recognized that the use-case could be implemented safely, although not correctly, if pg provides nested/autonomous transactions like Oracle, DB2 or MS SQL does, but I think that having such a useful feature is quite far away... > You have every right to decide what you think is useful, but you don't > have a right to decide what other people think is useful. Hmmm. I feel entitled to point out to other people that their belief that a feature as described provides a correct solution to a particular use case is wrong, if it is factually the case. If they persist in this belief despite the submitted proof, I can only be sad about it, because if pg provides a feature for a security-relared use case which does not work correctly it is just shooting one's foot. I do not like Pavel's feature, this is a subjective opinion. This feature does not provide a correct solution for the use case, this is an objective fact. The presented feature does not have a real use case, this is too bad. Finally, I did not "veto" this feature, I reviewed it in depth and concluded negatively. You are a committer and I'm just a "silly academic", you do not have to listen to anything I say and can take majority votes against proofs if you want. -- Fabien.
2017-01-10 7:31 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
Hello Robert,Half-persistence (in definition, not in value) is not a key feature needed by the use-case.
Well, you don't get to decide that.
I do not think that your reprimand is deserved about this point: I did not decide a subjective opinion, I noted an objective fact.You've been told by at least three or four people that they don't want variables to be transactional, you've been pointed to documentation links showing that in other database systems including Oracle variables are not transactional, and you still insist that this proposal is senseless unless variables are transactional.
Indeed.
I have submitted a proof of this fact in the form of a counter example which (1) (pseudo) implements the use-case by logging into an audit table the fact a user accesses the secure level (2) shows that the status of a non transactional session variable used for keeping this status is wrong for the use case in some cases (it says that all is well while appending to the audit table failed).
I have also recognized that the use-case could be implemented safely, although not correctly, if pg provides nested/autonomous transactions like Oracle, DB2 or MS SQL does, but I think that having such a useful feature is quite far away...You have every right to decide what you think is useful, but you don't have a right to decide what other people think is useful.
Hmmm.
I feel entitled to point out to other people that their belief that a feature as described provides a correct solution to a particular use case is wrong, if it is factually the case. If they persist in this belief despite the submitted proof, I can only be sad about it, because if pg provides a feature for a security-relared use case which does not work correctly it is just shooting one's foot.
I do not like Pavel's feature, this is a subjective opinion. This feature does not provide a correct solution for the use case, this is an objective fact. The presented feature does not have a real use case, this is too bad.
I wrote more time, so transactional and temporal support can be optional feature. The people who uses package or module variables in other RDBMS probably doesn't agree with you, so there are not real use case.
The transaction support is not main point in my proposal - the main points are:
1. catalog based - created only once time, persistent metadata
2. allows be schema organized - like our PL functions
3. disallow identifier conflict - the name is unique in catalogue
4. allows to use declared secure access
After this discussion I append points
5. can be temporal - metadata are cleaned after end of life scope
6. can be transactional where content should be sensitive on possible rollback
Regards
Pavel
Finally, I did not "veto" this feature, I reviewed it in depth and concluded negatively. You are a committer and I'm just a "silly academic", you do not have to listen to anything I say and can take majority votes against proofs if you want.
--
Fabien.
On 10 January 2017 at 14:31, Fabien COELHO <coelho@cri.ensmp.fr> wrote: > I have submitted a proof of this fact in the form of a counter example which > (1) (pseudo) implements the use-case by logging into an audit table the fact > a user accesses the secure level (2) shows that the status of a non > transactional session variable used for keeping this status is wrong for the > use case in some cases (it says that all is well while appending to the > audit table failed). You've been assuming everyone else cares about auditing such access into a table. Personally I tend to agree with you that it's useful enough to justify transactional vars. But you're fixated on the idea that without that use case satisfied the rest is useless, and that's simply not the case. Transactional vars are only needed if you make _write_ changes to the DB that must be committed atomically with the var change. If you're only doing (maybe expensive) lookups, it doesn't matter. > I feel entitled to point out to other people that their belief that a > feature as described provides a correct solution to a particular use case is > wrong, if it is factually the case. If they persist in this belief despite > the submitted proof, I can only be sad about it, because if pg provides a > feature for a security-relared use case which does not work correctly it is > just shooting one's foot. Again ... I think you've assumed everyone else is talking about the same security-related case you are. I haven't seen Pavel talking about access audit logging. If he has been, then my mistake and you're quite correct. But my reading is that you've been _assuming_ that. > I do not like Pavel's feature, this is a subjective opinion. This feature > does not provide a correct solution for the use case, this is an objective > fact. For _your_ use case. > The presented feature does not have a real use case, this is too bad. No, it just doesn't match your idea of what the use case is. It does have other uses that are perfectly valid. Look up access rights, set var. It's fine. Now, I think we might as well do it transactionally, but it's not some kind of absolute requirement; if you're not transactional, it just means you can't do reliable audit logging of access into tables. Of course, we can't do that already for anything else. It's kind of like someone coming to you and saying they want to add an engine to their glider, and you explaining that it's totally useless to add an engine unless it can also function as a submarine. Um, that's nice, but not what they asked for. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 10 January 2017 at 14:31, Fabien COELHO <coelho@cri.ensmp.fr> wrote: > I do not like Pavel's feature, this is a subjective opinion. This feature > does not provide a correct solution for the use case, this is an objective > fact. The presented feature does not have a real use case, this is too bad. Oh, also, you might want to tell Oracle and the many people who use package variables that. Now, that said, huge numbers of people blindly do all sorts of unsafe things and mostly get away with it. Using MERGE in concurrent OLTP workloads. Racey upserts. Blindly assuming xacts will succeed and not keeping the info around to retry them until confirmation of commit is received. That sort of business. Nonetheless, it's pretty clear they're far from having a "real use case". I'd like to see transactional vars. I think it's worthwhile and you've made a reasonable argument that they're useful, and should probably even be the default. Your unwillingness to listen to anyone else isn't doing your argument any favours though. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Tue, Jan 10, 2017 at 1:31 AM, Fabien COELHO <coelho@cri.ensmp.fr> wrote: > I have submitted a proof of this fact in the form of a counter example which > (1) (pseudo) implements the use-case by logging into an audit table the fact > a user accesses the secure level (2) shows that the status of a non > transactional session variable used for keeping this status is wrong for the > use case in some cases (it says that all is well while appending to the > audit table failed). > > I feel entitled to point out to other people that their belief that a > feature as described provides a correct solution to a particular use case is > wrong, if it is factually the case. If they persist in this belief despite > the submitted proof, I can only be sad about it, because if pg provides a > feature for a security-relared use case which does not work correctly it is > just shooting one's foot. You're just ignoring explanations from other people - Craig in particular - about why it DOES satisfy their use case. And the reason his argument is valid is because he is questioning your premise. You are proving "if A, then B" and he's saying, "yes, but not A". That's not a logical fallacy on his part. That's you proving something that is in his view irrelevant to the desirability of the feature. > I do not like Pavel's feature, this is a subjective opinion. This feature > does not provide a correct solution for the use case, this is an objective > fact. The presented feature does not have a real use case, this is too bad. If the presented feature had no use case, I don't think there would be 3 or 4 people arguing for it. Those people aren't stupid. > Finally, I did not "veto" this feature, I reviewed it in depth and concluded > negatively. Sure, that's pretty fair. Are you also willing to accept other people's differing conclusions? > You are a committer and I'm just a "silly academic", you do not > have to listen to anything I say and can take majority votes against proofs > if you want. I believe that the words "silly" and "academic" were used about certain proposals that you made, and you have here pulled them out of the context in which they were written and recast them as general judgements on you rather than statements about certain ideas which you proposed or certain arguments which you made. I think most people on this mailing list, including me, are very careful to avoid "ad hominum" arguments, and I believe that is also the case in the arguments made to you. Everybody's ideas on this mailing list, including mine, come in for criticism from time to time. That doesn't necessarily imply personal disrespect. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Hello Craig, >> I have submitted a proof of this fact in the form of a counter example which >> (1) (pseudo) implements the use-case by logging into an audit table the fact >> a user accesses the secure level (2) shows that the status of a non >> transactional session variable used for keeping this status is wrong for the >> use case in some cases (it says that all is well while appending to the >> audit table failed). > > You've been assuming everyone else cares about auditing such access > into a table. No, I have not. For the PosgreSQL product, I'm really assuming that a security feature should work in all cases, not just some cases with implicit uncheckable restrictions, especially restrictions related to transactions which is all a database is about. I think that providing a misleading feature is a bad idea. Note that my blessing is not required. If a committer wants to add this then they can do it. > But you're fixated on the idea that without that use case satisfied the > rest is useless, and that's simply not the case. Transactional vars are > only needed if you make _write_ changes to the DB that must be committed > atomically with the var change. If you're only doing (maybe expensive) > lookups, it doesn't matter. It does not matter if and only if the transaction does not fail, not because the variable is not transactional. Basically, if it is untransactional, then it works only if it behaves exactly like a transaction... > Again ... I think you've assumed everyone else is talking about the > same security-related case you are. I'm looking forward to see any use case which requires untransactional variables with permissions and works correctly without adding un-database constraints such as "please do not use in transactions that change any data because then it may or may not work". > It's kind of like someone coming to you and saying they want to add an > engine to their glider, and you explaining that it's totally useless > to add an engine unless it can also function as a submarine. Um, > that's nice, but not what they asked for. Hmmm... I think that it is really like adding an engine on a glider which does not work if the glider flies under a cloud. You just have to recall that you should not fly under a cloud when the engine is turned on. -- Fabien.
>> I do not like Pavel's feature, this is a subjective opinion. This feature >> does not provide a correct solution for the use case, this is an objective >> fact. The presented feature does not have a real use case, this is too bad. > > Oh, also, you might want to tell Oracle and the many people who use > package variables that. As it can be used safely with nested transaction, I have no doubt that they do that, and that auditors check that carefully when auditing code:-) > [...] Your unwillingness to listen to anyone else isn't doing your > argument any favours though. Hmmm. I'm far from perfect and I have a limited supply of patience when logic does not always apply in a long discussion. However I think that my review of Pavel proposal is fair, with a clear separation of objective (proven) facts and subjective but argumented opinions. I do not think that I can contribute anything more by continuing argumenting, so I wish I would not have been dragged back into this thread:-( Despite a lot of effort, Pavel proposal is still about a untransactional (by default) session variables. Too bad. Time out for me. I'm deeply against that, I have said it: I think it would harm PostgreSQL to provide such a misleading security feature. Then I'm done. If a committer wants to add untransactional session variables with permissions, it is their priviledge, and my blessing is not needed anyway. -- Fabien.
Hello Robert, > You're just ignoring explanations from other people - Craig in > particular - about why it DOES satisfy their use case. I'm not so sure about Craig precise opinion, but I cannot talk in his name. I think that I understood that he points out that there exists a situation where the use case is okay despite an untransactional variable: if the containing transaction is warranted not to fail, and probably (provably?) a read-only transaction is enough for that. Okay, sure... This falls under "the feature works sometime", which I think is not acceptable for a security thing in pg core. > And the reason his argument is valid is because he is questioning your > premise. [...] Yes. I made the assumption that PostgreSQL is about keeping data safe and secure, and that misleading features which do not comply with this goal should be kept out. This is indeed a subjective opinion, not provable truth. I only assumed that this opinion was implicitely shared, so that providing a counter example with the feature where data is not safe or secure was enough to dismiss the proposal. I'm clearly wrong: some people are okay with a security feature proven not to work in some case, if it works for their particular (read-only) case. >> I do not like Pavel's feature, this is a subjective opinion. This feature >> does not provide a correct solution for the use case, this is an objective >> fact. The presented feature does not have a real use case, this is too bad. > > If the presented feature had no use case, I don't think there would be > 3 or 4 people arguing for it. Those people aren't stupid. I have not said that, nor thought that. I pointed out my arguments, basically I answer "security must always work" to "the feature can work sometimes". Then it cycles. As I can offer limited time for reviewing features, at some point I do not have any more time to argue constructively and convince people, that is life. That is when I tried to conclude my contribution by sending my review. > [..] Are you also willing to accept other people's differing > conclusions? I do not have to "accept", or not, differing conclusions. The committer decides in the end, because they have the power, I just have words. All I can say is that as a committer I would not commit such a feature. As a basic contributor, I can hope that the best decision is made in the end, and for that I try to express arguments precisely and objectively, that is the point of reviewing a proposal and give advice about how it should be amended if I think it should. > I believe that the words "silly" and "academic" were used about certain > proposals that you made, [..] it does necessarily imply personal > disrespect. Sure. "Silly academic" suits me though, I'm fine with it:-) -- Fabien.
On 11 January 2017 at 06:09, Fabien COELHO <coelho@cri.ensmp.fr> wrote: > I'm not so sure about Craig precise opinion, but I cannot talk in his name. > I think that I understood that he points out that there exists a situation > where the use case is okay despite an untransactional variable: if the > containing transaction is warranted not to fail, and probably (provably?) a > read-only transaction is enough for that. Okay, sure... No. I'm saying that if you do a series of checks, then set the variable last, it does not matter if the xact somehow aborts after setting the variable. You have already done all your checks and are satisfied that the user has the appropriate access level. This does not cover all use cases. It's not suitable if the checks involve writes to the database since you can then have a situation where the writes are lost but the variable setting retained. However, it does cover some common and useful ones like looking up external services, possibly expensive queries, etc, and setting a variable to cache "yup, access approved". > I made the assumption that PostgreSQL is about keeping data safe and secure, > and that misleading features which do not comply with this goal should be > kept out. This is hyperbolic. Every system has rules and restrictions. Appropriately documented, I don't see a problem. Should we also remove sequences because they violate transactional boundaries and users get confused by them? You won't mind that you can only use synthetic keys completely serially, right? Sometimes compromises are a thing. Now, I have already said that I actually agree that transactional vars are probably a good default and something we should have if we do this. But they are not the One True And Only Way. > I'm clearly wrong: some people are okay with a security feature proven not > to work in some case, if it works for their particular (read-only) case. Many normal things work only in some cases. COMMIT can be indeterminate if you lose your connection after sending COMMIT and before getting a reply. Did my transaction commit? Um, I dunno. (Yes, I know we have 2PC). That's pretty fundmental... So. I would like transactional variables and think you have made a good case for them. If they prove impractical, I think variables with access controls that are not transactional are also OK though less useful, so long as they are clearly documented. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Hello Craig. >> I'm not so sure about Craig precise opinion, but I cannot talk in his name. >> I think that I understood that he points out that there exists a situation >> where the use case is okay despite an untransactional variable: if the >> containing transaction is warranted not to fail, and probably (provably?) a >> read-only transaction is enough for that. Okay, sure... > > No. > > I'm saying that if you do a series of checks, then set the variable > last, it does not matter if the xact somehow aborts after setting the > variable. You have already done all your checks and are satisfied that > the user has the appropriate access level. I'm lost. This is precisely what I had in mind above with "read-only transaction" which is "warranted not to fail". I do not understand about which point you write "No". >> I made the assumption that PostgreSQL is about keeping data safe and secure, >> and that misleading features which do not comply with this goal should be >> kept out. > > [...] Sometimes compromises are a thing. Indeed. Sequence is an interesting compromise with a clear use case and a measurable performance impact resulting from this. Note that sequences do have a key transactional property: it is transactionaly warranted that distinct committed transactions, whenever they occur (simultaneously or after a crash), get distinct values (bar cycles, sure). IMHO, security & compromise do not work well together, so should be avoided. I recognize that this is an opinion. As for the particular case, there is no deep problem about session variables being transactional: The available ones already have this property. They are pretty fast (eg 0.186 µs/get + cast to int on my laptop, or 5.3 million retrieval per second). I'm yet to understand how compromising security is worth the added value of the discussed proposal. It is not about performance. The static checkability is moot. I have already argued about all that... > Now, I have already said that I actually agree that transactional vars > are probably a good default and something we should have if we do this. Yep, I read that. Pavel updated proposal does not do that. Are you "voting" for or against the proposal? ISTM that you are currently counted as "for". > But they are not the One True And Only Way. Indeed. The idea that security & compromise do not go well together is an opinion, and people may feel that a security feature can be compromised. We could add a "maybe corrupt the database" feature because it provides better performance to some use case: MySQL has made this initial choice that performance is better than data safety, with great popular success. I do not think that pg should fellow such path, the product is not on the same market. I understood that data safety & security are key properties expected of PostgreSQL and that it should remain a goal of the project. You can call this hyperbolic, or a misunderstanding, but that is the position I am defending on this thread. >> I'm clearly wrong: some people are okay with a security feature proven not >> to work in some case, if it works for their particular (read-only) case. > > Many normal things work only in some cases. > > COMMIT can be indeterminate if you lose your connection after sending > COMMIT and before getting a reply. The commit *is* determinate on the server. Knowing whether it succeeded by a client is indeed indeterminate because the message saying so may be lost, as you point out. > So. I would like transactional variables and think you have made a > good case for them. I'm not that sure:-) > If they prove impractical, [...] ISTM that I have also shown that they are practical, so there is no good reason to compromise. -- Fabien.
On 11 Jan. 2017 16:29, "Fabien COELHO" <coelho@cri.ensmp.fr> wrote:
I'm lost. This is precisely what I had in mind above with "read-only transaction" which is "warranted not to fail". I do not understand about which point you write "No".
I misread. We agree.
Are you "voting" for or against [Pavel's] proposal?
ISTM that you are currently counted as "for".
Mixed. We don't really vote anyway.
There is no code yet. Code review and testing is where things get firmer.
My personal stance right now is that I'd like to see catalog-decared typed variables. I would prefer them to be transactional and would at least oppose anything that didn't allow future room for that capability. I'd prefer that non-transactional vars be clearly declared as such.
In the end though... I'm not the one implementing it. I can have some influence through the code review process. But it's whoever steps up with a proposed implementation that has the biggest say. The rest of us can say yes or no to some degree... but nobody can make someone else implement something they don't want.
On Wed, Jan 11, 2017 at 10:42 PM, Craig Ringer <craig@2ndquadrant.com> wrote: > There is no code yet. Code review and testing is where things get firmer. > > My personal stance right now is that I'd like to see catalog-decared typed > variables. I would prefer them to be transactional and would at least oppose > anything that didn't allow future room for that capability. I'd prefer that > non-transactional vars be clearly declared as such. > > In the end though... I'm not the one implementing it. I can have some > influence through the code review process. But it's whoever steps up with a > proposed implementation that has the biggest say. The rest of us can say yes > or no to some degree... but nobody can make someone else implement something > they don't want. The last patch is from the 6th of December and does not apply anymore: https://www.postgresql.org/message-id/CAFj8pRA9w_AujBAYdLR0UVfXwwoxhmn%2BFbNHnD3_NL%3DJ9x3y8w%40mail.gmail.com I don't have a better idea than marking this patch as "returned with feedback" for now, as the thread has died 3 weeks ago as well. -- Michael
2017-02-01 6:05 GMT+01:00 Michael Paquier <michael.paquier@gmail.com>:
On Wed, Jan 11, 2017 at 10:42 PM, Craig Ringer <craig@2ndquadrant.com> wrote:
> There is no code yet. Code review and testing is where things get firmer.
>
> My personal stance right now is that I'd like to see catalog-decared typed
> variables. I would prefer them to be transactional and would at least oppose
> anything that didn't allow future room for that capability. I'd prefer that
> non-transactional vars be clearly declared as such.
>
> In the end though... I'm not the one implementing it. I can have some
> influence through the code review process. But it's whoever steps up with a
> proposed implementation that has the biggest say. The rest of us can say yes
> or no to some degree... but nobody can make someone else implement something
> they don't want.
The last patch is from the 6th of December and does not apply anymore:
https://www.postgresql.org/message-id/CAFj8pRA9w_ AujBAYdLR0UVfXwwoxhmn% 2BFbNHnD3_NL%3DJ9x3y8w%40mail. gmail.com
I don't have a better idea than marking this patch as "returned with
feedback" for now, as the thread has died 3 weeks ago as well.
There is not a agreement on the form of session variables.
Regards
Pavel
--
Michael
2017-02-01 6:42 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:
2017-02-01 6:05 GMT+01:00 Michael Paquier <michael.paquier@gmail.com>:On Wed, Jan 11, 2017 at 10:42 PM, Craig Ringer <craig@2ndquadrant.com> wrote:
> There is no code yet. Code review and testing is where things get firmer.
>
> My personal stance right now is that I'd like to see catalog-decared typed
> variables. I would prefer them to be transactional and would at least oppose
> anything that didn't allow future room for that capability. I'd prefer that
> non-transactional vars be clearly declared as such.
>
> In the end though... I'm not the one implementing it. I can have some
> influence through the code review process. But it's whoever steps up with a
> proposed implementation that has the biggest say. The rest of us can say yes
> or no to some degree... but nobody can make someone else implement something
> they don't want.
The last patch is from the 6th of December and does not apply anymore:
https://www.postgresql.org/message-id/CAFj8pRA9w_AujBAYdLR0U VfXwwoxhmn%2BFbNHnD3_NL% 3DJ9x3y8w%40mail.gmail.com
I don't have a better idea than marking this patch as "returned with
feedback" for now, as the thread has died 3 weeks ago as well.There is not a agreement on the form of session variables.
Today I found on net a documentation to DB2 "CREATE VARIABLE" command. I had not any idea, so this statement exists already, although it is old feature - I found a doc from 2007.
The DB2 design is very similar to my proposal - secured access, persistent metadata, unshared untransactional data limmited by session.
They doesn't use a access functions - the access is with notation schemaname.variablename. I proposed this syntax as next step in implementation.
The DB2 authors doesn't propose transactional variables - when user needs XA behave, then global temporary tables should be used.
My patch was marked as "returned with feedback". Personally, I had not a idea what can be next step and what is preferred design, if some preferred design exists. I don't know what I have to change on my proposal.
I understand, so there are two different concepts - 1. using variables for adhoc writing like T-SQL, MySQL or 2. using variables as global session objects for stored procedures.
The @1 area is partially solved by psql session variables or by pgAdmin scripting functionality. @2 is partially solved by GUC but without possibility to set a access rights.
I didn't found any implementation of XA variables or persistent variables on the world.
Regards
Pavel
RegardsPavel--
Michael
There is a link - comparation Oracle package variables and DB2 global variables
https://www.ibm.com/developerworks/data/library/techarticle/dm-0711zubiri/
Regardshttps://www.ibm.com/developerworks/data/library/techarticle/dm-0711zubiri/
On Fri, Feb 3, 2017 at 2:56 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: > My patch was marked as "returned with feedback". Personally, I had not a > idea what can be next step and what is preferred design, if some preferred > design exists. I don't know what I have to change on my proposal. Perhaps this was not adapted, sorry about that. Now the latest patch is 2-month old and does not apply. If you think that the approach you are taking is worth it, you can of course submit again a new version and make the discussion move on. Finding a consensus is the difficult part though. -- Michael
Hello Pavel, > The @1 area is partially solved by psql session variables or by pgAdmin > scripting functionality. @2 is partially solved by GUC but without > possibility to set a access rights. > > I didn't found any implementation of XA variables [...] I did: GUCs in PostgreSQL are an implementation of transactional session variables. As I wrote on the thread, given the "security check" use case, the safe alternative to transactional session variables is to have nested transactions. This seems like a far away prospect for pg, but is a reality for Oracle, DB2 and some others that have untransactional session variables, so at least it is safe in their case, if not elegant. My "hard" opinion is that providing an unsafe by default feature (i.e. which works as in some particular cases, but may fail silently if the transaction fails), especially for a security related use case which motivates the whole feature addition, is a very bad idea for the product. If a committer likes it anyway, good for you. Other opinions I expressed on the thread are somehow "softer", i.e. even if I think that there are better (simpler, easier) alternatives, these are only alternatives. -- Fabien.
2017-02-03 7:25 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
Hello Pavel,The @1 area is partially solved by psql session variables or by pgAdmin
scripting functionality. @2 is partially solved by GUC but without
possibility to set a access rights.
I didn't found any implementation of XA variables [...]
I did: GUCs in PostgreSQL are an implementation of transactional session variables.
GUC was not designed for usage in stored procedures.
As I wrote on the thread, given the "security check" use case, the safe alternative to transactional session variables is to have nested transactions. This seems like a far away prospect for pg, but is a reality for Oracle, DB2 and some others that have untransactional session variables, so at least it is safe in their case, if not elegant.
You have everywhere the fence between transactional/untransactional - and you can see some unwanted artefacts there. The world "secure" just means - a possibility to set access rights - nothing more, nothing less.
My "hard" opinion is that providing an unsafe by default feature (i.e. which works as in some particular cases, but may fail silently if the transaction fails), especially for a security related use case which motivates the whole feature addition, is a very bad idea for the product. If a committer likes it anyway, good for you.
I respect to your opinion and I understand - I have a similar strong opinion on packages in Postgres. In this case I prefer a common implementation - and common expectation.
When some feature is PostgreSQL original, then we can design how we can. But when we implement some feature that exists already, then we should to respect some previous, older major implementation.
The great example is our implementation of OUT parameters in PL. The idea is great - modern languages use it Golang, Rust. But in PL area is unique, different. One from significant issues migrations to Postgres, Postgres adoptions is this small feature.
The people who is working with stored procedures doesn't expect XA behave, overhead when they working with some objects named "variables". We can implement XA support for variables, ale I don't think so default should be XA. Only few cases where variables can be used are are XA sensitive.
Regards
Pavel
Other opinions I expressed on the thread are somehow "softer", i.e. even if I think that there are better (simpler, easier) alternatives, these are only alternatives.
--
Fabien.
> We can implement XA support for variables, ale I don't think so default > should be XA. I was answering your question, which is what you can do about the feedback: take the one hard/strong point into account in your proposal. You do not want to do that. Too bad. The argument that you keep on repeating about "other software do it like that so it is the good way" do not work because these software (Oracle, DB2, ...) have features unavailable to postgres which mitigate the issue I'm raising, and there is no such mitigation in postgres. Note that you can proceed and simply ignore my negative opinion, which will stay negative till these "secure" variables are transactional by default, or till nested/autonomous transactions are provided by postgres. -- Fabien.
My "hard" opinion is that providing an unsafe by default feature (i.e. which works as in some particular cases, but may fail silently if the transaction fails), especially for a security related use case which motivates the whole feature addition, is a very bad idea for the product. If a committer likes it anyway, good for you.I respect to your opinion and I understand - I have a similar strong opinion on packages in Postgres. In this case I prefer a common implementation - and common expectation.When some feature is PostgreSQL original, then we can design how we can. But when we implement some feature that exists already, then we should to respect some previous, older major implementation.The great example is our implementation of OUT parameters in PL. The idea is great - modern languages use it Golang, Rust. But in PL area is unique, different. One from significant issues migrations to Postgres, Postgres adoptions is this small feature.The people who is working with stored procedures doesn't expect XA behave, overhead when they working with some objects named "variables". We can implement XA support for variables, ale I don't think so default should be XA. Only few cases where variables can be used are are XA sensitive.
A syntax can be designed very verbose, so anybody can see and should to choose expected behave of variables
CREATE [ { TEMPORARY | TEMP } ] [ { TRANSACTIONAL | XA } ] VARIABLE [ IF NOT EXISTS ] varname datatype [ DEFAULT default_expression ] ;
Regards
Pavel
RegardsPavel
Other opinions I expressed on the thread are somehow "softer", i.e. even if I think that there are better (simpler, easier) alternatives, these are only alternatives.
--
Fabien.
2017-02-03 11:18 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
We can implement XA support for variables, ale I don't think so default should be XA.
I was answering your question, which is what you can do about the feedback: take the one hard/strong point into account in your proposal.
You do not want to do that. Too bad.
The argument that you keep on repeating about "other software do it like that so it is the good way" do not work because these software (Oracle, DB2, ...) have features unavailable to postgres which mitigate the issue I'm raising, and there is no such mitigation in postgres.
Note that you can proceed and simply ignore my negative opinion, which will stay negative till these "secure" variables are transactional by default, or till nested/autonomous transactions are provided by postgres.
ok
Regards
Pavel
--
Fabien.
2017-02-03 11:18 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
We can implement XA support for variables, ale I don't think so default should be XA.
I was answering your question, which is what you can do about the feedback: take the one hard/strong point into account in your proposal.
You do not want to do that. Too bad.
The argument that you keep on repeating about "other software do it like that so it is the good way" do not work because these software (Oracle, DB2, ...) have features unavailable to postgres which mitigate the issue I'm raising, and there is no such mitigation in postgres.
Note that you can proceed and simply ignore my negative opinion, which will stay negative till these "secure" variables are transactional by default, or till nested/autonomous transactions are provided by postgres.
I'll work on my proposal in v11 time. Maybe in this time Postgres will support autonomous transactions.
The variables syntax should be better integrated to core - it should be implemented without getter/setter functions. I am not sure If statement SET can be enhanced to allows the work with session variables without some conflicts, but we will see.
Regards
Pavel
--
Fabien.
Hello, > I'll work on my proposal in v11 time. Maybe in this time Postgres will > support autonomous transactions. Maybe. > The variables syntax should be better integrated to core - it should be > implemented without getter/setter functions. Yes, a nicer syntax would be great. Note that setter/getter could be useful for some use case, eg with queries built dynamically? > I am not sure If statement SET can be enhanced to allows the work with > session variables without some conflicts, but we will see. If so, maybe some kind of prefix could provide a workaround. -- Fabien.
2017-02-06 21:36 GMT+01:00 Fabien COELHO <coelho@cri.ensmp.fr>:
Hello,I'll work on my proposal in v11 time. Maybe in this time Postgres will support autonomous transactions.
Maybe.The variables syntax should be better integrated to core - it should be
implemented without getter/setter functions.
Yes, a nicer syntax would be great.
Note that setter/getter could be useful for some use case, eg with queries built dynamically?
There is not any problem for usage in dynamic sql. Some generic access is done already.
I am not sure If statement SET can be enhanced to allows the work with session variables without some conflicts, but we will see.
If so, maybe some kind of prefix could provide a workaround.
any other database objects has not prefix. But we can identify a ambiguous situation and in this case we can require qualified identifier.
Regards
Pavel
--
Fabien.