Thread: BUG #16006: Update queries fail on a table having any policy with a function that takes a whole-row var as arg

The following bug has been logged on the website:

Bug reference:      16006
Logged by:          Vinay Banakar
Email address:      vinay.s.banakar@gmail.com
PostgreSQL version: 9.5.19
Operating system:   Ubuntu 16.04.6 LTS
Description:

Update queries fail on a table having any policy with a function that takes
a whole-row var as argument:

I stumbled upon this after I had enabled RLS on a table and have a policy
that basically raises every row read to log at INFO level (a quick hack for
benchmarking).

Table:
create table usertable(key VARCHAR PRIMARY KEY,field0 text,field1
text,field2 text,field3 text,field4 text,field5 text,field6 text,field7
text,field8 text,field9 text,timestamp timestamp NOT NULL DEFAULT NOW());

# This policy is to raise a row to INFO log when an operation is made on
that.
ALTER TABLE usertable enable ROW LEVEL SECURITY;
ALTER TABLE usertable force ROW LEVEL SECURITY;
create function log_record(_tbl usertable) returns boolean language plpgsql
immutable as $$ begin raise info 'log: %', $1; return true; end; $$;
CREATE POLICY loggerPolicy ON usertable USING (log_record(usertable));

Now any update queries on this table fail. 

This was working correctly on 9.5.15.

Thanks to RhodiumToad from #postgresql  for verifying the bug.


PG Bug reporting form <noreply@postgresql.org> writes:
> Update queries fail on a table having any policy with a function that takes
> a whole-row var as argument.

Hm.  You really should have shown the failure you were seeing, but
for the archives' sake: I can reproduce this on 9.5 and 9.6 (if I
run the queries as non-superuser!), and it looks like

regression=> insert into usertable
values('key','field0','field1','field2','field3','field4','field5','field6','field7','field8','field9');
INFO:  log: (key,field0,field1,field2,field3,field4,field5,field6,field7,field8,field9,"2019-09-12 16:29:37.511329")
INSERT 0 1
regression=> update usertable set field0 = 'f0';
INFO:  log: (key,field0,field1,field2,field3,field4,field5,field6,field7,field8,field9,"2019-09-12 16:29:37.511329")
ERROR:  table row type and query-specified row type do not match
DETAIL:  Table has type tid at ordinal position 1, but query expects character varying.

Digging into this, it seems the short answer is "Andres should have
back-patched 148e632c0".  The plan shape in 9.6.x is

 Update on usertable usertable_1  (cost=0.00..66.00 rows=70 width=366)
   ->  Subquery Scan on usertable  (cost=0.00..66.00 rows=70 width=366)
         ->  LockRows  (cost=0.00..65.30 rows=70 width=340)
               ->  Seq Scan on usertable usertable_2  (cost=0.00..64.60 rows=70 width=340)
                     Filter: log_record(usertable_2.*)

and because ExecInitModifyTable incorrectly passes the SubqueryScan
as parent of the WCO expressions, the horrible examine-the-parent
kluge in ExecEvalWholeRowVar fires, causing it to apply a completely
inappropriate junkfilter to the scan tuple.  After which, of course,
the tuple rowtype is wrong.

I do not see the bug in v10 and up, but I think that's accidental in
v10/v11, because they produce a plan without the not-really-necessary
SubqueryScan:

 Update on usertable  (cost=0.00..64.60 rows=70 width=366)
   ->  Seq Scan on usertable  (cost=0.00..64.60 rows=70 width=366)
         Filter: log_record(usertable.*)

The whole-row var is still being initialized with respect to the wrong
parent, but it doesn't do anything funny when it's pointed at a plain
SeqScan, so all is well.  I suspect it's possible to develop a test
case that will fail in v10/v11, and will go look for one.

Thanks for the report!

            regards, tom lane



Hi,

On 2019-09-12 16:33:55 -0400, Tom Lane wrote:
> PG Bug reporting form <noreply@postgresql.org> writes:
> > Update queries fail on a table having any policy with a function that takes
> > a whole-row var as argument.
> 
> Hm.  You really should have shown the failure you were seeing, but
> for the archives' sake: I can reproduce this on 9.5 and 9.6 (if I
> run the queries as non-superuser!), and it looks like
> 
> regression=> insert into usertable
values('key','field0','field1','field2','field3','field4','field5','field6','field7','field8','field9');
> INFO:  log: (key,field0,field1,field2,field3,field4,field5,field6,field7,field8,field9,"2019-09-12 16:29:37.511329")
> INSERT 0 1
> regression=> update usertable set field0 = 'f0';
> INFO:  log: (key,field0,field1,field2,field3,field4,field5,field6,field7,field8,field9,"2019-09-12 16:29:37.511329")
> ERROR:  table row type and query-specified row type do not match
> DETAIL:  Table has type tid at ordinal position 1, but query expects character varying.
> 
> Digging into this, it seems the short answer is "Andres should have
> back-patched 148e632c0".  The plan shape in 9.6.x is

Heh :/. I was wondering (and asking for input) at the time... The whole
business with parent nodes is somewhat fragile (cf 2abd7ae9b2 /
5f32b29c1819), which makes me generally a bit hesitant to backpatch.
But obviously it's needed here.


>  Update on usertable usertable_1  (cost=0.00..66.00 rows=70 width=366)
>    ->  Subquery Scan on usertable  (cost=0.00..66.00 rows=70 width=366)
>          ->  LockRows  (cost=0.00..65.30 rows=70 width=340)
>                ->  Seq Scan on usertable usertable_2  (cost=0.00..64.60 rows=70 width=340)
>                      Filter: log_record(usertable_2.*)
> 
> and because ExecInitModifyTable incorrectly passes the SubqueryScan
> as parent of the WCO expressions, the horrible examine-the-parent
> kluge in ExecEvalWholeRowVar fires, causing it to apply a completely
> inappropriate junkfilter to the scan tuple.  After which, of course,
> the tuple rowtype is wrong.

I don't quite understand how this "visibly" broke between 9.5.15 and
9.5.19 as the op reports? Haven't dug into it yet though.

Greetings,

Andres Freund



Andres Freund <andres@anarazel.de> writes:
> I don't quite understand how this "visibly" broke between 9.5.15 and
> 9.5.19 as the op reports? Haven't dug into it yet though.

I'm confused about that too, since in my hands the submitted test
case fails in both of those branches.  I've also been able to
adapt it to fail in v10/v11.  Will commit in a moment.

            regards, tom lane



Hello,

Thank you for finding the root cause, I just checked on 9.15.17 and it is still failing here also.
But I distinctly remember it working earlier, not sure which version of pg (at least <9.5.16) though as I have lost that earlier setup.

Regards
Vinay

On Fri, Sep 13, 2019 at 3:58 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andres Freund <andres@anarazel.de> writes:
> I don't quite understand how this "visibly" broke between 9.5.15 and
> 9.5.19 as the op reports? Haven't dug into it yet though.

I'm confused about that too, since in my hands the submitted test
case fails in both of those branches.  I've also been able to
adapt it to fail in v10/v11.  Will commit in a moment.

                        regards, tom lane
Vinay Banakar <vinay.s.banakar@gmail.com> writes:
> Thank you for finding the root cause, I just checked on 9.15.17 and it is
> still failing here also.
> But I distinctly remember it working earlier, not sure which version of pg
> (at least <9.5.16) though as I have lost that earlier setup.

Oh!  I misread your report as being that it worked on 9.5.x and failed
on 9.6.x.  If it really did work on some earlier 9.5.x version, then
we must've busted it with a patch ... but what?  That bizarre code in
ExecEvalWholeRowVar is much older than 9.5 (looks like it dates to
8e617e29a), and the mistake proper goes back to the introduction of
WITH CHECK OPTION in 9.4 (4cbe3ac3e).  I'm not seeing any other
moving parts here.

            regards, tom lane



Right, I might as well have mistaken as it was quite sometime ago.
Sorry for the confusion.

is checked in. However, do you suggest I build pg from this master branch for my bench marking (benchmarking pg with a particular
dataset) or should I wait until the next STABLE release is created (if so do you have any information on when the next one will be released)?
I will be running this on Ubuntu 16.04 so hope that does not have any
implications.

Thanks
Vinay

On Fri, Sep 13, 2019, 9:59 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Vinay Banakar <vinay.s.banakar@gmail.com> writes:
> Thank you for finding the root cause, I just checked on 9.15.17 and it is
> still failing here also.
> But I distinctly remember it working earlier, not sure which version of pg
> (at least <9.5.16) though as I have lost that earlier setup.

Oh!  I misread your report as being that it worked on 9.5.x and failed
on 9.6.x.  If it really did work on some earlier 9.5.x version, then
we must've busted it with a patch ... but what?  That bizarre code in
ExecEvalWholeRowVar is much older than 9.5 (looks like it dates to
8e617e29a), and the mistake proper goes back to the introduction of
WITH CHECK OPTION in 9.4 (4cbe3ac3e).  I'm not seeing any other
moving parts here.

                        regards, tom lane
Vinay Banakar <vinay.s.banakar@gmail.com> writes:
> I see
> https://github.com/postgres/postgres/commit/7f1f72c44400e6fef3436b05f1ad0f6bacd03d96
> is checked in. However, do you suggest I build pg from this master branch
> for my bench marking (benchmarking pg with a particular
> dataset) or should I wait until the next STABLE release is created (if so
> do you have any information on when the next one will be released)?

The next scheduled releases are in November.

https://www.postgresql.org/developer/roadmap/

            regards, tom lane