Thread: [BUGS] BUG #14870: wrong query results when using WITH with UPDATE

[BUGS] BUG #14870: wrong query results when using WITH with UPDATE

From
andreigorita@gmail.com
Date:
The following bug has been logged on the website:

Bug reference:      14870
Logged by:          Andrei Gorita
Email address:      andreigorita@gmail.com
PostgreSQL version: 9.6.1
Operating system:   CentOS
Description:

when updating a table with unique index within a WITH part of a query, in
certain conditions the query reports that updated more than one row.

A simple way to reproduce:

test=> create table tmp_test(id int not null, test text not null);
CREATE TABLE
test=> create unique index on tmp_test(id);
CREATE INDEX
test=> INSERT INTO tmp_test VALUES (1, 'test');
INSERT 0 1
test=> create table tmp_test2(id int not null, test text not null);
CREATE TABLE

test=> WITH updated AS (UPDATE tmp_test SET test = 'test' WHERE id = 1
RETURNING id), inserted AS (INSERT INTO tmp_test2 (id, test) SELECT 1,
'test' WHERE NOT EXISTS (SELECT 1 FROM updated) RETURNING id) SELECT * FROM
updated;id 
---- 1
(1 row)

This is the expected result, but when another session is executing in
parallel:
test=> begin;
BEGIN
test=> UPDATE tmp_test SET test = 'test' WHERE id = 1;
UPDATE 1
test=> commit;
COMMIT


the result is:

test=> WITH updated AS (UPDATE tmp_test SET test = 'test' WHERE id = 1
RETURNING id), inserted AS (INSERT INTO tmp_test2 (id, test) SELECT 1,
'test' WHERE NOT EXISTS (SELECT 1 FROM updated) RETURNING id) SELECT * FROM
updated;id 
---- 1 1
(2 rows)

which is at least strange.


--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #14870: wrong query results when using WITH with UPDATE

From
Marko Tiikkaja
Date:
Hi Andrei,

On Tue, Oct 24, 2017 at 6:53 PM, <andreigorita@gmail.com> wrote:
test=> WITH updated AS (UPDATE tmp_test SET test = 'test' WHERE id = 1
RETURNING id), inserted AS (INSERT INTO tmp_test2 (id, test) SELECT 1,
'test' WHERE NOT EXISTS (SELECT 1 FROM updated) RETURNING id) SELECT * FROM
updated;
 id
----
  1
(1 row)

This is the expected result, but when another session is executing in
parallel:

the result is:

 id
----
  1
  1
(2 rows)

Right.  We don't actually need tmp_test2 to trigger the bug, we simply need an InitPlan on the CTE.  This query:

  WITH updated AS (
        UPDATE tmp_test SET test = 'test' WHERE id = 1
        RETURNING id
  )
  SELECT (SELECT 1 FROM updated), * FROM updated

has the same problem.  The query plan looks like this:

  CTE Scan on updated (actual rows=2 loops=1)
    CTE updated
      ->  Update on tmp_test (actual rows=1 loops=1)
            ->  Index Scan using tmp_test_id_idx on tmp_test (actual rows=1 loops=1)
                  Index Cond: (id = 1)
    InitPlan 2 (returns $2)
      ->  CTE Scan on updated updated_1 (actual rows=1 loops=1)

As you said, this problem only occurs when the row changed by the CTE is concurrently UPDATEd, so I'm guessing something goes wrong with EPQ here.


.m

Re: [BUGS] BUG #14870: wrong query results when using WITH with UPDATE

From
Marko Tiikkaja
Date:
Does anyone care to care about this?  We've had two report of what I think is the same problem, yet nobody's properly analyzed the problem.


.m

Re: [BUGS] BUG #14870: wrong query results when using WITH with UPDATE

From
Peter Geoghegan
Date:
On Sun, Feb 18, 2018 at 8:23 AM, Marko Tiikkaja <marko@joh.to> wrote:
> Does anyone care to care about this?  We've had two report of what I think
> is the same problem, yet nobody's properly analyzed the problem.

I didn't notice this until now. I'd say that it's certainly a bug, and
should be fixed.

-- 
Peter Geoghegan


Re: [BUGS] BUG #14870: wrong query results when using WITH with UPDATE

From
Tom Lane
Date:
Marko Tiikkaja <marko@joh.to> writes:
> Does anyone care to care about this?  We've had two report of what I think
> is the same problem, yet nobody's properly analyzed the problem.

Hm, what's the other report that you think is related?

I got around to tracing through this finally, and there seem to be a
couple of interrelated issues.  The real core of the problem is that
it didn't occur to me while writing CteScanNext that calling the input
plan could result in invoking a different CteScan on the same worktable.
This means that when we come back from the ExecProcNode call, the CTE's
shared tuplestore might now have somebody else's read pointer selected
as the active read pointer.  Since tuplestore_puttupleslot has different
effects on the active read pointer than on inactive ones, the wrong things
happen.  The fix is trivial: just reselect our read pointer as active
before storing the fetched tuple, viz

        /*
         * Append a copy of the returned tuple to tuplestore.  NOTE: because
         * our read pointer is certainly in EOF state, its read position will
         * move forward over the added tuple.  This is what we want.  Also,
         * any other readers will *not* move past the new tuple, which is what
         * they want.
         */
+       tuplestore_select_read_pointer(tuplestorestate, node->readptr);
        tuplestore_puttupleslot(tuplestorestate, cteslot);

This could, conceivably, account for all sorts of weird misbehavior if
the same CTE is read in different levels of a plan.  However, it wasn't
immediately obvious to me why the given examples involving EPQ would
cause it, because in fact these queries *don't* cause recursion of
CteScanNext.  There is recursion of ExecCteScan, but the recursive
call happens while projecting the result tuple of the outer CteScan
node, not while it's fetching an input tuple.  In fact, since what
we're calling is supposedly just the CTE subplan, it's not real
clear how it could contain another scan of itself.

The answer turns out to be that if EPQ is fired, then we instantiate a new
copy of the whole outer plan tree (though apparently not the InitPlan),
resulting in an additional ExecInitCteScan call that allocates an
additional read pointer on the same tuplestore, and it's that one being
the active one that causes the observed misbehavior.

This is kind of annoying because the EPQ-instantiated CteScan will never
actually read the tuplestore at all, so the extra read pointer it creates
is useless and just adds overhead to the tuplestore.  Since we can't
currently truncate a CteScan's tuplestore anyway, it might not be worth
worrying about today, and I certainly wouldn't back-patch a fix for it.
But if we ever get smarter about that, we should try to avoid touching
the tuplestore in EPQ subplans.  This seems related to the indexscan EPQ
bug we fixed the other day in 2e668c522...

            regards, tom lane


Re: [BUGS] BUG #14870: wrong query results when using WITH with UPDATE

From
Tom Lane
Date:
I wrote:
> The answer turns out to be that if EPQ is fired, then we instantiate a new
> copy of the whole outer plan tree (though apparently not the InitPlan),
> resulting in an additional ExecInitCteScan call that allocates an
> additional read pointer on the same tuplestore, and it's that one being
> the active one that causes the observed misbehavior.

Oh, on closer inspection I had that backwards: the EPQ plan tree would
normally contain only the child node(s) of the ModifyTable node, but
EvalPlanQualStart also duplicates all InitPlan or SubPlan node trees,
because it can't tell which of them might get used within the EPQ subtree.
So we get one extra ExecInitCteScan call, even though no actual access
to the tuplestore could happen.

So we can now characterize the problem case as "CTE is read within an
InitPlan or SubPlan, and the query as a whole encounters an EPQ recheck".

            regards, tom lane