Thread: [HACKERS] PG10 partitioning - odd behavior/possible bug

[HACKERS] PG10 partitioning - odd behavior/possible bug

From
Joe Conway
Date:
I was playing around with partitioning and found an oddity that is best
described with the following reasonably minimal test case:

8<---------
CREATE TABLE timetravel
( id int8, f1 text not null, tr tstzrange not null default tstzrange(now(), 'infinity', '[]')
) PARTITION BY RANGE (upper(tr));

CREATE TABLE timetravel_current PARTITION OF timetravel
( primary key (id, tr) DEFERRABLE
) FOR VALUES FROM ('infinity') TO (MAXVALUE);
CREATE INDEX timetravel_current_tr_idx ON timetravel_current USING GIST
(tr);

CREATE TABLE timetravel_history PARTITION OF timetravel
( primary key (id, tr) DEFERRABLE
) FOR VALUES FROM (MINVALUE) TO ('infinity');
CREATE INDEX timetravel_history_tr_idx ON timetravel_history USING GIST
(tr);

CREATE OR REPLACE FUNCTION modify_timetravel()
RETURNS TRIGGER AS $$ DECLARE   tsr tstzrange; BEGIN   RAISE NOTICE 'OLD.tr = %', OLD.tr;
   tsr := tstzrange(lower(OLD.tr), now(), '[)');   RAISE NOTICE 'tsr = %', tsr;
   OLD.tr = tsr;   INSERT INTO timetravel VALUES (OLD.*);   IF (TG_OP = 'UPDATE') THEN     tsr := tstzrange(now(),
'infinity','[]');     RAISE NOTICE 'NEW.tr = %', tsr;     NEW.tr = tsr;     RETURN NEW;   ELSIF (TG_OP = 'DELETE') THEN
   RETURN OLD;   END IF; END; 
$$ LANGUAGE plpgsql;

CREATE TRIGGER timetravel_audit BEFORE DELETE OR UPDATE
ON timetravel_current FOR EACH ROW EXECUTE PROCEDURE modify_timetravel();

INSERT INTO timetravel(id, f1)
SELECT g.i, 'row-' || g.i::text
FROM generate_series(1,10) AS g(i);

DO $$ DECLARE   i int; BEGIN   FOR i IN 1..2 LOOP     RAISE NOTICE 'loop = %', i;     UPDATE timetravel SET f1 = f1 ||
'-r'|| i where id < 2;   END LOOP; END 
$$;
NOTICE:  loop = 1
NOTICE:  OLD.tr = ["2017-09-03 14:15:08.800811-07",infinity]
NOTICE:  tsr = ["2017-09-03 14:15:08.800811-07","2017-09-03
14:18:48.270274-07")
NOTICE:  NEW.tr = ["2017-09-03 14:18:48.270274-07",infinity]
NOTICE:  loop = 2
NOTICE:  OLD.tr = ["2017-09-03 14:18:48.270274-07",infinity]
NOTICE:  tsr = empty
ERROR:  no partition of relation "timetravel" found for row
DETAIL:  Partition key of the failing row contains (upper(tr)) = (null).
CONTEXT:  SQL statement "INSERT INTO timetravel VALUES (OLD.*)"
PL/pgSQL function modify_timetravel() line 11 at SQL statement
SQL statement "UPDATE timetravel SET f1 = f1 || '-r' || i where id < 2"
PL/pgSQL function inline_code_block line 7 at SQL statement
8<---------

Notice that in the first loop iteration tsr is calculated from OLD.tr
correctly. But in the second loop iteration it is not, and therefore no
partition can be found for the insert.

I have not dug too deeply into this yet, but was wondering if this
behavior is sane/expected for some reason I am missing?

Thanks,

Joe

--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development


On 09/03/2017 02:28 PM, Joe Conway wrote:
> I was playing around with partitioning and found an oddity that is best
> described with the following reasonably minimal test case:

<snip>

> Notice that in the first loop iteration tsr is calculated from OLD.tr
> correctly. But in the second loop iteration it is not, and therefore no
> partition can be found for the insert.
>
> I have not dug too deeply into this yet, but was wondering if this
> behavior is sane/expected for some reason I am missing?

This does not require partitioning to reproduce -- sorry for the false
accusations ;-)

8<---------------
CREATE TABLE timetravel
( id int8, f1 text not null, tr tstzrange not null default tstzrange(now(), 'infinity', '[]')
);

CREATE OR REPLACE FUNCTION modify_timetravel()
RETURNS TRIGGER AS $$ DECLARE   tsr tstzrange; BEGIN   RAISE NOTICE 'OLD.tr = %', OLD.tr;
   tsr := tstzrange(lower(OLD.tr), now(), '[)');   RAISE NOTICE 'tsr = %', tsr;
   OLD.tr = tsr;   INSERT INTO timetravel VALUES (OLD.*);   IF (TG_OP = 'UPDATE') THEN     tsr := tstzrange(now(),
'infinity','[]');     RAISE NOTICE 'NEW.tr = %', tsr;     NEW.tr = tsr;     RETURN NEW;   ELSIF (TG_OP = 'DELETE') THEN
   RETURN OLD;   END IF; END; 
$$ LANGUAGE plpgsql;

CREATE TRIGGER timetravel_audit BEFORE DELETE OR UPDATE
ON timetravel FOR EACH ROW EXECUTE PROCEDURE modify_timetravel();

INSERT INTO timetravel(id, f1)
SELECT g.i, 'row-' || g.i::text
FROM generate_series(1,10) AS g(i);

DO $$ DECLARE   i int; BEGIN   FOR i IN 1..2 LOOP     RAISE NOTICE 'loop = %', i;     UPDATE timetravel SET f1 = f1 ||
'-r'|| i where id < 4;   END LOOP; END 
$$;
NOTICE:  loop = 1
NOTICE:  OLD.tr = ["2017-09-03 15:17:31.598734-07",infinity]
NOTICE:  tsr = ["2017-09-03 15:17:31.598734-07","2017-09-03
15:17:31.608018-07")
NOTICE:  NEW.tr = ["2017-09-03 15:17:31.608018-07",infinity]
NOTICE:  OLD.tr = ["2017-09-03 15:17:31.598734-07",infinity]
NOTICE:  tsr = ["2017-09-03 15:17:31.598734-07","2017-09-03
15:17:31.608018-07")
NOTICE:  NEW.tr = ["2017-09-03 15:17:31.608018-07",infinity]
NOTICE:  OLD.tr = ["2017-09-03 15:17:31.598734-07",infinity]
NOTICE:  tsr = ["2017-09-03 15:17:31.598734-07","2017-09-03
15:17:31.608018-07")
NOTICE:  NEW.tr = ["2017-09-03 15:17:31.608018-07",infinity]
NOTICE:  loop = 2
NOTICE:  OLD.tr = ["2017-09-03 15:17:31.598734-07","2017-09-03
15:17:31.608018-07")
NOTICE:  tsr = ["2017-09-03 15:17:31.598734-07","2017-09-03
15:17:31.608018-07")
NOTICE:  NEW.tr = ["2017-09-03 15:17:31.608018-07",infinity]
NOTICE:  OLD.tr = ["2017-09-03 15:17:31.608018-07",infinity]
NOTICE:  tsr = empty
NOTICE:  NEW.tr = ["2017-09-03 15:17:31.608018-07",infinity]
NOTICE:  OLD.tr = ["2017-09-03 15:17:31.598734-07","2017-09-03
15:17:31.608018-07")
NOTICE:  tsr = ["2017-09-03 15:17:31.598734-07","2017-09-03
15:17:31.608018-07")
NOTICE:  NEW.tr = ["2017-09-03 15:17:31.608018-07",infinity]
NOTICE:  OLD.tr = ["2017-09-03 15:17:31.608018-07",infinity]
NOTICE:  tsr = empty
NOTICE:  NEW.tr = ["2017-09-03 15:17:31.608018-07",infinity]
NOTICE:  OLD.tr = ["2017-09-03 15:17:31.598734-07","2017-09-03
15:17:31.608018-07")
NOTICE:  tsr = ["2017-09-03 15:17:31.598734-07","2017-09-03
15:17:31.608018-07")
NOTICE:  NEW.tr = ["2017-09-03 15:17:31.608018-07",infinity]
NOTICE:  OLD.tr = ["2017-09-03 15:17:31.608018-07",infinity]
NOTICE:  tsr = empty
NOTICE:  NEW.tr = ["2017-09-03 15:17:31.608018-07",infinity]
DO
8<---------------

Notice that tsr is not empty at all on the first loop, but on the second
loop it is empty every second time the trigger fires.

Joe

--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development


Joe Conway <mail@joeconway.com> writes:
> Notice that tsr is not empty at all on the first loop, but on the second
> loop it is empty every second time the trigger fires.

I think the issue is that now() isn't changing within the transaction,
so when you construct "tstzrange(lower(OLD.tr), now(), '[)')" using an
old row whose "lower(OLD.tr)" is already "now()", you get an empty range.
Probably using '[]' bounds would avoid the oddness.
        regards, tom lane



On 09/03/2017 03:34 PM, Tom Lane wrote:
> Joe Conway <mail@joeconway.com> writes:
>> Notice that tsr is not empty at all on the first loop, but on the second
>> loop it is empty every second time the trigger fires.
>
> I think the issue is that now() isn't changing within the transaction,
> so when you construct "tstzrange(lower(OLD.tr), now(), '[)')" using an
> old row whose "lower(OLD.tr)" is already "now()", you get an empty range.
> Probably using '[]' bounds would avoid the oddness.

Hmmm, good point. Sorry for the noise.

Joe

--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development


Re: [HACKERS] PG10 partitioning - odd behavior/possible bug

From
Robert Haas
Date:
On Sun, Sep 3, 2017 at 5:28 PM, Joe Conway <mail@joeconway.com> wrote:
> I was playing around with partitioning and found an oddity that is best
> described with the following reasonably minimal test case:

I can reproduce this without partitioning, just by creating two
independent tables with the same schema and tweaking a few things from
your test case to refer to the correct table rather than relying on
tuple routing:

create table timetravel_current (id int8, f1 text not null, tr
tstzrange not null default tstzrange(now(), 'infinity', '[]'), primary
key (id, tr) deferrable);
create table timetravel_history (id int8, f1 text not null, tr
tstzrange not null default tstzrange(now(), 'infinity', '[]'), primary
key (id, tr) deferrable);
create function modify_timetravel() RETURNS trigger   LANGUAGE plpgsql   AS $$ DECLARE   tsr tstzrange; BEGIN   RAISE
NOTICE'OLD.tr = %', OLD.tr;
 
   tsr := tstzrange(lower(OLD.tr), now(), '[)');   RAISE NOTICE 'tsr = %', tsr;
   OLD.tr = tsr;   INSERT INTO timetravel_history VALUES (OLD.*);   IF (TG_OP = 'UPDATE') THEN     tsr :=
tstzrange(now(),'infinity', '[]');     RAISE NOTICE 'NEW.tr = %', tsr;     NEW.tr = tsr;     RETURN NEW;   ELSIF (TG_OP
='DELETE') THEN     RETURN OLD;   END IF; END;
 
$$;

CREATE TRIGGER timetravel_audit BEFORE DELETE OR UPDATE
ON timetravel_current FOR EACH ROW EXECUTE PROCEDURE modify_timetravel();

INSERT INTO timetravel_current(id, f1)
SELECT g.i, 'row-' || g.i::text
FROM generate_series(1,10) AS g(i);

Then:

rhaas=# DO $$
DECLARE   i int; BEGIN   FOR i IN 1..2 LOOP     RAISE NOTICE 'loop = %', i;     UPDATE timetravel_current SET f1 = f1
||'-r' || i where id < 2;   END LOOP; END
 
$$;
NOTICE:  loop = 1
NOTICE:  OLD.tr = ["2017-11-28 16:28:46.117239-05",infinity]
NOTICE:  tsr = ["2017-11-28 16:28:46.117239-05","2017-11-28 16:28:50.700763-05")
NOTICE:  NEW.tr = ["2017-11-28 16:28:50.700763-05",infinity]
NOTICE:  loop = 2
NOTICE:  OLD.tr = ["2017-11-28 16:28:50.700763-05",infinity]
NOTICE:  tsr = empty
NOTICE:  NEW.tr = ["2017-11-28 16:28:50.700763-05",infinity]
DO

There's no error here because I didn't bother putting constraints on
the table, but that tsr = empty bit is still happening.  I think the
problem is that you're updating the same row twice in the same
transaction, and now() returns the same value both times because
that's how now() works, so the second time the range ends up with the
lower and endpoints that are equal.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: [HACKERS] PG10 partitioning - odd behavior/possible bug

From
Joe Conway
Date:
On 11/28/2017 04:40 PM, Robert Haas wrote:
> On Sun, Sep 3, 2017 at 5:28 PM, Joe Conway <mail@joeconway.com> wrote:
>> I was playing around with partitioning and found an oddity that is best
>> described with the following reasonably minimal test case:
>
> I can reproduce this without partitioning, just by creating two
> independent tables with the same schema and tweaking a few things from
> your test case to refer to the correct table rather than relying on
> tuple routing:

[snip]

> There's no error here because I didn't bother putting constraints on
> the table, but that tsr = empty bit is still happening.  I think the
> problem is that you're updating the same row twice in the same
> transaction, and now() returns the same value both times because
> that's how now() works, so the second time the range ends up with the
> lower and endpoints that are equal.

Yeah, Tom already pointed that out a while back:

https://www.postgresql.org/message-id/20986.1504478066%40sss.pgh.pa.us

FWIW, I have working version of this now (using clock_timestamp()) here
(see last part of the appendix):

https://www.joeconway.com/presentations/SecurePostgreSQL-PGOpen2017.pdf

Joe

--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development


Attachment

Re: [HACKERS] PG10 partitioning - odd behavior/possible bug

From
Robert Haas
Date:
On Tue, Nov 28, 2017 at 6:42 PM, Joe Conway <mail@joeconway.com> wrote:
> Yeah, Tom already pointed that out a while back:
>
> https://www.postgresql.org/message-id/20986.1504478066%40sss.pgh.pa.us

Ah, sorry, hadn't seen that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company