Home > mailing lists

Re: Two issues with REFRESH MATERIALIZED VIEW CONCURRENTLY - Mailing list pgsql-bugs

From	surya poondla
Subject	Re: Two issues with REFRESH MATERIALIZED VIEW CONCURRENTLY
Date	March 26 01:45:11
Msg-id	CAOVWO5o46gqX_=BkrzuTRyPijDBWaYAGFVqn7UXrww_4owhJ0w@mail.gmail.com Whole thread
In response to	Re: Two issues with REFRESH MATERIALIZED VIEW CONCURRENTLY ("cca5507" <cca5507@qq.com>)
List	pgsql-bugs

Tree view

Hi ChangAo,

Thank you for the continued review, your test case helped me find a real issue in v3 patch, and I updated the fix accordingly.

You were correct that v3 was wrong. Removing IS NOT NULL and using *= alone in the pre-check was too aggressive: *= treats NULL=NULL=true, which caused

(NULL,NULL)×2 with a unique index on a nullable column to be flagged as duplicates even though the unique index allows multiple NULLs.

The updated patch takes a different approach. Instead of removing the guard and relying on *= alone, the pre-check now uses the same per-column equality
operators as the FULL OUTER JOIN accumulated during the same index-scanning loop. These operators treat NULL=NULL as false, which is consistent with how
unique indexes actually work (NULLs are distinct).

For (test, NULL)×2 with a single index on a (non-null):
newdata2.a = newdata.a i.e 'test'='test' that is TRUE
newdata2.* *= newdata.* that is TRUE
Thus the duplicate is caught and error is raised

For (NULL, NULL)×2 with a single index on a (nullable):
newdata2.a = newdata.a i.e NULL=NULL that is NULL (false)
Thus no duplicate is caught, the refresh correctly succeeds, MV gets 2 rows

For (test, NULL)×2 with a composite index on (a, b):
newdata2.a = newdata.a i.e TRUE
newdata2.b = newdata.b i.e NULL=NULL that is NULL (false)
Combined: NULL is not caught, refresh correctly succeeds

Regarding your record_image_eq_variant approach: it correctly handles the NULL-in-indexed-column case, but it introduces a performance regression for
unchanged rows where any non-indexed column contains NULL.
For example, an unchanged row (1, NULL) with a unique index on non-null index on a would require a DELETE+INSERT on every CONCURRENTLY refresh because
record_image_eq_variant((1,NULL),(1,NULL)) returns false.

This makes CONCURRENTLY impractical for any table where rows contain NULLs in non-indexed columns.

The updated patch for bug 1.

Here is some additional tests I did:
postgres=# SET client_min_messages = WARNING;
SET
postgres=# -- Test 1: (test,NULL)×2, single index on 'a' should ERROR
postgres=# CREATE TABLE t(a text, b text);
INSERT INTO t VALUES('test', NULL);
CREATE MATERIALIZED VCREATE TABLE
postgres=# INSERT INTO t VALUES('test', NULL);
INSERT 0 1
postgres=# CREATE MATERIALIZED VIEW m AS SELECT * FROM t;
CREATE UNIQUE INDEX ON m(a);
INSERT INTO t VALUES('test', NULL);
REFSELECT 1
postgres=# CREATE UNIQUE INDEX ON m(a);
CREATE INDEX
postgres=# INSERT INTO t VALUES('test', NULL);
INSERT 0 1
postgres=# REFRESH MATERIALIZED VIEW CONCURRENTLY m; -- must error
ERROR: new data for materialized view "m" contains duplicate rows
DETAIL: Row: (test,)
postgres=# DROP TABLE t CASCADE;
DROP TABLE

postgres=# -- Test 2: (NULL,NULL)×2, single index on 'a' should SUCCEED
postgres=# CREATE TABLE t(a int, b int);
CREATE TABLE
postgres=# INSERT INTO t VALUES(NULL, NULL);
INSERT 0 1
postgres=# CREATE MATERIALIZED VIEW m AS SELECT * FROM t;
SELECT 1
postgres=# CREATE UNIQUE INDEX ON m(a);
CREATE INDEX
postgres=# INSERT INTO t VALUES(NULL, NULL);
INSERT 0 1
postgres=# REFRESH MATERIALIZED VIEW CONCURRENTLY m; -- must succeed
SELECT COUNT(*) FROM m;
DROP TABLE t CASCADE;REFRESH MATERIALIZED VIEW
postgres=# SELECT COUNT(*) FROM m; -- should be 2
count
-------
2
(1 row)
postgres=# DROP TABLE t CASCADE;
DROP TABLE

postgres=# --Test 3: (test,NULL)×2, composite index (a,b) should SUCCEED
postgres=# CREATE TABLE t(a text, b text);
CREATE TABLE
postgres=# INSERT INTO t VALUES('test', NULL);
INSERT 0 1
postgres=# CREATE MATERIALIZED VIEW m AS SELECT * FROM t;
SELECT 1
postgres=# CREATE UNIQUE INDEX ON m(a, b);
INSERT INTO t VALUES('test', NUL CREATE UNIQUE INDEX ON m(a, b);
CREATE INDEX
postgres=# INSERT INTO t VALUES('test', NULL);
INSERT 0 1
postgres=# REFRESH MATERIALIZED VIEW CONCURRENTLY m; -- must succeed
REFRESH MATERIALIZED VIEW
postgres=# SELECT COUNT(*) FROM m;
DROP TABLE t CASCADE;
SELECT COUNT(*) FROM m; -- must be 2
count
-------
2
(1 row)
postgres=# DROP TABLE t CASCADE;
DROP TABLE

postgres=# -- Test 4: unchanged (1,NULL), index on 'a' should SUCCEED.
postgres=# CREATE TABLE t(a int, b int);
CREATE TABLE
postgres=# INSERT INTO t VALUES(1, NULL);
INSERT 0 1
postgres=# CREATE MATERIALIZED VIEW m AS SELECT * FROM t;
SELECT 1
postgres=# CREATE UNIQUE INDEX ON m(a);
CREATE INDEX
postgres=# REFRESH MATERIALIZED VIEW CONCURRENTLY m; -- must succeed
REFRESH MATERIALIZED VIEW
postgres=# SELECT * FROM m; -- must still show (1,)
a | b
---+---
1 |
(1 row)
postgres=# DROP TABLE t CASCADE;
DROP TABLE

postgres=# -- Test 5: (1,NULL)×2, separate index on a AND b, should ERROR
postgres=# CREATE TABLE t(a int, b int);
CREATE TABLE
postgres=# INSERT INTO t VALUES(1, NULL);
INSERT 0 1
postgres=# CREATE MATERIALIZED VIEW m AS SELECT * FROM t;
SELECT 1
postgres=# CREATE UNIQUE INDEX ON m(a);
CREATE INDEX
postgres=# CREATE UNIQUE INDEX ON m(b);
CREATE INDEX
postgres=# INSERT INTO t VALUES(1, NULL);
INSERT 0 1
postgres=# REFRESH MATERIALIZED VIEW CONCURRENTLY m; -- must error
ERROR: duplicate key value violates unique constraint "m_a_idx"
DETAIL: Key (a)=(1) already exists.
CONTEXT: SQL statement "INSERT INTO public.m SELECT (diff.newdata).* FROM pg_temp_2.pg_temp_16535_2 diff WHERE tid IS NULL"
postgres=# DROP TABLE t CASCADE;
DROP TABLE
postgres=#

Made some minor changes to bug2 patch too.

Regards,
Surya Poondla

Attachment

pgsql-bugs by date:

From: Dmitriy Kuzmin
Date: 25 March, 17:30:38
Subject: Re: BUG #19437: temp_tablespaces doesn't work inside a cursor?

From: Andrei Lepikhov
Date: 26 March, 22:59:13
Subject: Re: BUG #19435: Error: "No relation entry for relid 2" Triggered by Complex Join with Self-Referencing Tables

Re: Two issues with REFRESH MATERIALIZED VIEW CONCURRENTLY - Mailing list pgsql-bugs

Attachment

Previous

Next