Thread: UNION removes almost all rows (not duplicates) - in fresh build of pg17!

UNION removes almost all rows (not duplicates) - in fresh build of pg17!

From
hubert depesz lubaczewski
Date:
So, test case is trivial:

#v+
$ select count(*) from pg_class where relkind = 'r';
 count
───────
    68
(1 row)

$ select count(*) from pg_class where relkind = 'i';
 count
───────
   164
(1 row)

$ select count(*) from ( select * from pg_class where relkind = 'r' union select * from pg_class where relkind = 'i'
);
 count
───────
     1
(1 row)
#v-

explain shows unexpected:

#v+
$ explain (analyze) select count(*) from ( select * from pg_class where relkind = 'r' union select * from pg_class
whererelkind = 'i' );
 
                                                              QUERY PLAN

──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 Aggregate  (cost=42.43..42.44 rows=1 width=8) (actual time=0.108..0.109 rows=1 loops=1)
   ->  Unique  (cost=0.00..39.53 rows=232 width=236) (actual time=0.013..0.105 rows=1 loops=1)
         ->  Append  (cost=0.00..39.53 rows=232 width=236) (actual time=0.012..0.099 rows=232 loops=1)
               ->  Seq Scan on pg_class  (cost=0.00..19.19 rows=68 width=263) (actual time=0.012..0.058 rows=68
loops=1)
                     Filter: (relkind = 'r'::"char")
                     Rows Removed by Filter: 348
               ->  Seq Scan on pg_class pg_class_1  (cost=0.00..19.19 rows=164 width=263) (actual time=0.002..0.030
rows=164loops=1)
 
                     Filter: (relkind = 'i'::"char")
                     Rows Removed by Filter: 252
 Planning Time: 0.161 ms
 Execution Time: 0.129 ms
(11 rows)
#v-

It seems that I get 232 correct rows from Append, but then Unique
removes all of them, except for one?

For whatever it's worth returned row is:

#v+
$ select * from pg_class where relkind = 'r' union select * from pg_class where relkind = 'i' \gx
─[ RECORD 1 ]───────┬───────────────────────
oid                 │ 2619
relname             │ pg_statistic
relnamespace        │ 11
reltype             │ 10029
reloftype           │ 0
relowner            │ 10
relam               │ 2
relfilenode         │ 2619
reltablespace       │ 0
relpages            │ 19
reltuples           │ 410
relallvisible       │ 19
reltoastrelid       │ 2840
relhasindex         │ t
relisshared         │ f
relpersistence      │ p
relkind             │ r
relnatts            │ 31
relchecks           │ 0
relhasrules         │ f
relhastriggers      │ f
relhassubclass      │ f
relrowsecurity      │ f
relforcerowsecurity │ f
relispopulated      │ t
relreplident        │ n
relispartition      │ f
relrewrite          │ 0
relfrozenxid        │ 730
relminmxid          │ 1
relacl              │ {pgdba=arwdDxtm/pgdba}
reloptions          │ [null]
relpartbound        │ [null]
#v-

As for test environment:

Debian testing on amd64, self-compiled from git HEAD at d2a04470aa6401c1938cc107e0b2c56c22a2321f

Did I do something wrong?

Best regards,

depesz




hubert depesz lubaczewski <depesz@depesz.com> writes:
> $ select count(*) from ( select * from pg_class where relkind = 'r' union select * from pg_class where relkind = 'i'
);
>  count
> ───────
>      1
> (1 row)

Ugh, that is awful.  Bisecting shows it broke at

commit 66c0185a3d14bbbf51d0fc9d267093ffec735231
Author: David Rowley <drowley@postgresql.org>
Date:   Mon Mar 25 14:31:14 2024 +1300

    Allow planner to use Merge Append to efficiently implement UNION

I've not looked at the patch, but somehow it's totally forgetting
what it's supposed to be unique-ifying on.

            regards, tom lane



On Tue, 21 May 2024 at 04:00, hubert depesz lubaczewski
<depesz@depesz.com> wrote:
> $ explain (analyze) select count(*) from ( select * from pg_class where relkind = 'r' union select * from pg_class
whererelkind = 'i' );
 
>                                                               QUERY PLAN
>
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
>  Aggregate  (cost=42.43..42.44 rows=1 width=8) (actual time=0.108..0.109 rows=1 loops=1)
>    ->  Unique  (cost=0.00..39.53 rows=232 width=236) (actual time=0.013..0.105 rows=1 loops=1)
>          ->  Append  (cost=0.00..39.53 rows=232 width=236) (actual time=0.012..0.099 rows=232 loops=1)
>                ->  Seq Scan on pg_class  (cost=0.00..19.19 rows=68 width=263) (actual time=0.012..0.058 rows=68
loops=1)
>                      Filter: (relkind = 'r'::"char")
>                      Rows Removed by Filter: 348
>                ->  Seq Scan on pg_class pg_class_1  (cost=0.00..19.19 rows=164 width=263) (actual time=0.002..0.030
rows=164loops=1)
 
>                      Filter: (relkind = 'i'::"char")
>                      Rows Removed by Filter: 252

Thanks for the report.

It looks like it's a very simple fix.  The problem is I wasn't setting
groupList when the grouping_is_sortable() returned false.  In the
prior version, make_union_unique() always set that.

The attached should apply cleanly up to d2a04470a.

David

Attachment

Re: UNION removes almost all rows (not duplicates) - in fresh build of pg17!

From
hubert depesz lubaczewski
Date:
On Tue, May 21, 2024 at 09:54:25AM +1200, David Rowley wrote:
> Thanks for the report.
> It looks like it's a very simple fix.  The problem is I wasn't setting
> groupList when the grouping_is_sortable() returned false.  In the
> prior version, make_union_unique() always set that.
> The attached should apply cleanly up to d2a04470a.

Hi,
tested, and can confirm that applying this patch on top of
d2a04470aa6401c1938cc107e0b2c56c22a2321f solved the problem.

Best regards,

depesz