Re: BUG #16031: Group by returns duplicate groups - Mailing list pgsql-bugs

From Tomas Vondra
Subject Re: BUG #16031: Group by returns duplicate groups
Date
Msg-id 20191007205341.koxjl5etfoqg4y7l@development
Whole thread Raw
In response to RE: BUG #16031: Group by returns duplicate groups  (David Raymond <David.Raymond@tomtom.com>)
Responses RE: BUG #16031: Group by returns duplicate groups
List pgsql-bugs
Hi,

On Mon, Oct 07, 2019 at 06:29:04PM +0000, David Raymond wrote:
>Downloaded and installed 12.0, created a nice shiny new cluster, and
>confirmed that it's still doing it. Now in 12 you have to force it to
>materialize the CTE, which was why I had used a CTE in 11 in the first
>place.
>

I've tried running this on a 12.0 cluster too (on Linux though), using
the data set linked by Andrew, and I haven't observed any duplicate rows
either. I do have some observations, though ...

>
>testing=> select version();
>                          version
>------------------------------------------------------------
> PostgreSQL 12.0, compiled by Visual C++ build 1914, 64-bit
>(1 row)
>
>Time: 0.148 ms
>testing=> \d+ weird_grouping
>                                    Table "name_stuff.weird_grouping"
> Column |          Type          | Collation | Nullable | Default | Storage  | Stats target | Description
>--------+------------------------+-----------+----------+---------+----------+--------------+-------------
> name   | character varying(254) |           | not null |         | extended |              |
>Access method: heap
>
>testing=> analyze verbose weird_grouping;
>INFO:  analyzing "name_stuff.weird_grouping"
>INFO:  "weird_grouping": scanned 8316 of 8316 pages, containing 1297265 live rows and 0 dead rows; 30000 rows in
sample,1297265 estimated total rows
 
>ANALYZE
>Time: 206.577 ms
>testing=> select count(*), count(distinct name) from weird_grouping;
>   count   |   count
>-----------+-----------
> 1,297,265 | 1,176,103
>(1 row)
>

I do get this:

  count  |  count
---------+---------
 1297265 | 1176101
(1 row)

>Time: 6729.011 ms (00:06.729)
>testing=> with foo as (select name from weird_grouping group by name) select name from foo group by name having
count(*)> 1;
 
> name
>------
>(0 rows)
>
>Time: 7289.128 ms (00:07.289)
>testing=> explain (analyze, verbose, costs, settings, buffers, timing, summary) with foo as (select name from
weird_groupinggroup by name) select name from foo group by name having count(*) > 1;
 
>                                                                       QUERY PLAN

>--------------------------------------------------------------------------------------------------------------------------------------------------------
> GroupAggregate  (cost=179613.72..200268.04 rows=67 width=20) (actual time=6203.868..6203.868 rows=0 loops=1)
>   Output: weird_grouping.name
>   Group Key: weird_grouping.name
>   Filter: (count(*) > 1)
>   Rows Removed by Filter: 1176103
>   Buffers: shared hit=2464 read=5852, temp read=7800 written=7830
>   ->  Group  (cost=179613.72..186100.05 rows=944366 width=20) (actual time=4769.781..5985.111 rows=1176103 loops=1)
>         Output: weird_grouping.name
>         Group Key: weird_grouping.name
>         Buffers: shared hit=2464 read=5852, temp read=7800 written=7830
>         ->  Sort  (cost=179613.72..182856.89 rows=1297265 width=20) (actual time=4769.779..5844.350 rows=1297265
loops=1)
>               Output: weird_grouping.name
>               Sort Key: weird_grouping.name
>               Sort Method: external merge  Disk: 39048kB
>               Buffers: shared hit=2464 read=5852, temp read=7800 written=7830
>               ->  Seq Scan on name_stuff.weird_grouping  (cost=0.00..21288.65 rows=1297265 width=20) (actual
time=0.059..102.772rows=1297265 loops=1)
 
>                     Output: weird_grouping.name
>                     Buffers: shared hit=2464 read=5852
> Settings: search_path = 'name_stuff'
> Planning Time: 0.048 ms
> Execution Time: 7115.761 ms
>(21 rows)
>

I'd like to point this may not be actually correct either. It does not
produce any rows, i.e. it does not see any "duplicate groups", but it
does this:

  ->  Group  (cost=179613.72..186100.05 rows=944366 width=20) (actual time=4769.781..5985.111 rows=1176103 loops=1)

Notice that it has 1176103 groups, but is that actually correct? On my
machine I only see 1176101 groups, and the difference (2) would match
the DCL and DELTA duplicates with the next query.


>Time: 7116.170 ms (00:07.116)
>testing=> with foo as materialized (select name from weird_grouping group by name) select name from foo group by name
havingcount(*) > 1;
 
> name
>-------
> DCT
> DELTA
>(2 rows)
>
>Time: 8850.833 ms (00:08.851)
>testing=> explain (analyze, verbose, costs, settings, buffers, timing, summary) with foo as materialized (select name
fromweird_grouping group by name) select name from foo group by name having count(*) > 1;
 
>                                                                        QUERY PLAN

>----------------------------------------------------------------------------------------------------------------------------------------------------------
> HashAggregate  (cost=209709.20..209711.70 rows=67 width=516) (actual time=6676.811..6768.094 rows=2 loops=1)
>   Output: foo.name
>   Group Key: foo.name
>   Filter: (count(*) > 1)
>   Rows Removed by Filter: 1176099
>   Buffers: shared hit=2528 read=5788, temp read=7800 written=12363
>   CTE foo
>     ->  Group  (cost=179613.72..186100.05 rows=944366 width=20) (actual time=4774.681..6004.725 rows=1176103
loops=1)
>           Output: weird_grouping.name
>           Group Key: weird_grouping.name
>           Buffers: shared hit=2528 read=5788, temp read=7800 written=7830
>           ->  Sort  (cost=179613.72..182856.89 rows=1297265 width=20) (actual time=4774.678..5860.270 rows=1297265
loops=1)
>                 Output: weird_grouping.name
>                 Sort Key: weird_grouping.name
>                 Sort Method: external merge  Disk: 39048kB
>                 Buffers: shared hit=2528 read=5788, temp read=7800 written=7830
>                 ->  Seq Scan on name_stuff.weird_grouping  (cost=0.00..21288.65 rows=1297265 width=20) (actual
time=0.065..101.141rows=1297265 loops=1)
 
>                       Output: weird_grouping.name
>                       Buffers: shared hit=2528 read=5788
>   ->  CTE Scan on foo  (cost=0.00..18887.32 rows=944366 width=516) (actual time=4774.683..6228.002 rows=1176103
loops=1)
>         Output: foo.name
>         Buffers: shared hit=2528 read=5788, temp read=7800 written=12363
> Settings: search_path = 'name_stuff'
> Planning Time: 0.054 ms
> Execution Time: 8786.597 ms
>(25 rows)
>
>Time: 8787.011 ms (00:08.787)
>testing=>
>

Can you try running this with enable_hashagg = off? That should give you
another Sort on the CTE Scan, and a GroupAggregate at the top. I wonder
if that makes the issue go away ...


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-bugs by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: BUG #16043: Duplicate Entry Key Message on postgres
Next
From: Michael Paquier
Date:
Subject: Re: BUG #16039: PANIC when activating replication slots in Postgres12.0 64bit under Windows