Thread: BUG #16031: Group by returns duplicate groups
The following bug has been logged on the website: Bug reference: 16031 Logged by: David Raymond Email address: david.raymond@tomtom.com PostgreSQL version: 11.5 Operating system: Windows 10 Description: I have a large table which I'm running a query on, grouping on a single field, and returning only that 1 field, filtered with a HAVING clause. The results that I'm getting back however contain a duplicate entry for one of the returned values, which I believe should never be able to happen. I'm working on dropping as many extra fields and records as I can to still get the bad result from something small enough to send as a test case, but figured I'd post this while I'm at it to get any advice. I've done a dump of the table and then restored it to a new table, and still get the same weird results. On the original version of the table I had clustered it on the index that starts with the field being grouped on, but it's still giving the bad groups after a dump and restore. I'm running 11.5 on Windows 10 (Enterprise DB installer) and unfortunately don't have the resources to build a new version myself if you come up with a patch (also why I'm hoping to shrink it down to where others can test it) Here's output from psql: testing=> select version(); version ------------------------------------------------------------ PostgreSQL 11.5, compiled by Visual C++ build 1914, 64-bit (1 row) Time: 0.272 ms testing=> \d+ big_table Table "name_stuff.big_table" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description -------------+------------------------+-----------+----------+---------+----------+--------------+------------- id_1 | uuid | | not null | | plain | | field_2 | uuid | | | | plain | | name | character varying(254) | | not null | | extended | | field_4 | character varying(254) | | | | extended | | field_5 | numeric(2,0) | | not null | | main | | field_6 | character varying(4) | | | | extended | | field_7 | character varying(3) | | not null | | extended | | field_8 | character varying(3) | | | | extended | | arr_field_1 | character varying(254) | | | | extended | | arr_field_2 | character varying(254) | | | | extended | | arr_field_3 | character varying(254) | | | | extended | | arr_field_4 | character varying(254) | | | | extended | | arr_field_5 | character varying(254) | | | | extended | | arr_field_6 | character varying(254) | | | | extended | | field_15 | boolean | | | | plain | | field_16 | boolean | | | | plain | | id_2 | text | | not null | | extended | | Indexes: "big_table_pkey" PRIMARY KEY, btree (id_1, id_2) "big_table_name_id_1_id_2_idx" btree (name, id_1, id_2) testing=> select count(*) from big_table; count ------------- 108,565,086 (1 row) Time: 273770.205 ms (04:33.770) testing=> explain (analyze, verbose, costs, buffers, timing, summary) create table bad_groups_1 as select name from big_table group by name having count(distinct array[arr_field_1, arr_field_2, arr_field_3, arr_field_4, arr_field_5, arr_field_6]) > 1; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- GroupAggregate (cost=29317810.81..30149735.19 rows=487417 width=22) (actual time=758501.326..895894.082 rows=745024 loops=1) Output: name Group Key: big_table.name Filter: (count(DISTINCT ARRAY[big_table.arr_field_1, big_table.arr_field_2, big_table.arr_field_3, big_table.arr_field_4, big_table.arr_field_5, big_table.arr_field_6]) > 1) Rows Removed by Filter: 80610652 Buffers: shared hit=2325 read=1515081, temp read=2464481 written=2467410 -> Sort (cost=29317810.81..29589026.23 rows=108486168 width=57) (actual time=758493.476..819035.136 rows=108565086 loops=1) Output: name, arr_field_1, arr_field_2, arr_field_3, arr_field_4, arr_field_5, arr_field_6 Sort Key: big_table.name Sort Method: external merge Disk: 4174488kB Buffers: shared hit=2317 read=1515081, temp read=2464481 written=2467410 -> Seq Scan on name_stuff.big_table (cost=0.00..2602259.68 rows=108486168 width=57) (actual time=23.216..119113.708 rows=108565086 loops=1) Output: name, arr_field_1, arr_field_2, arr_field_3, arr_field_4, arr_field_5, arr_field_6 Buffers: shared hit=2317 read=1515081 Planning Time: 0.196 ms Execution Time: 897276.109 ms (16 rows) Time: 897285.808 ms (14:57.286) testing=> \d+ bad_groups_1 Table "name_stuff.bad_groups_1" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------+------------------------+-----------+----------+---------+----------+--------------+------------- name | character varying(254) | | | | extended | | testing=> select count(*), count(distinct name) from bad_groups_1; count | count ---------+--------- 745,024 | 745,023 (1 row) Time: 899.273 ms testing=> select name from bad_groups_1 group by name having count(*) > 1; name ------ DK (1 row) Time: 337.663 ms testing=>
PG Bug reporting form <noreply@postgresql.org> writes: > I have a large table which I'm running a query on, grouping on a single > field, and returning only that 1 field, filtered with a HAVING clause. The > results that I'm getting back however contain a duplicate entry for one of > the returned values, which I believe should never be able to happen. That seems quite weird. What LC_COLLATE/LC_CTYPE settings are you using? regards, tom lane
All the lc_* settings are en-US, and server_encoding is UTF8. -----Original Message----- From: Tom Lane <tgl@sss.pgh.pa.us> Sent: Monday, September 30, 2019 2:17 PM To: David Raymond <David.Raymond@tomtom.com> Cc: pgsql-bugs@lists.postgresql.org Subject: Re: BUG #16031: Group by returns duplicate groups PG Bug reporting form <noreply@postgresql.org> writes: > I have a large table which I'm running a query on, grouping on a single > field, and returning only that 1 field, filtered with a HAVING clause. The > results that I'm getting back however contain a duplicate entry for one of > the returned values, which I believe should never be able to happen. That seems quite weird. What LC_COLLATE/LC_CTYPE settings are you using? regards, tom lane
Looking possibly like indexing is part of the issue at the moment. When I added an array_agg(id_1) in the group by, one of the 'DK' groups had 9, one had 16. I wrote a script to scan the text dump and counted 25 records with the "name" field value of 'DK' After the restore which includes the indexes, "count(*) from big_table where name = 'DK';" uses the index which starts with"name" and returns 9. Dropping that index and running it again returns 25. I re-created the index... create index on big_table (name, id_1, id_2); ...and count(*) goes back to returning 9 again. and group by sees those 9 as one group and the other 16 as a different group. Will get back to this sometime tomorrow as my brain has now melted.
On Mon, Sep 30, 2019 at 2:49 PM David Raymond <David.Raymond@tomtom.com> wrote: > I re-created the index... > create index on big_table (name, id_1, id_2); > > ...and count(*) goes back to returning 9 again. > > and group by sees those 9 as one group and the other 16 as a different group. You should try running contrib/amcheck, which should be able to confirm index corruption, and give you a specific complaint. You may then be able to inspect the exact index page with the problem using contrib/pageinspect. Something like this ought to do it on Postgres 11: CREATE EXTENSION IF NOT EXISTS amcheck SELECT bt_index_check('my_index', true); If that doesn't show any errors, then perhaps try this: SELECT bt_index_parent_check('my_index', true); Let us know what you see. -- Peter Geoghegan
>>>>> "David" == David Raymond <David.Raymond@tomtom.com> writes: David> Looking possibly like indexing is part of the issue at the David> moment. Your original EXPLAIN didn't show any index scans being used...? I can think of a possible explanation if there's some other value in the big table which, due to some collation bug, is not consistently being compared as < 'DK' or > 'DK'. Unfortunately, we have two code paths for comparison, and one of them can't easily be exercised directly from SQL, since it is only used for sorts (and therefore also index builds). Can you try these queries: set enable_indexscan=off; set enable_bitmapscan=off; select count(*) c0, count(*) filter (where bttextcmp(name,'DK') < 0) c1_lt, count(*) filter (where bttextcmp(name,'DK') = 0) c1_eq, count(*) filter (where bttextcmp(name,'DK') > 0) c1_gt, count(*) filter (where bttextcmp('DK',name) > 0) c2_lt, count(*) filter (where bttextcmp('DK',name) = 0) c2_eq, count(*) filter (where bttextcmp('DK',name) < 0) c2_gt from big_table; with sd as (select name, row_number() over (order by name) rnum from big_table) select name from sd where rnum >= (select min(rnum) from sd where name='DK') and rnum <= (select max(rnum) from sd where name='DK') and name <> 'DK'; -- Andrew (irc:RhodiumToad)
psql output for those two queries, along with explain output, pasted below. testing=> set enable_indexscan = off; SET Time: 0.265 ms testing=> set enable_bitmapscan = off; SET Time: 0.236 ms testing=> select count(*) c0, testing-> count(*) filter (where bttextcmp(name,'DK') < 0) c1_lt, testing-> count(*) filter (where bttextcmp(name,'DK') = 0) c1_eq, testing-> count(*) filter (where bttextcmp(name,'DK') > 0) c1_gt, testing-> count(*) filter (where bttextcmp('DK',name) > 0) c2_lt, testing-> count(*) filter (where bttextcmp('DK',name) = 0) c2_eq, testing-> count(*) filter (where bttextcmp('DK',name) < 0) c2_gt testing-> from big_table; c0 | c1_lt | c1_eq | c1_gt | c2_lt | c2_eq | c2_gt -------------+------------+-------+------------+------------+-------+------------ 108,565,086 | 27,900,023 | 25 | 80,665,038 | 27,900,023 | 25 | 80,665,038 (1 row) Time: 311710.895 ms (05:11.711) testing=> explain (analyze, verbose, costs, buffers, timing, summary) select count(*) c0, testing-> count(*) filter (where bttextcmp(name,'DK') < 0) c1_lt, testing-> count(*) filter (where bttextcmp(name,'DK') = 0) c1_eq, testing-> count(*) filter (where bttextcmp(name,'DK') > 0) c1_gt, testing-> count(*) filter (where bttextcmp('DK',name) > 0) c2_lt, testing-> count(*) filter (where bttextcmp('DK',name) = 0) c2_eq, testing-> count(*) filter (where bttextcmp('DK',name) < 0) c2_gt testing-> from big_table; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Finalize Aggregate (cost=4119436.80..4119436.81 rows=1 width=56) (actual time=315815.338..315815.338 rows=1 loops=1) Output: count(*), count(*) FILTER (WHERE (bttextcmp((name)::text, 'DK'::text) < 0)), count(*) FILTER (WHERE (bttextcmp((name)::text,'DK'::text) = 0)), count(*) FILTER (WHERE (bttextcmp((name)::text, 'DK'::text) > 0)), count(*) FILTER(WHERE (bttextcmp('DK'::text, (name)::text) > 0)), count(*) FILTER (WHERE (bttextcmp('DK'::text, (name)::text) = 0)),count(*) FILTER (WHERE (bttextcmp('DK'::text, (name)::text) < 0)) Buffers: shared hit=258 read=1517140 -> Gather (cost=4119436.55..4119436.76 rows=2 width=56) (actual time=315814.594..315834.986 rows=3 loops=1) Output: (PARTIAL count(*)), (PARTIAL count(*) FILTER (WHERE (bttextcmp((name)::text, 'DK'::text) < 0))), (PARTIALcount(*) FILTER (WHERE (bttextcmp((name)::text, 'DK'::text) = 0))), (PARTIAL count(*) FILTER (WHERE (bttextcmp((name)::text,'DK'::text) > 0))), (PARTIAL count(*) FILTER (WHERE (bttextcmp('DK'::text, (name)::text) > 0))),(PARTIAL count(*) FILTER (WHERE (bttextcmp('DK'::text, (name)::text) = 0))), (PARTIAL count(*) FILTER (WHERE (bttextcmp('DK'::text,(name)::text) < 0))) Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=258 read=1517140 -> Partial Aggregate (cost=4118436.55..4118436.56 rows=1 width=56) (actual time=315571.326..315571.326 rows=1loops=3) Output: PARTIAL count(*), PARTIAL count(*) FILTER (WHERE (bttextcmp((name)::text, 'DK'::text) < 0)), PARTIALcount(*) FILTER (WHERE (bttextcmp((name)::text, 'DK'::text) = 0)), PARTIAL count(*) FILTER (WHERE (bttextcmp((name)::text,'DK'::text) > 0)), PARTIAL count(*) FILTER (WHERE (bttextcmp('DK'::text, (name)::text) > 0)), PARTIALcount(*) FILTER (WHERE (bttextcmp('DK'::text, (name)::text) = 0)), PARTIAL count(*) FILTER (WHERE (bttextcmp('DK'::text,(name)::text) < 0)) Buffers: shared hit=258 read=1517140 Worker 0: actual time=315491.284..315491.284 rows=1 loops=1 Buffers: shared hit=80 read=324165 Worker 1: actual time=315411.542..315411.542 rows=1 loops=1 Buffers: shared hit=81 read=592546 -> Parallel Seq Scan on name_stuff.big_table (cost=0.00..1969752.53 rows=45235453 width=22) (actual time=8.834..245716.269rows=36188362 loops=3) Output: id_1, field_2, name, field_4, field_5, field_6, field_7, field_8, arr_field_1, arr_field_2,arr_field_3, arr_field_4, arr_field_5, arr_field_6, field_15, field_16, id_2 Buffers: shared hit=258 read=1517140 Worker 0: actual time=13.873..213354.668 rows=23076339 loops=1 Buffers: shared hit=80 read=324165 Worker 1: actual time=0.232..260652.945 rows=42461913 loops=1 Buffers: shared hit=81 read=592546 Planning Time: 1.746 ms Execution Time: 315835.098 ms (24 rows) Time: 315860.795 ms (05:15.861) testing=> with sd as (select name, row_number() over (order by name) rnum testing(> from big_table) testing-> select name from sd testing-> where rnum >= (select min(rnum) from sd where name='DK') testing-> and rnum <= (select max(rnum) from sd where name='DK') testing-> and name <> 'DK'; name ----------------------------------- Dk'bus Marine Dk's Auto's Dk's Bar & Grill Dk's Barbers & Stylist Dk's Beach Boutique Dk's Cabinets & Countertops Dk's Cleaning Service Dk's Clothing Dk's Communications Dk's Dancewear & Fitnesswear Dk's Dancewear Boutique Dk's Discount Dance & Fitnesswear DK's Drywall Service DK'S DUSTBUSTERS Dk's Family Five Star Trophies DK's Family Five Star Trophies Dk's Food Mart Dk'S Group Pte. Ltd. Dk's Hair Designs Dk's Hair Happenings Dk's Hair Supply Dk's Home Decor DK's Informática Dk's Janitorial DK's Liquors Dk's Market Dk's Moda Masculina Dk's Nails And Spa DK's Pawn Shop Dk's Pet Grooming DK's Quality Service DK's Restoration Dk's Sports Center Dk's Statuary Dk's Style Hut Dk's Temiskaming Shore Taxi Dk's Towing DK's Travel Dk'Style DK'Z Car Wash Dk- (41 rows) Time: 848889.096 ms (14:08.889) testing=> explain (analyze, verbose, costs, buffers, timing, summary) with sd as (select name, row_number() over (order byname) rnum testing(> from big_table) testing-> select name from sd testing-> where rnum >= (select min(rnum) from sd where name='DK') testing-> and rnum <= (select max(rnum) from sd where name='DK') testing-> and name <> 'DK'; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------- CTE Scan on sd (cost=30560557.18..33546097.10 rows=540111 width=516) (actual time=794958.814..805397.008 rows=41 loops=1) Output: sd.name Filter: ((sd.rnum >= $1) AND (sd.rnum <= $2) AND ((sd.name)::text <> 'DK'::text)) Rows Removed by Filter: 108565045 Buffers: shared hit=386 read=1517012, temp read=3202642 written=2621799 CTE sd -> WindowAgg (cost=23772525.03..25672414.07 rows=108565088 width=30) (actual time=668354.380..753149.449 rows=108565086loops=1) Output: big_table.name, row_number() OVER (?) Buffers: shared hit=386 read=1517012, temp read=2035717 written=2038337 -> Sort (cost=23772525.03..24043937.75 rows=108565088 width=22) (actual time=668354.361..724059.209 rows=108565086loops=1) Output: big_table.name Sort Key: big_table.name Sort Method: external merge Disk: 3453888kB Buffers: shared hit=386 read=1517012, temp read=2035717 written=2038337 -> Seq Scan on name_stuff.big_table (cost=0.00..2603048.88 rows=108565088 width=22) (actual time=18.886..106731.751rows=108565086 loops=1) Output: big_table.name Buffers: shared hit=386 read=1517012 InitPlan 2 (returns $1) -> Aggregate (cost=2444071.54..2444071.55 rows=1 width=8) (actual time=111513.923..111513.923 rows=1 loops=1) Output: min(sd_1.rnum) Buffers: temp read=431506 written=583461 -> CTE Scan on sd sd_1 (cost=0.00..2442714.48 rows=542825 width=8) (actual time=28796.843..111513.916 rows=25loops=1) Output: sd_1.name, sd_1.rnum Filter: ((sd_1.name)::text = 'DK'::text) Rows Removed by Filter: 108565061 Buffers: temp read=431506 written=583461 InitPlan 3 (returns $2) -> Aggregate (cost=2444071.54..2444071.55 rows=1 width=8) (actual time=12135.464..12135.464 rows=1 loops=1) Output: max(sd_2.rnum) Buffers: temp read=583462 -> CTE Scan on sd sd_2 (cost=0.00..2442714.48 rows=542825 width=8) (actual time=2785.283..12135.457 rows=25loops=1) Output: sd_2.name, sd_2.rnum Filter: ((sd_2.name)::text = 'DK'::text) Rows Removed by Filter: 108565061 Buffers: temp read=583462 Planning Time: 0.159 ms Execution Time: 807095.184 ms (37 rows) Time: 807103.763 ms (13:27.104) testing=> -----Original Message----- From: Andrew Gierth <andrew@tao11.riddles.org.uk> Sent: Tuesday, October 1, 2019 1:41 AM To: David Raymond <David.Raymond@tomtom.com> Cc: Tom Lane <tgl@sss.pgh.pa.us>; pgsql-bugs@lists.postgresql.org Subject: Re: BUG #16031: Group by returns duplicate groups >>>>> "David" == David Raymond <David.Raymond@tomtom.com> writes: David> Looking possibly like indexing is part of the issue at the David> moment. Your original EXPLAIN didn't show any index scans being used...? I can think of a possible explanation if there's some other value in the big table which, due to some collation bug, is not consistently being compared as < 'DK' or > 'DK'. Unfortunately, we have two code paths for comparison, and one of them can't easily be exercised directly from SQL, since it is only used for sorts (and therefore also index builds). Can you try these queries: set enable_indexscan=off; set enable_bitmapscan=off; select count(*) c0, count(*) filter (where bttextcmp(name,'DK') < 0) c1_lt, count(*) filter (where bttextcmp(name,'DK') = 0) c1_eq, count(*) filter (where bttextcmp(name,'DK') > 0) c1_gt, count(*) filter (where bttextcmp('DK',name) > 0) c2_lt, count(*) filter (where bttextcmp('DK',name) = 0) c2_eq, count(*) filter (where bttextcmp('DK',name) < 0) c2_gt from big_table; with sd as (select name, row_number() over (order by name) rnum from big_table) select name from sd where rnum >= (select min(rnum) from sd where name='DK') and rnum <= (select max(rnum) from sd where name='DK') and name <> 'DK'; -- Andrew (irc:RhodiumToad)
>>>>> "David" == David Raymond <David.Raymond@tomtom.com> writes: David> c0 | c1_lt | c1_eq | c1_gt | c2_lt | c2_eq | c2_gt David> -------------+------------+-------+------------+------------+-------+------------ David> 108,565,086 | 27,900,023 | 25 | 80,665,038 | 27,900,023 | 25 | 80,665,038 David> (1 row) Well those counts look consistent to me. But this is all kinds of messed up: David> testing=> with sd as (select name, row_number() over (order by name) rnum David> testing(> from big_table) David> testing-> select name from sd David> testing-> where rnum >= (select min(rnum) from sd where name='DK') David> testing-> and rnum <= (select max(rnum) from sd where name='DK') David> testing-> and name <> 'DK'; David> name David> ----------------------------------- David> Dk'bus Marine David> Dk's Auto's David> Dk's Bar & Grill David> Dk's Barbers & Stylist David> Dk's Beach Boutique David> Dk's Cabinets & Countertops David> Dk's Cleaning Service David> Dk's Clothing David> Dk's Communications David> Dk's Dancewear & Fitnesswear David> Dk's Dancewear Boutique David> Dk's Discount Dance & Fitnesswear David> DK's Drywall Service David> DK'S DUSTBUSTERS David> Dk's Family Five Star Trophies David> DK's Family Five Star Trophies David> Dk's Food Mart David> Dk'S Group Pte. Ltd. David> Dk's Hair Designs David> Dk's Hair Happenings David> Dk's Hair Supply David> Dk's Home Decor David> DK's Informática David> Dk's Janitorial David> DK's Liquors David> Dk's Market David> Dk's Moda Masculina David> Dk's Nails And Spa David> DK's Pawn Shop David> Dk's Pet Grooming David> DK's Quality Service David> DK's Restoration David> Dk's Sports Center David> Dk's Statuary David> Dk's Style Hut David> Dk's Temiskaming Shore Taxi David> Dk's Towing David> DK's Travel David> Dk'Style David> DK'Z Car Wash David> Dk- David> (41 rows) Let's see some more data from that. Do this query: with sd as (select name, row_number() over (order by name) rnum from big_table) select rnum, name, encode(convert_to(name, 'SQL_ASCII'),'escape'), name < 'DK' as lt, name = 'DK' as eq, name > 'DK' as gt from sd where rnum >= (select min(rnum) from sd where name='DK') and rnum <= (select max(rnum) from sd where name='DK'); -- Andrew (irc:RhodiumToad)
Here's what it's reporting for an error: testing=# select bt_index_check('big_table_name_id_1_id_2_idx', true); ERROR: high key invariant violated for index "big_table_name_id_1_id_2_idx" DETAIL: Index tid=(86990,140) points to heap tid=(139313,57) page lsn=0/0. Time: 65695.359 ms (01:05.695) testing=# select bt_index_parent_check('big_table_name_id_1_id_2_idx', true); ERROR: down-link lower bound invariant violated for index "big_table_name_id_1_id_2_idx" DETAIL: Parent block=43131 child index tid=(43197,9) parent page lsn=0/0. Time: 1697.205 ms (00:01.697) testing=# -----Original Message----- From: Peter Geoghegan <pg@bowt.ie> Sent: Monday, September 30, 2019 7:34 PM To: David Raymond <David.Raymond@tomtom.com> Cc: Tom Lane <tgl@sss.pgh.pa.us>; pgsql-bugs@lists.postgresql.org Subject: Re: BUG #16031: Group by returns duplicate groups On Mon, Sep 30, 2019 at 2:49 PM David Raymond <David.Raymond@tomtom.com> wrote: > I re-created the index... > create index on big_table (name, id_1, id_2); > > ...and count(*) goes back to returning 9 again. > > and group by sees those 9 as one group and the other 16 as a different group. You should try running contrib/amcheck, which should be able to confirm index corruption, and give you a specific complaint. You may then be able to inspect the exact index page with the problem using contrib/pageinspect. Something like this ought to do it on Postgres 11: CREATE EXTENSION IF NOT EXISTS amcheck SELECT bt_index_check('my_index', true); If that doesn't show any errors, then perhaps try this: SELECT bt_index_parent_check('my_index', true); Let us know what you see. -- Peter Geoghegan
>>>>> "David" == David Raymond <David.Raymond@tomtom.com> writes: David> Here's what it's reporting for an error: David> testing=# select bt_index_check('big_table_name_id_1_id_2_idx', true); David> ERROR: high key invariant violated for index "big_table_name_id_1_id_2_idx" David> DETAIL: Index tid=(86990,140) points to heap tid=(139313,57) page lsn=0/0. David> Time: 65695.359 ms (01:05.695) David> testing=# select bt_index_parent_check('big_table_name_id_1_id_2_idx', true); David> ERROR: down-link lower bound invariant violated for index "big_table_name_id_1_id_2_idx" David> DETAIL: Parent block=43131 child index tid=(43197,9) parent page lsn=0/0. David> Time: 1697.205 ms (00:01.697) David> testing=# Based on the other response, it looks like something is fundamentally broken with regard to sorting this dataset, and since btree index build works by sorting, the index corruption is probably just another symptom of the real problem rather than a cause in itself. -- Andrew (irc:RhodiumToad)
"But this is all kinds of messed up" Sounds about right :) Output below. As asked earlier it's Windows 10, all the lc_* settings are "en-US", and the server encoding is UTF8. Throughout the table there are records with pretty much every notation alphabet. Latin, Cyrillic, Greek, Arabic, Hebrew,Japanese, Mandarin, etc. etc. Which "shouldn't" matter, but I figured I'd mention it. testing=> set enable_indexscan = off; SET Time: 0.536 ms testing=> set enable_bitmapscan = off; SET Time: 0.225 ms testing=> with sd as (select name, row_number() over (order by name) rnum testing(> from big_table) testing-> select rnum, name, encode(convert_to(name, 'SQL_ASCII'),'escape'), testing-> name < 'DK' as lt, name = 'DK' as eq, name > 'DK' as gt testing-> from sd testing-> where rnum >= (select min(rnum) from sd where name='DK') testing-> and rnum <= (select max(rnum) from sd where name='DK'); rnum | name | encode | lt | eq | gt ------------+-----------------------------------+-----------------------------------+----+----+---- 27,900,023 | DK | DK | f | t | f 27,900,024 | DK | DK | f | t | f 27,900,025 | DK | DK | f | t | f 27,900,026 | DK | DK | f | t | f 27,900,027 | DK | DK | f | t | f 27,900,028 | DK | DK | f | t | f 27,900,029 | DK | DK | f | t | f 27,900,030 | DK | DK | f | t | f 27,900,031 | DK | DK | f | t | f 27,900,032 | Dk'bus Marine | Dk'bus Marine | f | f | t 27,900,033 | Dk's Auto's | Dk's Auto's | f | f | t 27,900,034 | Dk's Bar & Grill | Dk's Bar & Grill | f | f | t 27,900,035 | Dk's Barbers & Stylist | Dk's Barbers & Stylist | f | f | t 27,900,036 | Dk's Beach Boutique | Dk's Beach Boutique | f | f | t 27,900,037 | Dk's Cabinets & Countertops | Dk's Cabinets & Countertops | f | f | t 27,900,038 | Dk's Cleaning Service | Dk's Cleaning Service | f | f | t 27,900,039 | Dk's Clothing | Dk's Clothing | f | f | t 27,900,040 | Dk's Communications | Dk's Communications | f | f | t 27,900,041 | Dk's Dancewear & Fitnesswear | Dk's Dancewear & Fitnesswear | f | f | t 27,900,042 | Dk's Dancewear Boutique | Dk's Dancewear Boutique | f | f | t 27,900,043 | Dk's Discount Dance & Fitnesswear | Dk's Discount Dance & Fitnesswear | f | f | t 27,900,044 | DK's Drywall Service | DK's Drywall Service | f | f | t 27,900,045 | DK'S DUSTBUSTERS | DK'S DUSTBUSTERS | f | f | t 27,900,046 | Dk's Family Five Star Trophies | Dk's Family Five Star Trophies | f | f | t 27,900,047 | DK's Family Five Star Trophies | DK's Family Five Star Trophies | f | f | t 27,900,048 | Dk's Food Mart | Dk's Food Mart | f | f | t 27,900,049 | Dk'S Group Pte. Ltd. | Dk'S Group Pte. Ltd. | f | f | t 27,900,050 | Dk's Hair Designs | Dk's Hair Designs | f | f | t 27,900,051 | Dk's Hair Happenings | Dk's Hair Happenings | f | f | t 27,900,052 | Dk's Hair Supply | Dk's Hair Supply | f | f | t 27,900,053 | Dk's Home Decor | Dk's Home Decor | f | f | t 27,900,054 | DK's Informática | DK's Inform\303\241tica | f | f | t 27,900,055 | Dk's Janitorial | Dk's Janitorial | f | f | t 27,900,056 | DK's Liquors | DK's Liquors | f | f | t 27,900,057 | Dk's Market | Dk's Market | f | f | t 27,900,058 | Dk's Moda Masculina | Dk's Moda Masculina | f | f | t 27,900,059 | Dk's Nails And Spa | Dk's Nails And Spa | f | f | t 27,900,060 | DK's Pawn Shop | DK's Pawn Shop | f | f | t 27,900,061 | Dk's Pet Grooming | Dk's Pet Grooming | f | f | t 27,900,062 | DK's Quality Service | DK's Quality Service | f | f | t 27,900,063 | DK's Restoration | DK's Restoration | f | f | t 27,900,064 | Dk's Sports Center | Dk's Sports Center | f | f | t 27,900,065 | Dk's Statuary | Dk's Statuary | f | f | t 27,900,066 | Dk's Style Hut | Dk's Style Hut | f | f | t 27,900,067 | Dk's Temiskaming Shore Taxi | Dk's Temiskaming Shore Taxi | f | f | t 27,900,068 | Dk's Towing | Dk's Towing | f | f | t 27,900,069 | DK's Travel | DK's Travel | f | f | t 27,900,070 | Dk'Style | Dk'Style | f | f | t 27,900,071 | DK'Z Car Wash | DK'Z Car Wash | f | f | t 27,900,072 | Dk- | Dk- | t | f | f 27,900,073 | DK | DK | f | t | f 27,900,074 | DK | DK | f | t | f 27,900,075 | DK | DK | f | t | f 27,900,076 | DK | DK | f | t | f 27,900,077 | DK | DK | f | t | f 27,900,078 | DK | DK | f | t | f 27,900,079 | DK | DK | f | t | f 27,900,080 | DK | DK | f | t | f 27,900,081 | DK | DK | f | t | f 27,900,082 | DK | DK | f | t | f 27,900,083 | DK | DK | f | t | f 27,900,084 | DK | DK | f | t | f 27,900,085 | DK | DK | f | t | f 27,900,086 | DK | DK | f | t | f 27,900,087 | DK | DK | f | t | f 27,900,088 | DK | DK | f | t | f (66 rows) Time: 821796.036 ms (13:41.796) testing=> explain (analyze, verbose, costs, buffers, timing, summary) testing-> with sd as (select name, row_number() over (order by name) rnum testing(> from big_table) testing-> select rnum, name, encode(convert_to(name, 'SQL_ASCII'),'escape'), testing-> name < 'DK' as lt, name = 'DK' as eq, name > 'DK' as gt testing-> from sd testing-> where rnum >= (select min(rnum) from sd where name='DK') testing-> and rnum <= (select max(rnum) from sd where name='DK'); QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- CTE Scan on sd (cost=30560557.18..33281469.69 rows=542825 width=559) (actual time=886056.646..898585.621 rows=66 loops=1) Output: sd.rnum, sd.name, encode(convert_to((sd.name)::text, 'SQL_ASCII'::name), 'escape'::text), ((sd.name)::text < 'DK'::text),((sd.name)::text = 'DK'::text), ((sd.name)::text > 'DK'::text) Filter: ((sd.rnum >= $1) AND (sd.rnum <= $2)) Rows Removed by Filter: 108565020 Buffers: shared hit=482 read=1516916, temp read=3200588 written=2619846 CTE sd -> WindowAgg (cost=23772525.03..25672414.07 rows=108565088 width=30) (actual time=719781.561..832442.189 rows=108565086loops=1) Output: big_table.name, row_number() OVER (?) Buffers: shared hit=482 read=1516916, temp read=2033663 written=2036384 -> Sort (cost=23772525.03..24043937.75 rows=108565088 width=22) (actual time=719781.549..793477.044 rows=108565086loops=1) Output: big_table.name Sort Key: big_table.name Sort Method: external merge Disk: 3453888kB Buffers: shared hit=482 read=1516916, temp read=2033663 written=2036384 -> Seq Scan on name_stuff.big_table (cost=0.00..2603048.88 rows=108565088 width=22) (actual time=0.010..105238.261rows=108565086 loops=1) Output: big_table.name Buffers: shared hit=482 read=1516916 InitPlan 2 (returns $1) -> Aggregate (cost=2444071.54..2444071.55 rows=1 width=8) (actual time=148300.335..148300.336 rows=1 loops=1) Output: min(sd_1.rnum) Buffers: temp read=431489 written=583461 -> CTE Scan on sd sd_1 (cost=0.00..2442714.48 rows=542825 width=8) (actual time=34105.882..148300.327 rows=25loops=1) Output: sd_1.name, sd_1.rnum Filter: ((sd_1.name)::text = 'DK'::text) Rows Removed by Filter: 108565061 Buffers: temp read=431489 written=583461 InitPlan 3 (returns $2) -> Aggregate (cost=2444071.54..2444071.55 rows=1 width=8) (actual time=14707.032..14707.032 rows=1 loops=1) Output: max(sd_2.rnum) Buffers: temp read=583462 -> CTE Scan on sd sd_2 (cost=0.00..2442714.48 rows=542825 width=8) (actual time=3729.712..14707.025 rows=25loops=1) Output: sd_2.name, sd_2.rnum Filter: ((sd_2.name)::text = 'DK'::text) Rows Removed by Filter: 108565061 Buffers: temp read=583462 Planning Time: 0.099 ms Execution Time: 899881.036 ms (37 rows) Time: 899900.240 ms (14:59.900) testing=> -----Original Message----- From: Andrew Gierth <andrew@tao11.riddles.org.uk> Sent: Tuesday, October 1, 2019 11:21 AM To: David Raymond <David.Raymond@tomtom.com> Cc: Tom Lane <tgl@sss.pgh.pa.us>; pgsql-bugs@lists.postgresql.org Subject: Re: BUG #16031: Group by returns duplicate groups But this is all kinds of messed up: ... Let's see some more data from that. Do this query: with sd as (select name, row_number() over (order by name) rnum from big_table) select rnum, name, encode(convert_to(name, 'SQL_ASCII'),'escape'), name < 'DK' as lt, name = 'DK' as eq, name > 'DK' as gt from sd where rnum >= (select min(rnum) from sd where name='DK') and rnum <= (select max(rnum) from sd where name='DK'); -- Andrew (irc:RhodiumToad)
I checked for other places where there was weirdness going on and there are more. testing=> create temp table weird as with foo as (select name from big_table group by name) select name from foo group byname having count(*) > 1; SELECT 23 Time: 700304.130 ms (11:40.304) Hilariously enough 'DK' isn't in these 23. Did initdb let me pick a locale that doesn't actually exist? I'm beginning to feel like I need an exorcist.
>>>>> "David" == David Raymond <David.Raymond@tomtom.com> writes: David> I checked for other places where there was weirdness going on David> and there are more. You could try something like, select name, lname from (select name, lag(name) over (order by name) as lname from big_table) s where name < lname; That should show all cases where the sort order was inconsistent with the < operator (which obviously should never happen). However, there's not much more I can do to help with this, since I don't use Windows myself and have no useful access to any Windows system. You might try and cut down the data to the smallest set that shows inconsistencies using the above; particularly relevant is whether the problem only shows up for external merge sorts or whether it happens for in-memory sorts too. -- Andrew (irc:RhodiumToad)
As an update, I've currently got a dump that consistently shows weirdness when loaded. It's just the "name" field, has 1.3million records, is 15 MB zipped, and has things garbled enough that I don't mind sending it. How small does it need to be before it's good to send to someone? Output after re-loading the table: testing=> \d+ weird_grouping Table "name_stuff.weird_grouping" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------+------------------------+-----------+----------+---------+----------+--------------+------------- name | character varying(254) | | not null | | extended | | testing=> select count(*), count(distinct name) from weird_grouping; count | count -----------+----------- 1,297,265 | 1,176,103 (1 row) Time: 7616.186 ms (00:07.616) testing=> explain (analyze, verbose, costs, buffers, timing, summary) select count(*), count(distinct name) from weird_grouping; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=10187.10..10187.11 rows=1 width=16) (actual time=7805.463..7805.463 rows=1 loops=1) Output: count(*), count(DISTINCT name) Buffers: shared hit=2080 read=6236, temp read=5484 written=5506 -> Seq Scan on name_stuff.weird_grouping (cost=0.00..9563.40 rows=124740 width=516) (actual time=0.031..128.203 rows=1297265loops=1) Output: name Buffers: shared hit=2080 read=6236 Planning Time: 0.027 ms Execution Time: 7805.483 ms (8 rows) Time: 7805.822 ms (00:07.806) testing=> with foo as (select name from weird_grouping group by name) select name from foo group by name having count(*)> 1; name ------ (0 rows) Time: 1950.401 ms (00:01.950) testing=> explain (analyze, verbose, costs, buffers, timing, summary) with foo as (select name from weird_grouping groupby name) select name from foo group by name having count(*) > 1; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------- HashAggregate (cost=9882.25..9884.75 rows=67 width=516) (actual time=1466.738..1466.738 rows=0 loops=1) Output: foo.name Group Key: foo.name Filter: (count(*) > 1) Rows Removed by Filter: 1176101 Buffers: shared hit=2144 read=6172, temp written=4533 CTE foo -> HashAggregate (cost=9875.25..9877.25 rows=200 width=516) (actual time=494.343..734.005 rows=1176101 loops=1) Output: weird_grouping.name Group Key: weird_grouping.name Buffers: shared hit=2144 read=6172 -> Seq Scan on name_stuff.weird_grouping (cost=0.00..9563.40 rows=124740 width=516) (actual time=0.283..120.898rows=1297265 loops=1) Output: weird_grouping.name Buffers: shared hit=2144 read=6172 -> CTE Scan on foo (cost=0.00..4.00 rows=200 width=516) (actual time=494.346..1035.185 rows=1176101 loops=1) Output: foo.name Buffers: shared hit=2144 read=6172, temp written=4533 Planning Time: 0.464 ms Execution Time: 2230.238 ms (19 rows) Time: 2231.291 ms (00:02.231) testing=> analyze verbose weird_grouping; INFO: analyzing "name_stuff.weird_grouping" INFO: "weird_grouping": scanned 8316 of 8316 pages, containing 1297265 live rows and 0 dead rows; 30000 rows in sample,1297265 estimated total rows ANALYZE Time: 474.963 ms testing=> select count(*), count(distinct name) from weird_grouping; count | count -----------+----------- 1,297,265 | 1,176,103 (1 row) Time: 7449.983 ms (00:07.450) testing=> explain (analyze, verbose, costs, buffers, timing, summary) select count(*), count(distinct name) from weird_grouping; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=27774.98..27774.99 rows=1 width=16) (actual time=7754.578..7754.578 rows=1 loops=1) Output: count(*), count(DISTINCT name) Buffers: shared hit=2240 read=6076, temp read=5484 written=5506 -> Seq Scan on name_stuff.weird_grouping (cost=0.00..21288.65 rows=1297265 width=20) (actual time=0.036..102.144 rows=1297265loops=1) Output: name Buffers: shared hit=2240 read=6076 Planning Time: 0.031 ms Execution Time: 7754.598 ms (8 rows) Time: 7754.902 ms (00:07.755) testing=> with foo as (select name from weird_grouping group by name) select name from foo group by name having count(*)> 1; name ------- DCT DELTA (2 rows) Time: 9561.382 ms (00:09.561) testing=> explain (analyze, verbose, costs, buffers, timing, summary) with foo as (select name from weird_grouping groupby name) select name from foo group by name having count(*) > 1; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------- HashAggregate (cost=209783.95..209786.45 rows=67 width=516) (actual time=8591.210..8686.132 rows=2 loops=1) Output: foo.name Group Key: foo.name Filter: (count(*) > 1) Rows Removed by Filter: 1176099 Buffers: shared hit=2304 read=6012, temp read=7800 written=12363 CTE foo -> Group (cost=179613.72..186100.05 rows=947356 width=20) (actual time=6416.900..7842.634 rows=1176103 loops=1) Output: weird_grouping.name Group Key: weird_grouping.name Buffers: shared hit=2304 read=6012, temp read=7800 written=7830 -> Sort (cost=179613.72..182856.89 rows=1297265 width=20) (actual time=6416.899..7665.158 rows=1297265 loops=1) Output: weird_grouping.name Sort Key: weird_grouping.name Sort Method: external merge Disk: 39048kB Buffers: shared hit=2304 read=6012, temp read=7800 written=7830 -> Seq Scan on name_stuff.weird_grouping (cost=0.00..21288.65 rows=1297265 width=20) (actual time=0.294..137.486rows=1297265 loops=1) Output: weird_grouping.name Buffers: shared hit=2304 read=6012 -> CTE Scan on foo (cost=0.00..18947.12 rows=947356 width=516) (actual time=6416.902..8105.771 rows=1176103 loops=1) Output: foo.name Buffers: shared hit=2304 read=6012, temp read=7800 written=12363 Planning Time: 0.258 ms Execution Time: 10305.891 ms (24 rows) Time: 10306.990 ms (00:10.307) testing=> -----Original Message----- From: Andrew Gierth <andrew@tao11.riddles.org.uk> Sent: Tuesday, October 1, 2019 3:27 PM To: David Raymond <David.Raymond@tomtom.com> Cc: pgsql-bugs@lists.postgresql.org Subject: Re: BUG #16031: Group by returns duplicate groups >>>>> "David" == David Raymond <David.Raymond@tomtom.com> writes: David> I checked for other places where there was weirdness going on David> and there are more. You could try something like, select name, lname from (select name, lag(name) over (order by name) as lname from big_table) s where name < lname; That should show all cases where the sort order was inconsistent with the < operator (which obviously should never happen). However, there's not much more I can do to help with this, since I don't use Windows myself and have no useful access to any Windows system. You might try and cut down the data to the smallest set that shows inconsistencies using the above; particularly relevant is whether the problem only shows up for external merge sorts or whether it happens for in-memory sorts too. -- Andrew (irc:RhodiumToad)
>>>>> "David" == David Raymond <David.Raymond@tomtom.com> writes: David> As an update, I've currently got a dump that consistently shows David> weirdness when loaded. It's just the "name" field, has 1.3 David> million records, is 15 MB zipped, and has things garbled enough David> that I don't mind sending it. David> How small does it need to be before it's good to send to David> someone? That's small enough for me, though since I don't use Windows all I'll be able to do is check if you're exposing some general PG bug. If not I'll see if I can find someone to test on Windows. -- Andrew (irc:RhodiumToad)
Downloaded and installed 12.0, created a nice shiny new cluster, and confirmed that it's still doing it. Now in 12 you haveto force it to materialize the CTE, which was why I had used a CTE in 11 in the first place. testing=> select version(); version ------------------------------------------------------------ PostgreSQL 12.0, compiled by Visual C++ build 1914, 64-bit (1 row) Time: 0.148 ms testing=> \d+ weird_grouping Table "name_stuff.weird_grouping" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------+------------------------+-----------+----------+---------+----------+--------------+------------- name | character varying(254) | | not null | | extended | | Access method: heap testing=> analyze verbose weird_grouping; INFO: analyzing "name_stuff.weird_grouping" INFO: "weird_grouping": scanned 8316 of 8316 pages, containing 1297265 live rows and 0 dead rows; 30000 rows in sample,1297265 estimated total rows ANALYZE Time: 206.577 ms testing=> select count(*), count(distinct name) from weird_grouping; count | count -----------+----------- 1,297,265 | 1,176,103 (1 row) Time: 6729.011 ms (00:06.729) testing=> with foo as (select name from weird_grouping group by name) select name from foo group by name having count(*)> 1; name ------ (0 rows) Time: 7289.128 ms (00:07.289) testing=> explain (analyze, verbose, costs, settings, buffers, timing, summary) with foo as (select name from weird_groupinggroup by name) select name from foo group by name having count(*) > 1; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------- GroupAggregate (cost=179613.72..200268.04 rows=67 width=20) (actual time=6203.868..6203.868 rows=0 loops=1) Output: weird_grouping.name Group Key: weird_grouping.name Filter: (count(*) > 1) Rows Removed by Filter: 1176103 Buffers: shared hit=2464 read=5852, temp read=7800 written=7830 -> Group (cost=179613.72..186100.05 rows=944366 width=20) (actual time=4769.781..5985.111 rows=1176103 loops=1) Output: weird_grouping.name Group Key: weird_grouping.name Buffers: shared hit=2464 read=5852, temp read=7800 written=7830 -> Sort (cost=179613.72..182856.89 rows=1297265 width=20) (actual time=4769.779..5844.350 rows=1297265 loops=1) Output: weird_grouping.name Sort Key: weird_grouping.name Sort Method: external merge Disk: 39048kB Buffers: shared hit=2464 read=5852, temp read=7800 written=7830 -> Seq Scan on name_stuff.weird_grouping (cost=0.00..21288.65 rows=1297265 width=20) (actual time=0.059..102.772rows=1297265 loops=1) Output: weird_grouping.name Buffers: shared hit=2464 read=5852 Settings: search_path = 'name_stuff' Planning Time: 0.048 ms Execution Time: 7115.761 ms (21 rows) Time: 7116.170 ms (00:07.116) testing=> with foo as materialized (select name from weird_grouping group by name) select name from foo group by name havingcount(*) > 1; name ------- DCT DELTA (2 rows) Time: 8850.833 ms (00:08.851) testing=> explain (analyze, verbose, costs, settings, buffers, timing, summary) with foo as materialized (select name fromweird_grouping group by name) select name from foo group by name having count(*) > 1; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------- HashAggregate (cost=209709.20..209711.70 rows=67 width=516) (actual time=6676.811..6768.094 rows=2 loops=1) Output: foo.name Group Key: foo.name Filter: (count(*) > 1) Rows Removed by Filter: 1176099 Buffers: shared hit=2528 read=5788, temp read=7800 written=12363 CTE foo -> Group (cost=179613.72..186100.05 rows=944366 width=20) (actual time=4774.681..6004.725 rows=1176103 loops=1) Output: weird_grouping.name Group Key: weird_grouping.name Buffers: shared hit=2528 read=5788, temp read=7800 written=7830 -> Sort (cost=179613.72..182856.89 rows=1297265 width=20) (actual time=4774.678..5860.270 rows=1297265 loops=1) Output: weird_grouping.name Sort Key: weird_grouping.name Sort Method: external merge Disk: 39048kB Buffers: shared hit=2528 read=5788, temp read=7800 written=7830 -> Seq Scan on name_stuff.weird_grouping (cost=0.00..21288.65 rows=1297265 width=20) (actual time=0.065..101.141rows=1297265 loops=1) Output: weird_grouping.name Buffers: shared hit=2528 read=5788 -> CTE Scan on foo (cost=0.00..18887.32 rows=944366 width=516) (actual time=4774.683..6228.002 rows=1176103 loops=1) Output: foo.name Buffers: shared hit=2528 read=5788, temp read=7800 written=12363 Settings: search_path = 'name_stuff' Planning Time: 0.054 ms Execution Time: 8786.597 ms (25 rows) Time: 8787.011 ms (00:08.787) testing=> -----Original Message----- From: Andrew Gierth <andrew@tao11.riddles.org.uk> Sent: Sunday, October 6, 2019 10:29 AM To: David Raymond <David.Raymond@tomtom.com> Cc: pgsql-bugs@lists.postgresql.org Subject: Re: BUG #16031: Group by returns duplicate groups >>>>> "David" == David Raymond <David.Raymond@tomtom.com> writes: David> As an update, I've currently got a dump that consistently shows David> weirdness when loaded. It's just the "name" field, has 1.3 David> million records, is 15 MB zipped, and has things garbled enough David> that I don't mind sending it. David> How small does it need to be before it's good to send to David> someone? That's small enough for me, though since I don't use Windows all I'll be able to do is check if you're exposing some general PG bug. If not I'll see if I can find someone to test on Windows. -- Andrew (irc:RhodiumToad)
>>>>> "David" == David Raymond <David.Raymond@tomtom.com> writes: David> As an update, I've currently got a dump that consistently shows David> weirdness when loaded. It's just the "name" field, has 1.3 David> million records, is 15 MB zipped, and has things garbled enough David> that I don't mind sending it. How small does it need to be David> before it's good to send to someone? I got your file, and I put it up here if others want to test this: http://www.rhodiumtoad.org.uk/junk/Bug_16031-BadGrouping.zip Unfortunately, I can't reproduce any issue (as expected), and the Windows guy who tested it for me also wasn't able to reproduce it. :-( -- Andrew (irc:RhodiumToad)
Hi, On Mon, Oct 07, 2019 at 06:29:04PM +0000, David Raymond wrote: >Downloaded and installed 12.0, created a nice shiny new cluster, and >confirmed that it's still doing it. Now in 12 you have to force it to >materialize the CTE, which was why I had used a CTE in 11 in the first >place. > I've tried running this on a 12.0 cluster too (on Linux though), using the data set linked by Andrew, and I haven't observed any duplicate rows either. I do have some observations, though ... > >testing=> select version(); > version >------------------------------------------------------------ > PostgreSQL 12.0, compiled by Visual C++ build 1914, 64-bit >(1 row) > >Time: 0.148 ms >testing=> \d+ weird_grouping > Table "name_stuff.weird_grouping" > Column | Type | Collation | Nullable | Default | Storage | Stats target | Description >--------+------------------------+-----------+----------+---------+----------+--------------+------------- > name | character varying(254) | | not null | | extended | | >Access method: heap > >testing=> analyze verbose weird_grouping; >INFO: analyzing "name_stuff.weird_grouping" >INFO: "weird_grouping": scanned 8316 of 8316 pages, containing 1297265 live rows and 0 dead rows; 30000 rows in sample,1297265 estimated total rows >ANALYZE >Time: 206.577 ms >testing=> select count(*), count(distinct name) from weird_grouping; > count | count >-----------+----------- > 1,297,265 | 1,176,103 >(1 row) > I do get this: count | count ---------+--------- 1297265 | 1176101 (1 row) >Time: 6729.011 ms (00:06.729) >testing=> with foo as (select name from weird_grouping group by name) select name from foo group by name having count(*)> 1; > name >------ >(0 rows) > >Time: 7289.128 ms (00:07.289) >testing=> explain (analyze, verbose, costs, settings, buffers, timing, summary) with foo as (select name from weird_groupinggroup by name) select name from foo group by name having count(*) > 1; > QUERY PLAN >-------------------------------------------------------------------------------------------------------------------------------------------------------- > GroupAggregate (cost=179613.72..200268.04 rows=67 width=20) (actual time=6203.868..6203.868 rows=0 loops=1) > Output: weird_grouping.name > Group Key: weird_grouping.name > Filter: (count(*) > 1) > Rows Removed by Filter: 1176103 > Buffers: shared hit=2464 read=5852, temp read=7800 written=7830 > -> Group (cost=179613.72..186100.05 rows=944366 width=20) (actual time=4769.781..5985.111 rows=1176103 loops=1) > Output: weird_grouping.name > Group Key: weird_grouping.name > Buffers: shared hit=2464 read=5852, temp read=7800 written=7830 > -> Sort (cost=179613.72..182856.89 rows=1297265 width=20) (actual time=4769.779..5844.350 rows=1297265 loops=1) > Output: weird_grouping.name > Sort Key: weird_grouping.name > Sort Method: external merge Disk: 39048kB > Buffers: shared hit=2464 read=5852, temp read=7800 written=7830 > -> Seq Scan on name_stuff.weird_grouping (cost=0.00..21288.65 rows=1297265 width=20) (actual time=0.059..102.772rows=1297265 loops=1) > Output: weird_grouping.name > Buffers: shared hit=2464 read=5852 > Settings: search_path = 'name_stuff' > Planning Time: 0.048 ms > Execution Time: 7115.761 ms >(21 rows) > I'd like to point this may not be actually correct either. It does not produce any rows, i.e. it does not see any "duplicate groups", but it does this: -> Group (cost=179613.72..186100.05 rows=944366 width=20) (actual time=4769.781..5985.111 rows=1176103 loops=1) Notice that it has 1176103 groups, but is that actually correct? On my machine I only see 1176101 groups, and the difference (2) would match the DCL and DELTA duplicates with the next query. >Time: 7116.170 ms (00:07.116) >testing=> with foo as materialized (select name from weird_grouping group by name) select name from foo group by name havingcount(*) > 1; > name >------- > DCT > DELTA >(2 rows) > >Time: 8850.833 ms (00:08.851) >testing=> explain (analyze, verbose, costs, settings, buffers, timing, summary) with foo as materialized (select name fromweird_grouping group by name) select name from foo group by name having count(*) > 1; > QUERY PLAN >---------------------------------------------------------------------------------------------------------------------------------------------------------- > HashAggregate (cost=209709.20..209711.70 rows=67 width=516) (actual time=6676.811..6768.094 rows=2 loops=1) > Output: foo.name > Group Key: foo.name > Filter: (count(*) > 1) > Rows Removed by Filter: 1176099 > Buffers: shared hit=2528 read=5788, temp read=7800 written=12363 > CTE foo > -> Group (cost=179613.72..186100.05 rows=944366 width=20) (actual time=4774.681..6004.725 rows=1176103 loops=1) > Output: weird_grouping.name > Group Key: weird_grouping.name > Buffers: shared hit=2528 read=5788, temp read=7800 written=7830 > -> Sort (cost=179613.72..182856.89 rows=1297265 width=20) (actual time=4774.678..5860.270 rows=1297265 loops=1) > Output: weird_grouping.name > Sort Key: weird_grouping.name > Sort Method: external merge Disk: 39048kB > Buffers: shared hit=2528 read=5788, temp read=7800 written=7830 > -> Seq Scan on name_stuff.weird_grouping (cost=0.00..21288.65 rows=1297265 width=20) (actual time=0.065..101.141rows=1297265 loops=1) > Output: weird_grouping.name > Buffers: shared hit=2528 read=5788 > -> CTE Scan on foo (cost=0.00..18887.32 rows=944366 width=516) (actual time=4774.683..6228.002 rows=1176103 loops=1) > Output: foo.name > Buffers: shared hit=2528 read=5788, temp read=7800 written=12363 > Settings: search_path = 'name_stuff' > Planning Time: 0.054 ms > Execution Time: 8786.597 ms >(25 rows) > >Time: 8787.011 ms (00:08.787) >testing=> > Can you try running this with enable_hashagg = off? That should give you another Sort on the CTE Scan, and a GroupAggregate at the top. I wonder if that makes the issue go away ... regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
" Can you try running this with enable_hashagg = off? That should give you another Sort on the CTE Scan, and a GroupAggregateat the top. I wonder if that makes the issue go away ..." Here's what I get for that. And extra weirdness below: testing=> set enable_hashagg = off; SET Time: 0.241 ms testing=> with foo as materialized (select name from weird_grouping group by name) select name from foo group by name havingcount(*) > 1; name ------ (0 rows) Time: 10423.486 ms (00:10.423) testing=> explain (analyze, verbose, costs, settings, buffers, timing, summary) with foo as materialized (select name fromweird_grouping group by name) select name from foo group by name having count(*) > 1; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------- GroupAggregate (cost=737694.91..744780.15 rows=67 width=516) (actual time=7650.411..7650.411 rows=0 loops=1) Output: foo.name Group Key: foo.name Filter: (count(*) > 1) Rows Removed by Filter: 1176103 Buffers: shared hit=160 read=8156, temp read=14645 written=19235 CTE foo -> Group (cost=179613.72..186100.05 rows=944366 width=20) (actual time=4811.449..6027.355 rows=1176103 loops=1) Output: weird_grouping.name Group Key: weird_grouping.name Buffers: shared hit=160 read=8156, temp read=7800 written=7830 -> Sort (cost=179613.72..182856.89 rows=1297265 width=20) (actual time=4811.447..5884.667 rows=1297265 loops=1) Output: weird_grouping.name Sort Key: weird_grouping.name Sort Method: external merge Disk: 39048kB Buffers: shared hit=160 read=8156, temp read=7800 written=7830 -> Seq Scan on name_stuff.weird_grouping (cost=0.00..21288.65 rows=1297265 width=20) (actual time=0.058..117.833rows=1297265 loops=1) Output: weird_grouping.name Buffers: shared hit=160 read=8156 -> Sort (cost=551594.86..553955.78 rows=944366 width=516) (actual time=6915.562..7418.978 rows=1176103 loops=1) Output: foo.name Sort Key: foo.name Sort Method: external merge Disk: 36368kB Buffers: shared hit=160 read=8156, temp read=14645 written=19235 -> CTE Scan on foo (cost=0.00..18887.32 rows=944366 width=516) (actual time=4811.451..6243.160 rows=1176103 loops=1) Output: foo.name Buffers: shared hit=160 read=8156, temp read=7800 written=12363 Settings: enable_hashagg = 'off', search_path = 'name_stuff' Planning Time: 0.064 ms Execution Time: 10175.478 ms (30 rows) Time: 10175.906 ms (00:10.176) testing=> But now here's another weird bit: testing=> select count(*), count(distinct name) from weird_grouping; count | count -----------+----------- 1,297,265 | 1,176,103 (1 row) Time: 6866.369 ms (00:06.866) testing=> explain (analyze, verbose, costs, settings, buffers, timing, summary) select count(*), count(distinct name) fromweird_grouping; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=27774.98..27774.99 rows=1 width=16) (actual time=6642.856..6642.856 rows=1 loops=1) Output: count(*), count(DISTINCT name) Buffers: shared hit=928 read=7388, temp read=5484 written=5506 -> Seq Scan on name_stuff.weird_grouping (cost=0.00..21288.65 rows=1297265 width=20) (actual time=0.072..110.798 rows=1297265loops=1) Output: name Buffers: shared hit=928 read=7388 Settings: search_path = 'name_stuff' Planning Time: 0.030 ms Execution Time: 6642.875 ms (9 rows) Time: 6643.181 ms (00:06.643) testing=> select count(*), count(distinct name collate "C") from weird_grouping; count | count -----------+----------- 1,297,265 | 1,176,101 (1 row) Time: 1655.202 ms (00:01.655) testing=> explain (analyze, verbose, costs, settings, buffers, timing, summary) select count(*), count(distinct name collate"C") from weird_grouping; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=27774.98..27774.99 rows=1 width=16) (actual time=1788.276..1788.277 rows=1 loops=1) Output: count(*), count(DISTINCT (name)::character varying(254)) Buffers: shared hit=992 read=7324, temp read=5484 written=5506 -> Seq Scan on name_stuff.weird_grouping (cost=0.00..21288.65 rows=1297265 width=20) (actual time=0.059..112.815 rows=1297265loops=1) Output: name Buffers: shared hit=992 read=7324 Settings: search_path = 'name_stuff' Planning Time: 0.030 ms Execution Time: 1788.295 ms (9 rows) Time: 1788.596 ms (00:01.789) testing=> show lc_collate; lc_collate ------------ en-US (1 row) Time: 0.122 ms testing=> show server_encoding; server_encoding ----------------- UTF8 (1 row) Time: 0.082 ms testing=> select count(*), count(distinct name collate "en-US") from weird_grouping; ERROR: collation "en-US" for encoding "UTF8" does not exist LINE 1: select count(*), count(distinct name collate "en-US") from w... ^ Time: 5.759 ms testing=>