Re: BUG #13908: Query returns too few rows - Mailing list pgsql-bugs

From David G. Johnston
Subject Re: BUG #13908: Query returns too few rows
Date
Msg-id CAKFQuwZcE4x=2ZvFpiuTagdwLACuTtTVk_NxZed3NUOKe7ziQQ@mail.gmail.com
Whole thread Raw
In response to Fwd: BUG #13908: Query returns too few rows  ("David G. Johnston" <david.g.johnston@gmail.com>)
Responses Re: BUG #13908: Query returns too few rows  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: BUG #13908: Query returns too few rows  ("David G. Johnston" <david.g.johnston@gmail.com>)
List pgsql-bugs
Tom, we really need you to chime in here.

On Thu, Feb 4, 2016 at 2:58 PM, David G. Johnston <
david.g.johnston@gmail.com> wrote:

> Adding back the list so that everyone can see the latest finding.  Adding
> in the original bug block too...
> =E2=80=8B
>
> The following bug has been logged on the website:
>
> Bug reference:      13908
> Logged by:          Seth
> Email address:      seth-p@outlook.com
> PostgreSQL version: 9.5.0
> Operating system:   Windows 10
> Description:
>
>
> *  ->  Unique  (cost=3D1497322.60..1501340.65 rows=3D267870 width=3D24) (=
actual
> time=3D153375.583..154829.461 rows=3D415874 loops=3D1)*
>
>

=E2=80=8BWithout running any DML I just got this result on the DISTINCT que=
ry...=E2=80=8B

*=E2=80=8B  ->  Unique  (cost=3D1519634.64..1520973.99 rows=3D200 width=3D4=
8) (actual
time=3D161695.425..163174.422 rows=3D416075 loops=3D1)*
=E2=80=8B
There is nothing in the SQL=E2=80=8B itself that would invoke an order depe=
ndency...

The query, the explain analyze on the first pass and the explain analyze on
the second pass.

*The second-level Hash Join combines/sees, in both cases, 7,993 and
1,275,138 records but depending on the LEFT/RIGHT order of the sub-nodes
appears to provide a different result.*

EXPLAIN ANALYZE
select count(*) from
(
SELECT DISTINCT
    rrr
FROM public.rrr
INNER JOIN public.fff
    ON fff.idx =3D rrr.fff_idx
INNER JOIN public.uuu
    ON uuu.universe_hash =3D 5188205190738336870 AND
           uuu.bababa =3D rrr.bababa
WHERE
    fff.file_name_model_ver =3D '100' AND
    fff.file_name_date BETWEEN '2005-03-01' AND '2005-07-30'
) ad

QUERY PLAN
Aggregate  (cost=3D1504689.02..1504689.03 rows=3D1 width=3D0) (actual
time=3D155253.758..155253.759 rows=3D1 loops=3D1)
  ->  Unique  (cost=3D1497322.60..1501340.65 rows=3D267870 width=3D24) *(ac=
tual
time=3D153375.583..154829.461 rows=3D415874 loops=3D1)*
        ->  Sort  (cost=3D1497322.60..1497992.27 rows=3D267870 width=3D24)
(actual time=3D153375.578..153919.245 rows=3D415874 loops=3D1)
              Sort Key: rrr.idx, rrr.row_number, rrr.bababa, rrr.data_date,
rrr.fff_idx
              Sort Method: external merge  Disk: 13784kB
              ->  Hash Join  (cost=3D1456589.92..1467677.54 rows=3D267870
width=3D24) (actual time=3D151982.289..152665.099 rows=3D415874 loops=3D1)
                    Hash Cond: ((uuu.bababa)::text =3D (rrr.bababa)::text)
                    ->  Index Only Scan using pk_uuu on uuu
 (cost=3D0.42..270.01 rows=3D8091 width=3D8) (actual time=3D0.012..9.359 ro=
ws=3D*7993*
loops=3D1)
                          Index Cond: (universe_hash =3D
'5188205190738336870'::bigint)
                          Heap Fetches: 0
                    ->  Hash  (cost=3D1440572.59..1440572.59 rows=3D872393
width=3D24) (actual time=3D151981.919..151981.919 rows=3D*1275138* loops=3D=
1)
                          Buckets: 65536 (originally 65536)  Batches: 32
(originally 16)  Memory Usage: 3585kB
                          ->  Hash Join  (cost=3D819.47..1440572.59
rows=3D872393 width=3D24) (actual time=3D12037.549..150408.756 rows=3D12751=
38
loops=3D1)
                                Hash Cond: (rrr.fff_idx =3D fff.idx)
                                ->  Seq Scan on rrr  (cost=3D0.00..1164409.=
32
rows=3D71098632 width=3D24) (actual time=3D0.004..73673.708 rows=3D71098547=
 loops=3D1)
                                ->  Hash  (cost=3D807.58..807.58 rows=3D951
width=3D4) (actual time=3D2.360..2.360 rows=3D964 loops=3D1)
                                      Buckets: 1024  Batches: 1  Memory
Usage: 42kB
                                      ->  Bitmap Heap Scan on fff
 (cost=3D30.98..807.58 rows=3D951 width=3D4) (actual time=3D0.123..1.293 ro=
ws=3D964
loops=3D1)
                                            Recheck Cond: ((file_name_date
>=3D '2005-03-01'::date) AND (file_name_date <=3D '2005-07-30'::date))
                                            Filter:
((file_name_model_ver)::text =3D '100'::text)
                                            Rows Removed by Filter: 540
                                            Heap Blocks: exact=3D30
                                            ->  Bitmap Index Scan on
ix_fff_file_name_date  (cost=3D0.00..30.74 rows=3D1445 width=3D0) (actual
time=3D0.111..0.111 rows=3D1504 loops=3D1)
                                                  Index Cond:
((file_name_date >=3D '2005-03-01'::date) AND (file_name_date <=3D
'2005-07-30'::date))
Planning time: 0.668 ms
Execution time: 155255.933 ms

QUERY PLAN
Aggregate  (cost=3D1520976.49..1520976.50 rows=3D1 width=3D0) (actual
time=3D163603.573..163603.574 rows=3D1 loops=3D1)
  ->  Unique  (cost=3D1519634.64..1520973.99 rows=3D200 width=3D48) *(actua=
l
time=3D161695.425..163174.422 rows=3D416075 loops=3D1)*
        ->  Sort  (cost=3D1519634.64..1520304.32 rows=3D267870 width=3D48)
(actual time=3D161695.393..162157.673 rows=3D416075 loops=3D1)
              Sort Key: rrr.*
              Sort Method: external sort  Disk: 22376kB
              ->  Hash Join  (cost=3D1190.62..1487242.09 rows=3D267870
width=3D48) (actual time=3D12556.229..158035.711 rows=3D416075 loops=3D1)
                    Hash Cond: ((rrr.bababa)::text =3D (uuu.bababa)::text)
                    ->  Hash Join  (cost=3D819.47..1440572.59 rows=3D872393
width=3D56) (actual time=3D12538.212..156077.489 rows=3D*1275138 *loops=3D1=
)
                          Hash Cond: (rrr.fff_idx =3D fff.idx)
                          ->  Seq Scan on rrr  (cost=3D0.00..1164409.32
rows=3D71098632 width=3D60) (actual time=3D0.008..82185.329 rows=3D71098547=
 loops=3D1)
                          ->  Hash  (cost=3D807.58..807.58 rows=3D951 width=
=3D4)
(actual time=3D2.351..2.351 rows=3D964 loops=3D1)
                                Buckets: 1024  Batches: 1  Memory Usage:
42kB
                                ->  Bitmap Heap Scan on fff
 (cost=3D30.98..807.58 rows=3D951 width=3D4) (actual time=3D0.097..1.307 ro=
ws=3D964
loops=3D1)
                                      Recheck Cond: ((file_name_date >=3D
'2005-03-01'::date) AND (file_name_date <=3D '2005-07-30'::date))
                                      Filter: ((file_name_model_ver)::text
=3D '100'::text)
                                      Rows Removed by Filter: 540
                                      Heap Blocks: exact=3D30
                                      ->  Bitmap Index Scan on
ix_fff_file_name_date  (cost=3D0.00..30.74 rows=3D1445 width=3D0) (actual
time=3D0.086..0.086 rows=3D1504 loops=3D1)
                                            Index Cond: ((file_name_date >=
=3D
'2005-03-01'::date) AND (file_name_date <=3D '2005-07-30'::date))
                    ->  Hash  (cost=3D270.01..270.01 rows=3D8091 width=3D8)
(actual time=3D17.233..17.233 rows=3D7993 loops=3D1)
                          Buckets: 8192  Batches: 1  Memory Usage: 377kB
                          ->  Index Only Scan using pk_uuu on uuu
 (cost=3D0.42..270.01 rows=3D8091 width=3D8) (actual time=3D0.012..8.525 ro=
ws=3D*7993
*loops=3D1)
                                Index Cond: (universe_hash =3D
'5188205190738336870'::bigint)
                                Heap Fetches: 0
Planning time: 0.642 ms
Execution time: 163606.571 ms

pgsql-bugs by date:

Previous
From: "David G. Johnston"
Date:
Subject: Fwd: BUG #13908: Query returns too few rows
Next
From: "David G. Johnston"
Date:
Subject: Re: BUG #13908: Query returns too few rows