Fwd: BUG #13908: Query returns too few rows - Mailing list pgsql-bugs
| From | David G. Johnston |
|---|---|
| Subject | Fwd: BUG #13908: Query returns too few rows |
| Date | |
| Msg-id | CAKFQuwY8MGr1kKz1DzfdERnVxL-aQo=L3hHY5ytjZFj3TNAxoA@mail.gmail.com Whole thread |
| In response to | BUG #13908: Query returns too few rows (seth-p@outlook.com) |
| Responses |
Re: BUG #13908: Query returns too few rows
|
| List | pgsql-bugs |
Adding back the list so that everyone can see the latest finding. Adding
in the original bug block too...
=E2=80=8B
The following bug has been logged on the website:
Bug reference: 13908
Logged by: Seth
Email address: seth-p@outlook.com
PostgreSQL version: 9.5.0
Operating system: Windows 10
Description:
*=E2=80=8BNext request, Seth*: please restore your dump into a clean databa=
se and
see what results you get. Also, please run:
SELECT version(); and provide the results.=E2=80=8B
=E2=80=8BMy findings
=E2=80=8B- there is=E2=80=8B
a smaller count for
=E2=80=8Bthe =E2=80=8B
DISTINCT than without.=E2=80=8B
=E2=80=8B
I successfully restored your pg_dump file=E2=80=8B, vacuum analyzed it, the=
n ran
the two counting queries. Here are my results.
The DISTINCT plan has an actual count of 415,874 while the non-DISTINCT
plan resulted in 415,967; *thus DISTINCT removed 93 duplicates.*
Looking back at your most recent email your numbers are 415,983 and
416,009; an increase of 26 by adding DISTINCT...and not matching either of
these numbers.
*version*
PostgreSQL 9.5.0 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu
4.8.2-19ubuntu1) 4.8.2, 64-bit
=E2=80=8BInstalled into the default postgres database which has an encoding=
of
en_US.UTF-8
(NB: had to use pg_database as I couldn't located a psql meta-command to
list databases...am I missing something here?)
=E2=80=8B
The follow are the results of my EXPLAIN ANALYZE confirmed by running the
actual count queries.
QUERY PLAN
Aggregate (cost=3D1464939.22..1464939.23 rows=3D1 width=3D0) (actual
time=3D151609.418..151609.419 rows=3D1 loops=3D1)
* -> Hash Join (cost=3D1454885.92..1464269.54 rows=3D267870 width=3D0) (=
actual
time=3D150567.030..151188.740 rows=3D415967 loops=3D1)*
Hash Cond: ((uuu.bababa)::text =3D (rrr.bababa)::text)
-> Index Only Scan using pk_uuu on uuu (cost=3D0.42..270.01
rows=3D8091 width=3D8) (actual time=3D0.019..9.337 rows=3D7993 loops=3D1)
Index Cond: (universe_hash =3D '5188205190738336870'::bigint)
Heap Fetches: 0
-> Hash (cost=3D1440572.59..1440572.59 rows=3D872393 width=3D8) (=
actual
time=3D150566.397..150566.397 rows=3D1275138 loops=3D1)
Buckets: 131072 (originally 131072) Batches: 32 (originally
16) Memory Usage: 3073kB
-> Hash Join (cost=3D819.47..1440572.59 rows=3D872393 width=
=3D8)
(actual time=3D11865.529..149038.533 rows=3D1275138 loops=3D1)
Hash Cond: (rrr.fff_idx =3D fff.idx)
-> Seq Scan on rrr (cost=3D0.00..1164409.32
rows=3D71098632 width=3D12) (actual time=3D0.006..73136.366 rows=3D71098547=
loops=3D1)
-> Hash (cost=3D807.58..807.58 rows=3D951 width=3D4)
(actual time=3D3.458..3.458 rows=3D964 loops=3D1)
Buckets: 1024 Batches: 1 Memory Usage: 42kB
-> Bitmap Heap Scan on fff (cost=3D30.98..807.5=
8
rows=3D951 width=3D4) (actual time=3D0.133..1.843 rows=3D964 loops=3D1)
Recheck Cond: ((file_name_date >=3D
'2005-03-01'::date) AND (file_name_date <=3D '2005-07-30'::date))
Filter: ((file_name_model_ver)::text =3D
'100'::text)
Rows Removed by Filter: 540
Heap Blocks: exact=3D30
-> Bitmap Index Scan on
ix_fff_file_name_date (cost=3D0.00..30.74 rows=3D1445 width=3D0) (actual
time=3D0.117..0.117 rows=3D1504 loops=3D1)
Index Cond: ((file_name_date >=3D
'2005-03-01'::date) AND (file_name_date <=3D '2005-07-30'::date))
Planning time: 1.727 ms
Execution time: 151609.494 ms
QUERY PLAN
Aggregate (cost=3D1504689.02..1504689.03 rows=3D1 width=3D0) (actual
time=3D155253.758..155253.759 rows=3D1 loops=3D1)
* -> Unique (cost=3D1497322.60..1501340.65 rows=3D267870 width=3D24) (ac=
tual
time=3D153375.583..154829.461 rows=3D415874 loops=3D1)*
-> Sort (cost=3D1497322.60..1497992.27 rows=3D267870 width=3D24)
(actual time=3D153375.578..153919.245 rows=3D415874 loops=3D1)
Sort Key: rrr.idx, rrr.row_number, rrr.bababa, rrr.data_date,
rrr.fff_idx
Sort Method: external merge Disk: 13784kB
-> Hash Join (cost=3D1456589.92..1467677.54 rows=3D267870
width=3D24) (actual time=3D151982.289..152665.099 rows=3D415874 loops=3D1)
Hash Cond: ((uuu.bababa)::text =3D (rrr.bababa)::text)
-> Index Only Scan using pk_uuu on uuu
(cost=3D0.42..270.01 rows=3D8091 width=3D8) (actual time=3D0.012..9.359 ro=
ws=3D7993
loops=3D1)
Index Cond: (universe_hash =3D
'5188205190738336870'::bigint)
Heap Fetches: 0
-> Hash (cost=3D1440572.59..1440572.59 rows=3D872393
width=3D24) (actual time=3D151981.919..151981.919 rows=3D1275138 loops=3D1)
Buckets: 65536 (originally 65536) Batches: 32
(originally 16) Memory Usage: 3585kB
-> Hash Join (cost=3D819.47..1440572.59
rows=3D872393 width=3D24) (actual time=3D12037.549..150408.756 rows=3D12751=
38
loops=3D1)
Hash Cond: (rrr.fff_idx =3D fff.idx)
-> Seq Scan on rrr (cost=3D0.00..1164409.=
32
rows=3D71098632 width=3D24) (actual time=3D0.004..73673.708 rows=3D71098547=
loops=3D1)
-> Hash (cost=3D807.58..807.58 rows=3D951
width=3D4) (actual time=3D2.360..2.360 rows=3D964 loops=3D1)
Buckets: 1024 Batches: 1 Memory
Usage: 42kB
-> Bitmap Heap Scan on fff
(cost=3D30.98..807.58 rows=3D951 width=3D4) (actual time=3D0.123..1.293 ro=
ws=3D964
loops=3D1)
Recheck Cond: ((file_name_date
>=3D '2005-03-01'::date) AND (file_name_date <=3D '2005-07-30'::date))
Filter:
((file_name_model_ver)::text =3D '100'::text)
Rows Removed by Filter: 540
Heap Blocks: exact=3D30
-> Bitmap Index Scan on
ix_fff_file_name_date (cost=3D0.00..30.74 rows=3D1445 width=3D0) (actual
time=3D0.111..0.111 rows=3D1504 loops=3D1)
Index Cond:
((file_name_date >=3D '2005-03-01'::date) AND (file_name_date <=3D
'2005-07-30'::date))
Planning time: 0.668 ms
Execution time: 155255.933 ms
pgsql-bugs by date: