Home > mailing lists

Fwd: BUG #13908: Query returns too few rows - Mailing list pgsql-bugs

From	David G. Johnston
Subject	Fwd: BUG #13908: Query returns too few rows
Date	February 4, 2016 21:58:22
Msg-id	CAKFQuwY8MGr1kKz1DzfdERnVxL-aQo=L3hHY5ytjZFj3TNAxoA@mail.gmail.com Whole thread
In response to	BUG #13908: Query returns too few rows (seth-p@outlook.com)
Responses	Re: BUG #13908: Query returns too few rows
List	pgsql-bugs

Tree view

Adding back the list so that everyone can see the latest finding.  Adding
in the original bug block too...
=E2=80=8B

The following bug has been logged on the website:

Bug reference:      13908
Logged by:          Seth
Email address:      seth-p@outlook.com
PostgreSQL version: 9.5.0
Operating system:   Windows 10
Description:

*=E2=80=8BNext request, Seth*: please restore your dump into a clean databa=
se and
see what results you get.  Also, please run:
SELECT version(); and provide the results.=E2=80=8B

=E2=80=8BMy findings
=E2=80=8B- there is=E2=80=8B
 a smaller count for
=E2=80=8Bthe =E2=80=8B
DISTINCT than without.=E2=80=8B

=E2=80=8B
I successfully restored your pg_dump file=E2=80=8B, vacuum analyzed it, the=
n ran
the two counting queries. Here are my results.


The DISTINCT plan has an actual count of 415,874 while the non-DISTINCT
plan resulted in 415,967; *thus DISTINCT removed 93 duplicates.*

Looking back at your most recent email your numbers are 415,983 and
416,009; an increase of 26 by adding DISTINCT...and not matching either of
these numbers.

*version*
PostgreSQL 9.5.0 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu
4.8.2-19ubuntu1) 4.8.2, 64-bit

=E2=80=8BInstalled into the default postgres database which has an encoding=
 of
en_US.UTF-8

(NB: had to use pg_database as I couldn't located a psql meta-command to
list databases...am I missing something here?)
=E2=80=8B
The follow are the results of my EXPLAIN ANALYZE confirmed by running the
actual count queries.

QUERY PLAN
Aggregate  (cost=3D1464939.22..1464939.23 rows=3D1 width=3D0) (actual
time=3D151609.418..151609.419 rows=3D1 loops=3D1)
*  ->  Hash Join  (cost=3D1454885.92..1464269.54 rows=3D267870 width=3D0) (=
actual
time=3D150567.030..151188.740 rows=3D415967 loops=3D1)*
        Hash Cond: ((uuu.bababa)::text =3D (rrr.bababa)::text)
        ->  Index Only Scan using pk_uuu on uuu  (cost=3D0.42..270.01
rows=3D8091 width=3D8) (actual time=3D0.019..9.337 rows=3D7993 loops=3D1)
              Index Cond: (universe_hash =3D '5188205190738336870'::bigint)
              Heap Fetches: 0
        ->  Hash  (cost=3D1440572.59..1440572.59 rows=3D872393 width=3D8) (=
actual
time=3D150566.397..150566.397 rows=3D1275138 loops=3D1)
              Buckets: 131072 (originally 131072)  Batches: 32 (originally
16)  Memory Usage: 3073kB
              ->  Hash Join  (cost=3D819.47..1440572.59 rows=3D872393 width=
=3D8)
(actual time=3D11865.529..149038.533 rows=3D1275138 loops=3D1)
                    Hash Cond: (rrr.fff_idx =3D fff.idx)
                    ->  Seq Scan on rrr  (cost=3D0.00..1164409.32
rows=3D71098632 width=3D12) (actual time=3D0.006..73136.366 rows=3D71098547=
 loops=3D1)
                    ->  Hash  (cost=3D807.58..807.58 rows=3D951 width=3D4)
(actual time=3D3.458..3.458 rows=3D964 loops=3D1)
                          Buckets: 1024  Batches: 1  Memory Usage: 42kB
                          ->  Bitmap Heap Scan on fff  (cost=3D30.98..807.5=
8
rows=3D951 width=3D4) (actual time=3D0.133..1.843 rows=3D964 loops=3D1)
                                Recheck Cond: ((file_name_date >=3D
'2005-03-01'::date) AND (file_name_date <=3D '2005-07-30'::date))
                                Filter: ((file_name_model_ver)::text =3D
'100'::text)
                                Rows Removed by Filter: 540
                                Heap Blocks: exact=3D30
                                ->  Bitmap Index Scan on
ix_fff_file_name_date  (cost=3D0.00..30.74 rows=3D1445 width=3D0) (actual
time=3D0.117..0.117 rows=3D1504 loops=3D1)
                                      Index Cond: ((file_name_date >=3D
'2005-03-01'::date) AND (file_name_date <=3D '2005-07-30'::date))
Planning time: 1.727 ms
Execution time: 151609.494 ms

QUERY PLAN
Aggregate  (cost=3D1504689.02..1504689.03 rows=3D1 width=3D0) (actual
time=3D155253.758..155253.759 rows=3D1 loops=3D1)
*  ->  Unique  (cost=3D1497322.60..1501340.65 rows=3D267870 width=3D24) (ac=
tual
time=3D153375.583..154829.461 rows=3D415874 loops=3D1)*
        ->  Sort  (cost=3D1497322.60..1497992.27 rows=3D267870 width=3D24)
(actual time=3D153375.578..153919.245 rows=3D415874 loops=3D1)
              Sort Key: rrr.idx, rrr.row_number, rrr.bababa, rrr.data_date,
rrr.fff_idx
              Sort Method: external merge  Disk: 13784kB
              ->  Hash Join  (cost=3D1456589.92..1467677.54 rows=3D267870
width=3D24) (actual time=3D151982.289..152665.099 rows=3D415874 loops=3D1)
                    Hash Cond: ((uuu.bababa)::text =3D (rrr.bababa)::text)
                    ->  Index Only Scan using pk_uuu on uuu
 (cost=3D0.42..270.01 rows=3D8091 width=3D8) (actual time=3D0.012..9.359 ro=
ws=3D7993
loops=3D1)
                          Index Cond: (universe_hash =3D
'5188205190738336870'::bigint)
                          Heap Fetches: 0
                    ->  Hash  (cost=3D1440572.59..1440572.59 rows=3D872393
width=3D24) (actual time=3D151981.919..151981.919 rows=3D1275138 loops=3D1)
                          Buckets: 65536 (originally 65536)  Batches: 32
(originally 16)  Memory Usage: 3585kB
                          ->  Hash Join  (cost=3D819.47..1440572.59
rows=3D872393 width=3D24) (actual time=3D12037.549..150408.756 rows=3D12751=
38
loops=3D1)
                                Hash Cond: (rrr.fff_idx =3D fff.idx)
                                ->  Seq Scan on rrr  (cost=3D0.00..1164409.=
32
rows=3D71098632 width=3D24) (actual time=3D0.004..73673.708 rows=3D71098547=
 loops=3D1)
                                ->  Hash  (cost=3D807.58..807.58 rows=3D951
width=3D4) (actual time=3D2.360..2.360 rows=3D964 loops=3D1)
                                      Buckets: 1024  Batches: 1  Memory
Usage: 42kB
                                      ->  Bitmap Heap Scan on fff
 (cost=3D30.98..807.58 rows=3D951 width=3D4) (actual time=3D0.123..1.293 ro=
ws=3D964
loops=3D1)
                                            Recheck Cond: ((file_name_date
>=3D '2005-03-01'::date) AND (file_name_date <=3D '2005-07-30'::date))
                                            Filter:
((file_name_model_ver)::text =3D '100'::text)
                                            Rows Removed by Filter: 540
                                            Heap Blocks: exact=3D30
                                            ->  Bitmap Index Scan on
ix_fff_file_name_date  (cost=3D0.00..30.74 rows=3D1445 width=3D0) (actual
time=3D0.111..0.111 rows=3D1504 loops=3D1)
                                                  Index Cond:
((file_name_date >=3D '2005-03-01'::date) AND (file_name_date <=3D
'2005-07-30'::date))
Planning time: 0.668 ms
Execution time: 155255.933 ms

pgsql-bugs by date:

From: Andres Freund
Date: 04 February 2016, 21:15:39
Subject: Re: BUG #13891: Deparsed arbiter WHERE clauses cannot be parsed by Postgres

From: "David G. Johnston"
Date: 04 February 2016, 22:13:05
Subject: Re: BUG #13908: Query returns too few rows

Fwd: BUG #13908: Query returns too few rows - Mailing list pgsql-bugs

Previous

Next