Re: Hash join on int takes 8..114 seconds - Mailing list pgsql-performance

From Andrus
Subject Re: Hash join on int takes 8..114 seconds
Date
Msg-id 75A918DFCF25470FAF96B3FE3D25A38B@andrusnotebook
Whole thread Raw
In response to Re: Hash join on int takes 8..114 seconds  (PFC <lists@peufeu.com>)
Responses Re: Hash join on int takes 8..114 seconds
Re: Hash join on int takes 8..114 seconds
Re: Hash join on int takes 8..114 seconds
List pgsql-performance
PFC,

thank you.

> OK so vmstat says you are IO-bound, this seems logical if the same plan
> has widely varying timings...
>
> Let's look at the usual suspects :
>
> - how many dead rows in your tables ? are your tables data, or bloat ?
> (check vacuum verbose, etc)

set search_path to firma2,public;
vacuum verbose dok; vacuum verbose rid

INFO:  vacuuming "firma2.dok"
INFO:  index "dok_pkey" now contains 1235086 row versions in 9454 pages
DETAIL:  100 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.16s/0.38u sec elapsed 0.77 sec.
INFO:  index "dok_dokumnr_idx" now contains 1235086 row versions in 9454
pages
DETAIL:  100 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.14s/0.40u sec elapsed 0.78 sec.
INFO:  index "dok_klient_idx" now contains 1235086 row versions in 18147
pages
DETAIL:  887 index row versions were removed.
3265 index pages have been deleted, 3033 are currently reusable.
CPU 0.36s/0.46u sec elapsed 31.87 sec.
INFO:  index "dok_krdokumnr_idx" now contains 1235086 row versions in 11387
pages
DETAIL:  119436 index row versions were removed.
1716 index pages have been deleted, 1582 are currently reusable.
CPU 0.47s/0.55u sec elapsed 63.38 sec.
INFO:  index "dok_kuupaev_idx" now contains 1235101 row versions in 10766
pages
DETAIL:  119436 index row versions were removed.
659 index pages have been deleted, 625 are currently reusable.
CPU 0.62s/0.53u sec elapsed 40.20 sec.
INFO:  index "dok_tasudok_idx" now contains 1235104 row versions in 31348
pages
DETAIL:  119436 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 1.18s/1.08u sec elapsed 118.97 sec.
INFO:  index "dok_tasudok_unique_idx" now contains 99 row versions in 97
pages
DETAIL:  98 index row versions were removed.
80 index pages have been deleted, 80 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.48 sec.
INFO:  index "dok_tasumata_idx" now contains 1235116 row versions in 11663
pages
DETAIL:  119436 index row versions were removed.
5340 index pages have been deleted, 5131 are currently reusable.
CPU 0.43s/0.56u sec elapsed 53.96 sec.
INFO:  index "dok_tellimus_idx" now contains 1235122 row versions in 11442
pages
DETAIL:  119436 index row versions were removed.
1704 index pages have been deleted, 1569 are currently reusable.
CPU 0.45s/0.59u sec elapsed 76.91 sec.
INFO:  index "dok_yksus_pattern_idx" now contains 1235143 row versions in
5549 pages
DETAIL:  119436 index row versions were removed.
529 index pages have been deleted, 129 are currently reusable.
CPU 0.19s/0.46u sec elapsed 2.72 sec.
INFO:  index "dok_doktyyp" now contains 1235143 row versions in 3899 pages
DETAIL:  119436 index row versions were removed.
188 index pages have been deleted, 13 are currently reusable.
CPU 0.14s/0.44u sec elapsed 1.40 sec.
INFO:  index "dok_sihtyksus_pattern_idx" now contains 1235143 row versions
in 5353 pages
DETAIL:  119436 index row versions were removed.
286 index pages have been deleted, 5 are currently reusable.
CPU 0.13s/0.45u sec elapsed 3.01 sec.
INFO:  "dok": removed 119436 row versions in 13707 pages
DETAIL:  CPU 0.80s/0.37u sec elapsed 14.15 sec.
INFO:  "dok": found 119436 removable, 1235085 nonremovable row versions in
171641 pages
DETAIL:  2 dead row versions cannot be removed yet.
There were 1834279 unused item pointers.
0 pages are entirely empty.
CPU 6.56s/6.88u sec elapsed 450.54 sec.
INFO:  vacuuming "pg_toast.pg_toast_40595"
INFO:  index "pg_toast_40595_index" now contains 0 row versions in 1 pages
DETAIL:  0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "pg_toast_40595": found 0 removable, 0 nonremovable row versions in 0
pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  vacuuming "firma2.rid"
INFO:  index "rid_pkey" now contains 3275197 row versions in 13959 pages
DETAIL:  38331 index row versions were removed.
262 index pages have been deleted, 262 are currently reusable.
CPU 0.42s/1.05u sec elapsed 58.56 sec.
INFO:  index "rid_dokumnr_idx" now contains 3275200 row versions in 14125
pages
DETAIL:  38331 index row versions were removed.
572 index pages have been deleted, 571 are currently reusable.
CPU 0.49s/1.14u sec elapsed 71.57 sec.
INFO:  index "rid_inpdokumnr_idx" now contains 3275200 row versions in 15103
pages
DETAIL:  38331 index row versions were removed.
579 index pages have been deleted, 579 are currently reusable.
CPU 0.66s/1.03u sec elapsed 68.38 sec.
INFO:  index "rid_toode_idx" now contains 3275224 row versions in 31094
pages
DETAIL:  38331 index row versions were removed.
2290 index pages have been deleted, 2290 are currently reusable.
CPU 1.39s/1.58u sec elapsed 333.82 sec.
INFO:  index "rid_rtellimus_idx" now contains 3275230 row versions in 7390
pages
DETAIL:  18591 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.18s/0.66u sec elapsed 1.78 sec.
INFO:  index "rid_toode_pattern_idx" now contains 3275230 row versions in
16310 pages
DETAIL:  17800 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.44s/1.04u sec elapsed 6.55 sec.
INFO:  "rid": removed 38331 row versions in 3090 pages
DETAIL:  CPU 0.20s/0.10u sec elapsed 5.49 sec.
INFO:  "rid": found 38331 removable, 3275189 nonremovable row versions in
165282 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 1878923 unused item pointers.
0 pages are entirely empty.
CPU 5.06s/7.27u sec elapsed 607.59 sec.

Query returned successfully with no result in 1058319 ms.

> - what's the size of the dataset relative to the RAM ?

Db size is 7417 MB
relevant table sizes in desc by size order:

      1        40595 dok                                 2345 MB
      2         1214 pg_shdepend                         2259 MB
      3        40926 rid                                 2057 MB
      6         1232 pg_shdepend_depender_index          795 MB
      7         1233 pg_shdepend_reference_index         438 MB
      8        44286 dok_tasudok_idx                     245 MB
      9        44299 rid_toode_idx                       243 MB
     10        44283 dok_klient_idx                      142 MB
     11     19103791 rid_toode_pattern_idx               127 MB
     14        44298 rid_inpdokumnr_idx                  118 MB
     15        44297 rid_dokumnr_idx                     110 MB
     16        43573 rid_pkey                            109 MB
     18        44288 dok_tasumata_idx                    91 MB
     19        44289 dok_tellimus_idx                    89 MB
     20        44284 dok_krdokumnr_idx                   89 MB
     21        44285 dok_kuupaev_idx                     84 MB
     23        43479 dok_pkey                            74 MB
     24        44282 dok_dokumnr_idx                     74 MB
     25     19076304 rid_rtellimus_idx                   58 MB
     26     18663923 dok_yksus_pattern_idx               43 MB
     27     18801591 dok_sihtyksus_pattern_idx           42 MB
     29     18774881 dok_doktyyp                         30 MB
     46        40967 toode                               13 MB

server is HP Proliant DL320 G3
http://h18000.www1.hp.com/products/quickspecs/12169_ca/12169_ca.HTML
CPU is 2.93Ghz Celeron 256kb cache.

Server has 2 GB RAM.
It has SATA RAID 0,1 integrated controller (1.5Gbps) and SAMSUNG HD160JJ
mirrored disks.

> Now let's look more closely at the query :
>
> explain analyze
> SELECT sum(1)
>   FROM dok JOIN rid USING (dokumnr)
>   JOIN toode USING (toode)
>   LEFT JOIN artliik using(grupp,liik)
>   WHERE rid.toode='X05' AND dok.kuupaev>='2008-09-01'
>
>
> I presume doing the query without artliik changes nothing to the runtime,
> yes ?

Yes. After removing artkliik from join I got response times 65 and 50
seconds, so this does not make difference.

> Your problem here is that, no matter what, postgres will have to examine
> - all rows where dok.kuupaev>='2008-09-01',
> - and all rows where rid.toode = 'X05'.
> If you use dok.kuupaev>='2007-09-01' (note : 2007) it will probably have
> to scan many, many more rows.

Probably yes, since then it reads one year more sales data.

> If you perform this query often you could CLUSTER rid on (toode) and dok
> on (kuupaev), but this can screw other queries.

Some reports are by sales date (dok.kuupaev) and customers.
CLUSTER rid on (toode) slows them down. Also autovacuum cannot do
clustering.

> What is the meaning of the columns ?

This is typical sales data:

-- Receipt headers:
DOK ( dokumnr  INT SERIAL PRIMARY KEY,
  kuupaev DATE --- sales date
)

-- Receipt details
RID ( dokumnr INT,
         toode CHAR(20),  -- item code
 CONSTRAINT rid_dokumnr_fkey FOREIGN KEY (dokumnr)   REFERENCES dok
(dokumnr),
  CONSTRAINT rid_toode_fkey FOREIGN KEY (toode)
      REFERENCES firma2.toode (toode)
)

-- Products
TOODE (
  toode CHAR(20) PRIMARY KEY
)

> To make this type of query faster I would tend to think about :

> - denormalization (ie adding a column in one of your tables and a
> multicolumn index)

For this query it is possible to duplicate kuupaev column to rid table.
However most of the this seems to go to scanning rid table, so I suspect
that this will help.

> - materialized views
> - materialized summary tables (ie. summary of sales for last month, for
> instance)

There are about 1000 items and reports are different.

Andrus.


pgsql-performance by date:

Previous
From: "Andrus"
Date:
Subject: Re: Hash join on int takes 8..114 seconds
Next
From: tv@fuzzy.cz
Date:
Subject: Re: Hash join on int takes 8..114 seconds