Thread: Workaround for cross column stats dependency

Workaround for cross column stats dependency

From
"Guillaume Smet"
Date:
Hi -performance,

While testing 8.3, I found this query which is equally slow on 8.1 and
8.3 and seems to be really slow for a not so complex query. The stats
are as good as possible and the behaviour of PostgreSQL seems to be
logical considering the stats but I'm looking for a workaround to
speed up this query.

So here is the original query:
cityvox_prod=# EXPLAIN ANALYZE SELECT vq.codequar, vq.liblong, vq.libcourt
    FROM lieu l, vilquartier vq, rubtylieu rtl, genrelieu gl, lieugelieu lgl
    WHERE l.codequar = vq.codequar AND l.dfinvalidlieu is null AND
vq.codevil = 'MUC' AND lgl.numlieu = l.numlieu AND lgl.codegelieu =
gl.codegelieu
    AND gl.codetylieu = rtl.codetylieu AND rtl.codeth = 'RES' -- the
interesting part is here
    GROUP BY vq.codequar, vq.liblong, vq.libcourt, vq.flagintramuros
ORDER BY vq.flagintramuros, vq.liblong;

              QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=2773.18..2773.25 rows=26 width=43) (actual
time=602.822..602.829 rows=13 loops=1)
   Sort Key: vq.flagintramuros, vq.liblong
   Sort Method:  quicksort  Memory: 26kB
   ->  HashAggregate  (cost=2772.31..2772.57 rows=26 width=43) (actual
time=602.769..602.778 rows=13 loops=1)
         ->  Hash Join  (cost=2737.48..2769.83 rows=248 width=43)
(actual time=601.999..602.580 rows=167 loops=1)
               Hash Cond: ((vq.codequar)::text = (l.codequar)::text)
               ->  Bitmap Heap Scan on vilquartier vq
(cost=1.91..27.50 rows=47 width=43) (actual time=0.032..0.067 rows=47
loops=1)
                     Recheck Cond: ((codevil)::text = 'MUC'::text)
                     ->  Bitmap Index Scan on idx_vilquartier_codevil
(cost=0.00..1.90 rows=47 width=0) (actual time=0.023..0.023 rows=47
loops=1)
                           Index Cond: ((codevil)::text = 'MUC'::text)
               ->  Hash  (cost=2646.25..2646.25 rows=7145 width=5)
(actual time=601.955..601.955 rows=62526 loops=1)
                     ->  Nested Loop  (cost=0.00..2646.25 rows=*7145*
width=5) (actual time=0.058..548.618 rows=*62526* loops=1)
                           ->  Nested Loop  (cost=0.00..349.71
rows=7232 width=4) (actual time=0.049..147.221 rows=66292 loops=1)
                                 ->  Nested Loop  (cost=0.00..8.59
rows=13 width=4) (actual time=0.027..0.254 rows=88 loops=1)
                                       ->  Seq Scan on rubtylieu rtl
(cost=0.00..2.74 rows=1 width=4) (actual time=0.013..0.026 rows=1
loops=1)
                                             Filter: ((codeth)::text =
'RES'::text)
                                       ->  Index Scan using
ind_genrelieu2 on genrelieu gl  (cost=0.00..5.68 rows=*14* width=8)
(actual time=0.013..0.119 rows=*88* loops=1)
                                             Index Cond:
((gl.codetylieu)::text = (rtl.codetylieu)::text)
                                 ->  Index Scan using
idx_lieugelieu_codegelieu on lieugelieu lgl  (cost=0.00..18.33
rows=633 width=8) (actual time=0.014..0.802 rows=753 loops=88)
                                       Index Cond:
((lgl.codegelieu)::text = (gl.codegelieu)::text)
                           ->  Index Scan using pk_lieu on lieu l
(cost=0.00..0.31 rows=1 width=9) (actual time=0.003..0.004 rows=1
loops=66292)
                                 Index Cond: (l.numlieu = lgl.numlieu)
                                 Filter: (l.dfinvalidlieu IS NULL)
 Total runtime: 602.930 ms

The query is looking for parts of the city where we can find
restaurants. The problem of this query is that we have several
categories (here genrelieu) for a given type (here rubtylieu) and we
have several types for a given theme (here the theme is codeth =
'RES'). When the value of the theme is RES (restaurants), we have only
1 type (it's also RES). The fact is that there are a lot of rows for
the value RES in genrelieu (all the types of food and so on) compared
to other values. Considering that PostgreSQL doesn't have the value of
RES when building the stats for genrelieu, 14 rows is an expected
value considering the distribution of the values in genrelieu so
there's nothing really wrong here. But it's really really slow.

If I remove the join on rubtylieu and I inject directly the value
obtained in the query, the stats are exact and I get a far better
plan:

cityvox_prod=# EXPLAIN ANALYZE SELECT vq.codequar, vq.liblong, vq.libcourt
    FROM lieu l, vilquartier vq, genrelieu gl, lieugelieu lgl
    WHERE l.codequar = vq.codequar AND l.dfinvalidlieu is null AND
vq.codevil = 'MUC' AND lgl.numlieu = l.numlieu AND lgl.codegelieu =
gl.codegelieu
    AND gl.codetylieu = 'RES'
    GROUP BY vq.codequar, vq.liblong, vq.libcourt, vq.flagintramuros
ORDER BY vq.flagintramuros, vq.liblong;

         QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=7070.53..7070.59 rows=26 width=43) (actual
time=8.502..8.511 rows=13 loops=1)
   Sort Key: vq.flagintramuros, vq.liblong
   Sort Method:  quicksort  Memory: 26kB
   ->  HashAggregate  (cost=7069.65..7069.91 rows=26 width=43) (actual
time=8.451..8.458 rows=13 loops=1)
         ->  Hash Join  (cost=10.06..7053.22 rows=1643 width=43)
(actual time=0.318..8.267 rows=167 loops=1)
               Hash Cond: ((lgl.codegelieu)::text = (gl.codegelieu)::text)
               ->  Nested Loop  (cost=2.63..7001.77 rows=7358
width=47) (actual time=0.098..7.361 rows=973 loops=1)
                     ->  Nested Loop  (cost=2.63..5527.36 rows=4319
width=47) (actual time=0.084..2.574 rows=630 loops=1)
                           ->  Index Scan using
idx_vilquartier_codevil on vilquartier vq  (cost=0.00..34.06 rows=47
width=43) (actual time=0.023..0.062 rows=47 loops=1)
                                 Index Cond: ((codevil)::text = 'MUC'::text)
                           ->  Bitmap Heap Scan on lieu l
(cost=2.63..115.05 rows=146 width=9) (actual time=0.014..0.035 rows=13
loops=47)
                                 Recheck Cond: ((l.codequar)::text =
(vq.codequar)::text)
                                 Filter: (l.dfinvalidlieu IS NULL)
                                 ->  Bitmap Index Scan on
lieu_i_codequar  (cost=0.00..2.59 rows=146 width=0) (actual
time=0.010..0.010 rows=13 loops=47)
                                       Index Cond: ((l.codequar)::text
= (vq.codequar)::text)
                     ->  Index Scan using
idx_lieugelieu_numlieu_principal on lieugelieu lgl  (cost=0.00..0.32
rows=2 width=8) (actual time=0.003..0.005 rows=2 loops=630)
                           Index Cond: (lgl.numlieu = l.numlieu)
               ->  Hash  (cost=6.33..6.33 rows=88 width=4) (actual
time=0.153..0.153 rows=88 loops=1)
                     ->  Bitmap Heap Scan on genrelieu gl
(cost=2.23..6.33 rows=*88* width=4) (actual time=0.028..0.088
rows=*88* loops=1)
                           Recheck Cond: ((codetylieu)::text = 'RES'::text)
                           ->  Bitmap Index Scan on ind_genrelieu2
(cost=0.00..2.21 rows=88 width=0) (actual time=0.021..0.021 rows=88
loops=1)
                                 Index Cond: ((codetylieu)::text = 'RES'::text)
 Total runtime: 8.589 ms

So the question is: is there any way to improve the results of the
original query, other than doing a first query in the application to
get the list of types and inject them in a second query (the one just
above)?

Thanks.

--
Guillaume

Re: Workaround for cross column stats dependency

From
Tom Lane
Date:
"Guillaume Smet" <guillaume.smet@gmail.com> writes:
> So the question is: is there any way to improve the results of the
> original query, other than doing a first query in the application to
> get the list of types and inject them in a second query (the one just
> above)?

Well, if you're willing to cheat like mad, you can use a phony immutable
function to perform that injection.  Here's a really silly example in
the regression database:

regression=# create or replace function getu2(int) returns int[] as $$
select array(select unique2 from tenk1 where thousand = $1);
$$ language sql immutable;
CREATE FUNCTION
regression=# explain select * from tenk1 where unique1 = any(getu2(42));
                                              QUERY PLAN
------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on tenk1  (cost=38.59..73.80 rows=10 width=244)
   Recheck Cond: (unique1 = ANY ('{381,932,2369,4476,5530,6342,6842,6961,7817,7973}'::integer[]))
   ->  Bitmap Index Scan on tenk1_unique1  (cost=0.00..38.59 rows=10 width=0)
         Index Cond: (unique1 = ANY ('{381,932,2369,4476,5530,6342,6842,6961,7817,7973}'::integer[]))
(4 rows)

Since the function is marked immutable, it'll be pre-evaluated during
planning and then the constant array result is exposed for statistics
purposes.

Now this method *only* works for interactive queries, or EXECUTE'd
queries in plpgsql, because you don't want the plan containing the
folded constants to get cached.  At least not if you're worried about
responding promptly to changes in the table you're fetching from.
But if that table is essentially constant anyway in your application,
there's little downside to this trick.

            regards, tom lane

Re: Workaround for cross column stats dependency

From
"Guillaume Smet"
Date:
On Jan 23, 2008 2:43 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> regression=# create or replace function getu2(int) returns int[] as $$
> select array(select unique2 from tenk1 where thousand = $1);
> $$ language sql immutable;
> CREATE FUNCTION
> regression=# explain select * from tenk1 where unique1 = any(getu2(42));
>                                               QUERY PLAN
> ------------------------------------------------------------------------------------------------------
>  Bitmap Heap Scan on tenk1  (cost=38.59..73.80 rows=10 width=244)
>    Recheck Cond: (unique1 = ANY ('{381,932,2369,4476,5530,6342,6842,6961,7817,7973}'::integer[]))
>    ->  Bitmap Index Scan on tenk1_unique1  (cost=0.00..38.59 rows=10 width=0)
>          Index Cond: (unique1 = ANY ('{381,932,2369,4476,5530,6342,6842,6961,7817,7973}'::integer[]))
> (4 rows)

I'll give it a try tomorrow.

> Now this method *only* works for interactive queries, or EXECUTE'd
> queries in plpgsql, because you don't want the plan containing the
> folded constants to get cached.  At least not if you're worried about
> responding promptly to changes in the table you're fetching from.
> But if that table is essentially constant anyway in your application,
> there's little downside to this trick.

Yeah, that sounds like a good idea in our case. We don't use prepared
statements for these queries.

I'll post my results tomorrow morning.

Thanks.

--
Guillaume

Re: Workaround for cross column stats dependency

From
"Guillaume Smet"
Date:
On Jan 23, 2008 3:02 AM, Guillaume Smet <guillaume.smet@gmail.com> wrote:
> I'll post my results tomorrow morning.

It works perfectly well:
cityvox_prod=# CREATE OR REPLACE FUNCTION
getTypesLieuFromTheme(codeTheme text) returns text[] AS
$f$
SELECT ARRAY(SELECT codetylieu::text FROM rubtylieu WHERE codeth = $1);
$f$ LANGUAGE SQL IMMUTABLE;
CREATE FUNCTION

cityvox_prod=# EXPLAIN ANALYZE SELECT vq.codequar, vq.liblong, vq.libcourt
FROM lieu l, vilquartier vq, genrelieu gl, lieugelieu lgl
WHERE l.codequar = vq.codequar AND l.dfinvalidlieu is null AND
vq.codevil = 'MUC' AND lgl.numlieu = l.numlieu AND lgl.codegelieu =
gl.codegelieu
AND gl.codetylieu = ANY(getTypesLieuFromTheme('RES'))
GROUP BY vq.codequar, vq.liblong, vq.libcourt, vq.flagintramuros
ORDER BY vq.flagintramuros, vq.liblong;

         QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=5960.02..5960.08 rows=26 width=43) (actual
time=7.467..7.475 rows=13 loops=1)
   Sort Key: vq.flagintramuros, vq.liblong
   Sort Method:  quicksort  Memory: 26kB
   ->  HashAggregate  (cost=5959.15..5959.41 rows=26 width=43) (actual
time=7.421..7.428 rows=13 loops=1)
         ->  Hash Join  (cost=7.32..5944.52 rows=1463 width=43)
(actual time=0.241..7.212 rows=167 loops=1)
               Hash Cond: ((lgl.codegelieu)::text = (gl.codegelieu)::text)
               ->  Nested Loop  (cost=0.00..5898.00 rows=6552
width=47) (actual time=0.038..6.354 rows=973 loops=1)
                     ->  Nested Loop  (cost=0.00..4585.64 rows=3845
width=47) (actual time=0.031..1.959 rows=630 loops=1)
                           ->  Index Scan using
idx_vilquartier_codevil on vilquartier vq  (cost=0.00..34.06 rows=47
width=43) (actual time=0.015..0.047 rows=47 loops=1)
                                 Index Cond: ((codevil)::text = 'MUC'::text)
                           ->  Index Scan using idx_test on lieu l
(cost=0.00..95.53 rows=105 width=9) (actual time=0.008..0.024 rows=13
loops=47)
                                 Index Cond: ((l.codequar)::text =
(vq.codequar)::text)
                     ->  Index Scan using
idx_lieugelieu_numlieu_principal on lieugelieu lgl  (cost=0.00..0.32
rows=2 width=8) (actual time=0.003..0.004 rows=2 loops=630)
                           Index Cond: (lgl.numlieu = l.numlieu)
               ->  Hash  (cost=6.22..6.22 rows=88 width=4) (actual
time=0.146..0.146 rows=88 loops=1)
                     ->  Bitmap Heap Scan on genrelieu gl
(cost=2.23..6.22 rows=88 width=4) (actual time=0.022..0.075 rows=88
loops=1)
                           Recheck Cond: ((codetylieu)::text = ANY
('{RES}'::text[]))
                           ->  Bitmap Index Scan on ind_genrelieu2
(cost=0.00..2.21 rows=88 width=0) (actual time=0.016..0.016 rows=88
loops=1)
                                 Index Cond: ((codetylieu)::text = ANY
('{RES}'::text[]))
 Total runtime: 7.558 ms

It seems like a good tip to keep in mind.

Thanks for your help.

--
Guillaume