Re: v7.2.3 versus v7.3 -> huge performance penalty for JOIN with UNION - Mailing list pgsql-performance

From Tom Lane
Subject Re: v7.2.3 versus v7.3 -> huge performance penalty for JOIN with UNION
Date
Msg-id 18912.1038845624@sss.pgh.pa.us
Whole thread Raw
In response to v7.2.3 versus v7.3 -> huge performance penalty for JOIN with UNION  ("ir. F.T.M. van Vugt bc." <ftm.van.vugt@foxi.nl>)
Responses Re: v7.2.3 versus v7.3 -> huge performance penalty for JOIN with UNION
List pgsql-performance
"ir. F.T.M. van Vugt bc." <ftm.van.vugt@foxi.nl> writes:
> The query below runs 10-20 times slower under v7.3 than it did under v7.2.3:

I don't suppose you have explain output for it from 7.2.3?

It seems strange to me that the thing is picking a nestloop join here.
Either merge or hash would make more sense ... oh, but wait:

>     inner join creditor c
>         on foo.dflt_creditor_id = c.old_creditor_id

> * foo.dflt_creditor_id is of type varchar(20)
> * c.old_creditor_id is of type text

IIRC, merge and hash only work on plain Vars --- the implicit type
coercion from varchar to text is what's putting the kibosh on a more
intelligent join plan.  Can you fix your table declarations to agree
on the datatype?  If you don't want to change the tables, another
possibility is something like

 select foo.*, c.id from
    (select *, dflt_creditor_id::text as key, 't' from lijst01_table union all
    select *, dflt_creditor_id::text as key, 't' from lijst02_table union all
    select *, dflt_creditor_id::text as key, 'f' from lijst03_table union all
    select *, dflt_creditor_id::text as key, 'f' from lijst04_table union all
    select *, dflt_creditor_id::text as key, 't' from lijst04b_table ) as foo
    inner join creditor c
        on foo.key = c.old_creditor_id
     order by old_id;

ie, force the type coercion to occur down inside the union, not at the
join.

This doesn't explain the slowdown from 7.2.3, though --- it had the same
deficiency.  (I am hoping to get around to fixing it for 7.4.)

It could easy be that --enable-locale explains the slowdown.  Are you
running 7.4 in C locale, or something else?  Comparisons in locales
like en_US can be *way* slower than in C locale.  You can use
pg_controldata to check this for sure.

            regards, tom lane

pgsql-performance by date:

Previous
From: Andrew Sullivan
Date:
Subject: Re: v7.2.3 versus v7.3 -> huge performance penalty for JOIN with UNION
Next
From: "ir. F.T.M. van Vugt bc."
Date:
Subject: Re: v7.2.3 versus v7.3 -> huge performance penalty for JOIN with UNION