On 2017-11-28 20:48:24 +0100, Laurenz Albe wrote:
> Peter J. Holzer wrote:
> > I noticed that an update was taking a long time and found this:
> >
> > UPDATE public.facttable_imf_ifs p
> > SET [...lots of columns...]
> > FROM cleansing.cls_imf_ifs_facttable_imf_ifs c, cleansing.cleansing_change_type ct
> > WHERE
> > (p.macrobondtimeseries is not distinct from c.macrobondtimeseries) AND (p.date is not distinct from c.date)
> > AND c.cleansing_change_type_id = ct.cleansing_change_type_id
> > AND ct.cleansing_change_type_desc_short IN ('UPDATED_NEW')
>
> It is kind of ugly, and I didn't test it, but here is an idea:
>
> Suppose we know a value that cannot occur in both p.date and c.date.
>
> Then you could write
>
> WHERE coalesce(p.date, '0044-03-15 BC') = coalesce(c.date, '0044-03-15 BC')
>
> and create an index on the coalesce expressions to facilitate
> a merge join.
Yes, that would work. However in most cases these queries are
automatically generated from the tables so determining "impossible"
values is not generally possible. I don't think it is necessary, though.
Replacing (A is not distinct from B) with (A = B or A is null and B is
null) usually produces an acceptable plan (indeed, I had that before, I
replaced it with is not distinct to clean it up) and when a column has a
not null constraint I can simply use (A = B). (I should probably do this
automatically - currently I need to pass a flag to the query generator,
and that's a bit error prone)
hp
-- _ | Peter J. Holzer | we build much bigger, better disasters now
|_|_) | | because we have much more sophisticated
| | | hjp@hjp.at | management tools.
__/ | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>