Re: BUG #16241: Degraded hash join performance - Mailing list pgsql-bugs

From Andres Freund
Subject Re: BUG #16241: Degraded hash join performance
Date
Msg-id 20200204162901.s5hbfrl2ylb3jjsq@alap3.anarazel.de
Whole thread Raw
In response to Re: BUG #16241: Degraded hash join performance  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: BUG #16241: Degraded hash join performance  (Thomas Butz <tbutz@optitool.de>)
List pgsql-bugs
Hi,

On 2020-02-04 11:00:29 -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > Interesting! The no-children one clearly shows that a lot of the the
> > time is spent evaluating regular expressions (there's other regex
> > functions in the profile too):
> >     23.36%  postgres  postgres            [.] subcolor
> 
> Huh ...
> 
> > I'm not aware of any relevant regular expression evaluation changes
> > between 11 and 12. Tom, does this trigger anything?
> 
> (1) Nope, I'm not either; the last non-back-patched change in that
> code was c54159d44 in v10.
> 
> (2) subcolor() is part of regex compilation, not execution, which makes
> one wonder why it's showing up at all.  Maybe the regex cache in
> adt/regexp.c is overflowing and preventing useful caching?  But
> that didn't change in v12 either.  Are these test cases really
> 100% equivalent?  I'm wondering if there are a few more "hot"
> regex patterns in the v12 data ...

They are not 100% equivalent, but the part of the plan we see is very
similar rowcount wise. It's possible that the functions differ more
however, there are different postgis versions involved, and apparently
also an "osml10n" extension.


> (3) Where the heck is the regex use coming from at all?  I don't
> see any regex operators in the plan.  Maybe it's inside the
> plpgsql function?

It definitely is. The stack shows at least two levels of plpgsql
functions. And Thomas has since confirmed that removing the functioncall
fixes the issue.

Based on the name I think this is somewhere around this:
https://github.com/giggls/mapnik-german-l10n/blob/master/plpgsql/get_localized_name_from_tags.sql#L120
The callgraph indicates that most of the cost comes from within
textregexreplace_noopt.

Not clear why the cache isn't fixing this - there are no variables in
the regexp_replace calls as far as I can see.

Greetings,

Andres Freund



pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #16243: non super user take pg_restore found some errors.
Next
From: Jehan-Guillaume de Rorthais
Date:
Subject: FK violation in partitioned table after truncating a referencedpartition