Re: BUG #16966: Nested loop joining across tables with varchar -> bpchar cast always scans varchar table first - Mailing list pgsql-bugs

From Zeb Burke-Conte
Subject Re: BUG #16966: Nested loop joining across tables with varchar -> bpchar cast always scans varchar table first
Date
Msg-id CAMY747_oUyfZGPNhPMkFpVdYUEWNa2M6KGLcc6E7Vbui-tZ08A@mail.gmail.com
Whole thread Raw
In response to Re: BUG #16966: Nested loop joining across tables with varchar -> bpchar cast always scans varchar table first  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: BUG #16966: Nested loop joining across tables with varchar -> bpchar cast always scans varchar table first  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
Thank you so much, Tom--this is a great point. I've tested your index expression suggestion, and you have correctly diagnosed the problem. Even more simply, casting the bpchar to varchar in the JOIN ... ON condition works.

I still find this quite counterintuitive, since nothing about my query is forcing Postgres to cast the varchar column to bpchar instead of the other way around. Is there some arcane standard that requires it? More likely, I'm guessing, casting is determined upfront and indices are not considered. Changing this could be a "nice-to-have" although I see how it's not a bug per se.

Of course, in my actual use case, I've switched away from bpchar.

Best,
Zeb Burke-Conte

On Fri, Apr 16, 2021 at 5:36 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
PG Bug reporting form <noreply@postgresql.org> writes:
> I'm seeing a performance issue when joining across two tables on columns
> that require a cast from varchar to bpchar. No matter how selective the
> condition is on the bpchar table, the outer scan will be on the table with
> the varchar column.

I don't think this is a planner problem.  It can't generate the plan
you are hoping for because of index mismatch.  The query's join
condition is effectively "a.varcharcol::bpchar = b.bpcharcol", and
the construct "a.varcharcol::bpchar" doesn't match your index on
a.varcharcol, so it can't use an inner indexscan on that side of
the equality.

Possibly you could work around this by providing an expression index on
"a.varcharcol::bpchar".  But TBH my recommendation would be to nuke the
bpchar columns from orbit.  They're almost never the semantics you want,
especially not if you're sometimes comparing them to non-bpchar
columns.

                        regards, tom lane

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #16966: Nested loop joining across tables with varchar -> bpchar cast always scans varchar table first
Next
From: Tom Lane
Date:
Subject: Re: BUG #16966: Nested loop joining across tables with varchar -> bpchar cast always scans varchar table first