Re: Joins on TID - Mailing list pgsql-hackers

From Edmund Horner
Subject Re: Joins on TID
Date
Msg-id CAMyN-kDT=RN-xpZpynbK0NC8VJGLs2odxTPz7=fcCaW0bizr1A@mail.gmail.com
Whole thread Raw
In response to Joins on TID  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Joins on TID  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Sat, 22 Dec 2018 at 12:34, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I decided to spend an afternoon seeing exactly how much work would be
> needed to support parameterized TID scans, ie nestloop-with-inner-TID-
> scan joins, as has been speculated about before, most recently here:
>
>
https://www.postgresql.org/message-id/flat/CAMqTPq%3DhNg0GYFU0X%2BxmuKy8R2ARk1%2BA_uQpS%2BMnf71MYpBKzg%40mail.gmail.com
>
> It turns out it's not that bad, less than 200 net new lines of code
> (all of it in the planner; the executor seems to require no work).
>
> Much of the code churn is because tidpath.c is so ancient and crufty.
> It was mostly ignoring the RestrictInfo infrastructure, in particular
> emitting the list of tidquals as just bare clauses not RestrictInfos.
> I had to change that in order to avoid inefficiencies in some places.

It seems good, and I can see you've committed it now.  (I should have
commented sooner, but it's the big summer holiday period here, which
means I have plenty of time to work on PostgreSQL, but none of my
usual resources.  In any case, I was going to say "this looks useful
and not too complicated, please go ahead".)

I did notice that multiple tidquals are no longer removed from scan_clauses:

EXPLAIN SELECT * FROM pg_class WHERE ctid = '(1,1)' OR ctid = '(2,2)';

 Tid Scan on pg_class  (cost=0.01..8.03 rows=2 width=265)
   TID Cond: ((ctid = '(1,1)'::tid) OR (ctid = '(2,2)'::tid))
   Filter: ((ctid = '(1,1)'::tid) OR (ctid = '(2,2)'::tid))

I guess if we thought it was a big deal we could attempt to recreate
the old logic with RestrictInfos.

> I haven't really looked at how much of a merge problem there'll be
> with Edmund Horner's work for TID range scans.  My feeling about it
> is that we might be best off treating that as a totally separate
> code path, because the requirements are significantly different (for
> instance, a range scan needs AND semantics not OR semantics for the
> list of quals to apply).

Well, I guess it's up to me to merge it.  I can't quite see which
parts we'd use a separate code path for.  Can you elaborate?

Edmund


pgsql-hackers by date:

Previous
From: Andrew Gierth
Date:
Subject: Re: Early WIP/PoC for inlining CTEs
Next
From: Michael Paquier
Date:
Subject: Re: Clean up some elog messages and comments for do_pg_stop_backupand do_pg_start_backup