Thread: explain plans for foreign servers

explain plans for foreign servers

From
dinesh salve
Date:

Hi Hackers,

I am working on a feature in postgres_fdw extension to show plans used by remote postgresql servers in the output of the EXPLAIN command.
I think this will help end users understand query execution plans used by remote servers. Sample output for table people where people_1 is local partition and people_2 is remote partition would look like -

postgres:5432> explain select * from "test"."people";
QUERY PLAN
Append (cost=0.00..399.75 rows=2270 width=46)
    → Seq Scan on "people.people_1" people_1 (cost=0.00..21.00 rows=1100 width=46)
    → Foreign Scan on "people.people_2" people_2 (cost=100.00..367.40 rows=1170 width=46)
        Remote Plan
            Seq Scan on "people.people_2" (cost=0.00..21.00 rows=1100 width=46)
(5 rows)

I would like community inputs on below high level thoughts:

1. To enable this feature, either we can introduce a new option in EXPLAIN command e.g. (fetch_remote_plans true) or control this behaviour using a guc defined in postgres_fdw extension.      I am more inclined towards guc as this feature is for extension postgres_fdw. Adding the EXPLAIN command option might force other FDW extensions to handle this.

2. For ANALYZE = false, the idea is that postgres_fdw would create a connection to a remote server, prepare SQL to send over connection and store received plans in ExplainState.

3. For ANALYZE = true, idea is that postgres_fdw would set a new guc over connection to remote server, remote server postgres_fdw would read this guc and send back used query plan as a NOTICE (similar to auto_explain extension does) with custom header which postgres_fdw extension understands. . We also have an opportunity to introduce a new message type in the protocol to send back explain plans but it might look like too much work for this feature. Open to ideas here.

Dinesh Salve
SDE@AWS

Re: explain plans for foreign servers

From
Andy Fan
Date:
dinesh salve <cooltodinesh@gmail.com> writes:

Hi,

>
> I am working on a feature in postgres_fdw extension to show plans used
> by remote postgresql servers in the output of the EXPLAIN command.
> 
> I think this will help end users understand query execution plans used
> by remote servers. Sample output for table people where people_1 is
> local partition and people_2 is remote partition would look like

This looks nice! Especially for the people who want a FDW based sharding
cluster.  

> I would like community inputs on below high level thoughts:
>
> 1. To enable this feature, either we can introduce a new option in
> EXPLAIN command e.g. (fetch_remote_plans true) or control this
> behaviour using a guc defined in postgres_fdw extension. I am more
> inclined towards guc as this feature is for extension
> postgres_fdw. Adding the EXPLAIN command option might force other FDW
> extensions to handle this. 

+1.

> 2. For ANALYZE = false, the idea is that postgres_fdw would create a
> connection to a remote server, prepare SQL to send over connection and
> store received plans in ExplainState. 

> 3. For ANALYZE = true, idea is that postgres_fdw would set a new guc
> over connection to remote server, remote server postgres_fdw would
> read this guc and send back used query plan as a NOTICE (similar to
> auto_explain extension does) with custom header which postgres_fdw
> extension understands. We also have an opportunity to introduce a new
> message type in the protocol to send back explain plans but it might
> look like too much work for this feature. Open to ideas here. 

This generally looks good to me.  Looking forward a patch for the
details. 

-- 
Best Regards
Andy Fan




Re: explain plans for foreign servers

From
Ashutosh Bapat
Date:
On Mon, Nov 11, 2024 at 9:12 PM dinesh salve <cooltodinesh@gmail.com> wrote:
>
>
> Hi Hackers,
>
> I am working on a feature in postgres_fdw extension to show plans used by remote postgresql servers in the output of
theEXPLAIN command. 
> I think this will help end users understand query execution plans used by remote servers. Sample output for table
peoplewhere people_1 is local partition and people_2 is remote partition would look like - 
>
> postgres:5432> explain select * from "test"."people";
> QUERY PLAN
> Append (cost=0.00..399.75 rows=2270 width=46)
>     → Seq Scan on "people.people_1" people_1 (cost=0.00..21.00 rows=1100 width=46)
>     → Foreign Scan on "people.people_2" people_2 (cost=100.00..367.40 rows=1170 width=46)
>         Remote Plan
>             Seq Scan on "people.people_2" (cost=0.00..21.00 rows=1100 width=46)
> (5 rows)
>
> I would like community inputs on below high level thoughts:
>
> 1. To enable this feature, either we can introduce a new option in EXPLAIN command e.g. (fetch_remote_plans true) or
controlthis behaviour using a guc defined in postgres_fdw extension.      I am more inclined towards guc as this
featureis for extension postgres_fdw. Adding the EXPLAIN command option might force other FDW extensions to handle
this.
>
> 2. For ANALYZE = false, the idea is that postgres_fdw would create a connection to a remote server, prepare SQL to
sendover connection and store received plans in ExplainState. 
>
> 3. For ANALYZE = true, idea is that postgres_fdw would set a new guc over connection to remote server, remote server
postgres_fdwwould read this guc and send back used query plan as a NOTICE (similar to auto_explain extension does) with
customheader which postgres_fdw extension understands. . We also have an opportunity to introduce a new message type in
theprotocol to send back explain plans but it might look like too much work for this feature. Open to ideas here. 

If use_remote_estimates is enabled for a given foreign server,
postgres_fdw fetches EXPLAIN output and plugs those costs into the
local plan's costs. You could use that - display the remote plan only
when use_remote_estimates is enabled. However, there's no guarantee
that the plan so fetched will be the plan used by foreign server when
actually executing the query. Mostly likely that is true but no
guarantee. That's also true if the plan is fetched only for the final
query. Of course the EXPLAIN output differences between server
versions need to taken care of.

But the real question is usability. How do you plan to use it?
--
Best Wishes,
Ashutosh Bapat



Re: explain plans for foreign servers

From
dinesh salve
Date:
On Tue, Nov 12, 2024 at 4:16 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote:
On Mon, Nov 11, 2024 at 9:12 PM dinesh salve <cooltodinesh@gmail.com> wrote:
>
>
> Hi Hackers,
>
> I am working on a feature in postgres_fdw extension to show plans used by remote postgresql servers in the output of the EXPLAIN command.
> I think this will help end users understand query execution plans used by remote servers. Sample output for table people where people_1 is local partition and people_2 is remote partition would look like -
>
> postgres:5432> explain select * from "test"."people";
> QUERY PLAN
> Append (cost=0.00..399.75 rows=2270 width=46)
>     → Seq Scan on "people.people_1" people_1 (cost=0.00..21.00 rows=1100 width=46)
>     → Foreign Scan on "people.people_2" people_2 (cost=100.00..367.40 rows=1170 width=46)
>         Remote Plan
>             Seq Scan on "people.people_2" (cost=0.00..21.00 rows=1100 width=46)
> (5 rows)
>
> I would like community inputs on below high level thoughts:
>
> 1. To enable this feature, either we can introduce a new option in EXPLAIN command e.g. (fetch_remote_plans true) or control this behaviour using a guc defined in postgres_fdw extension.      I am more inclined towards guc as this feature is for extension postgres_fdw. Adding the EXPLAIN command option might force other FDW extensions to handle this.
>
> 2. For ANALYZE = false, the idea is that postgres_fdw would create a connection to a remote server, prepare SQL to send over connection and store received plans in ExplainState.
>
> 3. For ANALYZE = true, idea is that postgres_fdw would set a new guc over connection to remote server, remote server postgres_fdw would read this guc and send back used query plan as a NOTICE (similar to auto_explain extension does) with custom header which postgres_fdw extension understands. . We also have an opportunity to introduce a new message type in the protocol to send back explain plans but it might look like too much work for this feature. Open to ideas here.

If use_remote_estimates is enabled for a given foreign server,
postgres_fdw fetches EXPLAIN output and plugs those costs into the
local plan's costs. You could use that - display the remote plan only
when use_remote_estimates is enabled. However, there's no guarantee
that the plan so fetched will be the plan used by foreign server when
actually executing the query. Mostly likely that is true but no
guarantee. That's also true if the plan is fetched only for the final
query. Of course the EXPLAIN output differences between server
versions need to taken care of.

But the real question is usability. How do you plan to use it?
--
Best Wishes,
Ashutosh Bapat

Hi Ashutosh,
Thanks for the feedback.

1. As admins and devs need to look at plans from time to time, if a remote plan is displayed only when use_remote_estimates is enabled, either the end user needs to keep it enabled (at table or server level) all the time or enable/disable when they want to view remote plans. I actually wanted to decouple that action for the user and make it easy by just setting a guc -> SET postgres_fdw.show_remote_explain_plans = on;
2. Yeah, plans are not guaranteed but this feature would give a high level idea on overall query execution at one place. This could be pretty useful when postgresql is used for sharding and tables are set up as partitions and the end user wants to view overall query execution.

Dinesh Salve
SDE@AWS

Re: explain plans for foreign servers

From
Anton Shmigirilov
Date:
> Hi Hackers,
>
> I am working on a feature in postgres_fdw extension to show plans used by remote postgresql servers in the output of
theEXPLAIN command. 
> I think this will help end users understand query execution plans used by remote servers. Sample output for table
peoplewhere people_1 is local partition and people_2 is remote partition would look like - 
>
> postgres:5432> explain select * from "test"."people";
> QUERY PLAN
> Append (cost=0.00..399.75 rows=2270 width=46)
>     → Seq Scan on "people.people_1" people_1 (cost=0.00..21.00 rows=1100 width=46)
>     → Foreign Scan on "people.people_2" people_2 (cost=100.00..367.40 rows=1170 width=46)
>         Remote Plan
>             Seq Scan on "people.people_2" (cost=0.00..21.00 rows=1100 width=46)
> (5 rows)
>
> I would like community inputs on below high level thoughts:
>
> 1. To enable this feature, either we can introduce a new option in EXPLAIN command e.g. (fetch_remote_plans true) or
controlthis behaviour using a guc defined in postgres_fdw extension.      I am more inclined towards guc as this
featureis for extension postgres_fdw. Adding the EXPLAIN command option might force other FDW extensions to handle
this.
>
> 2. For ANALYZE = false, the idea is that postgres_fdw would create a connection to a remote server, prepare SQL to
sendover connection and store received plans in ExplainState. 
>
> 3. For ANALYZE = true, idea is that postgres_fdw would set a new guc over connection to remote server, remote server
postgres_fdwwould read this guc and send back used query plan as a NOTICE (similar to auto_explain extension does) with
customheader which postgres_fdw extension understands. . We also have an opportunity to introduce a new message type in
theprotocol to send back explain plans but it might look like too much work for this feature. Open to ideas here. 
>
> Dinesh Salve
> SDE@AWS

Hi Dinesh,

Thank you for your proposal regarding explain for foreign servers.

I have been working on a similar feature and there are several considerations to take into account.

To enable this feature it is preferable to use GUC rather than the EXPLAIN option, as it simplifies regression testing.
Youcan simply set it to off before most tests that involve plan checking, while leaving the rest unchanged. This leads
toa reduction in the size of the differences. 

If it is necessary to provide only the execution plan of the foreign query (without actual timing metrics), you should
sendEXPLAIN with ANALYZE set to OFF, regardless of the initial ANALYZE state. This approach will prevent the
re-executionof the remote query (the SQL part of the ForeignScan node), which could potentially lead to side effects.
It'ssafe to send ANALYZE ON during remote EXPLAIN only if your remote SQL is idempotent, i.e. doesn't change anything.
Thatway you can't sent it for *Modify nodes, but it can be applicable for certain ForeignScans, such as those involving
FunctionScan.In general, it is safer to enforce ANALYZE OFF in all cases. 

Also you can't expose to main EXPLAIN some metrics obtained from remote side through the "remote" explain. For example,
valuessuch as actual time, planning time, execution time, and similar metrics cannot be exposed because they relates to
eventsthat occurred during the "EXPLAIN" communication, rather than during the actual planning and execution phases.
Therefore,these times would likely mislead the user. I suppose it's better to enforce EXPLAIN with TIMING OFF and
SUMMARYOFF when obtaining the remote portion of EXPLAIN. 

While reconstructing (deparsing) the SQL query to send as part of EXPLAIN to the remote server, you can obtain SQL
statementswith placeholders (i.e. $1, $2, etc) instead of actual parameter values. It's syntactically incorrect SQL,
whichwill lead to an error on the remote side. There are two ways to avoid this. You can use GENERIC_PLAN feature
(v16+),which accepts dollar-parameters here. Another option is to use params_list == NULL in the
deparseSelectStmtForRel()function to substitute dummy null values for placeholders, thereby generating syntactically
correctSQL. The downside of this approach is the need to perform an additional deparse stage, which can be redundant. 

However looking forward a patch, it is likely that some (or all) of my thoughts may become irrelevant.

--
Best regards,
Anton Shmigirilov,
Postgres Professional


Re: explain plans for foreign servers

From
Sami Imseih
Date:
Hi,

Thanks for working on this patch!

I looked at the patch from and I have several comments. There are
some others, but wanted to start with the most important I found, in order
of importance.

1/ The use of NOTICE to propagate the explain plan.
I see the message content is checked, but this does not look robust
and could lead to
some strange results if another ExecutorRun hook emits a similar notice message.

+    // We might receive plans per batch of cursor, but we only need
to store one.
+    // do we really need to handle len==0. report warn if we still
recived. have test around this warn.
+    if (strstr(notice, "postgres_fdw_explain_plan") &&
explain_plans->len == 0) {

2/ The current patch requires that the remote side has postgres_fdw
enabled. This seems very much
against the philosophy of FDWs. Only the side which creates the
foreign tables should require
the extension to be installed.

Also, if auto_explain_plan is enabled on the foreign table side, it
seems the explain_ExecutorRun
is exercised. This code path should only be taken on the remote side. right?

3/ ExecutorRun is the wrong place to send the plans from, because
postgres_fdw performs
FETCHES from a SQL declared cursor, and each fetch will hit
ExecutorRun. If you return the
first plan from ExecutorRun and stop consuming the rest of the plans,
you will only get the
results from a single fetch only. The plan should be generated and
sent back at CLOSE cursor time.

4/ As far as presenting the remote plans, I think adding them inline
in the EXPLAIN output
will make the plans hard to read, especially fas the plans become more complex.
What about they get added to a new section called "remote plans" and
the remote plans will
be identified by the plan_node_id, which we can add.

Below is a sketch-up to make it clear what I am thinking.

"""
Sort  (cost=1..10 rows=10 width=120) (actual time=1..10 rows=10 loops=1)
  -> Foreign Scan on prices  (cost=100.00..200.00 rows=10width=59)
(actual time=1..100 rows=10 loops=1)  (node=1)
Planning Time: 10 ms
Execution Time: 100 ms

Remote Plans
---------------
node 1:

Seq Scan on prices (cost=100.00..4576146.22 rows=467980 width=59)
(actual time=8737.505..2258486.086 rows=31752 loops=1)
"""

Here is a thought about how to generate and consume the plans.

What if we do something like a new EXPLAIN option which returns all the rows
back to the client, and then writes out the plan to some local memory. We would
then be able to fetch the last plan through a sql function, i.e.
SELECT pg_last_explain().

This may have applications beyond postgre_fdw; but in the case of
postgres_fdw, it will
call the remote sql using this EXPLAIN option and at the end of
execution, it will be
responsible to fetch the plans from pg_last_explain. I Have not fully
formulated this idea,
but wanted to share it.

Regards,

Sami Imseih
Amazon Web Services (AWS)