[PATCH] Make ReScanForeignScan callback optional for FDWs - Mailing list pgsql-hackers

From Adam Lee
Subject [PATCH] Make ReScanForeignScan callback optional for FDWs
Date
Msg-id aS_4arKxQTisUzm8@pluto
Whole thread Raw
List pgsql-hackers
Hi hackers,

I'd like to propose a patch that makes the ReScanForeignScan callback 
optional for Foreign Data Wrappers, allowing the planner to automatically 
handle non-rescannable FDWs.

# Background

Currently, FDWs are expected to implement ReScanForeignScan to support 
rescanning in scenarios like nested loop joins. However, some FDWs just cannot
implement rescanning:

- Streaming data sources (Kafka, message queues)
- One-time token-based APIs with expensive re-authentication
- Data sources where re-fetching is prohibitively expensive
- ...

Right now, these FDWs either implement a stub ReScanForeignScan that fails 
at runtime, or they buffer all data in BeginForeignScan to support rescan, 
which wastes memory when rescanning isn't needed.

# Proposed Solution

This patch introduces a 'rescannable' field in the Path structure that 
tracks whether each path supports rescanning. The planner uses this 
information to:

1. Auto-detect FDW rescan capability by checking if ReScanForeignScan is 
   provided
2. Automatically insert Material nodes when non-rescannable paths are used 
   as inner paths in nested loops
3. Reject parameterized foreign scan paths if the FDW doesn't support 
   rescan (preventing planning failures)
4. Raise a clear error for correlated subqueries that cannot be handled 
   (these are planned independently and can't use Material nodes)

# Not only FDWs

Beyond enabling non-rescannable FDWs, this mechanism could also be used 
for performance optimization on other nodes. Some operations can technically
support rescan but at significant cost (MergeAppend redo all the sorts,
Aggregation redo all the calculations...). We may could mark such paths as
non-rescannable in some cases to encourage the planner to materialize results
instead.

The patch is attached. If reviewers feel this is a good idea and needs more
discussion or the complexity warrants it, I'm happy to register this for the
next CommitFest. For now, I'm sharing it here to gather initial feedback on the
approach.

-- 
Adam

Attachment

pgsql-hackers by date:

Previous
From: Mikhail Kharitonov
Date:
Subject: [PATCH] VACUUM: avoid pre-creation transactions holding back cleanup of newly created relations
Next
From: Mircea Cadariu
Date:
Subject: Re: pg_recvlogical: Prevent flushed data from being re-sent after restarting replication