Enabling parallel execution for cursors - Mailing list pgsql-hackers

From Meftun Cincioğlu
Subject Enabling parallel execution for cursors
Date
Msg-id CAAeyror5bd6J6VFj4HR5zcWf_SLyfVNe53RtG4-A0jT+gZxdSA@mail.gmail.com
Whole thread Raw
List pgsql-hackers
Hello,

I believe the topic on the link below
https://www.postgresql.org/message-id/flat/98760523-a0d2-8705-38e3-c4602ecf2448%402ndquadrant.com
should be re-discussed, as more than 6 years passed from the last message. Let me summarize the topic first and ask the real question: Can declared cursors be developed that support parallelism?

[1] Tomas Vondra stated that cursors generally do not benefit from parallel query. There are two main reasons why parallelism is disabled for user-defined cursors (or queries that might get suspended):
(a) We can't predict what will happen while the query is suspended (and the transaction is still in "parallel mode"), e.g. the user might run arbitrary DML which is not allowed.
(b) If the cursor gets suspended, the parallel workers would be still assigned to it and could not be used for anything else.
He proposed to add a new cursor option (PARALLEL), which would allow parallel plans for that particular user-defined cursor. He also attached the first version of the patch and indicated that the patch is experimental.

[2] Robert Haas stated that the proposed patch is a small portion of what should be done to cover all the holes.

[3] Tomas Vondra stated that it should be checked at planning time if the transaction is in parallel mode and the query contains unsafe/restricted functions.

[4] Robert Haas said;
The main points I want to make clearly understood is the current design relies on (1) functions being labeled correctly and (2) other dangerous code paths being unreachable because there's nothing that runs between EnterParallelMode and ExitParallelMode which could invoke them, except by calling a mislabeled function.  Your patch expands the vulnerability surface from "executor code that can be reached without calling a mislabeled function" to "any code that can be reached by typing an SQL command". Just rejecting any queries that are parallel-unsafe probably closes a good chunk of the holes, but that still leaves a lot of code that's never been run in parallel mode before potentially now running in parallel mode. Error handling might be a problem, too: what happens if a parallel worker is killed while the query is suspended?
He also proposed to write a test patch that keeps parallel mode active at all times except when running a query that contains something parallel-unsafe.

[5] Tomas Vondra proposed a simpler option - what if we only allow fetches from the PARALLEL cursor while the cursor is open?
    BEGIN;
    ...
    DECLARE x PARALLEL CURSOR FOR SELECT * FROM t2 WHERE ...;
    FETCH 1000 FROM x;
    FETCH 1000 FROM x;
    FETCH 1000 FROM x;
    CLOSE x;
    ...
    COMMIT;
but adding any other command between the OPEN/CLOSE commands would fail. That should close all the holes with parallel-unsafe stuff, right?
Of course, this won't solve the issue with error handling / killing suspended workers (which didn't occur to me before as a possible issue at all, so that's for pointing that out). But that's a significantly more limited issue to fix than all the parallel-unsafe bits.

[6] Robert Haas asked some questions about the suggestion that Tomas stated and also made a comment about the materialization issue that Craig stated. He also said; If you're running a query like "SELECT * FROM bigtable", you don't need parallel query in the first place, because a single backend is quite capable of sending back the rows as fast as a client can read them.  If you're running a query like "SELECT * FROM bigtable WHERE <highly selective predicate>" then that's a good use case for parallel query, but then materializing it isn't that bad because the result set is a lot smaller than the original table.

[7] Simon Riggs said;
At present, one major use of Cursors is in postgres_fdw. In that usage, the query executes and returns all the rows. No other side execution is possible. How do we make parallel query work for Cursors, if not by Tomas' proposal? If a parallel cursor is requested, we could simply prevent other intermediate commands other than FETCH (next).

[8] Robert Haas proposed providing some kind of infrastructure for workers to detach from a parallel query. Let's say the leader is either (a) suspending execution of the query, because it's a cursor, or (b) not able to absorb rows as fast as workers are generating them.
In the former situation, we'd like to get rid of all workers; in the latter situation, some workers. In the former situation, getting all workers to shut down cleanly would let us exit parallel mode (and perhaps re-enter it later when we resume execution of the query). In the latter situation, we could avoid wasting workers on queries where the leader can't keep up so that those worker slots are available to other queries that can benefit from them.
He stated that there were some problems finding a point at which tuples could be safely stopped from being returned.
He considered two approaches for handling parallel workers when a query is suspended:
(a) Keep Workers Running: This would involve keeping workers active during suspension, but it requires complex state synchronization and could tie up resources for long periods.
(b) Restrict Backend Activity: Imposing strict limits on actions while a parallel cursor is open (e.g., only allowing FETCH) could prevent some issues, but it's highly restrictive and doesn't address all problems, such as potential conflicts in transaction state and lock management. Both approaches are challenging but potentially feasible with enough effort.

[9] Robert Haas stated if you restrict operations to only fetching from the cursor and ensure that both protocol messages and SQL commands are locked down, with errors automatically killing the workers, this approach might work. However, challenges remain, such as handling errors when the leader is idle, since reporting these errors is tricky due to protocol limitations. Additionally, if fetching from the cursor is the only action allowed, using `PQsetSingleRowMode` instead of a cursor might be a simpler solution.

[10] Robert Haas stated parallel mode is designed to handle errors during query execution only. Extending parallel mode beyond a query's context introduces significant risks, including crashes or incorrect results. Restricting post-query operations to a very limited set might work, but allowing arbitrary PL/pgSQL code execution is too broad and risky.

This is the summary of the topic. Let me ask the question again: Can declared cursors be developed that support parallelism?

[1] https://www.postgresql.org/message-id/98760523-a0d2-8705-38e3-c4602ecf2448%402ndquadrant.com
[2] https://www.postgresql.org/message-id/CA%2BTgmoaJVMbQbHn3i_Uzz_vSGsbDsavYkPV4Tqz%3DUbg%2Bv8%2BLUQ%40mail.gmail.com
[3] https://www.postgresql.org/message-id/dfe4adec-ad9b-723c-d501-1eedc96837a3%402ndquadrant.com
[4] https://www.postgresql.org/message-id/CA%2BTgmobcUxTPLbjFL4PQa_07RC5wdEr-N_Zi%2BEOGuJ8Ui1Vy8Q%40mail.gmail.com
[5] https://www.postgresql.org/message-id/8bdb6684-09d7-f799-0a6a-362cdc251b31%402ndquadrant.com
[6] https://www.postgresql.org/message-id/CA%2BTgmoYWoqDUViwPTk-rOJbGG8aqEVAQWBgauVbYaUxznSA%3D%3Dg%40mail.gmail.com
[7] https://www.postgresql.org/message-id/CANP8%2BjJoAitOY5uM6L8xYjPHK-RjJtmsZ5TZKGrH8m54U9sfzA%40mail.gmail.com
[7-more] https://www.postgresql.org/message-id/CANP8%2BjJBHDu%2BkgH2Jy34zzx9C5x61LF7JkWODPaRKuXMhD0vKw%40mail.gmail.com
[8] https://www.postgresql.org/message-id/CA%2BTgmobXWEqRUr0k1RS%3Dx4NaAbi%2B9R-0z%2Bpp%2BVEuLYKorjbbYg%40mail.gmail.com
[9] https://www.postgresql.org/message-id/CA%2BTgmoaBYa_WDSFhZA6sPYVnBWH_vwqpXcy3iA93oqaecj2p7A%40mail.gmail.com
[10] https://www.postgresql.org/message-id/CA%2BTgmoZfwTc8DcoNQ_V2%3DUbmY%3DsVuPg%2B%2BMF6eNaWDVjaBczwrQ%40mail.gmail.com

Regards,

--
Meftun Cincioglu

pgsql-hackers by date:

Previous
From: John Naylor
Date:
Subject: Re: Linux likely() unlikely() for PostgreSQL
Next
From: Ashutosh Bapat
Date:
Subject: Re: A problem about partitionwise join