Home > mailing lists

Thread: How best to do parallel query given tens of thousands of iteration of a loop of recursive queries?

How best to do parallel query given tens of thousands of iteration of a loop of recursive queries?

From

Shaozhong SHI

Date:

10 April 2022, 16:50:32

There is a plpgsql script that have a loop to carry out the same recursive queries.

The estimation of iteration is in the order of tens of thousands.

What is the best way of making using parallel query strategy.

Any examples?

Regards,

David

Re: How best to do parallel query given tens of thousands of iteration of a loop of recursive queries?

From

Steve Midgley

Date:

10 April 2022, 19:03:04

On Sun, Apr 10, 2022, 6:50 AM Shaozhong SHI <shishaozhong@gmail.com> wrote:

There is a plpgsql script that have a loop to carry out the same recursive queries.

The estimation of iteration is in the order of tens of thousands.

What is the best way of making using parallel query strategy.

Any examples?

Can you provide some sample ddl, data and sql to illustrate your question?

You mention in one place "iteration" and "parallel" but in another you mention recursion.. Recursion (by definition?) involves pushing state data forward into each successive iteration.. Therefore you can't run recursion in parallel.

If you can provide examples maybe we can see which if parallelism is a possibility..

Steve

Re: How best to do parallel query given tens of thousands of iteration of a loop of recursive queries?

From

"David G. Johnston"

Date:

10 April 2022, 19:13:14

On Sun, Apr 10, 2022 at 6:50 AM Shaozhong SHI <shishaozhong@gmail.com> wrote:

There is a plpgsql script that have a loop to carry out the same recursive queries.

The estimation of iteration is in the order of tens of thousands.

What is the best way of making using parallel query strategy.

An example would be helpful.

However, as a general guideline, since parallelism is done at the per-row scope, removing looping logic from the script and turning the main script logic into one or more functions that operate on a single row, while obeying the rules such functions need to abide by in order to be marked parallel safe, will open up the possibility for the server to process different rows using different workers and then appending their results together for the next node to consume.

David J.