Thread: parallel-processing multiple similar query tasks - any example?

parallel-processing multiple similar query tasks - any example?

From
Shaozhong SHI
Date:



multiple similar query tasks are as follows:

select * from a_table where country ='UK'
select * from a_table where country='France'
and so on

How best to parallel-processing such types of multiple similar query tasks?

Any example available?

Regards,

David

Re: parallel-processing multiple similar query tasks - any example?

From
Steve Midgley
Date:


On Wed, Apr 27, 2022 at 4:34 PM Shaozhong SHI <shishaozhong@gmail.com> wrote:



multiple similar query tasks are as follows:

select * from a_table where country ='UK'
select * from a_table where country='France'
and so on

How best to parallel-processing such types of multiple similar query tasks?


This depends on how you are engaging with the queries when they return results. Let's assume you are running them from a programming environment with an ORM layer. In that case you can run each query in a separate thread and connection pipe, and the queries will run async just fine. If you are running at the command line using psql, you can just open multiple shells and run each query from a different shell terminal.

Postgres is very good at async queries, so your challenge is really figuring out how you will use the results of each query and setting up the environment sending the queries to perform asynchronously.

Steve

Re: parallel-processing multiple similar query tasks - any example?

From
Shaozhong SHI
Date:


On Thu, 28 Apr 2022 at 18:15, Steve Midgley <science@misuse.org> wrote:


On Wed, Apr 27, 2022 at 4:34 PM Shaozhong SHI <shishaozhong@gmail.com> wrote:



multiple similar query tasks are as follows:

select * from a_table where country ='UK'
select * from a_table where country='France'
and so on

How best to parallel-processing such types of multiple similar query tasks?


This depends on how you are engaging with the queries when they return results. Let's assume you are running them from a programming environment with an ORM layer. In that case you can run each query in a separate thread and connection pipe, and the queries will run async just fine. If you are running at the command line using psql, you can just open multiple shells and run each query from a different shell terminal.

Postgres is very good at async queries, so your challenge is really figuring out how you will use the results of each query and setting up the environment sending the queries to perform asynchronously.

Steve

Hi, Steve,

That is very useful.

All we want to do is to process a large amount of data.

I found loops of recursive queries are very time consuming and will not finish on time.

Measures like indexing are simply not adequate to address the problem.

I am thinking of making use of Linux capability to fire off concurrent processors.

So long as it is efficient, we can always work out how to ask it to return results.

Regards,

David 

Re: parallel-processing multiple similar query tasks - any example?

From
Erik Brandsberg
Date:
None of this discussion is really specific to postgres.  

On Thu, Apr 28, 2022 at 1:46 PM Shaozhong SHI <shishaozhong@gmail.com> wrote:


On Thu, 28 Apr 2022 at 18:15, Steve Midgley <science@misuse.org> wrote:


On Wed, Apr 27, 2022 at 4:34 PM Shaozhong SHI <shishaozhong@gmail.com> wrote:



multiple similar query tasks are as follows:

select * from a_table where country ='UK'
select * from a_table where country='France'
and so on

How best to parallel-processing such types of multiple similar query tasks?


This depends on how you are engaging with the queries when they return results. Let's assume you are running them from a programming environment with an ORM layer. In that case you can run each query in a separate thread and connection pipe, and the queries will run async just fine. If you are running at the command line using psql, you can just open multiple shells and run each query from a different shell terminal.

Postgres is very good at async queries, so your challenge is really figuring out how you will use the results of each query and setting up the environment sending the queries to perform asynchronously.

Steve

Hi, Steve,

That is very useful.

All we want to do is to process a large amount of data.

I found loops of recursive queries are very time consuming and will not finish on time.

Measures like indexing are simply not adequate to address the problem.

I am thinking of making use of Linux capability to fire off concurrent processors.

So long as it is efficient, we can always work out how to ask it to return results.

Regards,

David