Re: Parallel queries in single transaction - Mailing list pgsql-hackers

From Paul Muntyanu
Subject Re: Parallel queries in single transaction
Date
Msg-id CACnYr+geQM1RmGArOrt5bimO6qNVk73zEjAhba7QV72Ez38dRQ@mail.gmail.com
Whole thread Raw
In response to Re: Parallel queries in single transaction  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
List pgsql-hackers

> Well, sure. But you could just as well open multiple connections and
> make the queries concurrent that way. Or change the GUC to increase the
> number of workers for the nightly ETL.



This is an option right now for having permanent staging tables for future join. I mistakenly said ETL while it is ELT what means that most of operations are in the database so we try to keep all changes in db code instead of changing engine for execution. In PG11 we have parallel CTAS what is drammatical improvement for us, but there are still will be operations(query plans) which are not parallel.

Having postgresql completely ACID is amazing feature, so when we need to do ELT operation outside the transaction and guarantee that ELT job completed successfully by checking that all steps(multiple transactions with staging tables) are succeeded(with graceful rollback + cleanup in case of failure), makes things more complex. Indeed I still agree that it is possible to workaround by operating on application level.
-P

-P


On Mon, Jul 16, 2018 at 2:28 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:


On 07/16/2018 12:03 PM, Paul Muntyanu wrote:
> Hi Tomas, thanks for looking into. I am more talking about queries which
> can not be optimized, e.g.
> * fullscan of the table and heavy calculations for another one.
> * query through FDW for both queries(e.g. one query fetches data from
> Kafka and another one is fetching from remote Postgres. There are no
> bounds for both queries for anything except local CPU, network and
> remote machine)
>
> IO bound is not a problem in case if you have multiple tablesapces.

But it was you who mentioned "query stuck" not me. I merely pointed out
that in such cases running queries concurrently won't help.

> And CPU bound can be not the case when you have 32 cores and 6 max workers
> per query. Then, during nigtly ETL, I do not have anything except single
> query running) == 6 cores are occupied. If I can run queries in
> parallel, I would occupy two IO stacks(two tablespaces) + 12 cores
> instead of sequentially 6 and then again 6.
>

Well, sure. But you could just as well open multiple connections and
make the queries concurrent that way. Or change the GUC to increase the
number of workers for the nightly ETL.


regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: patch to allow disable of WAL recycling
Next
From: Andres Freund
Date:
Subject: Re: Pluggable Storage - Andres's take