Re: Concurrent CTE - Mailing list pgsql-general

From Jeremy Finzel
Subject Re: Concurrent CTE
Date
Msg-id CAMa1XUip+YFz=mNGS5MPiH+1hhRY8iMK9f6qCyoeEWV3b-n7sA@mail.gmail.com
Whole thread Raw
In response to Concurrent CTE  (Artur Formella <a.formella@tme3c.com>)
List pgsql-general

On Wed, Apr 4, 2018 at 3:20 AM Artur Formella <a.formella@tme3c.com> wrote:
Hello!
We have a lot of big CTE (~40 statements, ~1000 lines) for very dynamic
OLTP content and avg response time 50-300ms. Our setup has 96 threads
(Intel Xeon Gold 6128), 256 GB RAM and 12 SSD (3 tablespaces). DB size <
RAM.
Simplifying the problem:

WITH aa as (
   SELECT * FROM table1
), bb (
   SELECT * FROM table2
), cc (
   SELECT * FROM table3
), dd (
   SELECT * FROM aa,bb
), ee (
   SELECT * FROM aa,bb,cc
), ff (
   SELECT * FROM ee,dd
), gg (
   SELECT * FROM table4
), hh (
   SELECT * FROM aa
)
SELECT * FROM gg,hh,ff /* primary statement */

Execution now:
time-->
Thread1: aa | bb | cc | dd | ee | ff | gg | hh | primary

And the question: is it possible to achieve more concurrent execution
plan to reduce the response time? For example:
Thread1: aa | dd | ff | primary
Thread2: bb | ee | gg
Thread3: cc | -- | hh

Table1, table2 and table3 are located on separate tablespaces and are
independent.
Partial results (aa,bb,cc,dd,ee) are quite big and slow (full text
search, arrays, custom collations, function scans...).

We consider resigning from the CTE and rewrite to RX Java but we are
afraid of downloading partial results and sending it back with WHERE
IN(...).

Thanks!

Artur Formella

It is very difficult from your example to tell just what kind of data you are querying and why you are doing it this way. I will give it a try.

If you are filtering any of this data later you are fencing off that optimization. Also in your example it makes no sense to have cte aa when you could just cross join table1 directly in all your other ctes (and bb and cc for the same reason).

Also in my experience, you are not going to have a great query plan with that many CTEs. Also are you using functions or prepared statements or are you paying the price of planning this query every time?

It is hard to tell but your example leads me to question if there are some serious issues in your db design. Where are your joins and where are you leveraging indexes?  Also it is very easy to misuse use a raise and function scans to even make performance worse. 

Thanks,
Jeremy 

pgsql-general by date:

Previous
From: Jerry Sievers
Date:
Subject: Re: PgUpgrade bumped my XIDs by ~50M?
Next
From: "David G. Johnston"
Date:
Subject: Re: Concurrent CTE