Thread: CTE with JOIN of two tables is much faster than a regular query

Running PostgreSQL 9.5 on Windows.

 

The CTE mentioned below completes the query in 4.5 seconds while the regular query takes 66 seconds. I read from EXPLAIN ANALYSE that the regular query starts with a full table scan over “Doc” while the CTE joins the two tables first and applies the filter condition in the 2nd step.

I believe that some rows in “Doc” which are not referenced by “F” contain a large amount of data in the field “szText” and this will slow down the ILIKE operator.

 

What can I do to improve the performance of the regular query without using a CTE?

 

This is a much simplified extract from a larger application:

 

CREATE TABLE Doc (

  oID UUID NOT NULL PRIMARY KEY,

  uDocID UUID NOT NULL UNIQUE,

  szText TEXT

  );

 

CREATE TABLE F (

  oID UUID NOT NULL PRIMARY KEY,

  uDocRef UUID,

  CONSTRAINT F_fkey1 FOREIGN KEY (uDocRef) REFERENCES Doc (uDocID)

  ); 

 

-- just in case …

ALTER TABLE Doc ALTER uDocID SET STATISTICS 10000;

ALTER TABLE Doc ALTER szText SET STATISTICS 10000;

VACUUM ANALYSE Doc;

 

SELECT COUNT(*) FROM Doc;

=> 125946 records

 

ALTER TABLE F ALTER uDocRef SET STATISTICS 10000;

VACUUM ANALYSE F;

 

SELECT COUNT(*) FROM F;

=> 32605 records

 

Result with CTE:

 

EXPLAIN ANALYSE

  WITH a AS (

  SELECT F.oID, Doc.szText

  FROM F

  JOIN Doc ON F.uDocRef = Doc.udocid

)

SELECT *

  FROM a

  WHERE szText ILIKE '%480GB%';

 

"CTE Scan on a  (cost=9463.42..10197.03 rows=52 width=48) (actual time=478.770..4551.613 rows=10 loops=1)"

"  Filter: (sztext ~~* '%480GB%'::text)"

"  Rows Removed by Filter: 32595"

"  CTE a"

"    ->  Hash Join  (cost=973.61..9463.42 rows=32605 width=359) (actual time=36.998..100.337 rows=32605 loops=1)"

"          Hash Cond: (doc.udocid = f.udocref)"

"          ->  Seq Scan on doc  (cost=0.00..7691.46 rows=125946 width=359) (actual time=0.008..18.269 rows=125946 loops=1)"

"          ->  Hash  (cost=566.05..566.05 rows=32605 width=32) (actual time=35.825..35.825 rows=32605 loops=1)"

"                Buckets: 32768  Batches: 1  Memory Usage: 2294kB"

"                ->  Seq Scan on f  (cost=0.00..566.05 rows=32605 width=32) (actual time=0.005..14.677 rows=32605 loops=1)"

"Planning time: 4.689 ms"

"Execution time: 4554.893 ms"

 

Result with regular query:

 

EXPLAIN ANALYSE

SELECT F.oID, Doc.szText

FROM F

JOIN Doc ON F.uDocRef = Doc.udocid

WHERE szText ILIKE '%480GB%';

 

"Hash Join  (cost=8006.56..8694.93 rows=5 width=359) (actual time=66500.415..66506.978 rows=10 loops=1)"

"  Hash Cond: (f.udocref = doc.udocid)"

"  ->  Seq Scan on f  (cost=0.00..566.05 rows=32605 width=32) (actual time=0.002..3.143 rows=32605 loops=1)"

"  ->  Hash  (cost=8006.32..8006.32 rows=19 width=359) (actual time=66500.023..66500.023 rows=16 loops=1)"

"        Buckets: 1024  Batches: 1  Memory Usage: 19kB"

"        ->  Seq Scan on doc  (cost=0.00..8006.32 rows=19 width=359) (actual time=8864.720..66499.991 rows=16 loops=1)"

"              Filter: (sztext ~~* '%480GB%'::text)"

"              Rows Removed by Filter: 125930"

"Planning time: 263.542 ms"

"Execution time: 66507.003 ms"

 

 

 

 

Re: CTE with JOIN of two tables is much faster than a regular query

From
Andreas Kretschmer
Date:

Am 18.08.2018 um 11:36 schrieb kpi6288@gmail.com:
> What can I do to improve the performance of the regular query without 
> using a CTE? 

try to rewrite it to a subselect:

select ... from ... join (selec ... from ... where ...) x on ...


Regards, Andreas

-- 
2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com




> -----Ursprüngliche Nachricht-----
> Von: Andreas Kretschmer <andreas@a-kretschmer.de>
> Gesendet: Samstag, 18. August 2018 12:27

> Am 18.08.2018 um 11:36 schrieb kpi6288@gmail.com:
> > What can I do to improve the performance of the regular query without
> > using a CTE?
>
> try to rewrite it to a subselect:
>
> select ... from ... join (selec ... from ... where ...) x on ...
>

Do mean like this?

EXPLAIN ANALYSE
SELECT F.oID, D.szText
FROM F
JOIN (SELECT Doc.uDocID, Doc.szText FROM Doc WHERE szText ILIKE '%480GB%')
AS D ON D.uDocID = F.uDocRef;

Just as bad as my regular query:

"Hash Join  (cost=8006.56..8694.93 rows=5 width=359) (actual
time=66777.898..66784.630 rows=10 loops=1)"
"  Hash Cond: (f.udocref = doc.udocid)"
"  ->  Seq Scan on f  (cost=0.00..566.05 rows=32605 width=32) (actual
time=0.002..3.563 rows=32605 loops=1)"
"  ->  Hash  (cost=8006.32..8006.32 rows=19 width=359) (actual
time=66777.471..66777.471 rows=16 loops=1)"
"        Buckets: 1024  Batches: 1  Memory Usage: 19kB"
"        ->  Seq Scan on doc  (cost=0.00..8006.32 rows=19 width=359) (actual
time=9013.317..66777.438 rows=16 loops=1)"
"              Filter: (sztext ~~* '%480GB%'::text)"
"              Rows Removed by Filter: 125930"
"Planning time: 236.354 ms"
"Execution time: 66784.651 ms"



Re: AW: CTE with JOIN of two tables is much faster than a regularquery

From
Adrian Klaver
Date:
On 08/18/2018 04:08 AM, kpi6288@gmail.com wrote:
> 
> 
>> -----Ursprüngliche Nachricht-----
>> Von: Andreas Kretschmer <andreas@a-kretschmer.de>
>> Gesendet: Samstag, 18. August 2018 12:27
>   
>> Am 18.08.2018 um 11:36 schrieb kpi6288@gmail.com:
>>> What can I do to improve the performance of the regular query without
>>> using a CTE?
>>
>> try to rewrite it to a subselect:
>>
>> select ... from ... join (selec ... from ... where ...) x on ...
>>
> 
> Do mean like this?
> 
> EXPLAIN ANALYSE
> SELECT F.oID, D.szText
> FROM F
> JOIN (SELECT Doc.uDocID, Doc.szText FROM Doc WHERE szText ILIKE '%480GB%')
> AS D ON D.uDocID = F.uDocRef;

To try to replicate what the CTE is doing I would try:

SELECT
    *
FROM
    Doc
JOIN

(SELECT
    uDocRef, F.oID, Doc.szText
FROM
    F
JOIN
    Doc
ON
    F.uDocRef = Doc.udocid
) AS D
ON
    D.uDocRef = Doc.udocid
WHERE
    D.szText ILIKE '%480GB%'



> 
> Just as bad as my regular query:
> 
> "Hash Join  (cost=8006.56..8694.93 rows=5 width=359) (actual
> time=66777.898..66784.630 rows=10 loops=1)"
> "  Hash Cond: (f.udocref = doc.udocid)"
> "  ->  Seq Scan on f  (cost=0.00..566.05 rows=32605 width=32) (actual
> time=0.002..3.563 rows=32605 loops=1)"
> "  ->  Hash  (cost=8006.32..8006.32 rows=19 width=359) (actual
> time=66777.471..66777.471 rows=16 loops=1)"
> "        Buckets: 1024  Batches: 1  Memory Usage: 19kB"
> "        ->  Seq Scan on doc  (cost=0.00..8006.32 rows=19 width=359) (actual
> time=9013.317..66777.438 rows=16 loops=1)"
> "              Filter: (sztext ~~* '%480GB%'::text)"
> "              Rows Removed by Filter: 125930"
> "Planning time: 236.354 ms"
> "Execution time: 66784.651 ms"
> 
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com


> -----Ursprüngliche Nachricht-----
> Von: Adrian Klaver <adrian.klaver@aklaver.com>
> Gesendet: Samstag, 18. August 2018 16:24
>
> To try to replicate what the CTE is doing I would try:
> SELECT  *
> FROM    Doc
> JOIN  (SELECT    uDocRef, F.oID, Doc.szText
> FROM F JOIN    Doc ON    F.uDocRef = Doc.udocid) AS D
> ON D.uDocRef = Doc.udocid
> WHERE D.szText ILIKE '%480GB%'

No difference - still starting with the full scan on Doc and lasting 67 seconds:

"Nested Loop  (cost=8006.98..8700.40 rows=5 width=750) (actual time=66845.857..66852.705 rows=10 loops=1)"
"  ->  Hash Join  (cost=8006.56..8694.93 rows=5 width=391) (actual time=66845.838..66852.613 rows=10 loops=1)"
"        Hash Cond: (f.udocref = doc_1.udocid)"
"        ->  Seq Scan on f  (cost=0.00..566.05 rows=32605 width=32) (actual time=0.002..3.428 rows=32605 loops=1)"
"        ->  Hash  (cost=8006.32..8006.32 rows=19 width=359) (actual time=66845.431..66845.431 rows=16 loops=1)"
"              Buckets: 1024  Batches: 1  Memory Usage: 19kB"
"              ->  Seq Scan on doc doc_1  (cost=0.00..8006.32 rows=19 width=359) (actual time=9042.984..66845.398
rows=16loops=1)" 
"                    Filter: (sztext ~~* '%480GB%'::text)"
"                    Rows Removed by Filter: 125930"
"  ->  Index Scan using doc_udocid_key on doc  (cost=0.42..1.08 rows=1 width=375) (actual time=0.008..0.008 rows=1
loops=10)"
"        Index Cond: (udocid = f.udocref)"
"Planning time: 252.162 ms"
"Execution time: 66852.737 ms"




Re: CTE with JOIN of two tables is much faster than a regular query

From
Stephen Frost
Date:
Greetings,

* kpi6288@gmail.com (kpi6288@gmail.com) wrote:
> The CTE mentioned below completes the query in 4.5 seconds while the regular
> query takes 66 seconds. I read from EXPLAIN ANALYSE that the regular query
> starts with a full table scan over "Doc" while the CTE joins the two tables
> first and applies the filter condition in the 2nd step.
>
> I believe that some rows in "Doc" which are not referenced by "F" contain a
> large amount of data in the field "szText" and this will slow down the ILIKE
> operator.

Yup, that appears to be what's happening.

> What can I do to improve the performance of the regular query without using
> a CTE?

You could possibly build a trigram index on the field you're searching,
which could avoid the full table scan.  Of course, that index could be
quite large, so there's downsides to that.  If these are words you're
looking for then you could use PG's full text indexing to build indexes
on the words and then use that instead.  If you are fine working with
words but are concerned about misspellings then you can extract out the
distinct words, build a trigram index on those, find the most similar
words based on the input and then search for those words using the FTI.

Unfortunately, we don't currently pay attention to things like average
string length when considering the cost of performing an 'ilike', so we
figure that doing the filtering first and then the join will be faster,
but that obviously falls over in some cases, like this one.  Using the
CTE forces PG to (today, at least) do the join first, but that isn't
really good to rely on.

Thanks!

Stephen

Attachment
> -----Ursprüngliche Nachricht-----
> Von: Stephen Frost <sfrost@snowman.net>
> Gesendet: Samstag, 18. August 2018 16:39

Hello,

>
> > What can I do to improve the performance of the regular query without
> > using a CTE?
>
> You could possibly build a trigram index on the field you're searching,
which
> could avoid the full table scan.  Of course, that index could be quite
large, so
> there's downsides to that.  If these are words you're looking for then you
> could use PG's full text indexing to build indexes on the words and then
use
> that instead.  If you are fine working with words but are concerned about
> misspellings then you can extract out the distinct words, build a trigram
index
> on those, find the most similar words based on the input and then search
for
> those words using the FTI.
>
> Unfortunately, we don't currently pay attention to things like average
string
> length when considering the cost of performing an 'ilike', so we figure
that
> doing the filtering first and then the join will be faster, but that
obviously falls
> over in some cases, like this one.  Using the CTE forces PG to (today, at
least)
> do the join first, but that isn't really good to rely on.

A trigram index would be a possible help in this particular scenario but
size and updating the index in other parts of the application would be
probably create other issues. I may try it, though.

But thanks to confirming my assumption. I just thought that it should be
obvious to the optimizer to do the join first and filter on this result. But
I'm reading you r post that there is nothing that I can do to modify the
behavior of the optimizer. Or is there a way to specify the cost for an
operator (ILIKE in this case) on a specific column?

Thanks
Klaus



Re: CTE with JOIN of two tables is much faster than a regular query

From
Tom Lane
Date:
Stephen Frost <sfrost@snowman.net> writes:
> * kpi6288@gmail.com (kpi6288@gmail.com) wrote:
>> The CTE mentioned below completes the query in 4.5 seconds while the regular
>> query takes 66 seconds.

> Unfortunately, we don't currently pay attention to things like average
> string length when considering the cost of performing an 'ilike', so we
> figure that doing the filtering first and then the join will be faster,
> but that obviously falls over in some cases, like this one.  Using the
> CTE forces PG to (today, at least) do the join first, but that isn't
> really good to rely on.

Well, it's simpler than that: filter quals are always evaluated at
the lowest possible plan level.  One of the Berkeley PhD theses that
we ripped out ages ago tried to be smarter about that, but the
cost/benefit/complexity ratio just wasn't very good, mainly because
it's so darn hard to estimate the selectivity of quals on subsets
of relations.

It's not very apparent why the results are so bad in this case,
either.  One of the plans has the ILIKE being applied to circa 32600
rows, and the other one runs it on circa 126000 rows.  That should
produce less than a 4x penalty, not 14x.  Do the rows removed by
the join have significantly-longer-on-average sztext fields?
(If so, the odds that the planner would ever recognize such a
correlation seem pretty small.)

In any case, given that the ILIKE selects so few rows (and the planner
knows it!), finding a way to index that is clearly the right answer.

            regards, tom lane


> -----Ursprüngliche Nachricht-----
> Von: Tom Lane <tgl@sss.pgh.pa.us>
> Gesendet: Samstag, 18. August 2018 17:29
>
> Well, it's simpler than that: filter quals are always evaluated at the
lowest
> possible plan level.

Thank you. This "always" was not clear to me, but it explains a few similar
cases (with not-so-extreme differences) that I could not understand.

Regards
Klaus



Re: CTE with JOIN of two tables is much faster than a regular query

From
Ravi Krishna
Date:
> What can I do to improve the performance of the regular query without using a CTE? 

Why do you care ?  When I find that I can write a SQL 3 different ways, I will go for the most
efficient one.  So why not accept the CTE version of this SQL.  Just curious.



> -----Ursprüngliche Nachricht-----
> Von: Ravi Krishna <sravikrishna@aol.com>
> Gesendet: Samstag, 18. August 2018 18:25
>
> > What can I do to improve the performance of the regular query without
> using a CTE?
>
> Why do you care ?  When I find that I can write a SQL 3 different ways, I will
> go for the most efficient one.  So why not accept the CTE version of this SQL.
> Just curious.

We're using object mapping / entity frameworks (e.g. XPO, Entity Framework Core). These frameworks support regular
queriesout-of-the box; a CTEs require additional effort and are more difficult to maintain.  

Regards
Klaus



Re: AW: CTE with JOIN of two tables is much faster than a regular query

From
Tim Cross
Date:
kpi6288@gmail.com writes:

>> -----Ursprüngliche Nachricht-----
>> Von: Ravi Krishna <sravikrishna@aol.com>
>> Gesendet: Samstag, 18. August 2018 18:25
>>
>> > What can I do to improve the performance of the regular query without
>> using a CTE?
>>
>> Why do you care ?  When I find that I can write a SQL 3 different ways, I will
>> go for the most efficient one.  So why not accept the CTE version of this SQL.
>> Just curious.
>
> We're using object mapping / entity frameworks (e.g. XPO, Entity Framework Core). These frameworks support regular
queriesout-of-the box; a CTEs require additional effort and are more difficult to maintain.  
>

Ah, another reason to avoid object mapping/entity frameworks! I guess
really the same reason - loss of flexibility and expressive power.

Sorry, having a similar battle with some developers who are insisting on
using a particular framework because it makes maintenance easier as it
'automates' creation of controllers (MVC). However, they are frustrated
by performance and I'm frustrated as the framework also fails to pass
additional information, such as PGAPPNAME, which would make some
analysis easier. Part of the reason for the performance issues is
because the developers are doing things with result sets within the
client that would be far more efficient performed within the database.

One way I have resolved this in the past is to create database
procedures which present a 'mapped' view back to the framework layer
which hides the SQL from the framework. Works well, with the only main
downside being you now have SQL in a different (another) place, which
can make some people uncomfortable and can be a maintenance issue if all
your developers are just front-end devs who treat a database as just a
key/value repository. .

Tim
--
Tim Cross


> -----Ursprüngliche Nachricht-----
> Von: Tom Lane <tgl@sss.pgh.pa.us>
> Gesendet: Samstag, 18. August 2018 17:29
>
> In any case, given that the ILIKE selects so few rows (and the planner
knows
> it!), finding a way to index that is clearly the right answer.

A trigram index took 9 minutes to build but improved the regular query from
67 seconds down to 500 ms.

Although this is an impressive improvement, I'm afraid that the index might
create a delays in other parts of the application (INSERT / UPDATE). We will
probably rework the design of this particular table.

Thanks to everyone who helped me in this matter.

Regards

Klaus



> -----Ursprüngliche Nachricht-----
> Von: Tim Cross <theophilusx@gmail.com>
> Gesendet: Sonntag, 19. August 2018 04:57
> >
> > We're using object mapping / entity frameworks (e.g. XPO, Entity
> Framework Core). These frameworks support regular queries out-of-the
> box; a CTEs require additional effort and are more difficult to maintain.
> >
>
> Ah, another reason to avoid object mapping/entity frameworks! I guess
> really the same reason - loss of flexibility and expressive power.

While I agree that you loose control over certain details, we are overall quite happy using the frameworks.

The frameworks nowadays provide the ability to call procedures if required - but using the objects directly is more
convenientfor the developers. SQL procedures add (just like a CTE) an additional layer to the application design which
needsmaintenance. That's fine if it really helps overall but we try to avoid it if it isn't needed.  

Regards
Klaus




(was: CTE with JOIN of two tables is much faster than a regularquery)

From
Albrecht Dreß
Date:
Am 18.08.18 11:36 schrieb(en) kpi6288@gmail.com:
[snip]
> What can I do to improve the performance of the regular query without using a CTE?

Sorry for jumping into this discussion late – I'm facing similar problems with Postgres choosing strange and
inefficientquery plans for no (for me) apparent reason.  I use the DEB packages postgresql-10, version 10.5-1.pgdg90+1,
ona Debian stretch box. 

The relevant part of the database structure is:

--8<-----------------------------------------------------------------------------------------------
mydb=> \d strings
                             Table "public.strings"
  Column |  Type  | Collation | Nullable |               Default
--------+--------+-----------+----------+--------------------------------------
  iid    | bigint |           | not null |
  sid    | bigint |           | not null | nextval('strings_sid_seq'::regclass)
  stype  | text   |           |          |
  string | text   |           |          |
Indexes:
     "strings_pkey" PRIMARY KEY, btree (iid, sid)
     "idx_strings_string_gin" gin (string gin_trgm_ops)
     "idx_stype" btree (stype)
Foreign-key constraints:
     "strings_iid_fkey" FOREIGN KEY (iid) REFERENCES items(iid) ON DELETE CASCADE

mydb=> \d items
                                    Table "public.items"
     Column     |     Type      | Collation | Nullable |              Default
---------------+---------------+-----------+----------+------------------------------------
  dbid          | bigint        |           | not null |
  iid           | bigint        |           | not null | nextval('items_iid_seq'::regclass)
  riid          | integer       |           |          |
[…more columns…]
Indexes:
     "items_pkey" PRIMARY KEY, btree (iid)
     "idx_items_riid" btree (riid)
     "items_dbid" btree (dbid)
     […more indexes…]
Referenced by:
     TABLE "strings" CONSTRAINT "strings_iid_fkey" FOREIGN KEY (iid) REFERENCES items(iid) ON DELETE CASCADE
     […more references…]
--8<-----------------------------------------------------------------------------------------------

The table “strings” contains about 2 * 10e7 active rows, “items” about 10e8.

The “instability” occurs with the following somewhat trivial query.  In the correct (IMO) case, the indexes are used:

--8<-----------------------------------------------------------------------------------------------
mydb=> EXPLAIN ANALYZE SELECT items.iid, stype, string, riid FROM items LEFT JOIN strings USING(iid) WHERE stype ~
E'^tag\\..*(?<\!\\.\\d+)$'AND dbid = 7416000; 
                                                             QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------
  Nested Loop  (cost=1.13..522716.95 rows=8 width=133) (actual time=0.078..0.715 rows=16 loops=1)
    ->  Index Scan using items_dbid on items  (cost=0.57..1377.96 rows=773 width=12) (actual time=0.021..0.038 rows=19
loops=1)
          Index Cond: (dbid = 7416000)
    ->  Index Scan using strings_pkey on strings  (cost=0.56..674.18 rows=26 width=129) (actual time=0.030..0.035
rows=1loops=19) 
          Index Cond: (iid = items.iid)
          Filter: (stype ~ '^tag\..*(?<!\.\d+)$'::text)
          Rows Removed by Filter: 3
  Planning time: 1.685 ms
  Execution time: 0.762 ms
(9 rows)
--8<-----------------------------------------------------------------------------------------------

However, seemingly at random, Postgres chooses the following plan which is (planning plus execution) ~1500 times
slower:

--8<-----------------------------------------------------------------------------------------------
mydb=> EXPLAIN ANALYZE SELECT items.iid, stype, string, riid FROM items LEFT JOIN strings USING(iid) WHERE stype ~
E'^tag\\..*(?<\!\\.\\d+)$'AND dbid = 7416000; 
                                                                        QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------
  Gather  (cost=84945.47..522033.97 rows=9 width=133) (actual time=1401.570..3868.239 rows=16 loops=1)
    Workers Planned: 2
    Workers Launched: 2
    ->  Hash Join  (cost=83945.47..521033.07 rows=4 width=133) (actual time=2206.088..3823.982 rows=5 loops=3)
          Hash Cond: (strings.iid = items.iid)
          ->  Parallel Bitmap Heap Scan on strings  (cost=82539.52..518233.10 rows=531057 width=129) (actual
time=390.479..3795.902rows=401149 loops=3) 
                Filter: (stype ~ '^tag\..*(?<!\.\d+)$'::text)
                Rows Removed by Filter: 384802
                Heap Blocks: exact=76067
                ->  Bitmap Index Scan on idx_stype  (cost=0.00..82220.88 rows=2334832 width=0) (actual
time=340.725..340.725rows=2357863 loops=1) 
                      Index Cond: ((stype >= 'tag.'::text) AND (stype < 'tag/'::text))
          ->  Hash  (cost=1395.77..1395.77 rows=814 width=12) (actual time=0.137..0.137 rows=19 loops=3)
                Buckets: 1024  Batches: 1  Memory Usage: 9kB
                ->  Index Scan using items_dbid on items  (cost=0.57..1395.77 rows=814 width=12) (actual
time=0.072..0.126rows=19 loops=3) 
                      Index Cond: (dbid = 7416000)
  Planning time: 2.617 ms
  Execution time: 3868.303 ms
(17 rows)
--8<-----------------------------------------------------------------------------------------------

It looks as if the selection of the plan is more or less random, and does /not/ depend on the statistics state.  I.e.
running“vacuum analyze strings; vacuum analyze items;” immediately before the query does /not/ result in a reproducible
behaviour(a /very/ small number if entries may have been added or deleted between the calls in both tables, though). 

My solution for a stable (but slower than the query utilising the indexes) response time is also using a CTE.  However,
itwould be helpful to fix (or at least understand) the behaviour. 

Best,
Albrecht.
Attachment