PG planning randomly ? - Mailing list pgsql-performance

From Laurent Raufaste
Subject PG planning randomly ?
Date
Msg-id 669dc9710802260719h61d846b3hd50b7f547bcb82c3@mail.gmail.com
Whole thread Raw
Responses Re: PG planning randomly ?
Re: PG planning randomly ?
List pgsql-performance
Hi,

I'm having some issues with this simple query:

SELECT
  _comment.*,
  _article.title AS article_title,
  _article.reference AS article_reference
FROM
  _comment
  INNER JOIN _article
    ON _article.id = _comment.parent_id
WHERE
  _comment.path <@ '0.1.3557034'
ORDER BY
  _comment.date_publishing DESC
OFFSET 0
LIMIT 5
;

The varying information here is the ltree path "0.1.3557034"

Usually it's quite fast (<1s) but sometimes after an ANALYZE on the
_comment table it gets so slow it's killing our servers. And it's
really random.
We run our servers with default_statistics_target=100, I tried setting
it up to 1000 (max) but it does not change this wrong behavior.

I executed the same query on our 11 servers, 3 of them executed the
query slowly after the ANALYZE. Sometimes it happens to more,
sometimes to less.
Here is the EXPLAIN ANALYZE data on those 3 servers before and after
the ANALYZE execution.

===== Server 1 =( ======

===== The Query on server 1 before an ANALYZE =====
Limit  (cost=16286.04..16286.06 rows=5 width=567) (actual
time=62.521..62.526 rows=5 loops=1)
   ->  Sort  (cost=16286.04..16289.89 rows=1539 width=567) (actual
time=62.519..62.520 rows=5 loops=1)
         Sort Key: _comment.date_publishing
         ->  Nested Loop  (cost=0.00..16204.57 rows=1539 width=567)
(actual time=2.063..44.517 rows=3606 loops=1)
               ->  Index Scan using gist_idx_comment_path on _comment
(cost=0.00..4736.73 rows=1539 width=534) (actual time=2.038..20.487
rows=3748 loops=1)
                     Index Cond: (path <@ '0.1.14666029'::ltree)
               ->  Index Scan using _article_pkey on _article
(cost=0.00..7.44 rows=1 width=41) (actual time=0.004..0.004 rows=1
loops=3748)
                     Index Cond: (_article.id = _comment.parent_id)
Total runtime: 64.844 ms

===== The Query on server 1 after an ANALYZE =====
 Limit  (cost=0.00..11082.13 rows=5 width=569) (actual
time=313945.051..693805.921 rows=5 loops=1)
   ->  Nested Loop  (cost=0.00..34057601.77 rows=15366 width=569)
(actual time=313945.049..693805.912 rows=5 loops=1)
         ->  Index Scan Backward using idx_comment_date_publishing on
_comment  (cost=0.00..33949736.04 rows=15366 width=536) (actual
time=313923.129..693755.772 rows=5 loops=1)
               Filter: (path <@ '0.1.14666029'::ltree)
         ->  Index Scan using _article_pkey on _article
(cost=0.00..7.01 rows=1 width=41) (actual time=10.016..10.018 rows=1
loops=5)
               Index Cond: (_article.id = _comment.parent_id)
Total runtime: 693806.044 ms

===== Poor Server 2 ='( ======

 ===== The Query on server 2 before an ANALYZE =====
  Limit  (cost=21096.49..21096.51 rows=5 width=586) (actual
time=34.184..34.187 rows=5 loops=1)
   ->  Sort  (cost=21096.49..21100.33 rows=1535 width=586) (actual
time=34.182..34.184 rows=5 loops=1)
         Sort Key: _comment.date_publishing
         ->  Nested Loop  (cost=0.00..21015.26 rows=1535 width=586)
(actual time=0.119..25.232 rows=3606 loops=1)
               ->  Index Scan using gist_idx_comment_path on _comment
(cost=0.00..6325.53 rows=1535 width=553) (actual time=0.100..11.066
rows=3748 loops=1)
                     Index Cond: (path <@ '0.1.14666029'::ltree)
               ->  Index Scan using _article_pkey on _article
(cost=0.00..9.56 rows=1 width=41) (actual time=0.002..0.003 rows=1
loops=3748)
                     Index Cond: (_article.id = _comment.parent_id)
Total runtime: 34.658 ms

===== The Query on server 2 after an ANALYZE =====
   Limit  (cost=0.00..18806.13 rows=5 width=585) (actual
time=363344.748..575823.722 rows=5 loops=1)
   ->  Nested Loop  (cost=0.00..57764897.33 rows=15358 width=585)
(actual time=363344.747..575823.715 rows=5 loops=1)
         ->  Index Scan Backward using idx_comment_date_publishing on
_comment  (cost=0.00..57618270.03 rows=15358 width=552) (actual
time=363344.681..575823.502 rows=5 loops=1)
               Filter: (path <@ '0.1.14666029'::ltree)
         ->  Index Scan using _article_pkey on _article
(cost=0.00..9.53 rows=1 width=41) (actual time=0.036..0.036 rows=1
loops=5)
               Index Cond: (_article.id = _comment.parent_id)
Total runtime: 575823.796 ms

===== Poor Server 3 ='(((( ======

 ===== The Query on server 3 before an ANALYZE =====
   Limit  (cost=20563.80..20563.81 rows=5 width=585) (actual
time=31.424..31.428 rows=5 loops=1)
   ->  Sort  (cost=20563.80..20567.64 rows=1539 width=585) (actual
time=31.423..31.424 rows=5 loops=1)
         Sort Key: _comment.date_publishing
         ->  Nested Loop  (cost=0.00..20482.32 rows=1539 width=585)
(actual time=1.198..22.912 rows=3606 loops=1)
               ->  Index Scan using gist_idx_comment_path on _comment
(cost=0.00..6341.85 rows=1539 width=552) (actual time=1.160..9.641
rows=3748 loops=1)
                     Index Cond: (path <@ '0.1.14666029'::ltree)
               ->  Index Scan using _article_pkey on _article
(cost=0.00..9.18 rows=1 width=41) (actual time=0.002..0.003 rows=1
loops=3748)
                     Index Cond: (_article.id = _comment.parent_id)
Total runtime: 31.850 ms

===== The Query on server 3 after an ANALYZE =====
   Limit  (cost=0.00..18726.66 rows=5 width=585) (actual
time=171378.294..286416.273 rows=5 loops=1)
   ->  Nested Loop  (cost=0.00..57577000.69 rows=15373 width=585)
(actual time=171378.293..286416.269 rows=5 loops=1)
         ->  Index Scan Backward using idx_comment_date_publishing on
_comment  (cost=0.00..57436080.63 rows=15373 width=552) (actual
time=171378.249..286416.062 rows=5 loops=1)
               Filter: (path <@ '0.1.14666029'::ltree)
         ->  Index Scan using _article_pkey on _article
(cost=0.00..9.15 rows=1 width=41) (actual time=0.034..0.034 rows=1
loops=5)
               Index Cond: (_article.id = _comment.parent_id)
Total runtime: 286416.339 ms

How can we stick the planner to the faster execution plan ?

Please help our poor servers, they are tired ;)

--
Laurent Raufaste
<http://www.glop.org/>

pgsql-performance by date:

Previous
From: valgog
Date:
Subject: Re: response time when querying via JDBC and via psql differs
Next
From: "Laurent Raufaste"
Date:
Subject: Re: PG planning randomly ?