Re: [GENERAL] Inconsistent performance with LIKE and bind variable onlong-lived connection - Mailing list pgsql-general

From Steven Grimm
Subject Re: [GENERAL] Inconsistent performance with LIKE and bind variable onlong-lived connection
Date
Msg-id CAOFXwWXa8xeL2oc3k08MHph_tgYyU4fQYnzEyBEcC3keO6cm+Q@mail.gmail.com
Whole thread Raw
In response to Re: [GENERAL] Inconsistent performance with LIKE and bind variable onlong-lived connection  ("David G. Johnston" <david.g.johnston@gmail.com>)
Responses Re: [GENERAL] Inconsistent performance with LIKE and bind variable on long-lived connection  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: [GENERAL] Inconsistent performance with LIKE and bind variable onlong-lived connection  ("David G. Johnston" <david.g.johnston@gmail.com>)
List pgsql-general
On Sat, Jun 10, 2017 at 11:10 PM, David G. Johnston <david.g.johnston@gmail.com> wrote:

Specifically, the notes section.

That seems to fit the behavior. Thanks; I wasn't aware of that feature of prepared statements. I changed the Python code to do EXPLAIN ANALYZE EXECUTE rather than EXECUTE, and I do indeed see a change in plan after the fifth iteration:

(first five iterations)

Index Only Scan using test_pkey on test  (cost=0.29..476.29 rows=9999 width=4) (actual time=0.058..2.439 rows=10000 loops=1)
  Index Cond: (col1 = 'xyz'::text)
  Filter: (col2 ~~ '%'::text)
  Heap Fetches: 0
Execution time: 2.957 ms

(iterations 6+)

Sort  (cost=205.41..205.54 rows=50 width=4) (actual time=104.986..105.784 rows=10000 loops=1)
  Sort Key: col2
  Sort Method: quicksort  Memory: 853kB
  ->  Seq Scan on test  (cost=0.00..204.00 rows=50 width=4) (actual time=0.014..2.100 rows=10000 loops=1)
        Filter: ((col2 ~~ $2) AND (col1 = $1))
Execution time: 106.282 ms

So the problem here may be that the cost estimate for the generic execution plan is way off in the case of a LIKE bind variable that matches a large number of rows. I did make sure to have the Java code do a VACUUM ANALYZE after doing its inserts, just to eliminate lack of statistics as a possible explanation. Maybe the incorrect row count estimate (50 instead of 10000) is causing it to think the quicksort will be a lot cheaper than it ends up being with the actual rows?

Interesting that the Java version switches to the suboptimal plan after 9 iterations rather than 5. I don't know how to get the JDBC driver to do an EXPLAIN on a prepared statement, so I can't confirm that the same thing is happening there, but it seems plausible. Happy to try that if there's a way to do it.

-Steve

pgsql-general by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: [GENERAL] Inconsistent performance with LIKE and bind variable onlong-lived connection
Next
From: Alvaro Herrera
Date:
Subject: Re: [GENERAL] ERROR: unexpected chunk number 0 (expected 1) fortoast value 76753264 in pg_toast_10920100