Re: Prepared statements and generic plans - Mailing list pgsql-hackers

From 'bruce@momjian.us'
Subject Re: Prepared statements and generic plans
Date
Msg-id 20160610234640.GA27837@momjian.us
Whole thread Raw
In response to Re: Prepared statements and generic plans  (Albe Laurenz <laurenz.albe@wien.gv.at>)
Responses Re: Prepared statements and generic plans  (Albe Laurenz <laurenz.albe@wien.gv.at>)
List pgsql-hackers
On Tue, Jun  7, 2016 at 06:52:15AM +0000, Albe Laurenz wrote:
> Bruce Momjian wrote:
> >> !    distinct column values, a generic plan assumes a column equality
> >> !    comparison will match 33% of processed rows.  Column statistics
> >>
> >> ... assumes *that* a column equality comparison will match 33% of *the* processed rows.
> >
> > Uh, that seems overly wordy.  I think the rule is that if the sentence
> > makes sense without the words, you should not use them, but it is
> > clearly a judgement call in this case.  Do you agree?
>
> My gut feeling is that at least the "the" should be retained, but mine
> are the guts of a German speaker.
> It is clearly a judgement call, so follow your instincts.

I think "that/the" would make sense if this sentence was referencing a
specific result.  The sentence is referencing a hypothetical, so I don't
think "that/the" is needed.

> > One more thing --- there was talk of moving some of this into chapter
> > 66, but as someone already mentioned, there are no subsections there
> > because it is a dedicated topic:
> >
> >     66. How the Planner Uses Statistics.
> >
> > I am not inclined to add a prepare-only section to that chapter.  On the
> > other hand, the issues described apply to PREPARE and to protocol-level
> > prepare, so having it in PREPARE also seems illogical.  However, I am
> > inclined to leave it in PREPARE until we are ready to move all of this
> > to chapter 66.
>
> I think it would be ok to leave it where it is in your patch; while the
> paragraph goes into technical detail, it is still alright in the general
> documentation (but only just).

I researched moving some of this text into chapter 66, but found that
only some of it related to the optimizer.  I also realized that the text
applies to the libpq/wire protocol prepare cases too, so rather than
bounce readers to the PREPARE manual page, and then to chapter 66, I
just kept it all in PREPARE, with a reference from the wire protocol
prepare section.

Also, is it possible to do an EXPLAIN prepare() with the libpq/wire
protocol?  I can't do PREPARE EXPLAIN, but I can do EXPLAIN EXECUTE.
However, I don't see any way to inject EXPLAIN into the libpq/wire
prepare case.  Can you specify prepare(EXPLAIN SELECT)?  (PREPARE
EXPLAIN SELECT throws a syntax error.)

Looking at how the code behaves, it seems custom plans that are _more_
expensive (plus planning cost) than the generic plan switch to the
generic plan after five executions, as now documented.  Custom plans
that are significantly _cheaper_ than the generic plan _never_ use the
generic plan.

Here is an example --- first load this SQL:

    DROP TABLE IF EXISTS test;
    CREATE TABLE test (c1 INT, c2 INT);
    INSERT INTO test SELECT c1, abs(floor(random() * 4)-1) FROM generate_series(1, 10000) AS a(c1);
    INSERT INTO test SELECT c1, abs(floor(random() * 4)-1) FROM generate_series(10001, 15000) AS a(c1);
    INSERT INTO test SELECT c1, abs(floor(random() * 4)-1) FROM generate_series(15001, 20000) AS a(c1);
    -- add non-uniformly-distributed values to 'c2'
    INSERT INTO test SELECT 20001, 3;
    INSERT INTO test SELECT 20002, 4;
    CREATE INDEX i_test_c1 ON test (c1);
    CREATE INDEX i_test_c2 ON test (c2);
    ANALYZE test;
    PREPARE prep_c1 AS SELECT * FROM test WHERE c1 = $1;
    PREPARE prep_c2 AS SELECT * FROM test WHERE c2 = $1;

prep_c1 references 'c1', which is a unique column.  Any value used in
the EXECUTE, e.g. EXPLAIN EXECUTE prep_c1(1), existent or non-existent,
generates an index scan, and after five executions a generic index
scan is used.

For prep_c2, if you use the 50% common value '1', the first five
executions use a sequential scan, then the sixth is a generic Bitmap
Heap Scan. For the 25% value of '0' or '2',  the first five runs
generate a Bitmap Heap Scan, and a generic Bitmap Heap Scan on the sixth
and after.

For a prep_c2 value of 3 or any non-existent value, an Index Scan is
used, and a generic plan is never chosen, because the Index Scan is
significantly cheaper than the generic plan.

Updated patch attached.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+                     Ancient Roman grave inscription +

Attachment

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Perf Benchmarking and regression.
Next
From: Thom Brown
Date:
Subject: Confusing recovery message when target not hit