Thread: Re: [NOVICE] WHERE clause not used when index is used
Tobias Florek <postgres@ibotty.net> writes: > When creating an index to use for an ORDER BY clause, a simple query > starts to return more results than expected. See the following detailed > log. Ugh. That is *badly* broken. I thought maybe it had something to do with the "abbreviated keys" work, but the same thing happens if you change the numeric column to integer, so I'm not very sure where to look. Who's touched btree key comparison logic lately? (Problem is reproducible in 9.5 and HEAD, but not 9.4.) > Create enough test data for planer to use an index (if exists) for the > condition. > CREATE TABLE "index_cond_test" AS > SELECT > (10 + random() * 10)::int AS "final_score", > round((10 + random() * 10)::numeric, 5) "time_taken" > FROM generate_series(1, 10000) s; > Run control query without an index (will be less than 10000 rows). Pay > attention to tuples of (20,a) with a > 11. > SELECT * > FROM "index_cond_test" > WHERE (final_score, time_taken) < (20, 11) > ORDER BY final_score DESC, time_taken ASC; > Or wrapped in count(*), to make it even more obvious > SELECT count(*) FROM ( SELECT * > FROM "index_cond_test" > WHERE (final_score, time_taken) < (20, 11) > ORDER BY final_score DESC, time_taken ASC) q; > Create the index > CREATE INDEX "index_cond_test_ranking" ON "index_cond_test" USING btree (final_score DESC, time_taken ASC); > Run test query (will return all 10000 rows) > SELECT * > FROM "index_cond_test" > WHERE (final_score, time_taken) < (20, 11) > ORDER BY final_score DESC, time_taken ASC; > or wrapped > SELECT count(*) FROM ( SELECT * > FROM "index_cond_test" > WHERE (final_score, time_taken) < (20, 11) > ORDER BY final_score DESC, time_taken ASC) q; regards, tom lane
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > Tobias Florek <postgres@ibotty.net> writes: > > When creating an index to use for an ORDER BY clause, a simple query > > starts to return more results than expected. See the following detailed > > log. > > Ugh. That is *badly* broken. I thought maybe it had something to do with > the "abbreviated keys" work, but the same thing happens if you change the > numeric column to integer, so I'm not very sure where to look. Who's > touched btree key comparison logic lately? > > (Problem is reproducible in 9.5 and HEAD, but not 9.4.) Looks to have been introduced in 2ed5b87f. Reverting that gets us back to results which look correct. > > Create enough test data for planer to use an index (if exists) for the > > condition. > > > CREATE TABLE "index_cond_test" AS > > SELECT > > (10 + random() * 10)::int AS "final_score", > > round((10 + random() * 10)::numeric, 5) "time_taken" > > FROM generate_series(1, 10000) s; > > > > Run control query without an index (will be less than 10000 rows). Pay > > attention to tuples of (20,a) with a > 11. > > > SELECT * > > FROM "index_cond_test" > > WHERE (final_score, time_taken) < (20, 11) > > ORDER BY final_score DESC, time_taken ASC; > > > > Or wrapped in count(*), to make it even more obvious > > > SELECT count(*) FROM ( SELECT * > > FROM "index_cond_test" > > WHERE (final_score, time_taken) < (20, 11) > > ORDER BY final_score DESC, time_taken ASC) q; > > > Create the index > > > CREATE INDEX "index_cond_test_ranking" ON "index_cond_test" USING btree (final_score DESC, time_taken ASC); > > > Run test query (will return all 10000 rows) > > > SELECT * > > FROM "index_cond_test" > > WHERE (final_score, time_taken) < (20, 11) > > ORDER BY final_score DESC, time_taken ASC; > > > or wrapped > > > SELECT count(*) FROM ( SELECT * > > FROM "index_cond_test" > > WHERE (final_score, time_taken) < (20, 11) > > ORDER BY final_score DESC, time_taken ASC) q; Thanks! Stephen
On Tue, Mar 1, 2016 at 7:40 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Tobias Florek <postgres@ibotty.net> writes: >> When creating an index to use for an ORDER BY clause, a simple query >> starts to return more results than expected. See the following detailed >> log. > > Ugh. That is *badly* broken. I thought maybe it had something to do with > the "abbreviated keys" work, but the same thing happens if you change the > numeric column to integer, so I'm not very sure where to look. Who's > touched btree key comparison logic lately? > > (Problem is reproducible in 9.5 and HEAD, but not 9.4.) Bisects down to: 606c0123d627b37d5ac3f7c2c97cd715dde7842f is the first bad commit commit 606c0123d627b37d5ac3f7c2c97cd715dde7842f Author: Simon Riggs <simon@2ndQuadrant.com> Date: Tue Nov 18 10:24:55 2014 +0000 Reduce btree scan overhead for < and > strategies For <, <=, > and >= strategies, mark the first scan key as already matched if scanning in an appropriate direction. If index tuple contains no nulls we can skip the first re-check for each tuple. Author: Rajeev Rastogi Reviewer: Haribabu Kommi Rework of the code and comments by Simon Riggs It is not a part of the code-base I'm familiar with, so it is unlikely I can find the bug. Cheers, Jeff
[ trimming -novice from the cc list ] Jeff Janes <jeff.janes@gmail.com> writes: > On Tue, Mar 1, 2016 at 7:40 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> (Problem is reproducible in 9.5 and HEAD, but not 9.4.) > Bisects down to: > 606c0123d627b37d5ac3f7c2c97cd715dde7842f is the first bad commit > commit 606c0123d627b37d5ac3f7c2c97cd715dde7842f > Author: Simon Riggs <simon@2ndQuadrant.com> > Date: Tue Nov 18 10:24:55 2014 +0000 > Reduce btree scan overhead for < and > strategies Hmm ... just from the commit message, this seems much more likely to be the cause than does the buffer locking patch Stephen fingered. Stephen, how'd you identify 2ed5b87f as being the problem? regards, tom lane
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > [ trimming -novice from the cc list ] > > Jeff Janes <jeff.janes@gmail.com> writes: > > On Tue, Mar 1, 2016 at 7:40 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> (Problem is reproducible in 9.5 and HEAD, but not 9.4.) > > > Bisects down to: > > > 606c0123d627b37d5ac3f7c2c97cd715dde7842f is the first bad commit > > commit 606c0123d627b37d5ac3f7c2c97cd715dde7842f > > Author: Simon Riggs <simon@2ndQuadrant.com> > > Date: Tue Nov 18 10:24:55 2014 +0000 > > > Reduce btree scan overhead for < and > strategies > > Hmm ... just from the commit message, this seems much more likely to be > the cause than does the buffer locking patch Stephen fingered. Stephen, > how'd you identify 2ed5b87f as being the problem? Badly. :) I didn't expect it to be something that far back and was just going backwards through reverting/testing patches that looked like candidates, but forgot to create the index when I tested the revert of that, which made it look like reverting it 'fixed' it. Apologies for the noise. Thanks! Stephen
Jeff Janes <jeff.janes@gmail.com> writes: > Bisects down to: > 606c0123d627b37d5ac3f7c2c97cd715dde7842f is the first bad commit > commit 606c0123d627b37d5ac3f7c2c97cd715dde7842f > Author: Simon Riggs <simon@2ndQuadrant.com> > Date: Tue Nov 18 10:24:55 2014 +0000 > Reduce btree scan overhead for < and > strategies On looking at this, the problem seems pretty obvious: that commit simply fails to consider multi-column keys at all. (For that matter, it also fails to consider zero-column keys.) I think the logic might be all right if a test on so->numberOfKeys == 1 were added; I don't see how the optimization could work in multi-column cases. However, I'm not sure that's 100% of the issue, because in playing around with this I was having a harder time reproducing the failure outside of Tobias' example than I expected. There may be more than one bug, or there may be other changes that sometimes mask the problem. regards, tom lane
On 01/03/16 18:37, Tom Lane wrote: > Jeff Janes <jeff.janes@gmail.com> writes: >> Bisects down to: > >> 606c0123d627b37d5ac3f7c2c97cd715dde7842f is the first bad commit >> commit 606c0123d627b37d5ac3f7c2c97cd715dde7842f >> Author: Simon Riggs <simon@2ndQuadrant.com> >> Date: Tue Nov 18 10:24:55 2014 +0000 > >> Reduce btree scan overhead for < and > strategies > > On looking at this, the problem seems pretty obvious: that commit simply > fails to consider multi-column keys at all. (For that matter, it also > fails to consider zero-column keys.) I think the logic might be all right > if a test on so->numberOfKeys == 1 were added; I don't see how the > optimization could work in multi-column cases. > It seems that way to me as well. > However, I'm not sure that's 100% of the issue, because in playing around > with this I was having a harder time reproducing the failure outside of > Tobias' example than I expected. There may be more than one bug, or there > may be other changes that sometimes mask the problem. > I can only get the issue when the sort order of the individual keys does not correlate and the operator sorts according to the first column and there are duplicate values for the first column. This makes sense to me as we switch to SK_BT_MATCHED as soon as we hit first match, but because there isn't correlation on the second column we get false positives because while we are only scanning part of the btree where first column matches, it does not necessary has to be true for second column. I don't know our btree code to sufficient detail to be sure I didn't miss anything though. Quick fix to me seems what you said above, although it seems like we could make it work even for multicolumn indexes if we checked that the ordering of everything is correct. But I would refrain from adding the complexity as part of a fix. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Petr Jelinek <petr@2ndquadrant.com> writes: > On 01/03/16 18:37, Tom Lane wrote: >> However, I'm not sure that's 100% of the issue, because in playing around >> with this I was having a harder time reproducing the failure outside of >> Tobias' example than I expected. There may be more than one bug, or there >> may be other changes that sometimes mask the problem. > I can only get the issue when the sort order of the individual keys does > not correlate and the operator sorts according to the first column and > there are duplicate values for the first column. Yeah, I think the combination of ASC and DESC columns may be necessary to break things. It needs closer analysis. There is another behavorial difference between 9.4 and 9.5, which is that the planner's costing of scans of this sort seems to have changed. I can reproduce the problem now in the regression database: regression=# select count(*) from (select * from tenk1 where (thousand,tenthous) < (9,5000) order by thousand desc, tenthousasc) ss;count ------- 95 -- correct answer (1 row) regression=# create index on tenk1(thousand desc,tenthous asc); CREATE INDEX regression=# select count(*) from (select * from tenk1 where (thousand,tenthous) < (9,5000) order by thousand desc, tenthousasc) ss;count ------- 100 -- WRONG (1 row) What was confusing me is that the plan's changed: HEAD gives Aggregate (cost=7.29..7.29 rows=1 width=0) -> Index Only Scan using tenk1_thousand_tenthous_idx on tenk1 (cost=0.29..6.04rows=100 width=8) Index Cond: (ROW(thousand, tenthous) < ROW(9, 5000)) whereas 9.4 prefers Aggregate (cost=232.50..232.51 rows=1 width=0) -> Sort (cost=231.00..231.25 rows=100 width=244) Sort Key: tenk1.thousand,tenk1.tenthous -> Bitmap Heap Scan on tenk1 (cost=5.06..227.67 rows=100 width=244) RecheckCond: (ROW(thousand, tenthous) < ROW(9, 5000)) -> Bitmap Index Scan on tenk1_thousand_tenthous_idx (cost=0.00..5.04 rows=100 width=0) Index Cond: (ROW(thousand, tenthous) < ROW(9, 5000)) However you can force 9.4 to do it the same as HEAD by setting enable_sort to zero: Aggregate (cost=359.27..359.28 rows=1 width=0) -> Index Scan using tenk1_thousand_tenthous_idx on tenk1 (cost=0.29..358.02rows=100 width=244) Index Cond: (ROW(thousand, tenthous) < ROW(9, 5000)) But 9.4 correctly answers "95" with either plan, and 9.5 gives the wrong answer with either plan, so the plan change is not the cause of the bug. I'm not sure if the costing change is a bug or not --- the non-bitmap scan does seem to be cheaper in reality, but not by a couple orders of magnitude as the planner now thinks. regards, tom lane
I wrote: > I'm not sure if the costing change is a bug or not --- the non-bitmap scan > does seem to be cheaper in reality, but not by a couple orders of > magnitude as the planner now thinks. Ah, scratch that, I wasn't looking closely enough. The 9.4 plan is an IndexScan whereas 9.5+ uses IndexOnlyScan, which accounts for the cost differential. The reason it changed is the remove_unused_subquery_outputs patch (55d5b3c08279b487), which allows the subquery to know that it doesn't actually have to return all columns of tenk1, so that an index-only scan is legal. regards, tom lane
I wrote: > Petr Jelinek <petr@2ndquadrant.com> writes: >> I can only get the issue when the sort order of the individual keys does >> not correlate and the operator sorts according to the first column and >> there are duplicate values for the first column. > Yeah, I think the combination of ASC and DESC columns may be necessary to > break things. It needs closer analysis. Okay, I've now studied the patch more carefully and understood that it wasn't quite so naive as to not consider multi-column cases at all. It does work for simple scalar multi-column cases, because the test on SK_BT_MATCHED is placed so that it only skips checking the first column's condition, not the conditions on lower-order columns. Rather, the problem is failure to consider ROW() comparisons, which do not necessarily have the same simple behavior as scalar comparisons. They sort of accidentally work, *if* the lower-order columns in the ROW() comparison match the index's lower-order columns as to position and index direction. Tobias' example fails because the second column in the ROW() isn't sorted the same, and you could also make it fail with a 3-or-more-column index in which the ROW() comparison tested, say, just the first and third columns. What I think we should do is move the SK_BT_MATCHED flag down to the first sub-key of the ROW() comparison, which is the one that does behave the same as in scalar cases. The fact that it doesn't fail for the case where the lower-order columns of the ROW() do match the index shows that this still leaves something on the table: within a ROW() comparison, you could set SK_BT_MATCHED on more than just the first sub-key, *if* the subsequent columns match the index. This makes me think that the analysis should be getting done in or near _bt_mark_scankey_required, not in the rather random place it is now, because _bt_mark_scankey_required already understands about sub-keys matching the index or not. However ... there is an even larger problem, which is that the patch thinks it can use skey->sk_flags for what's effectively transient state storage --- not merely transient, but scan-direction-dependent. This means that a cursor that reads forwards for awhile and then turns around and reads backwards will get the wrong answer, even without anything as complicated as multiple key columns. It's a bit harder to exhibit such a bug than you might guess, because of btree's habit of prefetching an index page at a time; you have to make the scan cross an index page boundary before you turn around and back it up. Bearing that in mind, I was soon able to devise a failure case in the regression database: regression=# set enable_seqscan TO 0; SET regression=# set enable_bitmapscan TO 0; SET regression=# begin; BEGIN regression=# declare c cursor for select unique1,unique2 from tenk1 where unique1 > 9500; DECLARE CURSOR regression=# fetch 20 from c;unique1 | unique2 ---------+--------- 9501 | 5916 9502 | 1812 9503 | 1144 9504 | 9202 9505 | 4300 9506 | 5551 9507 | 847 9508 | 4093 9509 | 9418 9510 | 1862 9511 | 848 9512 | 4823 9513 | 1125 9514 | 9276 9515 | 1160 9516 | 5177 9517 | 3600 9518 | 9677 9519 | 5518 9520 | 1429 (20 rows) regression=# fetch backward 30 from c;unique1 | unique2 ---------+--------- 9519 | 5518 9518 | 9677 9517 | 3600 9516 | 5177 9515 | 1160 9514 | 9276 9513 | 1125 9512 | 4823 9511 | 848 9510 | 1862 9509 | 9418 9508 | 4093 9507 | 847 9506 | 5551 9505 | 4300 9504 | 9202 9503 | 1144 9502 | 1812 9501 | 5916 9500 | 3676 9499 | 927 9498 | 2807 9497 | 6558 9496 | 1857 9495 | 1974 9494 | 560 9493 | 3531 9492 | 9875 9491 | 7237 9490 | 8928 (30 rows) Ooops. I believe the way to fix this would be to stop regarding SK_BT_MATCHED as state, and instead treat it as a scankey property identified during _bt_preprocess_keys, analogously to SK_BT_REQFWD/SK_BT_REQBKWD --- and, like those, you'd need two flags not one since the properties will be determined independently of knowing which direction you'll be going in. In any event, I am now of the opinion that this patch needs to be reverted outright and returned to the authors for redesign. There are too many things wrong with it and too little reason to think that we have to have it in 9.5. regards, tom lane
I wrote: > I believe the way to fix this would be to stop regarding SK_BT_MATCHED > as state, and instead treat it as a scankey property identified during > _bt_preprocess_keys, analogously to SK_BT_REQFWD/SK_BT_REQBKWD --- and, > like those, you'd need two flags not one since the properties will be > determined independently of knowing which direction you'll be going in. BTW, the analogy to SK_BT_REQFWD/SK_BT_REQBKWD exposes another way in which the patch leaves money on the table: if the leading key is "=" then MATCHED behavior can't apply to it, but it might apply to a later key. I'm imagining a specification like this (in the comments for _bt_preprocess_keys, after the para starting "The output keys are marked with flags SK_BT_REQFWD and/or SK_BT_REQBKWD ..."): * Another property of the first attribute without an "=" key is that it may* not be necessary to recheck its value at eachindex entry as we scan* through the index. Again considering "x = 1 AND y < 4 AND z < 5", once we* have positioned toan entry satisfying those keys, it is unnecessary to* recheck "y < 4" as we scan forward, at least so long as the index'sy* value is not NULL. Every later row with x=1 must have y>=4; though we* can't make any similar statement aboutz. Similarly, a key like "y > 4"* need not be rechecked in a backwards scan. We mark appropriate keys with* flagsSK_BT_NORECHECK_FWD or SK_BT_NORECHECK_BKWD to indicate that _bt_next* can skip checking those keys (at non-null indexentries) when scanning in* the indicated direction. I'm also wondering whether it'd be worth taking more care about the handling of index entries containing some null columns. Right now, the presence of any nulls disables the MATCH improvement, but it would still apply if the null(s) are in lower-order columns. I'm not sure if that case comes up often enough to justify checking the flag bit twice per iteration, but it might. regards, tom lane
Sorry to keep coming back to this, but I just realized that the next para in _bt_preprocess_keys' doco explains yet another way in which this patch is broken: * Note that one reason we need direction-sensitive required-key flags is* precisely that we may not be able to eliminateredundant keys. Suppose* we have "x > 4::int AND x > 10::bigint", and we are unable to determine* which key ismore restrictive for lack of a suitable cross-type operator.* _bt_first will arbitrarily pick one of the keys to do theinitial* positioning with. If it picks x > 4, then the x > 10 condition will fail* until we reach index entries > 10;but we can't stop the scan just because* x > 10 is failing. On the other hand, if we are scanning backwards, then* failureof either key is indeed enough to stop the scan. (In general, when* inequality keys are present, the initial-positioningcode only promises to* position before the first possible match, not exactly at the first match,* fora forward scan; or after the last match for a backward scan.) This means that the patch's basic assumption, that _bt_first() leaves us positioned on a row that satisfies all the keys, is sometimes wrong. It may not be possible to demonstrate this bug using any standard opclasses, because we don't ship any incomplete cross-type operator families in core Postgres, but it's still wrong. Offhand the most practical solution seems to be to not mark keys with MATCH or NORECHECK or whatever you call it unless there is just a single inequality key for the column after preprocessing. That might add a bit more complication in bt_preprocess_keys, I fear, because it would be that much less like the REQFWD/REQBKWD case; but it would be a localized fix. I doubt we want to get into making _bt_first's API contract tighter. regards, tom lane
On 1 March 2016 at 17:22, Jeff Janes <jeff.janes@gmail.com> wrote:
--
On Tue, Mar 1, 2016 at 7:40 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Tobias Florek <postgres@ibotty.net> writes:
>> When creating an index to use for an ORDER BY clause, a simple query
>> starts to return more results than expected. See the following detailed
>> log.
>
> Ugh. That is *badly* broken. I thought maybe it had something to do with
> the "abbreviated keys" work, but the same thing happens if you change the
> numeric column to integer, so I'm not very sure where to look. Who's
> touched btree key comparison logic lately?
>
> (Problem is reproducible in 9.5 and HEAD, but not 9.4.)
Bisects down to:
606c0123d627b37d5ac3f7c2c97cd715dde7842f is the first bad commit
commit 606c0123d627b37d5ac3f7c2c97cd715dde7842f
Author: Simon Riggs <simon@2ndQuadrant.com>
Date: Tue Nov 18 10:24:55 2014 +0000
Reduce btree scan overhead for < and > strategies
For <, <=, > and >= strategies, mark the first scan key
as already matched if scanning in an appropriate direction.
If index tuple contains no nulls we can skip the first
re-check for each tuple.
Author: Rajeev Rastogi
Reviewer: Haribabu Kommi
Rework of the code and comments by Simon Riggs
Mea culpa.
Looks like we'll need a new release as soon as we can lock down a fix.
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 1 March 2016 at 20:03, Tom Lane <tgl@sss.pgh.pa.us> wrote:
--
In any event, I am now of the opinion that this patch needs to be reverted
outright and returned to the authors for redesign. There are too many
things wrong with it and too little reason to think that we have to have
it in 9.5.
Agreed; I'd already got to this thought while reading the thread.
I'll get on with that.
The patch is not of great importance to us? Since we are past 9.6 deadline it seems just worth reverting and leaving for next release to come back with a more isolated optimization. I don't want to add the last CF workload with this.
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2 March 2016 at 10:57, Simon Riggs <simon@2ndquadrant.com> wrote:
--
On 1 March 2016 at 20:03, Tom Lane <tgl@sss.pgh.pa.us> wrote:In any event, I am now of the opinion that this patch needs to be reverted
outright and returned to the authors for redesign. There are too many
things wrong with it and too little reason to think that we have to have
it in 9.5.Agreed; I'd already got to this thought while reading the thread.I'll get on with that.The patch is not of great importance to us? Since we are past 9.6 deadline it seems just worth reverting and leaving for next release to come back with a more isolated optimization. I don't want to add the last CF workload with this.
Reverted patch in HEAD and 9.5
Later, I will add the tests we discovered here to index scans, so that further optimization work is more easily possible.
Thanks Tom, Petr for analysis; thanks Jeff for bisecting.
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi, > Reverted patch in HEAD and 9.5 Is there an ETA? We can't easily go back to 9.4 and while adding additional conditions not to trigger this index only scan is possible, but a little fragile. Thank you for your work, btw. I was very surprised to find a bug in PostgreSQL! I honestly still am. Cheers,Tobias Florek
On 3 March 2016 at 10:11, Tobias Florek <postgres@ibotty.net> wrote:
--
Hi,
> Reverted patch in HEAD and 9.5
Is there an ETA?
I just committed the fix to the repo.
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
> > > Reverted patch in HEAD and 9.5 > > > > Is there an ETA? > > > > I just committed the fix to the repo. Sorry for being unclear, is there an ETA for a new point-release? Thank you,Tobias Florek
On Thu, Mar 3, 2016 at 7:49 PM, Tobias Florek <postgres@ibotty.net> wrote: >> > > Reverted patch in HEAD and 9.5 >> > >> > Is there an ETA? >> > >> >> I just committed the fix to the repo. > > Sorry for being unclear, is there an ETA for a new point-release? Nothing concrete yet. -- Michael
I recently joined the product management team for AWS RDS Postgres (after years at Oracle in their database team), and we are very interested in confirming (or not) that the fix for the problem below will be included in 9.5.2, and in the community’s plans (likely date) for releasing 9.5.2.
Is there an email list other than hackers where we can follow discussions on release plans for 9.5.2 (and future releases)?
Thanks,
-Kevin Jernigan
From: <pgsql-novice-owner@postgresql.org> on behalf of Simon Riggs <simon@2ndQuadrant.com>
Date: Wednesday, March 2, 2016 at 2:53 AM
To: Jeff Janes <jeff.janes@gmail.com>
Cc: Tom Lane <tgl@sss.pgh.pa.us>, Tobias Florek <postgres@ibotty.net>, pgsql-hackers <pgsql-hackers@postgresql.org>, PGSQL-Novice <pgsql-novice@postgresql.org>
Subject: Re: [HACKERS] [NOVICE] WHERE clause not used when index is used
Date: Wednesday, March 2, 2016 at 2:53 AM
To: Jeff Janes <jeff.janes@gmail.com>
Cc: Tom Lane <tgl@sss.pgh.pa.us>, Tobias Florek <postgres@ibotty.net>, pgsql-hackers <pgsql-hackers@postgresql.org>, PGSQL-Novice <pgsql-novice@postgresql.org>
Subject: Re: [HACKERS] [NOVICE] WHERE clause not used when index is used
On 1 March 2016 at 17:22, Jeff Janes <jeff.janes@gmail.com> wrote:
--
On Tue, Mar 1, 2016 at 7:40 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Tobias Florek <postgres@ibotty.net> writes:
>> When creating an index to use for an ORDER BY clause, a simple query
>> starts to return more results than expected. See the following detailed
>> log.
>
> Ugh. That is *badly* broken. I thought maybe it had something to do with
> the "abbreviated keys" work, but the same thing happens if you change the
> numeric column to integer, so I'm not very sure where to look. Who's
> touched btree key comparison logic lately?
>
> (Problem is reproducible in 9.5 and HEAD, but not 9.4.)
Bisects down to:
606c0123d627b37d5ac3f7c2c97cd715dde7842f is the first bad commit
commit 606c0123d627b37d5ac3f7c2c97cd715dde7842f
Author: Simon Riggs <simon@2ndQuadrant.com>
Date: Tue Nov 18 10:24:55 2014 +0000
Reduce btree scan overhead for < and > strategies
For <, <=, > and >= strategies, mark the first scan key
as already matched if scanning in an appropriate direction.
If index tuple contains no nulls we can skip the first
re-check for each tuple.
Author: Rajeev Rastogi
Reviewer: Haribabu Kommi
Rework of the code and comments by Simon Riggs
Mea culpa.
Looks like we'll need a new release as soon as we can lock down a fix.
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, Mar 3, 2016 at 2:07 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > Later, I will add the tests we discovered here to index scans, so that > further optimization work is more easily possible. Please do. I would like to start testing the B-Tree code more exhaustively by adding a test suite to amcheck. This test suite would indirectly test external sorting, B-Tree page deletion, edge-cases with very large IndexTuples, etc. Ideas for good areas of the B-Tree code to add tests for are welcome. -- Peter Geoghegan
On 3/15/16 2:28 PM, Jernigan, Kevin wrote: > I recently joined the product management team for AWS RDS Postgres > (after years at Oracle in their database team), and we are very > interested in confirming (or not) that the fix for the problem below > will be included in 9.5.2, and in the community’s plans (likely date) > for releasing 9.5.2. The patch was reverted in the 9.5 branch, so assuming that that is the end of this investigation (which it appears to be), then it will be part of the 9.5.2 release. > Is there an email list other than hackers where we can follow > discussions on release plans for 9.5.2 (and future releases)? This is a good list to follow to know about release schedules.