Thread: Please correct/improve wiki page about abbreviated keys bug
Hackers, For Thursday's release, I've added a wiki page to give users more information about the strxfrm() issue, especially since we're going to ask them to do a bunch of REINDEXing. Please help me improve this. Particularly, I need help on the following: * is my explanation of the issue correct? * how does this affect non-Linux platforms? Windows? * can someone write a search query for columns with indexes in non-C collation? Thanks! http://wiki.postgresql.org/wiki/Abbreviatedkeys_issue (if you have editing rights, please just edit the wiki page instead of commenting here) -- -- Josh Berkus Red Hat OSAS (any opinions are my own)
On Tue, Mar 29, 2016 at 5:31 PM, Josh berkus <josh@agliodbs.com> wrote: > For Thursday's release, I've added a wiki page to give users more > information about the strxfrm() issue, especially since we're going to > ask them to do a bunch of REINDEXing. I made a pass over this, and changed some things. I noticed it said something about incorrect unique violations on affected systems. Is that really possible? I can't imagine how it could be, although that is left in place largely unchanged for now. -- Peter Geoghegan
Peter Geoghegan <pg@heroku.com> writes: > I made a pass over this, and changed some things. I noticed it said > something about incorrect unique violations on affected systems. Is > that really possible? A corrupt index could easily fail to detect uniqueness violations (because searches fail to find entries they should find). Not sure I believe that it would make false reports of a uniqueness conflict that's not really there. regards, tom lane
On Tue, Mar 29, 2016 at 7:31 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > A corrupt index could easily fail to detect uniqueness violations (because > searches fail to find entries they should find). Not sure I believe that > it would make false reports of a uniqueness conflict that's not really > there. Sure. But looking at how texteq() is implemented, it certainly seems impossible that that could happen. Must have been a miscommunication somewhere. I'll fix it. Do you think it would be okay if the SQL query to detect potentially affected indexes only considered the leading attribute? Since that's the only attribute that could use abbreviated keys, it ought to be safe to not require users to REINDEX indexes that happen to have a second-or-subsequent text/varchar(n) attribute that doesn't use the C locale. Maybe it's not worth worrying about. -- Peter Geoghegan
Peter Geoghegan <pg@heroku.com> writes: > Do you think it would be okay if the SQL query to detect potentially > affected indexes only considered the leading attribute? Since that's > the only attribute that could use abbreviated keys, it ought to be > safe to not require users to REINDEX indexes that happen to have a > second-or-subsequent text/varchar(n) attribute that doesn't use the C > locale. Maybe it's not worth worrying about. Sure, if that's the case it seems like a useful thing to know. I'm sure there are lots of examples of mixed int-and-text columns in indexes, for example. regards, tom lane
On 03/29/2016 07:43 PM, Peter Geoghegan wrote: > On Tue, Mar 29, 2016 at 7:31 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> A corrupt index could easily fail to detect uniqueness violations (because >> searches fail to find entries they should find). Not sure I believe that >> it would make false reports of a uniqueness conflict that's not really >> there. I meant failing to detect a violation, and thus letting the user insert a duplicate key. > Sure. But looking at how texteq() is implemented, it certainly seems > impossible that that could happen. Must have been a miscommunication > somewhere. I'll fix it. There was speculation on this in the -bugs thread, and nobody contradicted it. If you're fairly sure that it wouldn't happen, your knowledge of the issue is definitely superior to mine. > Do you think it would be okay if the SQL query to detect potentially > affected indexes only considered the leading attribute? Since that's > the only attribute that could use abbreviated keys, it ought to be > safe to not require users to REINDEX indexes that happen to have a > second-or-subsequent text/varchar(n) attribute that doesn't use the C > locale. Maybe it's not worth worrying about. I think that's a great idea. -- -- Josh Berkus Red Hat OSAS (any opinions are my own)
On 03/30/2016 02:47 PM, Josh berkus wrote: > On 03/29/2016 07:43 PM, Peter Geoghegan wrote: >> Do you think it would be okay if the SQL query to detect potentially >> affected indexes only considered the leading attribute? Since that's >> the only attribute that could use abbreviated keys, it ought to be >> safe to not require users to REINDEX indexes that happen to have a >> second-or-subsequent text/varchar(n) attribute that doesn't use the C >> locale. Maybe it's not worth worrying about. > > I think that's a great idea. Based on that concept, I wrote a query which is now on the wiki page. Please fix it if it's not showing what we want it to show. -- -- Josh Berkus Red Hat OSAS (any opinions are my own)
<p dir="ltr">Okay. I'll look at it later today. <p dir="ltr">--<br /> Peter Geoghegan
On Wed, Mar 30, 2016 at 6:23 PM, Josh berkus <josh@agliodbs.com> wrote: > Based on that concept, I wrote a query which is now on the wiki page. > Please fix it if it's not showing what we want it to show. Sorry, I got caught up with something last night. I did this this morning Pacific time. I've made a number of passes over this. I've listed more non-affected cases, since there are a number of non-obvious cases that are not at risk, and fixed-up the SQL query, which didn't seem to be doing the right thing about default collations or domains. I also made it explicit that non-leading column indexes are not affected (although that's just mentioned in passing when discussing the SQL query, as it's a bit esoteric). -- Peter Geoghegan