Thread: Please correct/improve wiki page about abbreviated keys bug

Please correct/improve wiki page about abbreviated keys bug

From
Josh berkus
Date:
Hackers,

For Thursday's release, I've added a wiki page to give users more
information about the strxfrm() issue, especially since we're going to
ask them to do a bunch of REINDEXing.

Please help me improve this.  Particularly, I need help on the following:

* is my explanation of the issue correct?
* how does this affect non-Linux platforms?  Windows?
* can someone write a search query for columns with indexes in non-C
collation?

Thanks!

http://wiki.postgresql.org/wiki/Abbreviatedkeys_issue

(if you have editing rights, please just edit the wiki page instead of
commenting here)

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)



Re: Please correct/improve wiki page about abbreviated keys bug

From
Peter Geoghegan
Date:
On Tue, Mar 29, 2016 at 5:31 PM, Josh berkus <josh@agliodbs.com> wrote:
> For Thursday's release, I've added a wiki page to give users more
> information about the strxfrm() issue, especially since we're going to
> ask them to do a bunch of REINDEXing.

I made a pass over this, and changed some things. I noticed it said
something about incorrect unique violations on affected systems. Is
that really possible? I can't imagine how it could be, although that
is left in place largely unchanged for now.

-- 
Peter Geoghegan



Re: Please correct/improve wiki page about abbreviated keys bug

From
Tom Lane
Date:
Peter Geoghegan <pg@heroku.com> writes:
> I made a pass over this, and changed some things. I noticed it said
> something about incorrect unique violations on affected systems. Is
> that really possible?

A corrupt index could easily fail to detect uniqueness violations (because
searches fail to find entries they should find).  Not sure I believe that
it would make false reports of a uniqueness conflict that's not really
there.
        regards, tom lane



Re: Please correct/improve wiki page about abbreviated keys bug

From
Peter Geoghegan
Date:
On Tue, Mar 29, 2016 at 7:31 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> A corrupt index could easily fail to detect uniqueness violations (because
> searches fail to find entries they should find).  Not sure I believe that
> it would make false reports of a uniqueness conflict that's not really
> there.

Sure. But looking at how texteq() is implemented, it certainly seems
impossible that that could happen. Must have been a miscommunication
somewhere. I'll fix it.

Do you think it would be okay if the SQL query to detect potentially
affected indexes only considered the leading attribute? Since that's
the only attribute that could use abbreviated keys, it ought to be
safe to not require users to REINDEX indexes that happen to have a
second-or-subsequent text/varchar(n) attribute that doesn't use the C
locale. Maybe it's not worth worrying about.

-- 
Peter Geoghegan



Re: Please correct/improve wiki page about abbreviated keys bug

From
Tom Lane
Date:
Peter Geoghegan <pg@heroku.com> writes:
> Do you think it would be okay if the SQL query to detect potentially
> affected indexes only considered the leading attribute? Since that's
> the only attribute that could use abbreviated keys, it ought to be
> safe to not require users to REINDEX indexes that happen to have a
> second-or-subsequent text/varchar(n) attribute that doesn't use the C
> locale. Maybe it's not worth worrying about.

Sure, if that's the case it seems like a useful thing to know.  I'm
sure there are lots of examples of mixed int-and-text columns in
indexes, for example.
        regards, tom lane



Re: Please correct/improve wiki page about abbreviated keys bug

From
Josh berkus
Date:
On 03/29/2016 07:43 PM, Peter Geoghegan wrote:
> On Tue, Mar 29, 2016 at 7:31 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> A corrupt index could easily fail to detect uniqueness violations (because
>> searches fail to find entries they should find).  Not sure I believe that
>> it would make false reports of a uniqueness conflict that's not really
>> there.

I meant failing to detect a violation, and thus letting the user insert
a duplicate key.

> Sure. But looking at how texteq() is implemented, it certainly seems
> impossible that that could happen. Must have been a miscommunication
> somewhere. I'll fix it.

There was speculation on this in the -bugs thread, and nobody
contradicted it.  If you're fairly sure that it wouldn't happen, your
knowledge of the issue is definitely superior to mine.

> Do you think it would be okay if the SQL query to detect potentially
> affected indexes only considered the leading attribute? Since that's
> the only attribute that could use abbreviated keys, it ought to be
> safe to not require users to REINDEX indexes that happen to have a
> second-or-subsequent text/varchar(n) attribute that doesn't use the C
> locale. Maybe it's not worth worrying about.

I think that's a great idea.

--
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)



Re: Please correct/improve wiki page about abbreviated keys bug

From
Josh berkus
Date:
On 03/30/2016 02:47 PM, Josh berkus wrote:
> On 03/29/2016 07:43 PM, Peter Geoghegan wrote:
>> Do you think it would be okay if the SQL query to detect potentially
>> affected indexes only considered the leading attribute? Since that's
>> the only attribute that could use abbreviated keys, it ought to be
>> safe to not require users to REINDEX indexes that happen to have a
>> second-or-subsequent text/varchar(n) attribute that doesn't use the C
>> locale. Maybe it's not worth worrying about.
> 
> I think that's a great idea.

Based on that concept, I wrote a query which is now on the wiki page.
Please fix it if it's not showing what we want it to show.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)



Re: Please correct/improve wiki page about abbreviated keys bug

From
Peter Geoghegan
Date:
<p dir="ltr">Okay. I'll look at it later today. <p dir="ltr">--<br /> Peter Geoghegan

Re: Please correct/improve wiki page about abbreviated keys bug

From
Peter Geoghegan
Date:
On Wed, Mar 30, 2016 at 6:23 PM, Josh berkus <josh@agliodbs.com> wrote:
> Based on that concept, I wrote a query which is now on the wiki page.
> Please fix it if it's not showing what we want it to show.

Sorry, I got caught up with something last night. I did this this
morning Pacific time.

I've made a number of passes over this. I've listed more non-affected
cases, since there are a number of non-obvious cases that are not at
risk, and fixed-up the SQL query, which didn't seem to be doing the
right thing about default collations or domains. I also made it
explicit that non-leading column indexes are not affected (although
that's just mentioned in passing when discussing the SQL query, as
it's a bit esoteric).

-- 
Peter Geoghegan