Re: B-Tree support function number 3 (strxfrm() optimization) - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: B-Tree support function number 3 (strxfrm() optimization)
Date
Msg-id CAM3SWZTijoBPpqFF7mN3021Vvtu+5Fd1ymABQ8tLoV4zhfAqxA@mail.gmail.com
Whole thread Raw
In response to Re: B-Tree support function number 3 (strxfrm() optimization)  (Peter Geoghegan <pg@heroku.com>)
List pgsql-hackers
There is an interesting thread about strcoll() overhead over on -general:

http://www.postgresql.org/message-id/CAB25XEXNONdRmC1_cy3jvmB0TMyDm38eF9q2D7xLa0rbnCJ5pQ@mail.gmail.com

My guess was that this person experienced a rather unexpected downside
of spilling to disk when sorting on a text attribute: System
throughput becomes very CPU bound, because tapesort tends to result in
more comparisons [1].  With abbreviated keys, tapesort can actually
compete with quicksort in certain cases [2]. Tapesorts of text
attributes are especially bad on released versions of Postgres, and
will perform very little actual I/O.

In all seriousness, I wonder if we should add a release note item
stating that when using Postgres 9.5, due to the abbreviated key
optimization, external sorts can be much more I/O bound than in
previous releases...

[1] http://www.postgresql.org/message-id/20140806035512.GA91137@tornado.leadboat.com
[2] http://www.postgresql.org/message-id/CAM3SWZQiGvGhMB4TMbEWoNjO17=ySB5b5Y5MGqJsaNq4uWTryA@mail.gmail.com
-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Small TRUNCATE glitch
Next
From: Petr Jelinek
Date:
Subject: Re: Sequence Access Method WIP