Re: Abbreviated keys for Numeric - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Abbreviated keys for Numeric
Date
Msg-id 54E8E775.7070408@2ndquadrant.com
Whole thread Raw
In response to Re: Abbreviated keys for Numeric  (Peter Geoghegan <pg@heroku.com>)
Responses Re: Abbreviated keys for Numeric  (Peter Geoghegan <pg@heroku.com>)
List pgsql-hackers
On 21.2.2015 20:33, Peter Geoghegan wrote:
> On Sat, Feb 21, 2015 at 10:57 AM, Peter Geoghegan <pg@heroku.com> 
>
>> That's odd. I have a hard time thinking of why the datum sort
>> patch could be at fault, though.
> 
> Oh, wait. For queries like this, which I now see in your 
> spreadsheet:
> 
> select * from (select * from stuff_text order by randtxt offset 
> 100000000000) foo
> 
> There is no reason to think that either patch will improve things 
> over master branch tip's performance. This doesn't use a datum 
> tuplesort. So that explains it, I think.

Really? Because those are the queries that you posted on 26/1 to
demonstrate that this patch makes sorting Numeric even faster than
sorting float8.

And for the Numeric data type this actually gets significant speedup
with the numeric_sortsupp.patch (~4x).

But maybe for text that works differently?

> Although I cannot easily explain the disparity in performance between
> 1M and 5M sized sets for this query:
> 
> select count(distinct randtxt) from stuff_text
> 
> You did make sure that the queries didn't spill to disk, right? Or 
> that they did so consistently, at least.

All the queries were running with work_mem=1GB, and I don't think they
were spilling to disk. Actually, I don't have the 'merge' patch applied,
so that would probably crash because of SIGSEGV.

> There is also no reason to think that integer or float datum sorts 
> will be accelerated, because they could always use sortsupport - the 
> datum sort patch is only about making it also possible to also use 
> abbreviation for opclasses that support it, like text (and, 
> eventually, numeric).

Yes, I'm aware of that. I used that as a control group, to get and idea
of how noisy the results are, and maybe check if the patches don't
affect it for some unexpected reason.

-- 
Tomas Vondra                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: Abbreviated keys for Numeric
Next
From: Tom Lane
Date:
Subject: Re: Expanding the use of FLEXIBLE_ARRAY_MEMBER for declarations like foo[1]