Re: hash agg is slower on wide tables? - Mailing list pgsql-hackers

From Andres Freund
Subject Re: hash agg is slower on wide tables?
Date
Msg-id 20150222122211.GD6093@alap3.anarazel.de
Whole thread Raw
In response to Re: hash agg is slower on wide tables?  (Andrew Gierth <andrew@tao11.riddles.org.uk>)
Responses Re: hash agg is slower on wide tables?  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: hash agg is slower on wide tables?  (Pavel Stehule <pavel.stehule@gmail.com>)
List pgsql-hackers
On 2015-02-22 10:33:16 +0000, Andrew Gierth wrote:
> This is, if I'm understanding the planner logic right, physical-tlist
> optimization; it's faster for a table scan to simply return the whole
> row (copying nothing, just pointing to the on-disk tuple) and let
> hashagg pick out the columns it needs, rather than for the scan to run a
> projection step just to select specific columns.
> 
> If there's a Sort step, this isn't done because Sort neither evaluates
> its input nor projects new tuples on its output, it simply accepts the
> tuples it receives and returns them with the same structure. So now it's
> important to have the node providing input to the Sort projecting out
> only the minimum required set of columns.
> 
> Why it's slower on the wider table... that's less obvious.

It's likely to just be tuple deforming. I've not tried it but I'd bet
you'll see slot_deform* very high in the profile. For the narrow table
only two attributes need to be extracted, for the wider one everything
up to a11 will get extracted.

I've wondered before if we shouldn't use the caching via
slot->tts_values so freely - if you only use a couple values from a wide
tuple the current implementation really sucks if those few aren't at the
beginning of the tuple.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: proposal: searching in array function - array_position
Next
From: Petr Jelinek
Date:
Subject: Re: Replication identifiers, take 4