Re: Is it possible and worthy to optimize scanRTEForColumn()? - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Is it possible and worthy to optimize scanRTEForColumn()?
Date
Msg-id 20171208194854.43azdxqsaj3pg33b@alap3.anarazel.de
Whole thread Raw
In response to Re: Is it possible and worthy to optimize scanRTEForColumn()?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hi,

On 2017-12-08 14:41:14 -0500, Tom Lane wrote:
> Yeah, if someone were holding a gun on me and saying "make that particular
> function faster", I'd think about a hash table rather than scanning a
> list.  Perhaps a hash table with all the column names exposed by FROM,
> not one hash per RTE.

That sounds right.


> However, if you have a FROM that exposes a lot of column names, and
> then the query only looks up a few of them, you might come out behind
> by building a hash table :-(

Hm, I don't think that's that big of a deal - you don't need many
lookups to make a hashtable worthwhile if the alternative is exhaustive
scans through linked lists.  I'd be more concerned about the pretty
common case we're most of the time hitting now, where there's just a
handfull of columns selected from about as many available columns, the
additional allocations and such might show up.


> I'm still unconvinced that this is the first place to improve for
> wide tables, anyway.

I've run a few profiles with wide columns lately, and on the read-mostly
side without prepared statements this is very commonly the biggest entry
by a lot. Over 90% of the time in one of them.

If the queries select a large number of the underlying rows tupledesc
computations, and their syscache lookups, become the bottleneck. That's
partially what lead me to microoptimize syscache ~two months back.  The
real solution there is going to be to move tupledesc computations to the
planner, but that's a bigger piece of work than I can take on right now.

Greetings,

Andres Freund


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Is it possible and worthy to optimize scanRTEForColumn()?
Next
From: Robert Haas
Date:
Subject: Re: explain analyze output with parallel workers - question aboutmeaning of information for explain.depesz.com