Re: Why does ExecComputeStoredGenerated() form a heap tuple - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Why does ExecComputeStoredGenerated() form a heap tuple
Date
Msg-id 20190405215855.eaasabjnutl2xy6r@alap3.anarazel.de
Whole thread Raw
In response to Re: Why does ExecComputeStoredGenerated() form a heap tuple  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
List pgsql-hackers
Hi,

On 2019-04-01 11:25:46 +0200, Peter Eisentraut wrote:
> On 2019-03-31 05:00, Andres Freund wrote:
> > Also, have you actually benchmarked this code? ISTM that adding a
> > stored generated column would cause quite noticable slowdowns in the
> > COPY path based on this code.
> 
> Yes, it'll be slower than not having it, but it's much faster than the
> equivalent trigger.

It at the moment is quite noticably slower than directly inserting the
generated column.

postgres[11993][1]=# CREATE TABLE foo_without_generated(id int, copy_of_int int);
CREATE TABLE
Time: 0.625 ms
postgres[11993][1]=# CREATE TABLE foo_with_generated(id int, copy_of_int int generated always as (id) stored);
CREATE TABLE
Time: 0.771 ms
postgres[11993][1]=# INSERT INTO foo_without_generated SELECT g.i, g.i FROM generate_series(1, 1000000) g(i);
INSERT 0 1000000
Time: 691.533 ms
postgres[11993][1]=# INSERT INTO foo_with_generated SELECT g.i FROM generate_series(1, 1000000) g(i);
INSERT 0 1000000
Time: 825.471 ms
postgres[11993][1]=# COPY foo_without_generated TO '/tmp/foo_without_generated';
COPY 1000000
Time: 194.051 ms
postgres[11993][1]=# COPY foo_with_generated TO '/tmp/foo_with_generated';
COPY 1000000
Time: 153.146 ms
postgres[11993][1]=# ;TRUNCATE foo_without_generated ;COPY foo_without_generated FROM '/tmp/foo_without_generated';
Time: 0.178 ms
TRUNCATE TABLE
Time: 8.456 ms
COPY 1000000
Time: 394.990 ms
postgres[11993][1]=# ;TRUNCATE foo_with_generated ;COPY foo_with_generated FROM '/tmp/foo_with_generated';
Time: 0.147 ms
TRUNCATE TABLE
Time: 8.043 ms
COPY 1000000
Time: 508.918 ms

From a quick profile that's indeed largely because
ExecComputeStoredGenerated() is really inefficient - and it seems
largely unnecessarily so.  I think this should at least be roughly as
efficient as getting the additional data from the client.


Minor other point: I'm not a fan of defining more general infrastructure
like ExecComputedStoredGenerated() in nodeModifyTable.c - it's already
large and confusing, and it's not obvious that e.g. COPY would call into
it.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: New vacuum option to do only freezing
Next
From: Thomas Munro
Date:
Subject: Intermittent failure in InstallCheck-C "stat" test