Thread: Slow update on column that is part of exclusion constraint

Slow update on column that is part of exclusion constraint

From
Adam Brusselback
Date:
Hey all, been running into some performance issues with one of my tables, and it seems to be centered around index maintenance.

I have a table to store aggregated prices that are derived from sale data over a configurable period, and a function that runs periodically that inserts new prices if necessary, or "inactivates" the old record and inserts new ones.  We use that price in calculations, and store the price we used for a specific calculation for audit purposes, so the old record cannot just be updated or deleted. We need a new record every time.

So to do that we have an "active_range" column on the table, and that is  used to set the periods that each specific price was in use for (important for audit).

The issue seems to be the exclusion constraint we have on the table to ensure data consistency... It is ridiculously slow to insert / update.  To update 200 records, it takes 45 seconds. To update 1500 rows takes about 3.5 min. To build the constraint/index fresh on the table (1,392,085 rows) it takes about 4 min.

Looking at the schema, is there just a clearly better way to do this? I know uuid is not supported by GiST indexes yet, so casting to text isn't great, but necessary (as far as I know...) at the moment.

The table definition looks like so:
CREATE TABLE price_generated
(
  price_generated_id uuid NOT NULL DEFAULT gen_random_uuid(),
  product_id uuid NOT NULL,
  company_id uuid NOT NULL,
  date_range daterange NOT NULL,
  average_price numeric NOT NULL,
  average_price_delivered numeric NOT NULL,
  low_price numeric NOT NULL,
  low_price_delivered numeric NOT NULL,
  high_price numeric NOT NULL,
  high_price_delivered numeric NOT NULL,
  uom_type_id uuid NOT NULL,
  active_range tstzrange NOT NULL DEFAULT tstzrange(now(), NULL::timestamp with time zone),
  CONSTRAINT price_generated_pkey PRIMARY KEY (price_generated_id),
  CONSTRAINT price_generated_company_id_fkey FOREIGN KEY (company_id)
      REFERENCES public.company (company_id) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION,
  CONSTRAINT price_generated_product_id_fkey FOREIGN KEY (product_id)
      REFERENCES public.product (product_id) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION,
  CONSTRAINT price_generated_company_product_date_active_excl EXCLUDE 
  USING gist ((company_id::text) WITH =, (product_id::text) WITH =, date_range WITH &&, active_range WITH &&)
);

CREATE INDEX idx_price_generated_prod_comp_date
  ON price_generated
  USING btree
  (product_id, company_id, date_range);

Re: Slow update on column that is part of exclusion constraint

From
Adam Brusselback
Date:
Sorry, brain stopped working and I forgot to include the normal info.

Postgres version: 9.5.1
Hardware: 2 core, 4gb Digital Ocean virtual server
OS: Debian 

explain analyze for an example update:
'Update on price_generated  (cost=32.45..644.83 rows=1 width=157) (actual time=29329.614..29329.614 rows=0 loops=1)'
'  ->  Nested Loop  (cost=32.45..644.83 rows=1 width=157) (actual time=29329.608..29329.608 rows=0 loops=1)'
'        ->  HashAggregate  (cost=32.04..34.35 rows=231 width=52) (actual time=1.137..2.090 rows=231 loops=1)'
'              Group Key: pti.product_id, pti.company_id, pti.date_range'
'              ->  Seq Scan on _prices_to_insert pti  (cost=0.00..30.31 rows=231 width=52) (actual time=0.060..0.678 rows=231 loops=1)'
'        ->  Index Scan using price_generated_company_product_date_active_excl on price_generated  (cost=0.41..2.63 rows=1 width=151) (actual time=126.949..126.949 rows=0 loops=231)'
'              Index Cond: (date_range = pti.date_range)'
'              Filter: ((upper(active_range) IS NULL) AND (pti.product_id = product_id) AND (pti.company_id = company_id))'
'              Rows Removed by Filter: 29460'
'Planning time: 3.134 ms'
'Execution time: 29406.717 ms'

Re: Slow update on column that is part of exclusion constraint

From
Evgeniy Shishkin
Date:
> On 13 Apr 2016, at 20:14, Adam Brusselback <adambrusselback@gmail.com> wrote:
>
> Sorry, brain stopped working and I forgot to include the normal info.
>
> Postgres version: 9.5.1
> Hardware: 2 core, 4gb Digital Ocean virtual server
> OS: Debian
>
> explain analyze for an example update:
> 'Update on price_generated  (cost=32.45..644.83 rows=1 width=157) (actual time=29329.614..29329.614 rows=0 loops=1)'
> '  ->  Nested Loop  (cost=32.45..644.83 rows=1 width=157) (actual time=29329.608..29329.608 rows=0 loops=1)'
> '        ->  HashAggregate  (cost=32.04..34.35 rows=231 width=52) (actual time=1.137..2.090 rows=231 loops=1)'
> '              Group Key: pti.product_id, pti.company_id, pti.date_range'
> '              ->  Seq Scan on _prices_to_insert pti  (cost=0.00..30.31 rows=231 width=52) (actual time=0.060..0.678
rows=231loops=1)' 
> '        ->  Index Scan using price_generated_company_product_date_active_excl on price_generated  (cost=0.41..2.63
rows=1width=151) (actual time=126.949..126.949 rows=0 loops=231)' 
> '              Index Cond: (date_range = pti.date_range)'
> '              Filter: ((upper(active_range) IS NULL) AND (pti.product_id = product_id) AND (pti.company_id =
company_id))'
> '              Rows Removed by Filter: 29460'
> 'Planning time: 3.134 ms'
> 'Execution time: 29406.717 ms'

Well, you see execution time of 30 seconds because there are 231 index lookups,
each taking 126 ms.

And that lookup is slow because of
Filter: ((upper(active_range) IS NULL) AND (pti.product_id = product_id) AND (pti.company_id = company_id))'

Can you provide self-containing example of update?
I don't see there (upper(active_range) IS NULL condition is coming from.

Re: Slow update on column that is part of exclusion constraint

From
Adam Brusselback
Date:
So fair enough, it does seem to be related to the lookup rather than maintenance on the index. I was misguided in my initial assumption.  

Spent quite a bit of time trying to come up with a self contained test, and it seems like I can't make it choose the GiST index unless I remove the regular btree index in my test case, though the opposite is true for my table in production.  Not really sure what that means as far as what I need to do though. I've tried a vacuum full, analyze, rebuild index, drop and re-add the constraint... It still uses that GiST index for this query.

Hell, a sequential scan is a ton faster even.

On Wed, Apr 13, 2016 at 2:54 PM, Evgeniy Shishkin <itparanoia@gmail.com> wrote:

> On 13 Apr 2016, at 20:14, Adam Brusselback <adambrusselback@gmail.com> wrote:
>
> Sorry, brain stopped working and I forgot to include the normal info.
>
> Postgres version: 9.5.1
> Hardware: 2 core, 4gb Digital Ocean virtual server
> OS: Debian
>
> explain analyze for an example update:
> 'Update on price_generated  (cost=32.45..644.83 rows=1 width=157) (actual time=29329.614..29329.614 rows=0 loops=1)'
> '  ->  Nested Loop  (cost=32.45..644.83 rows=1 width=157) (actual time=29329.608..29329.608 rows=0 loops=1)'
> '        ->  HashAggregate  (cost=32.04..34.35 rows=231 width=52) (actual time=1.137..2.090 rows=231 loops=1)'
> '              Group Key: pti.product_id, pti.company_id, pti.date_range'
> '              ->  Seq Scan on _prices_to_insert pti  (cost=0.00..30.31 rows=231 width=52) (actual time=0.060..0.678 rows=231 loops=1)'
> '        ->  Index Scan using price_generated_company_product_date_active_excl on price_generated  (cost=0.41..2.63 rows=1 width=151) (actual time=126.949..126.949 rows=0 loops=231)'
> '              Index Cond: (date_range = pti.date_range)'
> '              Filter: ((upper(active_range) IS NULL) AND (pti.product_id = product_id) AND (pti.company_id = company_id))'
> '              Rows Removed by Filter: 29460'
> 'Planning time: 3.134 ms'
> 'Execution time: 29406.717 ms'

Well, you see execution time of 30 seconds because there are 231 index lookups,
each taking 126 ms.

And that lookup is slow because of
Filter: ((upper(active_range) IS NULL) AND (pti.product_id = product_id) AND (pti.company_id = company_id))'

Can you provide self-containing example of update?
I don't see there (upper(active_range) IS NULL condition is coming from.

Attachment

Re: Slow update on column that is part of exclusion constraint

From
Evgeniy Shishkin
Date:
> On 14 Apr 2016, at 07:17, Adam Brusselback <adambrusselback@gmail.com> wrote:
>
> So fair enough, it does seem to be related to the lookup rather than maintenance on the index. I was misguided in my
initialassumption.   
>
> Spent quite a bit of time trying to come up with a self contained test, and it seems like I can't make it choose the
GiSTindex unless I remove the regular btree index in my test case, though the opposite is true for my table in
production. Not really sure what that means as far as what I need to do though. I've tried a vacuum full, analyze,
rebuildindex, drop and re-add the constraint... It still uses that GiST index for this query. 
>
> Hell, a sequential scan is a ton faster even.
>

As i understand it, postgres needs a way to find rows for update.
In explain analyze you provided, we see that it chose gist index for that.
And that is a poor chose. I think you need a proper btree index for update
query to work properly fast. Like index on (product_id, company_id, date_range) WHERE
upper(price_generated_test.active_range)IS NULL.  



> On Wed, Apr 13, 2016 at 2:54 PM, Evgeniy Shishkin <itparanoia@gmail.com> wrote:
>
> > On 13 Apr 2016, at 20:14, Adam Brusselback <adambrusselback@gmail.com> wrote:
> >
> > Sorry, brain stopped working and I forgot to include the normal info.
> >
> > Postgres version: 9.5.1
> > Hardware: 2 core, 4gb Digital Ocean virtual server
> > OS: Debian
> >
> > explain analyze for an example update:
> > 'Update on price_generated  (cost=32.45..644.83 rows=1 width=157) (actual time=29329.614..29329.614 rows=0
loops=1)'
> > '  ->  Nested Loop  (cost=32.45..644.83 rows=1 width=157) (actual time=29329.608..29329.608 rows=0 loops=1)'
> > '        ->  HashAggregate  (cost=32.04..34.35 rows=231 width=52) (actual time=1.137..2.090 rows=231 loops=1)'
> > '              Group Key: pti.product_id, pti.company_id, pti.date_range'
> > '              ->  Seq Scan on _prices_to_insert pti  (cost=0.00..30.31 rows=231 width=52) (actual
time=0.060..0.678rows=231 loops=1)' 
> > '        ->  Index Scan using price_generated_company_product_date_active_excl on price_generated  (cost=0.41..2.63
rows=1width=151) (actual time=126.949..126.949 rows=0 loops=231)' 
> > '              Index Cond: (date_range = pti.date_range)'
> > '              Filter: ((upper(active_range) IS NULL) AND (pti.product_id = product_id) AND (pti.company_id =
company_id))'
> > '              Rows Removed by Filter: 29460'
> > 'Planning time: 3.134 ms'
> > 'Execution time: 29406.717 ms'
>
> Well, you see execution time of 30 seconds because there are 231 index lookups,
> each taking 126 ms.
>
> And that lookup is slow because of
> Filter: ((upper(active_range) IS NULL) AND (pti.product_id = product_id) AND (pti.company_id = company_id))'
>
> Can you provide self-containing example of update?
> I don't see there (upper(active_range) IS NULL condition is coming from.
>
> <excl constraint test case.sql>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance