Thread: Small patch to fix an O(N^2) problem in foreign keys

Small patch to fix an O(N^2) problem in foreign keys

From
Jan Wieck
Date:
Attached is a small patch and a script to reproduce the issue.

The problem is a cache introduced in commit 45ba4247 that improves
foreign key lookups during bulk updates when the FK value does not
change. When restoring a schema dump from a database with many (say
100,000) foreign keys, this cache is growing very big and every ALTER
TABLE command is causing a InvalidateConstraintCacheCallBack(), which
does a sequential hash table scan.

The patch uses a heuristic method of detecting when the hash table
should be destroyed and recreated. InvalidateConstraintCacheCallBack()
adds the current size of the hash table to a counter. When that sum
reaches 1,000,000, the hash table is flushed. This improves the schema
restore of a database with 100,000 foreign keys by factor 3.

According to my tests the patch does not interfere with the bulk
updates, the original feature was supposed to improve.


Regards, Jan

--
Jan Wieck
Senior Software Engineer
http://slony.info

Attachment

Re: Small patch to fix an O(N^2) problem in foreign keys

From
Kevin Grittner
Date:
Jan Wieck <jan@wi3ck.info> wrote:

> The problem is a cache introduced in commit 45ba4247 that improves

That's a bit off; 45ba424f seems to be what you mean.

> foreign key lookups during bulk updates when the FK value does not
> change. When restoring a schema dump from a database with many (say
> 100,000) foreign keys, this cache is growing very big and every ALTER
> TABLE command is causing a InvalidateConstraintCacheCallBack(), which
> does a sequential hash table scan.
>
> The patch uses a heuristic method of detecting when the hash table
> should be destroyed and recreated. InvalidateConstraintCacheCallBack()
> adds the current size of the hash table to a counter. When that sum
> reaches 1,000,000, the hash table is flushed. This improves the schema
> restore of a database with 100,000 foreign keys by factor 3.
>
> According to my tests the patch does not interfere with the bulk
> updates, the original feature was supposed to improve.

In the real-world customer case that caused you to look into this,
I thought 45ba424f drove schema-only restore time from 2 hours to
about 25 hours, and that this patch takes it back down to 2 hours.
Am I remembering right?  And this came about because it added
20-some hours to a pg_upgrade run?

If there are no objections, I will push this as a bug fix back to
9.3, where the performance regression was introduced.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Small patch to fix an O(N^2) problem in foreign keys

From
Jan Wieck
Date:
On 09/08/2015 08:49 AM, Kevin Grittner wrote:
> Jan Wieck <jan@wi3ck.info> wrote:
>
>> The problem is a cache introduced in commit 45ba4247 that improves
>
> That's a bit off; 45ba424f seems to be what you mean.

Copy paste over paper.


>
>> foreign key lookups during bulk updates when the FK value does not
>> change. When restoring a schema dump from a database with many (say
>> 100,000) foreign keys, this cache is growing very big and every ALTER
>> TABLE command is causing a InvalidateConstraintCacheCallBack(), which
>> does a sequential hash table scan.
>>
>> The patch uses a heuristic method of detecting when the hash table
>> should be destroyed and recreated. InvalidateConstraintCacheCallBack()
>> adds the current size of the hash table to a counter. When that sum
>> reaches 1,000,000, the hash table is flushed. This improves the schema
>> restore of a database with 100,000 foreign keys by factor 3.
>>
>> According to my tests the patch does not interfere with the bulk
>> updates, the original feature was supposed to improve.
>
> In the real-world customer case that caused you to look into this,
> I thought 45ba424f drove schema-only restore time from 2 hours to
> about 25 hours, and that this patch takes it back down to 2 hours.
> Am I remembering right?  And this came about because it added
> 20-some hours to a pg_upgrade run?
From 2 hours to >50, but yes, this is that case.

>
> If there are no objections, I will push this as a bug fix back to
> 9.3, where the performance regression was introduced.


Regards, Jan


-- 
Jan Wieck
Senior Software Engineer
http://slony.info



Re: Small patch to fix an O(N^2) problem in foreign keys

From
Kevin Grittner
Date:
Jan Wieck <jan@wi3ck.info> wrote:

>>> The patch uses a heuristic method of detecting when the hash table
>>> should be destroyed and recreated. InvalidateConstraintCacheCallBack()
>>> adds the current size of the hash table to a counter. When that sum
>>> reaches 1,000,000, the hash table is flushed. This improves the schema
>>> restore of a database with 100,000 foreign keys by factor 3.
>>>
>>> According to my tests the patch does not interfere with the bulk
>>> updates, the original feature was supposed to improve.
>>
>> In the real-world customer case that caused you to look into this,
>> I thought 45ba424f drove schema-only restore time from 2 hours to
>> about 25 hours, and that this patch takes it back down to 2 hours.
>> Am I remembering right?  And this came about because it added
>> 20-some hours to a pg_upgrade run?
>
> From 2 hours to >50, but yes, this is that case.
>
>> If there are no objections, I will push this as a bug fix back to
>> 9.3, where the performance regression was introduced.

Hearing no objections, pushed with one minor tweak Jan and I
discussed off-list -- the original patch duplicated a bit of code
that we agreed should be split into a static function and called
from both places, to protect against someone later changing one and
missing the other.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company