Home > mailing lists

Re: init_sequence spill to hash table - Mailing list pgsql-hackers

From	David Rowley
Subject	Re: init_sequence spill to hash table
Date	November 14, 2013 12:38:33
Msg-id	CAApHDvpWsc72i7bce-Xvf9YdXPFDCJs-yUkBikTzBE5XA31rCA@mail.gmail.com Whole thread Raw
In response to	Re: init_sequence spill to hash table (Heikki Linnakangas <hlinnakangas@vmware.com>)
Responses	Re: init_sequence spill to hash table
List	pgsql-hackers

Tree view

On Thu, Nov 14, 2013 at 12:15 AM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:

On 13.11.2013 11:55, David Rowley wrote:
I thought I would post the patch early to see if this is actually wanted
before I do too much more work on it.

Seems reasonable.

My implementation maintains using the linear list for sequences up to a
defined threshold (currently 32) then it moves everything over to a
hashtable and free's off the list.

Did you check how it would perform if you just always used the hash table? Or if you just have a single entry before you move to hash table, ie. set the threshold to 2? That would be slightly simpler.

I've just completed some more benchmarking of this. I didn't try dropping the threshold down to 2 or 0 but I did tests at the cut over point and really don't see much difference in performance between the list at 32 and the hashtable at 33 sequences. The hash table version excels in the 16000 sequence test in comparison to the unpatched version.

Times are in milliseconds of the time it took to call currval() 100000 times for 1 sequence.

	Patched	Unpatched	increased by
1 in cache	1856.452	1844.11	-1%
32 in cache	1841.84	1802.433	-2%
33 in cache	1861.558	not tested	N/A
16000 in cache	1963.711	10329.22	426%

There was not much point in testing 33 sequences with the unpatched version as there is no switching to hash table. Most likely the 1 and 2% drop in speed is noise as the only instructions that have been added here are adding test to check if the hashtable has been created in init_sequence and also a counter to count the number of sequences in the list while still in list mode.

I also tested filling the cache with 30000 sequences then inserting 100000 records into a table which used currval() for the default of a column. The sequence I used would have been half way up the linked list in the unpatched version, so init_sequence would have had to loop 15000 times before finding the sequence.

Here is the average and median over 8 runs of the INSERT statement

	Patched	Unpatched	increased by
AVERAGE	482.2626	597.66375	24%
MEDIAN	471.2205	567.949	21%

Just for clarity, during testing I added the following line to the switch_to_hashtable() function

elog(NOTICE, "moved sequences into hash table");

I've attached seqeuence.sql which includes all of my tests, the functions I used for benchmarking and comments with the results that I got during running the tests.

I've also attached the results formatted a little better in a spreadsheet.

Please let me know if there is something else you think I should test.

Regards

David Rowley

- Heikki

Attachment

pgsql-hackers by date:

From: KONDO Mitsumasa
Date: 14 November 2013, 12:24:02
Subject: Re: Add min and max execute statement time in pg_stat_statement

From: Andrew Dunstan
Date: 14 November 2013, 12:47:47
Subject: Re: nested hstore patch

Re: init_sequence spill to hash table - Mailing list pgsql-hackers

Attachment

Previous

Next