Home > mailing lists

Thread: init_sequence spill to hash table

init_sequence spill to hash table

From

David Rowley

Date:

13 November 2013, 12:55:53

Here http://www.postgresql.org/message-id/24278.1352922571@sss.pgh.pa.us there was some talk about init_sequence being a bottleneck when many sequences are used in a single backend.

The attached I think implements what was talked about in the above link which for me seems to double the speed of a currval() loop over 30000 sequences. It goes from about 7 seconds to 3.5 on my laptop.

I thought I would post the patch early to see if this is actually wanted before I do too much more work on it.

My implementation maintains using the linear list for sequences up to a defined threshold (currently 32) then it moves everything over to a hashtable and free's off the list.

A more complete solution would contain regression tests to exercise the hash table code.

I know there is a desire to move sequences over to a single table still, but I see this as a separate patch and storing current values in a hash table for each backend should still be a win even if/when the single table stuff gets implemented.

Regards

David Rowley

Attachment

Re: init_sequence spill to hash table

From

Heikki Linnakangas

Date:

13 November 2013, 14:15:43

On 13.11.2013 11:55, David Rowley wrote:
> I thought I would post the patch early to see if this is actually wanted
> before I do too much more work on it.

Seems reasonable.

> My implementation maintains using the linear list for sequences up to a
> defined threshold (currently 32) then it moves everything over to a
> hashtable and free's off the list.

Did you check how it would perform if you just always used the hash 
table? Or if you just have a single entry before you move to hash table, 
ie. set the threshold to 2? That would be slightly simpler.

- Heikki

Re: init_sequence spill to hash table

From

David Rowley

Date:

14 November 2013, 15:38:33

On Thu, Nov 14, 2013 at 12:15 AM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:

On 13.11.2013 11:55, David Rowley wrote:
I thought I would post the patch early to see if this is actually wanted
before I do too much more work on it.

Seems reasonable.

My implementation maintains using the linear list for sequences up to a
defined threshold (currently 32) then it moves everything over to a
hashtable and free's off the list.

Did you check how it would perform if you just always used the hash table? Or if you just have a single entry before you move to hash table, ie. set the threshold to 2? That would be slightly simpler.

I've just completed some more benchmarking of this. I didn't try dropping the threshold down to 2 or 0 but I did tests at the cut over point and really don't see much difference in performance between the list at 32 and the hashtable at 33 sequences. The hash table version excels in the 16000 sequence test in comparison to the unpatched version.

Times are in milliseconds of the time it took to call currval() 100000 times for 1 sequence.

	Patched	Unpatched	increased by
1 in cache	1856.452	1844.11	-1%
32 in cache	1841.84	1802.433	-2%
33 in cache	1861.558	not tested	N/A
16000 in cache	1963.711	10329.22	426%

There was not much point in testing 33 sequences with the unpatched version as there is no switching to hash table. Most likely the 1 and 2% drop in speed is noise as the only instructions that have been added here are adding test to check if the hashtable has been created in init_sequence and also a counter to count the number of sequences in the list while still in list mode.

I also tested filling the cache with 30000 sequences then inserting 100000 records into a table which used currval() for the default of a column. The sequence I used would have been half way up the linked list in the unpatched version, so init_sequence would have had to loop 15000 times before finding the sequence.

Here is the average and median over 8 runs of the INSERT statement

	Patched	Unpatched	increased by
AVERAGE	482.2626	597.66375	24%
MEDIAN	471.2205	567.949	21%

Just for clarity, during testing I added the following line to the switch_to_hashtable() function

elog(NOTICE, "moved sequences into hash table");

I've attached seqeuence.sql which includes all of my tests, the functions I used for benchmarking and comments with the results that I got during running the tests.

I've also attached the results formatted a little better in a spreadsheet.

Please let me know if there is something else you think I should test.

Regards

David Rowley

- Heikki

Attachment

Re: init_sequence spill to hash table

From

Heikki Linnakangas

Date:

14 November 2013, 17:03:57

On 14.11.2013 14:38, David Rowley wrote:
> I've just completed some more benchmarking of this. I didn't try dropping
> the threshold down to 2 or 0 but I did tests at the cut over point and
> really don't see much difference in performance between the list at 32 and
> the hashtable at 33 sequences. The hash table version excels in the 16000
> sequence test in comparison to the unpatched version.
>
> Times are in milliseconds of the time it took to call currval() 100000
> times for 1 sequence.
>       Patched Unpatched increased by  1 in cache 1856.452 1844.11 -1%  32 in
> cache 1841.84 1802.433 -2%  33 in cache 1861.558  not tested N/A  16000 in
> cache 1963.711 10329.22 426%

If I understand those results correctly, the best case scenario with the 
current code takes about 1800 ms. There's practically no difference with 
N <= 32, where N is the number of sequences touched. The hash table 
method also takes about 1800 ms when N=33. The performance of the hash 
table is O(1), so presumably we can extrapolate from that that it's the 
same for any N.

I think that means that we should just completely replace the list with 
the hash table. The difference with a small N is lost in noise, so 
there's no point in keeping the list as a fast path for small N. That'll 
make the patch somewhat simpler.
- Heikki

Re: init_sequence spill to hash table

From

Andres Freund

Date:

14 November 2013, 17:12:16

Hi,

On 2013-11-13 22:55:43 +1300, David Rowley wrote:
> Here http://www.postgresql.org/message-id/24278.1352922571@sss.pgh.pa.us there
> was some talk about init_sequence being a bottleneck when many sequences
> are used in a single backend.
> 
> The attached I think implements what was talked about in the above link
> which for me seems to double the speed of a currval() loop over 30000
> sequences. It goes from about 7 seconds to 3.5 on my laptop.

I think it'd be a better idea to integrate the sequence caching logic
into the relcache. There's a comment about it:* (We can't* rely on the relcache, since it's only, well, a cache, and
maydecide to* discard entries.)

but that's not really accurate anymore. We have the infrastructure for
keeping values across resets and we don't discard entries.

Since we already do a relcache lookup for every sequence manipulation
(c.f. init_sequence()) relying on it won't increase, but rather decrease
the overhead.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: init_sequence spill to hash table

From

Tom Lane

Date:

14 November 2013, 17:23:30

Andres Freund <andres@2ndquadrant.com> writes:
> I think it'd be a better idea to integrate the sequence caching logic
> into the relcache. There's a comment about it:
>  * (We can't
>  * rely on the relcache, since it's only, well, a cache, and may decide to
>  * discard entries.)
> but that's not really accurate anymore. We have the infrastructure for
> keeping values across resets and we don't discard entries.

We most certainly *do* discard entries, if they're not open when a cache
flush event comes along.

I suppose it'd be possible to mark a relcache entry for a sequence
as locked-in-core, but that doesn't attract me at all.  A relcache
entry is a whole lot larger than the amount of state we really need
to keep for a sequence.

One idea is to have a hashtable for the sequence-specific data,
but to add a link field to the relcache entry that points to the
non-flushable sequence hashtable entry.  That would save the second
hashtable lookup as long as the relcache entry hadn't been flushed
since last use, while not requiring any violence to the lifespan
semantics of relcache entries.  (Actually, if we did that, it might
not even be worth converting the list to a hashtable?  Searches would
become a lot less frequent.)
        regards, tom lane

Re: init_sequence spill to hash table

From

Andres Freund

Date:

14 November 2013, 17:33:44

On 2013-11-14 09:23:20 -0500, Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
> > I think it'd be a better idea to integrate the sequence caching logic
> > into the relcache. There's a comment about it:
> >  * (We can't
> >  * rely on the relcache, since it's only, well, a cache, and may decide to
> >  * discard entries.)
> > but that's not really accurate anymore. We have the infrastructure for
> > keeping values across resets and we don't discard entries.
> 
> We most certainly *do* discard entries, if they're not open when a cache
> flush event comes along.

What I was aiming at is that we don't discard them because of a limited
cache size. I don't think it means much that we flush the entry when
it's changed but not referenced.

> I suppose it'd be possible to mark a relcache entry for a sequence
> as locked-in-core, but that doesn't attract me at all.  A relcache
> entry is a whole lot larger than the amount of state we really need
> to keep for a sequence.

But effectively that's what already happens unless either somebody else
does an ALTER SEQUENCE or sinval overflow happened, right? So in
production workloads we already will keep both around since hopefully
neither altering a sequence nor sinval overflows should be a frequent
thing.

> One idea is to have a hashtable for the sequence-specific data,
> but to add a link field to the relcache entry that points to the
> non-flushable sequence hashtable entry.  That would save the second
> hashtable lookup as long as the relcache entry hadn't been flushed
> since last use, while not requiring any violence to the lifespan
> semantics of relcache entries.  (Actually, if we did that, it might
> not even be worth converting the list to a hashtable?  Searches would
> become a lot less frequent.)

That's not a bad idea if we decide not to integrate them. And I agree,
there's not much need to have a separate hashtable in that case.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: init_sequence spill to hash table

From

Tom Lane

Date:

14 November 2013, 17:47:30

Andres Freund <andres@2ndquadrant.com> writes:
> On 2013-11-14 09:23:20 -0500, Tom Lane wrote:
>> We most certainly *do* discard entries, if they're not open when a cache
>> flush event comes along.

> What I was aiming at is that we don't discard them because of a limited
> cache size. I don't think it means much that we flush the entry when
> it's changed but not referenced.

Well, I don't want non-user-significant events (such as an sinval queue
overrun) causing sequence state to get discarded.  We would get bug
reports about lost sequence values.
        regards, tom lane

Re: init_sequence spill to hash table

From

Andres Freund

Date:

14 November 2013, 17:49:51

On 2013-11-14 09:47:18 -0500, Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
> > On 2013-11-14 09:23:20 -0500, Tom Lane wrote:
> >> We most certainly *do* discard entries, if they're not open when a cache
> >> flush event comes along.
> 
> > What I was aiming at is that we don't discard them because of a limited
> > cache size. I don't think it means much that we flush the entry when
> > it's changed but not referenced.
> 
> Well, I don't want non-user-significant events (such as an sinval queue
> overrun) causing sequence state to get discarded.  We would get bug
> reports about lost sequence values.

But we can easily do as you suggest and simply retain the entry in that
case. I am just not seeing the memory overhead argument as counting much
since we don't protect against it in normal operation.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: init_sequence spill to hash table

From

David Rowley

Date:

15 November 2013, 02:33:34

<div dir="ltr"><br /><div class="gmail_extra"><br /><br /><div class="gmail_quote">On Fri, Nov 15, 2013 at 3:03 AM,
HeikkiLinnakangas <span dir="ltr"><<a href="mailto:hlinnakangas@vmware.com"
target="_blank">hlinnakangas@vmware.com</a>></span>wrote:<br /><blockquote class="gmail_quote" style="margin:0px 0px
0px0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div
class="im">On14.11.2013 14:38, David Rowley wrote:<br /><blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">I've just
completedsome more benchmarking of this. I didn't try dropping<br /> the threshold down to 2 or 0 but I did tests at
thecut over point and<br /> really don't see much difference in performance between the list at 32 and<br /> the
hashtableat 33 sequences. The hash table version excels in the 16000<br /> sequence test in comparison to the unpatched
version.<br/><br /> Times are in milliseconds of the time it took to call currval() 100000<br /> times for 1
sequence.<br/>       Patched Unpatched increased by  1 in cache 1856.452 1844.11 -1%  32 in<br /> cache 1841.84
1802.433-2%  33 in cache 1861.558  not tested N/A  16000 in<br /> cache 1963.711 10329.22 426%<br /></blockquote><br
/></div>If I understand those results correctly, the best case scenario with the current code takes about 1800 ms.
There'spractically no difference with N <= 32, where N is the number of sequences touched. The hash table method
alsotakes about 1800 ms when N=33. The performance of the hash table is O(1), so presumably we can extrapolate from
thatthat it's the same for any N.<br /><br /></blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">Ithink that
meansthat we should just completely replace the list with the hash table. The difference with a small N is lost in
noise,so there's no point in keeping the list as a fast path for small N. That'll make the patch somewhat simpler.<span
class=""><fontcolor="#888888"><br /> - Heikki<br /></font></span></blockquote></div><br /></div><div
class="gmail_extra">Ihad thought that maybe the biggest type of workloads might only touch 1 or 2 sequences, though it
maybe small but I had thought there would be an overhead in both cycles and memory usage in creating a hash table for
theselight usages of sequence backends. It would certainly make the patch more simple by removing this and it would
alsomean that I could remove the sometimes unused ->next member from the SeqTableData struct which is just now set
toNULL when in hash table mode. If you think it's the way to go then I can make the change, though maybe I'll hold off
therefactor for now as it looks like other ideas have come up around rel cache.</div><div class="gmail_extra"><br
/></div><divclass="gmail_extra">Regards</div><div class="gmail_extra"><br /></div><div class="gmail_extra">David
Rowley</div></div>

Re: init_sequence spill to hash table

From

David Rowley

Date:

15 November 2013, 04:22:40

On Fri, Nov 15, 2013 at 3:12 AM, Andres Freund <andres@2ndquadrant.com> wrote:

Hi,

On 2013-11-13 22:55:43 +1300, David Rowley wrote:
> Here http://www.postgresql.org/message-id/24278.1352922571@sss.pgh.pa.us there
> was some talk about init_sequence being a bottleneck when many sequences
> are used in a single backend.
>
> The attached I think implements what was talked about in the above link
> which for me seems to double the speed of a currval() loop over 30000
> sequences. It goes from about 7 seconds to 3.5 on my laptop.

I think it'd be a better idea to integrate the sequence caching logic
into the relcache. There's a comment about it:
* (We can't
* rely on the relcache, since it's only, well, a cache, and may decide to
* discard entries.)
but that's not really accurate anymore. We have the infrastructure for
keeping values across resets and we don't discard entries.

I just want to check this idea against an existing todo item to move sequences into a single table, as I think by the sounds of it this binds sequences being relations even closer together.

The todo item reads:

"Consider placing all sequences in a single table, or create a system view"

This had been on the back of my mind while implementing the hash table stuff for init_sequence and again when doing my benchmarks where I created 30000 sequences and went through the pain of having a path on my file system with 30000 8k files.

It sounds like your idea overlaps with this todo a little, so maybe this is a good idea to decide which would be best, though the more I think about it, the more I think that moving sequences into a single table is a no-go

So for implementing moving sequences into a single system table:

1. The search_path stuff makes this a bit more complex. It sounds like this would require some duplication of the search_path logic.

2. There is also the problem with tracking object dependency.

Currently:

create sequence t_a_seq;

create table t (a int not null default nextval('t_a_seq'));

alter sequence t_a_seq owned by t.a;

drop table t;

drop sequence t_a_seq; -- already deleted by drop table t

ERROR: sequence "t_a_seq" does not exist

Moving sequences to a single table sounds like a special case for this logic.

3. Would moving sequences to a table still have to check that no duplicate object existed in the pg_class?

Currently you can't have a sequence with the same name as a table

create sequence a;

create table a (a int);

ERROR: relation "a" already exists

Its not that I'm trying to shoot holes in moving sequences to a single table, really I'd like find a way to improve the wastefulness these 1 file per sequence laying around my file system, but if changing this is a no-go then it would be better to come off the todo list and then we shouldn't as many issues pouring more concrete in the sequences being relations mould.

Regards

David Rowley

Re: init_sequence spill to hash table

From

David Rowley

Date:

15 November 2013, 08:47:23

On Fri, Nov 15, 2013 at 3:03 AM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote

I think that means that we should just completely replace the list with the hash table. The difference with a small N is lost in noise, so there's no point in keeping the list as a fast path for small N. That'll make the patch somewhat simpler.
- Heikki

Attached is a much more simple patch which gets rid of the initial linked list.

Attachment

hashtab_seq_v0.2.patch

Re: init_sequence spill to hash table

From

David Rowley

Date:

15 November 2013, 09:12:26

On Fri, Nov 15, 2013 at 3:23 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Andres Freund <andres@2ndquadrant.com> writes:
> I think it'd be a better idea to integrate the sequence caching logic
> into the relcache. There's a comment about it:
> * (We can't
> * rely on the relcache, since it's only, well, a cache, and may decide to
> * discard entries.)
> but that's not really accurate anymore. We have the infrastructure for
> keeping values across resets and we don't discard entries.

We most certainly *do* discard entries, if they're not open when a cache
flush event comes along.

I suppose it'd be possible to mark a relcache entry for a sequence
as locked-in-core, but that doesn't attract me at all. A relcache
entry is a whole lot larger than the amount of state we really need
to keep for a sequence.

One idea is to have a hashtable for the sequence-specific data,
but to add a link field to the relcache entry that points to the
non-flushable sequence hashtable entry. That would save the second
hashtable lookup as long as the relcache entry hadn't been flushed
since last use, while not requiring any violence to the lifespan
semantics of relcache entries. (Actually, if we did that, it might
not even be worth converting the list to a hashtable? Searches would
become a lot less frequent.)

Unless I've misunderstood something it looks like this would mean giving heamam.c and relcache.c knowledge of sequences.

Currently relation_open is called from open_share_lock in sequence.c. The only way I can see to do this would be to add something like relation_open_sequence() in heapam.c which means we'd need to invent RelationIdGetSequenceRelation() and use that instead of RelationIdGetRelation() and somewhere along the line have it pass back the SeqTableData struct which would be tagged onto RelIdCacheEnt.

I think it can be done but I don't think it will look pretty.

Perhaps if there was a more generic way... Would tagging some void *rd_extra only the RelationData be a better way? And just have sequence.c make use of that for storing the SeqTableData.

Also I'm wondering what we'd do with all these pointers when someone does DISCARD SEQUENCES; would we have to invalidate the relcache or would it just be matter of looping over it and freeing of the sequence data setting the pointers to NULL?

Regards

David Rowley

Re: init_sequence spill to hash table

From

Andres Freund

Date:

15 November 2013, 12:43:21

On 2013-11-15 14:22:30 +1300, David Rowley wrote:
> On Fri, Nov 15, 2013 at 3:12 AM, Andres Freund <andres@2ndquadrant.com>wrote:
> 
> > Hi,
> >
> > On 2013-11-13 22:55:43 +1300, David Rowley wrote:
> > > Here http://www.postgresql.org/message-id/24278.1352922571@sss.pgh.pa.usthere
> > > was some talk about init_sequence being a bottleneck when many sequences
> > > are used in a single backend.
> > >
> > > The attached I think implements what was talked about in the above link
> > > which for me seems to double the speed of a currval() loop over 30000
> > > sequences. It goes from about 7 seconds to 3.5 on my laptop.
> >
> > I think it'd be a better idea to integrate the sequence caching logic
> > into the relcache. There's a comment about it:
> >  * (We can't
> >  * rely on the relcache, since it's only, well, a cache, and may decide to
> >  * discard entries.)
> > but that's not really accurate anymore. We have the infrastructure for
> > keeping values across resets and we don't discard entries.
> >
> >
> I just want to check this idea against an existing todo item to move
> sequences into a single table, as I think by the sounds of it this binds
> sequences being relations even closer together.

> This had been on the back of my mind while implementing the hash table
> stuff for init_sequence and again when doing my benchmarks where I created
> 30000 sequences and went through the pain of having a path on my file
> system with 30000 8k files.

Well. But in which real world usecases is that actually the bottleneck?

> 1. The search_path stuff makes this a bit more complex. It sounds like this
> would require some duplication of the search_path logic.

I'd assumed that if we were to do this, the sequences themselves would
still continue to live in pg_class. Just instead of a relfilenode
containing their state it would be stored in an extra table.

> 2. There is also the problem with tracking object dependency.
> 
>     Currently:
>     create sequence t_a_seq;
>     create table t (a int not null default nextval('t_a_seq'));
>     alter sequence t_a_seq owned by t.a;
>     drop table t;
>     drop sequence t_a_seq; -- already deleted by drop table t
>     ERROR:  sequence "t_a_seq" does not exist
> 
>     Moving sequences to a single table sounds like a special case for this
> logic.

There should already be code such dependencies.

4) Scalability problems: The one block sequences use already can be a
major contention issue when you have paralell inserts to the same
table. A workload which I, unlike a couple thousand unrelated sequences,
actually think is more realistic. So we'd need to force 1 sequence tuple
per block, which we currently cannot do.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: init_sequence spill to hash table

From

Andres Freund

Date:

15 November 2013, 12:49:46

On 2013-11-15 19:12:15 +1300, David Rowley wrote:
> On Fri, Nov 15, 2013 at 3:23 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> 
> > Andres Freund <andres@2ndquadrant.com> writes:
> > > I think it'd be a better idea to integrate the sequence caching logic
> > > into the relcache. There's a comment about it:
> > >  * (We can't
> > >  * rely on the relcache, since it's only, well, a cache, and may decide
> > to
> > >  * discard entries.)
> > > but that's not really accurate anymore. We have the infrastructure for
> > > keeping values across resets and we don't discard entries.
> >
> > We most certainly *do* discard entries, if they're not open when a cache
> > flush event comes along.
> >
> > I suppose it'd be possible to mark a relcache entry for a sequence
> > as locked-in-core, but that doesn't attract me at all.  A relcache
> > entry is a whole lot larger than the amount of state we really need
> > to keep for a sequence.
> >
> > One idea is to have a hashtable for the sequence-specific data,
> > but to add a link field to the relcache entry that points to the
> > non-flushable sequence hashtable entry.  That would save the second
> > hashtable lookup as long as the relcache entry hadn't been flushed
> > since last use, while not requiring any violence to the lifespan
> > semantics of relcache entries.  (Actually, if we did that, it might
> > not even be worth converting the list to a hashtable?  Searches would
> > become a lot less frequent.)
> >
> >
> Unless I've misunderstood something it looks like this would mean giving
> heamam.c and relcache.c knowledge of sequences.
> Currently relation_open is called from open_share_lock in sequence.c. The
> only way I can see to do this would be to add something like
> relation_open_sequence() in heapam.c which means we'd need to invent
> RelationIdGetSequenceRelation() and use that instead
> of RelationIdGetRelation() and somewhere along the line have it pass back
> the SeqTableData struct which would be tagged onto RelIdCacheEnt.

Look like indexes already go through that path, and store data in the
relcache, without requiring heapam.c to know all that much.

> I think it can be done but I don't think it will look pretty.
> Perhaps if there was a more generic way... Would tagging some void
> *rd_extra only the RelationData be a better way? And just have sequence.c
> make use of that for storing the SeqTableData.

I'd rather have it properly typed in ->rd_sequence, maybe in a union
against the index members.

> Also I'm wondering what we'd do with all these pointers when someone does
> DISCARD SEQUENCES; would we have to invalidate the relcache or would it
> just be matter of looping over it and freeing of the sequence data setting
> the pointers to NULL?

Good question. Have a 'discard_count' member and global variable? That
starts at zero and gets incremented everytime somebody discards?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: init_sequence spill to hash table

From

Heikki Linnakangas

Date:

15 November 2013, 13:32:19

On 15.11.2013 07:47, David Rowley wrote:
> On Fri, Nov 15, 2013 at 3:03 AM, Heikki Linnakangas <hlinnakangas@vmware.com
>> wrote
>>
>> I think that means that we should just completely replace the list with
>> the hash table. The difference with a small N is lost in noise, so there's
>> no point in keeping the list as a fast path for small N. That'll make the
>> patch somewhat simpler.
>
> Attached is a much more simple patch which gets rid of the initial linked
> list.

Thanks, committed with minor copy-editing. I dialed down the initial
size of the hash table from 1000 to 16, that ought to be enough.

I ran a quick performance test of my own, based on the script you sent.
I modified it a bit to eliminate the PL/pgSQL overhead, making it more
heavily bottlenecked by the nextval/currval overhead. Results:

nextval, 10000 seqs    36772    2426
currval, 1 seq        1176    1069
currval, 2 seqs        865    857
currval, 4 seqs        742    759
currval, 5 seqs        718    711
currval, 10 seqs    680    668
currval, 100 seqs    871    656
currval, 1000 seqs    3507    700
currval, 10000 seqs    34742    1224

The performance when you touch only a few sequences is unchanged. When
you touch a lot of them, you gain. Just as you would expect.

Attached is the test script I used. After running the test, I realized
that there's a little flaw in the test methodology, but I doesn't
invalidates the results. I used the same backend for all the test runs,
so even when currval() is called repeatedly for a single sequence, the
linked list (or hash table, with the patch) nevertheless contains
entries for all 10000 sequences. However, the sequences actually used by
the test are always in the front of the list, because the nextval()
calls were made in the same order. But with the unpatched version, if
you called currval() on the lastly initialized sequence repeatedly,
instead of the firstly initialized one, you would get much would get
horrible performance, even when you touch only a single sequence.

Regarding the more grandiose ideas of using the relcache or rewriting
the way sequences are stored altogether: this patch might become
obsolete if we do any of that stuff, but that's ok. The immediate
performance problem has been solved now, but those other ideas might be
worth pursuing for other reasons.

- Heikki

Attachment

seqtest.sql

Re: init_sequence spill to hash table

From

Andres Freund

Date:

15 November 2013, 13:44:52

On 2013-11-15 12:31:54 +0200, Heikki Linnakangas wrote:
> On 15.11.2013 07:47, David Rowley wrote:
> >On Fri, Nov 15, 2013 at 3:03 AM, Heikki Linnakangas <hlinnakangas@vmware.com
> >>wrote
> >>
> >>I think that means that we should just completely replace the list with
> >>the hash table. The difference with a small N is lost in noise, so there's
> >>no point in keeping the list as a fast path for small N. That'll make the
> >>patch somewhat simpler.
> >
> >Attached is a much more simple patch which gets rid of the initial linked
> >list.
> 
> Thanks, committed with minor copy-editing. I dialed down the initial size of
> the hash table from 1000 to 16, that ought to be enough.

I am slightly confused by the use of HASH_CONTEXT without setting
ctl->hcxt, that works, but seems more like an accident.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: init_sequence spill to hash table

From

David Rowley

Date:

15 November 2013, 14:03:06

On Fri, Nov 15, 2013 at 11:31 PM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:

Thanks, committed with minor copy-editing. I dialed down the initial size of the hash table from 1000 to 16, that ought to be enough.

Great. Thanks for commiting.

Regards

David Rowley

- Heikki

Re: init_sequence spill to hash table

From

Heikki Linnakangas

Date:

15 November 2013, 15:25:25

On 15.11.2013 12:44, Andres Freund wrote:
> On 2013-11-15 12:31:54 +0200, Heikki Linnakangas wrote:
>> On 15.11.2013 07:47, David Rowley wrote:
>>> On Fri, Nov 15, 2013 at 3:03 AM, Heikki Linnakangas <hlinnakangas@vmware.com
>>>> wrote
>>>>
>>>> I think that means that we should just completely replace the list with
>>>> the hash table. The difference with a small N is lost in noise, so there's
>>>> no point in keeping the list as a fast path for small N. That'll make the
>>>> patch somewhat simpler.
>>>
>>> Attached is a much more simple patch which gets rid of the initial linked
>>> list.
>>
>> Thanks, committed with minor copy-editing. I dialed down the initial size of
>> the hash table from 1000 to 16, that ought to be enough.
>
> I am slightly confused by the use of HASH_CONTEXT without setting
> ctl->hcxt, that works, but seems more like an accident.

Ugh. Accident indeed. Thanks, fixed!

- Heikki