Re: init_sequence spill to hash table - Mailing list pgsql-hackers

From David Rowley
Subject Re: init_sequence spill to hash table
Date
Msg-id CAApHDvoduD3jON1-4ZoY5-RahbE_=tPABM02WYSrwz79XUcuPQ@mail.gmail.com
Whole thread Raw
In response to Re: init_sequence spill to hash table  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: init_sequence spill to hash table  (Andres Freund <andres@2ndquadrant.com>)
List pgsql-hackers
On Fri, Nov 15, 2013 at 3:12 AM, Andres Freund <andres@2ndquadrant.com> wrote:
Hi,

On 2013-11-13 22:55:43 +1300, David Rowley wrote:
> Here http://www.postgresql.org/message-id/24278.1352922571@sss.pgh.pa.us there
> was some talk about init_sequence being a bottleneck when many sequences
> are used in a single backend.
>
> The attached I think implements what was talked about in the above link
> which for me seems to double the speed of a currval() loop over 30000
> sequences. It goes from about 7 seconds to 3.5 on my laptop.

I think it'd be a better idea to integrate the sequence caching logic
into the relcache. There's a comment about it:
 * (We can't
 * rely on the relcache, since it's only, well, a cache, and may decide to
 * discard entries.)
but that's not really accurate anymore. We have the infrastructure for
keeping values across resets and we don't discard entries.


I just want to check this idea against an existing todo item to move sequences into a single table, as I think by the sounds of it this binds sequences being relations even closer together. 

The todo item reads:

"Consider placing all sequences in a single table, or create a system view"

This had been on the back of my mind while implementing the hash table stuff for init_sequence and again when doing my benchmarks where I created 30000 sequences and went through the pain of having a path on my file system with 30000 8k files.
It sounds like your idea overlaps with this todo a little, so maybe this is a good idea to decide which would be best, though the more I think about it, the more I think that moving sequences into a single table is a no-go

So for implementing moving sequences into a single system table:

1. The search_path stuff makes this a bit more complex. It sounds like this would require some duplication of the search_path logic.

2. There is also the problem with tracking object dependency.

    Currently:
    create sequence t_a_seq;
    create table t (a int not null default nextval('t_a_seq'));
    alter sequence t_a_seq owned by t.a;
    drop table t;
    drop sequence t_a_seq; -- already deleted by drop table t
    ERROR:  sequence "t_a_seq" does not exist

    Moving sequences to a single table sounds like a special case for this logic.


3. Would moving sequences to a table still have to check that no duplicate object existed in the pg_class?
    Currently you can't have a sequence with the same name as a table

    create sequence a;
    create table a (a int);
    ERROR:  relation "a" already exists


Its not that I'm trying to shoot holes in moving sequences to a single table, really I'd like find a way to improve the wastefulness these 1 file per sequence laying around my file system, but if changing this is a no-go then it would be better to come off the todo list and then we shouldn't as many issues pouring more concrete in the sequences being relations mould.

Regards

David Rowley

pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Review: Patch insert throw error when year field len > 4 for timestamptz datatype
Next
From: Bruce Momjian
Date:
Subject: Re: Review: Patch insert throw error when year field len > 4 for timestamptz datatype