On 12/03/2014 11:24 AM, Andres Freund wrote:
> On 2014-12-03 10:59:50 +0100, José Luis Tallón wrote:
>> snip]
> I don't think the WAL logging would need to change much in comparison to
> the current solution. We'd just add the page number to the WAL record.
>
> The biggest advantage would be to require fewer heavyweight locks,
> because the pg_sequence one could be a single fastpath lock. Currently
> we have to take the sequence's relation lock (heavyweight) and then the
> the page level lock (lwlock) for every single sequence used.
Got it, thank you for the explanation.
>> May I possibly suggest a file-per-schema model instead? This approach would
>> certainly solve the excessive i-node consumption problem that --I guess--
>> Andres is trying to address here.
> I don't think that really has any advantages.
Just spreading the I/O load, nothing more, it seems:
Just to elaborate a bit on the reasoning, for completeness' sake:
Given that a relation's segment maximum size is 1GB, we'd have
(1048576/8)=128k sequences per relation segment.
Arguably, not many real use cases will have that many sequences.... save
for *massively* multi-tenant databases.
The downside being that all that random I/O --- in general, it can't
really be sequential unless there are very very few sequences--- can't
be spread to other spindles. Create a "sequence_default_tablespace" GUC
+ ALTER SEQUENCE SET TABLESPACE, to use an SSD for this purpose maybe? (I could take a shot at the patch, if deemed
worthwhile)
>> [snip]
> The overhead of a file is much more than wasting the remainder of a
> page. Alone the requirement of separate fsyncs and everything is pretty
> bothersome. The generated IO patterns are also much worse...
Yes, you are right. I stand corrected.
> [snip]
> I think there's pretty much no chance of accepting more than one page
> per sequence
Definitively.
Thanks,
J.L.