Re: Sequence Access Method WIP - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Sequence Access Method WIP
Date
Msg-id 20141203102425.GT2456@alap3.anarazel.de
Whole thread Raw
In response to Re: Sequence Access Method WIP  (José Luis Tallón<jltallon@adv-solutions.net>)
Responses Re: Sequence Access Method WIP  (José Luis Tallón<jltallon@adv-solutions.net>)
List pgsql-hackers
On 2014-12-03 10:59:50 +0100, José Luis Tallón wrote:
> On 12/02/2014 08:21 PM, Andres Freund wrote:
> >[snip]
> >>2. Instead of the single amdata field, make it possible for the
> >>implementation to define any number of fields with any datatype in the
> >>tuple. That would make debugging, monitoring etc. easier.
> >My main problem with that approach is that it pretty much nails the door
> >shut for moving sequences into a catalog table instead of the current,
> >pretty insane, approach of a physical file per sequence.
> 
> Hmm...  having done my fair bit of testing, I can say that this isn't
> actually that bad (though depends heavily on the underlying filesystem and
> workload, of course)
> With this approach, I fear extreme I/O contention with an update-heavy
> workload... unless all sequence activity is finally WAL-logged and hence
> writes to the actual files become mostly sequential and asynchronous.

I don't think the WAL logging would need to change much in comparison to
the current solution. We'd just add the page number to the WAL record.

The biggest advantage would be to require fewer heavyweight locks,
because the pg_sequence one could be a single fastpath lock. Currently
we have to take the sequence's relation lock (heavyweight) and then the
the page level lock (lwlock) for every single sequence used.

> May I possibly suggest a file-per-schema model instead? This approach would
> certainly solve the excessive i-node consumption problem that --I guess--
> Andres is trying to address here.

I don't think that really has any advantages.

> >Currently, with
> >our without seqam, it'd not be all that hard to force it into a catalog,
> >taking care to to force each tuple onto a separate page...
> 
> IMHO, this is jst as wasteful as the current approach (one-page file per
> sequence) in terms of disk usage and complicates the code a bit .... but I
> really don't see how we can have more than one sequence state per page
> without severe (page) locking problems.

The overhead of a file is much more than wasting the remainder of a
page. Alone the requirement of separate fsyncs and everything is pretty
bothersome. The generated IO patterns are also much worse...

> However, someone with deeper knowledge of page pinning and buffer manager
> internals could certainly devise a better solution...

I think there's pretty much no chance of accepting more than one page
per

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: José Luis Tallón
Date:
Subject: Re: Sequence Access Method WIP
Next
From: Ashutosh Bapat
Date:
Subject: Re: inherit support for foreign tables