Thread: alternative back-end block formats

alternative back-end block formats

From
Christian Convey
Date:
Hi all,

I'm playing around with Postgres, and I thought it might be fun to experiment with alternative formats for relation blocks, to see if I can get smaller files and/or faster server performance.

Does anyone know if this has been done before with Postgres?  I would have assumed yes, but I'm not finding anything in Google about people having done this.

Thanks,
Christian

Re: alternative back-end block formats

From
Bruce Momjian
Date:
On Tue, Jan 21, 2014 at 06:43:54AM -0500, Christian Convey wrote:
> Hi all,
> 
> I'm playing around with Postgres, and I thought it might be fun to experiment
> with alternative formats for relation blocks, to see if I can get smaller files
> and/or faster server performance.
> 
> Does anyone know if this has been done before with Postgres?  I would have
> assumed yes, but I'm not finding anything in Google about people having done
> this.

Not that I know of.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +



Re: alternative back-end block formats

From
Craig Ringer
Date:
On 01/21/2014 07:43 PM, Christian Convey wrote:
> Hi all,
> 
> I'm playing around with Postgres, and I thought it might be fun to
> experiment with alternative formats for relation blocks, to see if I can
> get smaller files and/or faster server performance.

It's not clear how you'd do this without massively rewriting the guts of Pg.

Per the docs on internal structure, Pg has a block header, then tuples
within the blocks, each with a tuple header and list of Datum values for
the tuple. Each Datum has a generic Datum header (handling varlena vs
fixed length values etc) then a type-specific on-disk representation
controlled by the type output function for that type.

At least, that's my understanding - I haven't had cause to delve into
the on-disk format yet.

What concrete problem do you mean to tackle? What idea do you want to
explore or implement?

> Does anyone know if this has been done before with Postgres?  I would
> have assumed yes, but I'm not finding anything in Google about people
> having done this.

AFAIK (and I don't know much in this area) the storage manager isn't
very pluggable compared to the rest of Pg.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: alternative back-end block formats

From
Christian Convey
Date:
Hi Craig,

On Sun, Jan 26, 2014 at 5:47 AM, Craig Ringer <craig@2ndquadrant.com> wrote:
On 01/21/2014 07:43 PM, Christian Convey wrote:
> Hi all,
>
> I'm playing around with Postgres, and I thought it might be fun to
> experiment with alternative formats for relation blocks, to see if I can
> get smaller files and/or faster server performance.

It's not clear how you'd do this without massively rewriting the guts of Pg.

Per the docs on internal structure, Pg has a block header, then tuples
within the blocks, each with a tuple header and list of Datum values for
the tuple. Each Datum has a generic Datum header (handling varlena vs
fixed length values etc) then a type-specific on-disk representation
controlled by the type output function for that type.

I'm still in the process of getting familiar with the pg backend code, so I don't have a concrete plan yet.  However, I'm working on the assumption that some set of macros and functions encapsulates the page layout.  

If/when I tackle this, I expect to add a layer of indirection somewhere around that boundary, so that some non-catalog tables, whose schemas meet certain simplifying assumptions, are read and modified using specialized code.
 
I don't want to get into the specific optimizations I'd like to try, only because I haven't fully studied the code yet, so I don't want to put my foot in my mouth.

What concrete problem do you mean to tackle? What idea do you want to
explore or implement?

My real motivation is that I'd like to get more familiar with the pg backend codebase, and tilting at this windmill seemed like an interesting way to accomplish that.

If I was focused on really solving a real-world problem, I'd say that this lays the groundwork for table-schema-specific storage optimizations and optimized record-filtering code.  But I'd only make that argument if I planned to (a) perform a careful study with statistically significant benchmarks, and/or (b) produce a merge-worthy patch.  At this point I have no intentions of doing so.  My main goal really is just to have fun with the code.


> Does anyone know if this has been done before with Postgres?  I would
> have assumed yes, but I'm not finding anything in Google about people
> having done this.

AFAIK (and I don't know much in this area) the storage manager isn't
very pluggable compared to the rest of Pg.

Thanks for the warning.  Duly noted.

Kind regards,
Christian

Re: alternative back-end block formats

From
Cédric Villemain
Date:
Le lundi 27 janvier 2014 13:42:29 Christian Convey a écrit :
> On Sun, Jan 26, 2014 at 5:47 AM, Craig Ringer <craig@2ndquadrant.com>
wrote:
> > On 01/21/2014 07:43 PM, Christian Convey wrote:
> > > Does anyone know if this has been done before with Postgres?  I
> > > would
> > > have assumed yes, but I'm not finding anything in Google about
> > > people
> > > having done this.
> >
> > AFAIK (and I don't know much in this area) the storage manager isn't
> > very pluggable compared to the rest of Pg.
>
> Thanks for the warning.  Duly noted.

As written in the meeting notes, Tom Lane revealed an interest in
pluggable storage. So it might be interesting to check that.

https://wiki.postgresql.org/wiki/PgCon_2013_Developer_Meeting


--
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation

Re: alternative back-end block formats

From
Christian Convey
Date:
On Tue, Jan 28, 2014 at 5:42 AM, Cédric Villemain <cedric@2ndquadrant.com> wrote:
... 
As written in the meeting notes, Tom Lane revealed an interest in
pluggable storage. So it might be interesting to check that.

https://wiki.postgresql.org/wiki/PgCon_2013_Developer_Meeting

Thanks.  I just read those meeting notes, and also Josh Berkus' blog on the topic:

I was only thinking to enable pluggable operations on a single, specified heap page, probably as a function of which table owned the page.  Josh's blog seems to describe something a little broader in scope, although I can't tell from that post exactly what functionality that entails.

Either way, this sounds like something I'd enjoy pitching in on, to whatever extent I could be useful.  Has anyone started work on this yet?

Re: alternative back-end block formats

From
Tom Lane
Date:
Christian Convey <christian.convey@gmail.com> writes:
> On Tue, Jan 28, 2014 at 5:42 AM, C�dric Villemain <cedric@2ndquadrant.com>wrote:
>> As written in the meeting notes, Tom Lane revealed an interest in
>> pluggable storage. So it might be interesting to check that.
>> https://wiki.postgresql.org/wiki/PgCon_2013_Developer_Meeting

> Thanks.  I just read those meeting notes, and also Josh Berkus' blog on the
> topic:
> http://www.databasesoup.com/2013/05/postgresql-new-development-priorities-2.html

> I was only thinking to enable pluggable operations on a single, specified
> heap page, probably as a function of which table owned the page.  Josh's
> blog seems to describe something a little broader in scope, although I
> can't tell from that post exactly what functionality that entails.

> Either way, this sounds like something I'd enjoy pitching in on, to
> whatever extent I could be useful.  Has anyone started work on this yet?

Nope, but it's still on the radar screen.

There are a couple of really huge issues that would have to be argued out
before any progress could be made.

One is that tuple layout (particularly tuple header format) is something
known in detail throughout large parts of the system.  This is a PITA if
the storage layer would like to use some other tuple format, but is it
worthwhile or even possible to abstract it?

Another is that we've got whole *classes* of utility commands that are
specifically targeted to the storage engine we've got.  VACUUM, CLUSTER,
ALTER TABLE SET TABLESPACE for example.  Not to mention autovacuum.
It's not clear where these would fit if we tried to define a storage
engine API layer.
        regards, tom lane



Re: alternative back-end block formats

From
Christian Convey
Date:
There are a couple of really huge issues that would have to be argued out
before any progress could be made.

Is this something that people want to spend time on right now?  As I mentioned earlier, I'm game.  But it doesn't sound like I'll get very far without adult supervision.

Re: alternative back-end block formats

From
Tom Lane
Date:
Christian Convey <christian.convey@gmail.com> writes:
>> There are a couple of really huge issues that would have to be argued out
>> before any progress could be made.

> Is this something that people want to spend time on right now?  As I
> mentioned earlier, I'm game.  But it doesn't sound like I'll get very far
> without adult supervision.

TBH, I'd rather we waited till the commitfest is over.  This is certainly
material for 9.5, if not even further out, so there's no pressing need for
a debate right now; and we have plenty of stuff we do need to deal with
right now.
        regards, tom lane



Re: alternative back-end block formats

From
Christian Convey
Date:
On Tue, Jan 28, 2014 at 12:39 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
TBH, I'd rather we waited till the commitfest is over.  This is certainly
material for 9.5, if not even further out, so there's no pressing need for
a debate right now; and we have plenty of stuff we do need to deal with
right now.

Works for me.  I'll just lurk in the meantime, and see what I can figure out.  Thanks.

- Christian