Re: Extensible storage manager API - smgr hooks - Mailing list pgsql-hackers

From Kirill Reshke
Subject Re: Extensible storage manager API - smgr hooks
Date
Msg-id CADVKa1VvBzSg=07Hw13QoKpd4b=A1nqcvHPKR=UVmSt0KKdbRw@mail.gmail.com
Whole thread Raw
In response to Extensible storage manager API - smgr hooks  (Anastasia Lubennikova <lubennikovaav@gmail.com>)
Responses Re: Extensible storage manager API - smgr hooks
List pgsql-hackers
Hello Yura and Anastasia.

I have tried to implement per-relation SMGR approach, and faced with a serious problem with redo.

So, to implement per-relation SMGR feature i have tried to do things similar to custom table AM apporach: that is, we can define our custom SMGR in an extention (which defines smgr handle) and then use this SMGR in relation definition. like this:

```postgres=# create extension proxy_smgr ;
CREATE EXTENSION
postgres=# select * from pg_smgr ;
  oid  |  smgrname  |    smgrhandler
-------+------------+--------------------
  4646 | md         | smgr_md_handler
 16386 | proxy_smgr | proxy_smgr_handler
(2 rows)

postgres=# create table tt(i int) storage manager proxy_smgr_handler;
ERROR:  storage manager "proxy_smgr_handler" does not exist
postgres=# create table tt(i int) storage manager proxy_smgr;
INFO:  proxy open 1663 5 16391
INFO:  proxy create 16391
INFO:  proxy close, 16391
INFO:  proxy close, 16391
INFO:  proxy close, 16391
INFO:  proxy close, 16391
CREATE TABLE
postgres=# select * from tt;
INFO:  proxy open 1663 5 16391
INFO:  proxy nblocks 16391
INFO:  proxy nblocks 16391
 i
---
(0 rows)

postgres=# insert into tt values(1);
INFO:  proxy exists 16391
INFO:  proxy nblocks 16391
INFO:  proxy nblocks 16391
INFO:  proxcy extend 16391
INSERT 0 1
postgres=# select * from tt;
INFO:  proxy nblocks 16391
INFO:  proxy nblocks 16391
 i
---
 1
(1 row)
```

extention sql files looks like this:

```
CREATE FUNCTION proxy_smgr_handler(internal)
RETURNS table_smgr_handler
AS 'MODULE_PATHNAME'
LANGUAGE C;

-- Storage manager
CREATE STORAGE MANAGER proxy_smgr HANDLER proxy_smgr_handler;
```

To do this i have defined catalog relation pg_smgr where i store smgr`s handlers and use this relation when we need to open some other(non-catalog) relations in smgropen function. The patch almost passes regression tests(8 of 214 tests failed.) but it fails on first checkpoint or in crash recorvery. Also, i have changed WAL format, added SMGR oid to each WAL record with RelFileNode structure. Why do we need WAL changes? well, i tried to solve folowing issue.

As i mentioned, there is a problem with redo, with is: we cannot do syscache search to get relation`s SMGR to apply wal, because syscache is not initialized during redo (crash recovery). As i understand, syscache is not initialised because system catalogs are not consistent until crash recovery is done.


So, thants it, I decided to write to this thread to get feedback and understand how best to solve the problem with redo.

What do you think?

On Thu, Jun 16, 2022 at 1:38 PM Andres Freund <andres@anarazel.de> wrote:
Hi,

On 2021-06-30 05:36:11 +0300, Yura Sokolov wrote:
> Anastasia Lubennikova писал 2021-06-30 00:49:
> > Hi, hackers!
> >
> > Many recently discussed features can make use of an extensible storage
> > manager API. Namely, storage level compression and encryption [1],
> > [2], [3], disk quota feature [4], SLRU storage changes [5], and any
> > other features that may want to substitute PostgreSQL storage layer
> > with their implementation (i.e. lazy_restore [6]).
> >
> > Attached is a proposal to change smgr API to make it extensible.  The
> > idea is to add a hook for plugins to get control in smgr and define
> > custom storage managers. The patch replaces smgrsw[] array and smgr_sw
> > selector with smgr() function that loads f_smgr implementation.
> >
> > As before it has only one implementation - smgr_md, which is wrapped
> > into smgr_standard().
> >
> > To create custom implementation, a developer needs to implement smgr
> > API functions
> >     static const struct f_smgr smgr_custom =
> >     {
> >         .smgr_init = custominit,
> >         ...
> >     }
> >
> > create a hook function
> >
> >    const f_smgr * smgr_custom(BackendId backend, RelFileNode rnode)
> >   {
> >       //Here we can also add some logic and chose which smgr to use
> > based on rnode and backend
> >       return &smgr_custom;
> >   }
> >
> > and finally set the hook:
> >     smgr_hook = smgr_custom;
> >
> > [1]
> > https://www.postgresql.org/message-id/flat/11996861554042351@iva4-dd95b404a60b.qloud-c.yandex.net
> > [2]
> > https://www.postgresql.org/message-id/flat/272dd2d9.e52a.17235f2c050.Coremail.chjischj%40163.com
> > [3] https://postgrespro.com/docs/enterprise/9.6/cfs
> > [4]
> > https://www.postgresql.org/message-id/flat/CAB0yre%3DRP_ho6Bq4cV23ELKxRcfhV2Yqrb1zHp0RfUPEWCnBRw%40mail.gmail.com
> > [5]
> > https://www.postgresql.org/message-id/flat/20180814213500.GA74618%4060f81dc409fc.ant.amazon.com
> > [6]
> > https://wiki.postgresql.org/wiki/PGCon_2021_Fun_With_WAL#Lazy_Restore
> >
> > --
> >
> > Best regards,
> > Lubennikova Anastasia
>
> Good day, Anastasia.
>
> I also think smgr should be extended with different implementations aside of
> md.
> But which way concrete implementation will be chosen for particular
> relation?
> I believe it should be (immutable!) property of tablespace, and should be
> passed
> to smgropen. Patch in current state doesn't show clear way to distinct
> different
> implementations per relation.
>
> I don't think patch should be that invasive. smgrsw could pointer to
> array instead of static array as it is of now, and then reln->smgr_which
> will remain with same meaning. Yep it then will need a way to select
> specific
> implementation, but something like `char smgr_name[NAMEDATALEN]` field with
> linear search in (i believe) small smgrsw array should be enough.
>
> Maybe I'm missing something?

There has been no activity on this thread for > 6 months. Therefore I'm
marking it as returned with feedback. Anastasia, if you want to work on this,
please do, but there's obviously no way it can be merged into 15...

Greetings,

Andres




Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Prevent writes on large objects in read-only transactions
Next
From: Matthias van de Meent
Date:
Subject: Re: CREATE TABLE ( .. STORAGE ..)