Thread: Control your disk usage in PG: Introduction to Disk Quota Extension
Overview
Diskquota is an extension that provides disk usage enforcement for database objects in Postgresql. Currently it supports to set quota limit on schema and role in a given database and limit the amount of disk space that a schema or a role can use.
This project is inspired by Heikki's pg_quota project (link: https://github.com/hlinnaka/pg_quota) and enhance it to support different kinds of DDL and DML which may change the disk usage of database objects.
Diskquota is a soft limit of disk uages. It has some delay to detect the schemas or roles whose quota limit is exceeded. Here 'soft limit' supports two kinds of encforcement: Query loading data into out-of-quota schema/role will be forbidden before query is running. Query loading data into schema/role with rooms will be cancelled when the quota limit is reached dynamically during the query is running.
Design
Diskquota extension is based on background worker framework in Postgresql. There are two kinds of background workers: diskquota launcher and diskquota worker.
There is only one laucher process per database cluster(i.e. one laucher per postmaster). Launcher process is reponsible for manage worker processes: Calling RegisterDynamicBackgroundWorker() to create new workers and keep their handle. Calling TerminateBackgroundWorker() to terminate workers which are disabled when DBA modify diskquota.monitor_databases
There are many worker processes, one for each database which is listed in diskquota.monitor_databases. Currently, we support to monitor at most 10 databases at the same time. Worker processes are responsible for monitoring the disk usage of schemas and roles for the target database, and do quota enfocement. It will periodically (can be set via diskquota.naptime) recalcualte the table size of active tables, and update their corresponding schema or owner's disk usage. Then compare with quota limit for those schemas or roles. If exceeds the limit, put the corresponding schemas or roles into the blacklist in shared memory. Schemas or roles in blacklist are used to do query enforcement to cancel queries which plan to load data into these schemas or roles.
Active table
Active tables are the tables whose table size may change in the last quota check interval. We use hooks in smgecreate(), smgrextend() and smgrtruncate() to detect active tables and store them(currently relfilenode) in the shared memory. Diskquota worker process will periodically consuming active table in shared memories, convert relfilenode to relaton oid, and calcualte table size by calling pg_total_relation_size(), which will sum the size of table(including: base, vm, fsm, toast and index).
Enforcement
Enforcement is implemented as hooks. There are two kinds of enforcement hooks: enforcement before query is running and enforcement during query is running. The 'before query' one is implemented at ExecutorCheckPerms_hook in function ExecCheckRTPerms() The 'during query' one is implemented at BufferExtendCheckPerms_hook in function ReadBufferExtended(). Note that the implementation of BufferExtendCheckPerms_hook will firstly check whether function request a new block, if not skip directyly.
Quota setting store
Quota limit of a schema or a role is stored in table 'quota_config' in 'diskquota' schema in monitored database. So each database stores and manages its own disk quota configuration. Note that although role is a db object in cluster level, we limit the diskquota of a role to be database specific. That is to say, a role may has different quota limit on different databases and their disk usage is isolated between databases.
Install
- Add hook functions to Postgres by applying patch. It's required since disk quota need to add some new hook functions in postgres core. This step would be skipped after patch is merged into postgres in future.
# install patch into postgres_src and rebuild postgres.
cd postgres_src;
git apply $diskquota_src/patch/pg_hooks.patch;
make;
make install;
- Compile and install disk quota.
cd $diskquota_src;
make;
make install;
- Config postgresql.conf
# enable diskquota in preload library.
shared_preload_libraries = 'diskquota'
# set monitored databases
diskquota.monitor_databases = 'postgres'
# set naptime (second) to refresh the disk quota stats periodically
diskquota.naptime = 2
# restart database to load preload library.
pg_ctl restart
- Create diskquota extension in monitored database.
create extension diskquota;
- Reload database configuraion
# reset monitored database list in postgresql.conf
diskquota.monitor_databases = 'postgres, postgres2'
# reload configuration
pg_ctl reload
Usage
- Set/update/delete schema quota limit using diskquota.set_schema_quota
create schema s1;
select diskquota.set_schema_quota('s1', '1 MB');
set search_path to s1;
create table a(i int);
# insert small data succeeded
insert into a select generate_series(1,100);
# insert large data failed
insert into a select generate_series(1,10000000);
# insert small data failed
insert into a select generate_series(1,100);
# delete quota configuration
select diskquota.set_schema_quota('s1', '-1');
# insert small data succeed
select pg_sleep(5);
insert into a select generate_series(1,100);
reset search_path;
- Set/update/delete role quota limit using diskquota.set_role_quota
create role u1 nologin;
create table b (i int);
alter table b owner to u1;
select diskquota.set_role_quota('u1', '1 MB');
# insert small data succeeded
insert into b select generate_series(1,100);
# insert large data failed
insert into b select generate_series(1,10000000);
# insert small data failed
insert into b select generate_series(1,100);
# delete quota configuration
select diskquota.set_role_quota('u1', '-1');
# insert small data succeed
select pg_sleep(5);
insert into a select generate_series(1,100);
reset search_path;
- Show schema quota limit and current usage
select * from diskquota.show_schema_quota_view;
Test
Run regression tests.
cd contrib/diskquota;
make installcheck
Benchmark & Performence Test
Cost of diskquota worker
During each refresh interval, the disk quota worker need to refresh the disk quota model.
It take less than 100ms under 100K user tables with no avtive tables.
It take less than 200ms under 100K user tables with 1K active tables.
Impact on OLTP queries
We test OLTP queries to measure the impact of enabling diskquota feature. The range is from 2k tables to 10k tables. Each connection will insert 100 rows into each table. And the parallel connections range is from 5 to 25. Number of active tables will be around 1k.
Without diskquota enabled (seconds)
2k | 4k | 6k | 8k | 10k | |
---|---|---|---|---|---|
5 | 4.002 | 11.356 | 18.460 | 28.591 | 41.123 |
10 | 4.832 | 11.988 | 21.113 | 32.829 | 45.832 |
15 | 6.238 | 16.896 | 28.722 | 45.375 | 64.642 |
20 | 8.036 | 21.711 | 38.499 | 61.763 | 87.875 |
25 | 9.909 | 27.175 | 47.996 | 75.688 | 106.648 |
With diskquota enabled (seconds)
2k | 4k | 6k | 8k | 10k | |
---|---|---|---|---|---|
5 | 4.135 | 10.641 | 18.776 | 28.804 | 41.740 |
10 | 4.773 | 12.407 | 22.351 | 34.243 | 47.568 |
15 | 6.355 | 17.305 | 30.941 | 46.967 | 66.216 |
20 | 9.451 | 22.231 | 40.645 | 61.758 | 88.309 |
25 | 10.096 | 26.844 | 48.910 | 76.537 | 108.025 |
The performance difference between with/without diskquota enabled are less then 2-3% in most case. Therefore, there is no significant performance downgrade when diskquota is enabled.
Notes
- Drop database with diskquota enabled.
If DBA enable monitoring diskquota on a database, there will be a connection to this database from diskquota worker process. DBA need to first remove this database from diskquota.monitor_databases in postgres.conf, and reload configuration by call pg_ctl reload
. Then database could be dropped successfully.
- Temp table.
Diskquota supports to limit the disk usage of temp table as well. But schema and role are different. For role, i.e. the owner of the temp table, diakquota will treat it the same as normal tables and sum its table size to its owner's quota. While for schema, temp table is located under namespace 'pg_temp_backend_id', so temp table size will not sum to the current schema's qouta.
On Tue, 2018-11-13 at 16:47 +0800, Hubert Zhang wrote: > Hi all, > > We implement disk quota feature on Postgresql as an extension(link: > https://github.com/greenplum-db/diskquota), > If you are interested, try and use it to limit the amount of disk > space that > a schema or a role can use in Postgresql. > Any feedback or question are appreciated. > It's not clear to me what's the goal of this thread? I understand what quotas are about, but are you sharing it because (a) it's a useful extension, (b) you propose adding a couple of new hooks (and keep the extension separate) or (c) you propose adding both the hooks and the extension (into contrib)? I assume it's either (b) and (c), in which case you should add it to 2019-01 CF: https://commitfest.postgresql.org/21/ In either case, it might be useful to add a LICENSE to the github repository, it's not clear what's the situation in this respect. That probably matters especially for (b), because for (c) it'd end up with PostgreSQL license automatically. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, 2018-11-13 at 16:47 +0800, Hubert Zhang wrote:
> Hi all,
>
> We implement disk quota feature on Postgresql as an extension(link:
> https://github.com/greenplum-db/diskquota),
> If you are interested, try and use it to limit the amount of disk
> space that
> a schema or a role can use in Postgresql.
> Any feedback or question are appreciated.
>
It's not clear to me what's the goal of this thread? I understand what
quotas are about, but are you sharing it because (a) it's a useful
extension, (b) you propose adding a couple of new hooks (and keep the
extension separate) or (c) you propose adding both the hooks and the
extension (into contrib)?
I assume it's either (b) and (c), in which case you should add it to
2019-01 CF: https://commitfest.postgresql.org/21/
In either case, it might be useful to add a LICENSE to the github
repository, it's not clear what's the situation in this respect. That
probably matters especially for (b), because for (c) it'd end up with
PostgreSQL license automatically.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Thanks, Tomas,Yes, we want to add the hooks into postgres repo.But before that, we plan to firstly get some feedbacks from community about the diskquota extension implementation and usage?Later, we'll modify our license and submit the hooks into CF.ThanksHubertOn Wed, Nov 14, 2018 at 3:54 AM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:On Tue, 2018-11-13 at 16:47 +0800, Hubert Zhang wrote:
> Hi all,
>
> We implement disk quota feature on Postgresql as an extension(link:
> https://github.com/greenplum-db/diskquota),
> If you are interested, try and use it to limit the amount of disk
> space that
> a schema or a role can use in Postgresql.
> Any feedback or question are appreciated.
>
It's not clear to me what's the goal of this thread? I understand what
quotas are about, but are you sharing it because (a) it's a useful
extension, (b) you propose adding a couple of new hooks (and keep the
extension separate) or (c) you propose adding both the hooks and the
extension (into contrib)?
I assume it's either (b) and (c), in which case you should add it to
2019-01 CF: https://commitfest.postgresql.org/21/
In either case, it might be useful to add a LICENSE to the github
repository, it's not clear what's the situation in this respect. That
probably matters especially for (b), because for (c) it'd end up with
PostgreSQL license automatically.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services--ThanksHubert Zhang
Attachment
On Tue, Nov 20, 2018 at 2:20 AM Haozhou Wang <hawang@pivotal.io> wrote: > We prepared a patch that includes the hook points. And such hook points are needed for disk quota extension. > There are two hooks. > One is SmgrStat_hook. It's used to perform ad-hoc logic in storage when doing smgr create/extend/truncate in general. Asfor disk quota extension, this hook is used to detect active tables(new created tables, tables extending new blocks, ortables being truncated) > The other is BufferExtendCheckPerms_hook. It's used to perform ad-hoc logic when buffer extend a new block. Since ReadBufferExtendedis a hot function, we call this hook only when blockNum == P_NEW. As for disk quota extension, this hookis used to do query enforcement during the query is loading data. > > Any comments are appreciated. +1 for adding some hooks to support this kind of thing, but I think the names you've chosen are not very good. The hook name should describe the place from which it is called, not the purpose for which one imagines that it will be used, because somebody else might imagine another use. Both BufferExtendCheckPerms_hook_type and SmgrStat_hook_type are imagining that they know what the hook does - CheckPerms in the first case and Stat in the second case. For this particular purpose, I don't immediately see why you need a hook in both places. If ReadBuffer is called with P_NEW, aren't we guaranteed to end up in smgrextend()? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, Nov 20, 2018 at 2:20 AM Haozhou Wang <hawang@pivotal.io> wrote:
> We prepared a patch that includes the hook points. And such hook points are needed for disk quota extension.
> There are two hooks.
> One is SmgrStat_hook. It's used to perform ad-hoc logic in storage when doing smgr create/extend/truncate in general. As for disk quota extension, this hook is used to detect active tables(new created tables, tables extending new blocks, or tables being truncated)
> The other is BufferExtendCheckPerms_hook. It's used to perform ad-hoc logic when buffer extend a new block. Since ReadBufferExtended is a hot function, we call this hook only when blockNum == P_NEW. As for disk quota extension, this hook is used to do query enforcement during the query is loading data.
>
> Any comments are appreciated.
+1 for adding some hooks to support this kind of thing, but I think
the names you've chosen are not very good. The hook name should
describe the place from which it is called, not the purpose for which
one imagines that it will be used, because somebody else might imagine
another use. Both BufferExtendCheckPerms_hook_type and
SmgrStat_hook_type are imagining that they know what the hook does -
CheckPerms in the first case and Stat in the second case.
For this particular purpose, I don't immediately see why you need a
hook in both places. If ReadBuffer is called with P_NEW, aren't we
guaranteed to end up in smgrextend()?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachment
For this particular purpose, I don't immediately see why you need a
hook in both places. If ReadBuffer is called with P_NEW, aren't we
guaranteed to end up in smgrextend()?
Thank you very much for your review.We refactored our patch with new names and comments.For ReadBufferExtended hook, yes, Readbuffer with P_NEW will then call smgrextend.But in smgrextend, we cannot get the oid of a relation, and it will take some time to get the oid via smgrrelation.We would like to add a hook just before the smgrextend to get the oid and avoid use RelidByRelfilenode().New patch is attached in the attachment.Thank a lot!Regards,HaozhouOn Wed, Nov 21, 2018 at 10:48 PM Robert Haas <robertmhaas@gmail.com> wrote:On Tue, Nov 20, 2018 at 2:20 AM Haozhou Wang <hawang@pivotal.io> wrote:
> We prepared a patch that includes the hook points. And such hook points are needed for disk quota extension.
> There are two hooks.
> One is SmgrStat_hook. It's used to perform ad-hoc logic in storage when doing smgr create/extend/truncate in general. As for disk quota extension, this hook is used to detect active tables(new created tables, tables extending new blocks, or tables being truncated)
> The other is BufferExtendCheckPerms_hook. It's used to perform ad-hoc logic when buffer extend a new block. Since ReadBufferExtended is a hot function, we call this hook only when blockNum == P_NEW. As for disk quota extension, this hook is used to do query enforcement during the query is loading data.
>
> Any comments are appreciated.
+1 for adding some hooks to support this kind of thing, but I think
the names you've chosen are not very good. The hook name should
describe the place from which it is called, not the purpose for which
one imagines that it will be used, because somebody else might imagine
another use. Both BufferExtendCheckPerms_hook_type and
SmgrStat_hook_type are imagining that they know what the hook does -
CheckPerms in the first case and Stat in the second case.
For this particular purpose, I don't immediately see why you need a
hook in both places. If ReadBuffer is called with P_NEW, aren't we
guaranteed to end up in smgrextend()?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Wed, Nov 21, 2018 at 09:47:44AM -0500, Robert Haas wrote: > +1 for adding some hooks to support this kind of thing, but I think > the names you've chosen are not very good. The hook name should > describe the place from which it is called, not the purpose for which > one imagines that it will be used, because somebody else might imagine > another use. Both BufferExtendCheckPerms_hook_type and > SmgrStat_hook_type are imagining that they know what the hook does - > CheckPerms in the first case and Stat in the second case. I personally don't mind making Postgres more pluggable, but I don't think that we actually need the extra ones proposed here at the layer of smgr, as smgr is already a layer designed to call an underlying set of APIs able to extend, unlink, etc. depending on the storage type. > For this particular purpose, I don't immediately see why you need a > hook in both places. If ReadBuffer is called with P_NEW, aren't we > guaranteed to end up in smgrextend()? Yes, that's a bit awkward. -- Michael
Attachment
On Wed, Nov 21, 2018 at 09:47:44AM -0500, Robert Haas wrote:
> +1 for adding some hooks to support this kind of thing, but I think
> the names you've chosen are not very good. The hook name should
> describe the place from which it is called, not the purpose for which
> one imagines that it will be used, because somebody else might imagine
> another use. Both BufferExtendCheckPerms_hook_type and
> SmgrStat_hook_type are imagining that they know what the hook does -
> CheckPerms in the first case and Stat in the second case.
I personally don't mind making Postgres more pluggable, but I don't
think that we actually need the extra ones proposed here at the layer
of smgr, as smgr is already a layer designed to call an underlying set
of APIs able to extend, unlink, etc. depending on the storage type.
> For this particular purpose, I don't immediately see why you need a
> hook in both places. If ReadBuffer is called with P_NEW, aren't we
> guaranteed to end up in smgrextend()?
Yes, that's a bit awkward.
--
Michael
> For this particular purpose, I don't immediately see why you need a
> hook in both places. If ReadBuffer is called with P_NEW, aren't we
> guaranteed to end up in smgrextend()?
Yes, that's a bit awkward.
Thanks very much for your comments.To the best of my knowledge, smgr is a layer that abstract the storage operations. Therefore, it is a good place to control or collect information the storage operations without touching the physical storage layer.Moreover, smgr is coming with actual disk IO operation (not consider the OS cache) for postgres. So we do not need to worry about the buffer management in postgres.It will make the purpose of hook is pure: a hook for actual disk IO.Regards,HaozhouOn Wed, Dec 26, 2018 at 1:56 PM Michael Paquier <michael@paquier.xyz> wrote:On Wed, Nov 21, 2018 at 09:47:44AM -0500, Robert Haas wrote:
> +1 for adding some hooks to support this kind of thing, but I think
> the names you've chosen are not very good. The hook name should
> describe the place from which it is called, not the purpose for which
> one imagines that it will be used, because somebody else might imagine
> another use. Both BufferExtendCheckPerms_hook_type and
> SmgrStat_hook_type are imagining that they know what the hook does -
> CheckPerms in the first case and Stat in the second case.
I personally don't mind making Postgres more pluggable, but I don't
think that we actually need the extra ones proposed here at the layer
of smgr, as smgr is already a layer designed to call an underlying set
of APIs able to extend, unlink, etc. depending on the storage type.
> For this particular purpose, I don't immediately see why you need a
> hook in both places. If ReadBuffer is called with P_NEW, aren't we
> guaranteed to end up in smgrextend()?
Yes, that's a bit awkward.
--
Michael
Attachment
> For this particular purpose, I don't immediately see why you need a
> hook in both places. If ReadBuffer is called with P_NEW, aren't we
> guaranteed to end up in smgrextend()?
Yes, that's a bit awkward.Hi Michael, we revisit the ReadBuffer hook and remove it in the latest patch.ReadBuffer hook is original used to do enforcement(e.g. out of diskquota limit) when query is loading data.We plan to put the enforcement work of running query to separate diskquota worker process.Let worker process to detect the backends to be cancelled and send SIGINT to these backends.So there is no need for ReadBuffer hook anymore.Our patch currently only contains smgr related hooks to catch the file change and get the Active Table list for diskquota extension.Thanks Hubert.On Mon, Jan 7, 2019 at 6:56 PM Haozhou Wang <hawang@pivotal.io> wrote:Thanks very much for your comments.To the best of my knowledge, smgr is a layer that abstract the storage operations. Therefore, it is a good place to control or collect information the storage operations without touching the physical storage layer.Moreover, smgr is coming with actual disk IO operation (not consider the OS cache) for postgres. So we do not need to worry about the buffer management in postgres.It will make the purpose of hook is pure: a hook for actual disk IO.Regards,HaozhouOn Wed, Dec 26, 2018 at 1:56 PM Michael Paquier <michael@paquier.xyz> wrote:On Wed, Nov 21, 2018 at 09:47:44AM -0500, Robert Haas wrote:
> +1 for adding some hooks to support this kind of thing, but I think
> the names you've chosen are not very good. The hook name should
> describe the place from which it is called, not the purpose for which
> one imagines that it will be used, because somebody else might imagine
> another use. Both BufferExtendCheckPerms_hook_type and
> SmgrStat_hook_type are imagining that they know what the hook does -
> CheckPerms in the first case and Stat in the second case.
I personally don't mind making Postgres more pluggable, but I don't
think that we actually need the extra ones proposed here at the layer
of smgr, as smgr is already a layer designed to call an underlying set
of APIs able to extend, unlink, etc. depending on the storage type.
> For this particular purpose, I don't immediately see why you need a
> hook in both places. If ReadBuffer is called with P_NEW, aren't we
> guaranteed to end up in smgrextend()?
Yes, that's a bit awkward.
--
Michael--ThanksHubert Zhang
void smgrextend
```
```
typedef void (*smgrextend_hook_type)
```
Hi Michael, RobertFor you question about the hook position, I want to explain more about the background why we want to introduce these hooks.We wrote a diskquota extension for Postgresql(which is inspired by Heikki's pg_quota). Diskquota extension is used to control the disk usage in Postgresql in a fine-grained way, which means:1. You could set disk quota limit at schema level or role level.2. A background worker will gather the current disk usage for each schema/role in realtime.3. A background worker will generate the blacklist for schema/role whose quota limit is exceeded.4. New transaction want to insert data into the schema/role in the blacklist will be cancelled.In step 2, gathering the current disk usage for each schema needs to sum disk size of all the tables in this schema. This is a time consuming operation. We want to use hooks in SMGR to detect the Active Table, and only recalculate the disk size of all the Active Tables.For example, the smgrextend hook indicates that you allocate a new block and the table need to be treated as Active Table.Do you have some better hook positions recommend to solve the above user case?Thanks in advance.HubertOn Tue, Jan 22, 2019 at 12:08 PM Hubert Zhang <hzhang@pivotal.io> wrote:> For this particular purpose, I don't immediately see why you need a
> hook in both places. If ReadBuffer is called with P_NEW, aren't we
> guaranteed to end up in smgrextend()?
Yes, that's a bit awkward.Hi Michael, we revisit the ReadBuffer hook and remove it in the latest patch.ReadBuffer hook is original used to do enforcement(e.g. out of diskquota limit) when query is loading data.We plan to put the enforcement work of running query to separate diskquota worker process.Let worker process to detect the backends to be cancelled and send SIGINT to these backends.So there is no need for ReadBuffer hook anymore.Our patch currently only contains smgr related hooks to catch the file change and get the Active Table list for diskquota extension.Thanks Hubert.On Mon, Jan 7, 2019 at 6:56 PM Haozhou Wang <hawang@pivotal.io> wrote:Thanks very much for your comments.To the best of my knowledge, smgr is a layer that abstract the storage operations. Therefore, it is a good place to control or collect information the storage operations without touching the physical storage layer.Moreover, smgr is coming with actual disk IO operation (not consider the OS cache) for postgres. So we do not need to worry about the buffer management in postgres.It will make the purpose of hook is pure: a hook for actual disk IO.Regards,HaozhouOn Wed, Dec 26, 2018 at 1:56 PM Michael Paquier <michael@paquier.xyz> wrote:On Wed, Nov 21, 2018 at 09:47:44AM -0500, Robert Haas wrote:
> +1 for adding some hooks to support this kind of thing, but I think
> the names you've chosen are not very good. The hook name should
> describe the place from which it is called, not the purpose for which
> one imagines that it will be used, because somebody else might imagine
> another use. Both BufferExtendCheckPerms_hook_type and
> SmgrStat_hook_type are imagining that they know what the hook does -
> CheckPerms in the first case and Stat in the second case.
I personally don't mind making Postgres more pluggable, but I don't
think that we actually need the extra ones proposed here at the layer
of smgr, as smgr is already a layer designed to call an underlying set
of APIs able to extend, unlink, etc. depending on the storage type.
> For this particular purpose, I don't immediately see why you need a
> hook in both places. If ReadBuffer is called with P_NEW, aren't we
> guaranteed to end up in smgrextend()?
Yes, that's a bit awkward.
--
Michael--ThanksHubert Zhang--ThanksHubert Zhang
Hi, On 2019-01-30 10:26:52 +0800, Hubert Zhang wrote: > Hi Michael, Robert > For you question about the hook position, I want to explain more about the > background why we want to introduce these hooks. > We wrote a diskquota extension <https://github.com/greenplum-db/diskquota> > [ ...] > On Tue, Jan 22, 2019 at 12:08 PM Hubert Zhang <hzhang@pivotal.io> wrote: > > > > For this particular purpose, I don't immediately see why you need a > >> > hook in both places. If ReadBuffer is called with P_NEW, aren't we > >> > guaranteed to end up in smgrextend()? > >> Yes, that's a bit awkward. Please note that on PG lists the customary and desired style is to quote inline in messages rather than top-quote. Greetings, Andres Freund
Hi,
On 2019-01-30 10:26:52 +0800, Hubert Zhang wrote:
> Hi Michael, Robert
> For you question about the hook position, I want to explain more about the
> background why we want to introduce these hooks.
> We wrote a diskquota extension <https://github.com/greenplum-db/diskquota>
> [ ...]
> On Tue, Jan 22, 2019 at 12:08 PM Hubert Zhang <hzhang@pivotal.io> wrote:
>
> > > For this particular purpose, I don't immediately see why you need a
> >> > hook in both places. If ReadBuffer is called with P_NEW, aren't we
> >> > guaranteed to end up in smgrextend()?
> >> Yes, that's a bit awkward.
Please note that on PG lists the customary and desired style is to quote
inline in messages rather than top-quote.
Greetings,
Andres Freund
On Tue, Nov 20, 2018 at 2:20 AM Haozhou Wang <hawang@pivotal.io> wrote:
> We prepared a patch that includes the hook points. And such hook points are needed for disk quota extension.
> There are two hooks.
> One is SmgrStat_hook. It's used to perform ad-hoc logic in storage when doing smgr create/extend/truncate in general. As for disk quota extension, this hook is used to detect active tables(new created tables, tables extending new blocks, or tables being truncated)
> The other is BufferExtendCheckPerms_hook. It's used to perform ad-hoc logic when buffer extend a new block. Since ReadBufferExtended is a hot function, we call this hook only when blockNum == P_NEW. As for disk quota extension, this hook is used to do query enforcement during the query is loading data.
>
> Any comments are appreciated.
+1 for adding some hooks to support this kind of thing, but I think
the names you've chosen are not very good. The hook name should
describe the place from which it is called, not the purpose for which
one imagines that it will be used, because somebody else might imagine
another use. Both BufferExtendCheckPerms_hook_type and
SmgrStat_hook_type are imagining that they know what the hook does -
CheckPerms in the first case and Stat in the second case.
For this particular purpose, I don't immediately see why you need a
hook in both places. If ReadBuffer is called with P_NEW, aren't we
guaranteed to end up in smgrextend()?
On Wed, Nov 21, 2018 at 09:47:44AM -0500, Robert Haas wrote:
> +1 for adding some hooks to support this kind of thing, but I think
> the names you've chosen are not very good. The hook name should
> describe the place from which it is called, not the purpose for which
> one imagines that it will be used, because somebody else might imagine
> another use. Both BufferExtendCheckPerms_hook_type and
> SmgrStat_hook_type are imagining that they know what the hook does -
> CheckPerms in the first case and Stat in the second case.
I personally don't mind making Postgres more pluggable, but I don't
think that we actually need the extra ones proposed here at the layer
of smgr, as smgr is already a layer designed to call an underlying set
of APIs able to extend, unlink, etc. depending on the storage type.
void smgrextend
```
```
typedef void (*smgrextend_hook_type)
```
On Mon, Feb 18, 2019 at 7:39 PM Hubert Zhang <hzhang@pivotal.io> wrote: > Based on the assumption we use smgr as hook position, hook API option1 or option2 which is better? > Or we could find some balanced API between option1 and option2? > > Again comments on other better hook positions are also appreciated! Hi Hubert, The July Commitfest is now running, and this entry is in "needs review" state. Could you please post a rebased patch? I have questions about how disk quotas should work and I think we'll probably eventually want more hooks than these, but simply adding these hooks so extensions can do whatever they want doesn't seem very risky for core. I think it's highly likely that the hook signatures will have to change in future releases too, but that seems OK for such detailed internal hooks. As for your question, my first reaction was that I preferred your option 1, because SMgrRelation seems quite private and there are no existing examples of that object being exposed to extensions. But on reflection, other callbacks don't take such a mollycoddling approach. So my vote is for option 2 "just pass all the arguments to the callback", which I understand to be the approach of patch you have posted. + if (smgrcreate_hook) + { + (*smgrcreate_hook)(reln, forknum, isRedo); + } Usually we don't use curlies for single line if branches. -- Thomas Munro https://enterprisedb.com
On Mon, Feb 18, 2019 at 7:39 PM Hubert Zhang <hzhang@pivotal.io> wrote:
> Based on the assumption we use smgr as hook position, hook API option1 or option2 which is better?
> Or we could find some balanced API between option1 and option2?
>
> Again comments on other better hook positions are also appreciated!
Hi Hubert,
The July Commitfest is now running, and this entry is in "needs
review" state. Could you please post a rebased patch?
I have questions about how disk quotas should work and I think we'll
probably eventually want more hooks than these, but simply adding
these hooks so extensions can do whatever they want doesn't seem very
risky for core. I think it's highly likely that the hook signatures
will have to change in future releases too, but that seems OK for such
detailed internal hooks. As for your question, my first reaction was
that I preferred your option 1, because SMgrRelation seems quite
private and there are no existing examples of that object being
exposed to extensions. But on reflection, other callbacks don't take
such a mollycoddling approach. So my vote is for option 2 "just pass
all the arguments to the callback", which I understand to be the
approach of patch you have posted.
+ if (smgrcreate_hook)
+ {
+ (*smgrcreate_hook)(reln, forknum, isRedo);
+ }
Usually we don't use curlies for single line if branches.
Attachment
This patch no longer applies. Can you please rebase? -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
This patch no longer applies. Can you please rebase?
--
Álvaro Herrera https://urldefense.proofpoint.com/v2/url?u=https-3A__www.2ndQuadrant.com_&d=DwIDAw&c=lnl9vOaLMzsy2niBC8-h_K-7QJuNJEsFrzdndhuJ3Sw&r=nGUCIcT5CVp6-pQcplYyagWnpAqoYm8YxWruds9UGI0&m=aNzGoEI15bAE-vAivY34BtG2WdgrVojH-B-kjvuXsYA&s=sT6zyiq4s8meelNuFw-lGD_mdvmUzv9zpVYWbFWusI0&e=
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On Fri, Sep 27, 2019 at 11:30:08AM +0800, Haozhou Wang wrote: > I rebased this patch with the newest master branch. Attached the new > patch disk_quota_hooks_v5.patch in the attachment. This again needs a rebase, so I have switched it as waiting on author. -- Michael
Attachment
On Fri, Sep 27, 2019 at 11:30:08AM +0800, Haozhou Wang wrote:
> I rebased this patch with the newest master branch. Attached the new
> patch disk_quota_hooks_v5.patch in the attachment.
This again needs a rebase, so I have switched it as waiting on
author.
--
Michael
Attachment
On 12/2/19 1:39 AM, Haozhou Wang wrote: > > Thank you very much for your email. I rebased the code with the newest > master and attached it in the attachment. This patch applies but no longer builds on Linux or Windows: https://travis-ci.org/github/postgresql-cfbot/postgresql/builds/666113036 https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.85284 The CF entry has been updated to Waiting on Author. Regards, -- -David david@pgmasters.net
On 12/2/19 1:39 AM, Haozhou Wang wrote:
>
> Thank you very much for your email. I rebased the code with the newest
> master and attached it in the attachment.
This patch applies but no longer builds on Linux or Windows:
https://urldefense.proofpoint.com/v2/url?u=https-3A__travis-2Dci.org_github_postgresql-2Dcfbot_postgresql_builds_666113036&d=DwICaQ&c=lnl9vOaLMzsy2niBC8-h_K-7QJuNJEsFrzdndhuJ3Sw&r=nGUCIcT5CVp6-pQcplYyagWnpAqoYm8YxWruds9UGI0&m=UfgeYnaIkXdE7XmY-eCIHV_ZCTEZqCAjXkPausd5qZI&s=zHJ5pgH_P4mMhZAuXtfTCGUB7lyo7dJ4VeoapRhIN4g&e=
https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.appveyor.com_project_postgresql-2Dcfbot_postgresql_build_1.0.85284&d=DwICaQ&c=lnl9vOaLMzsy2niBC8-h_K-7QJuNJEsFrzdndhuJ3Sw&r=nGUCIcT5CVp6-pQcplYyagWnpAqoYm8YxWruds9UGI0&m=UfgeYnaIkXdE7XmY-eCIHV_ZCTEZqCAjXkPausd5qZI&s=WmyVNF-rItz10LoeLtY-Xu6l-iV3uwI_U-R9lR7cKjc&e=
The CF entry has been updated to Waiting on Author.
Regards,
--
-David
david@pgmasters.net
Attachment
Re: Control your disk usage in PG: Introduction to Disk Quota Extension
> On 27 Mar 2020, at 11:22, Haozhou Wang <hawang@pivotal.io> wrote: > We rebased this patch with the newest master. This patch now fails to build according to Travis: smgr.c: In function ‘smgrtruncate’: smgr.c:578:47: error: passing argument 4 of ‘smgrtruncate_hook’ makes integer from pointer without a cast [-Werror=int-conversion] (*smgrtruncate_hook)(reln, forknum, nforks, nblocks); ^ smgr.c:578:47: note: expected ‘BlockNumber {aka unsigned int}’ but argument is of type ‘BlockNumber * {aka unsigned int *}’ The warning is also present in the Windows build: src/backend/storage/smgr/smgr.c(578): warning C4047: 'function' : 'BlockNumber' differs in levels of indirection from 'BlockNumber*' [C:\projects\postgresql\postgres.vcxproj] src/backend/storage/smgr/smgr.c(578): warning C4024: 'smgrtruncate_hook' : different types for formal and actual parameter4 [C:\projects\postgresql\postgres.vcxproj] 2 Warning(s) Marking the patch as Waiting for Author. cheers ./daniel
Re: Control your disk usage in PG: Introduction to Disk Quota Extension
> On 1 Jul 2020, at 10:36, Daniel Gustafsson <daniel@yesql.se> wrote: > >> On 27 Mar 2020, at 11:22, Haozhou Wang <hawang@pivotal.io> wrote: > >> We rebased this patch with the newest master. > > This patch now fails to build according to Travis: > > smgr.c: In function ‘smgrtruncate’: > smgr.c:578:47: error: passing argument 4 of ‘smgrtruncate_hook’ makes integer from pointer without a cast [-Werror=int-conversion] > (*smgrtruncate_hook)(reln, forknum, nforks, nblocks); > ^ > smgr.c:578:47: note: expected ‘BlockNumber {aka unsigned int}’ but argument is of type ‘BlockNumber * {aka unsigned int*}’ > > > The warning is also present in the Windows build: > > src/backend/storage/smgr/smgr.c(578): warning C4047: 'function' : 'BlockNumber' differs in levels of indirection from 'BlockNumber*' [C:\projects\postgresql\postgres.vcxproj] > src/backend/storage/smgr/smgr.c(578): warning C4024: 'smgrtruncate_hook' : different types for formal and actual parameter4 [C:\projects\postgresql\postgres.vcxproj] > 2 Warning(s) > > Marking the patch as Waiting for Author. As the thread has stalled and the above compilation issue hasn't been addressed, I'm marking this entry Returned with Feedback. Feel free to open a new entry when there is a fixed patch. cheers ./daniel
On Mon, Feb 26, 2024 at 7:27 PM Daniel Gustafsson <daniel@yesql.se> wrote: > > > On 1 Jul 2020, at 10:36, Daniel Gustafsson <daniel@yesql.se> wrote: > > > >> On 27 Mar 2020, at 11:22, Haozhou Wang <hawang@pivotal.io> wrote: > > > >> We rebased this patch with the newest master. > > > > This patch now fails to build according to Travis: > > > > smgr.c: In function ‘smgrtruncate’: > > smgr.c:578:47: error: passing argument 4 of ‘smgrtruncate_hook’ makes integer from pointer without a cast [-Werror=int-conversion] > > (*smgrtruncate_hook)(reln, forknum, nforks, nblocks); > > ^ > > smgr.c:578:47: note: expected ‘BlockNumber {aka unsigned int}’ but argument is of type ‘BlockNumber * {aka unsigned int*}’ > > > > > > The warning is also present in the Windows build: > > > > src/backend/storage/smgr/smgr.c(578): warning C4047: 'function' : 'BlockNumber' differs in levels of indirection from'BlockNumber *' [C:\projects\postgresql\postgres.vcxproj] > > src/backend/storage/smgr/smgr.c(578): warning C4024: 'smgrtruncate_hook' : different types for formal and actual parameter4 [C:\projects\postgresql\postgres.vcxproj] > > 2 Warning(s) > > > > Marking the patch as Waiting for Author. > > As the thread has stalled and the above compilation issue hasn't been > addressed, I'm marking this entry Returned with Feedback. Feel free to open a > new entry when there is a fixed patch. Hi, Looks that many hackers are happy with the original patch except that the original patch needs a simple rebase, though it has been 3 years. Shall we push forward this patch so that it can benefit extensions not only diskquota? The attachment is a rebased version of the original patch. > > cheers ./daniel > > >
Attachment
Greetings, * Xing Guo (higuoxing@gmail.com) wrote: > Looks that many hackers are happy with the original patch except that > the original patch needs a simple rebase, though it has been 3 years. I'm not completely against the idea of adding these hooks, but I have to say that it's unfortunate to imagine having to use an extension for something like quotas as it's really something that would ideally be in core. > Shall we push forward this patch so that it can benefit extensions not > only diskquota? Would be great to hear about other use-cases for these hooks, which would also help us be comfortable that these are the right hooks to introduce with the correct arguments. What are the other extensions that you're referring to here..? Thanks, Stephen
Attachment
On Tue, Feb 27, 2024 at 6:38 AM Stephen Frost <sfrost@snowman.net> wrote: > > Greetings, > > * Xing Guo (higuoxing@gmail.com) wrote: > > Looks that many hackers are happy with the original patch except that > > the original patch needs a simple rebase, though it has been 3 years. > > I'm not completely against the idea of adding these hooks, but I have to > say that it's unfortunate to imagine having to use an extension for > something like quotas as it's really something that would ideally be in > core. > > > Shall we push forward this patch so that it can benefit extensions not > > only diskquota? > > Would be great to hear about other use-cases for these hooks, which > would also help us be comfortable that these are the right hooks to > introduce with the correct arguments. What are the other extensions > that you're referring to here..? Sorry, we don't have another concrete extension that utilizes the smgrcreate / smgrtruncate / smgrdounlinkall hooks right now. But we have a transparent data encryption extension that utilizes the smgrread and smgrwrite hooks to mutate the read buffer and write buffer in place to perform file level encryption and decryption. I think smgrwrite / smgrread / smgrcreate / smgrtruncate and smgrdounlinkall hooks fall into the same catagory and these hooks are different from the recent proposed storage manager API hooks[1]. Probably it is worth starting a new thread and discussing them. [1]: https://www.postgresql.org/message-id/flat/CAEze2WgMySu2suO_TLvFyGY3URa4mAx22WeoEicnK%3DPCNWEMrA%40mail.gmail.com > Thanks, > > Stephen Best Regards, Xing