Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints - Mailing list pgsql-hackers

From Andres Freund
Subject Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints
Date
Msg-id 20220331162159.2jh6553qi7ix6uac@alap3.anarazel.de
Whole thread Raw
In response to Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints  (Dilip Kumar <dilipbalaut@gmail.com>)
Responses Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints  (Dilip Kumar <dilipbalaut@gmail.com>)
List pgsql-hackers
Hi,

On 2022-03-31 13:22:24 +0530, Dilip Kumar wrote:
> 0001 is changing the strategy to file copy during initdb and 0002
> patch adds the test cases for both these cases.

Thanks!

> From 4a997e2a95074a520777cd2b369f9c728b360969 Mon Sep 17 00:00:00 2001
> From: Dilip Kumar <dilipkumar@localhost.localdomain>
> Date: Thu, 31 Mar 2022 10:43:16 +0530
> Subject: [PATCH 1/2] Use file_copy strategy during initdb
> 
> Because skipping the checkpoint during initdb will not result
> in significant savings, so there is no point in using wal_log
> as that will simply increase the cluster size by generating
> extra wal.
> ---
>  src/bin/initdb/initdb.c | 14 +++++++++++---
>  1 file changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
> index 5e36943..1256082 100644
> --- a/src/bin/initdb/initdb.c
> +++ b/src/bin/initdb/initdb.c
> @@ -1856,6 +1856,11 @@ make_template0(FILE *cmdfd)
>       * it would fail. To avoid that, assign a fixed OID to template0 rather
>       * than letting the server choose one.
>       *
> +     * Using file_copy strategy is preferable over wal_log here because
> +     * skipping the checkpoint during initdb will not result in significant
> +     * savings, so there is no point in using wal_log as that will simply
> +     * increase the cluster size by generating extra wal.

It's not just the increase in size, it's also the increase in time due to WAL logging.


>       * (Note that, while the user could have dropped and recreated these
>       * objects in the old cluster, the problem scenario only exists if the OID
>       * that is in use in the old cluster is also used in the new cluster - and
> @@ -1863,7 +1868,7 @@ make_template0(FILE *cmdfd)
>       */
>      static const char *const template0_setup[] = {
>          "CREATE DATABASE template0 IS_TEMPLATE = true ALLOW_CONNECTIONS = false OID = "
> -        CppAsString2(Template0ObjectId) ";\n\n",
> +        CppAsString2(Template0ObjectId) " STRATEGY = file_copy;\n\n",

I'd perhaps break this into a separate line, but...


> From d0759bcfc4fed674e938e4a03159f5953ca9718d Mon Sep 17 00:00:00 2001
> From: Dilip Kumar <dilipkumar@localhost.localdomain>
> Date: Thu, 31 Mar 2022 12:07:19 +0530
> Subject: [PATCH 2/2] Create database test coverage
> 
> Test create database strategy wal replay and alter database
> set tablespace.
> ---
>  src/test/modules/test_misc/t/002_tablespace.pl | 12 ++++++++++++
>  src/test/recovery/t/001_stream_rep.pl          | 24 ++++++++++++++++++++++++
>  2 files changed, 36 insertions(+)
> 
> diff --git a/src/test/modules/test_misc/t/002_tablespace.pl b/src/test/modules/test_misc/t/002_tablespace.pl
> index 04e5439..f3bbddc 100644
> --- a/src/test/modules/test_misc/t/002_tablespace.pl
> +++ b/src/test/modules/test_misc/t/002_tablespace.pl
> @@ -83,7 +83,19 @@ $result = $node->psql('postgres',
>      "ALTER TABLE t SET tablespace regress_ts1");
>  ok($result == 0, 'move table in-place->abs');
>  
> +# Test ALTER DATABASE SET TABLESPACE
> +$result = $node->psql('postgres',
> +    "CREATE DATABASE testdb TABLESPACE regress_ts1");
> +ok($result == 0, 'create database in tablespace 1');
> +$result = $node->psql('testdb',
> +    "CREATE TABLE t ()");
> +ok($result == 0, 'create table in testdb database');
> +$result = $node->psql('postgres',
> +    "ALTER DATABASE testdb SET TABLESPACE regress_ts2");
> +ok($result == 0, 'move database to tablespace 2');

This just tests the command doesn't fail, but not whether it actually did
something useful. Seem we should at least insert a row or two into the the
table, and verify they can be accessed?


> +# Create database with different strategies and check its presence in standby
> +$node_primary->safe_psql('postgres',
> +    "CREATE DATABASE testdb1 STRATEGY = FILE_COPY; ");
> +$node_primary->safe_psql('testdb1',
> +    "CREATE TABLE tab_int AS SELECT generate_series(1,10) AS a");
> +$node_primary->safe_psql('postgres',
> +    "CREATE DATABASE testdb2 STRATEGY = WAL_LOG; ");
> +$node_primary->safe_psql('testdb2',
> +    "CREATE TABLE tab_int AS SELECT generate_series(1,10) AS a");
> +
> +# Wait for standbys to catch up
> +$primary_lsn = $node_primary->lsn('write');
> +$node_primary->wait_for_catchup($node_standby_1, 'replay', $primary_lsn);
> +
> +$result =
> +  $node_standby_1->safe_psql('testdb1', "SELECT count(*) FROM tab_int");
> +print "standby 1: $result\n";
> +is($result, qq(10), 'check streamed content on standby 1');
> +
> +$result =
> +  $node_standby_1->safe_psql('testdb2', "SELECT count(*) FROM tab_int");
> +print "standby 1: $result\n";
> +is($result, qq(10), 'check streamed content on standby 1');
> +
>  # Check that only READ-only queries can run on standbys
>  is($node_standby_1->psql('postgres', 'INSERT INTO tab_int VALUES (1)'),
>      3, 'read-only queries on standby 1');

I'd probably add a function for creating database / table and then testing it,
with a strategy parameter. That way we can afterwards add more tests verifying
that everything worked.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: pgsql: Add 'basebackup_to_shell' contrib module.
Next
From: Andres Freund
Date:
Subject: Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints