Thread: unlogged sequences

unlogged sequences

From
Peter Eisentraut
Date:
The discussion in bug #15631 revealed that serial/identity sequences of
temporary tables should really also be temporary (easy), and that
serial/identity sequences of unlogged tables should also be unlogged.
But there is no support for unlogged sequences, so I looked into that.

If you copy the initial sequence relation file to the init fork, then
this all seems to work out just fine.  Attached is a patch.  The
low-level copying seems to be handled quite inconsistently across the
code, so I'm not sure what the most appropriate way to do this would be.
 I'm looking for feedback from those who have worked on tableam and
storage manager to see what the right interfaces are or whether some new
interfaces might perhaps be appropriate.

(What's still missing in this patch is ALTER SEQUENCE SET
{LOGGED|UNLOGGED} as well as propagating the analogous ALTER TABLE
command to owned sequences.)

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: unlogged sequences

From
Michael Paquier
Date:
On Thu, Jun 20, 2019 at 09:30:34AM +0200, Peter Eisentraut wrote:
> The discussion in bug #15631 revealed that serial/identity sequences of
> temporary tables should really also be temporary (easy), and that
> serial/identity sequences of unlogged tables should also be unlogged.
> But there is no support for unlogged sequences, so I looked into that.

Thanks for doing so.

> If you copy the initial sequence relation file to the init fork, then
> this all seems to work out just fine.  Attached is a patch.  The
> low-level copying seems to be handled quite inconsistently across the
> code, so I'm not sure what the most appropriate way to do this would be.
>  I'm looking for feedback from those who have worked on tableam and
> storage manager to see what the right interfaces are or whether some new
> interfaces might perhaps be appropriate.

But the actual deal is that the sequence meta-data is now in
pg_sequences and not the init forks, no?  I have just done a small
test:
1) Some SQL queries:
create unlogged sequence popo;
alter sequence popo increment 2;
select nextval('popo');
select nextval('popo');
2) Then a hard crash:
pg_ctl stop -m immediate
pg_ctl start
3) Again, with a crash:
select nextval('popo');
                                                    
#2  0x000055ce60f3208d in ExceptionalCondition
(conditionName=0x55ce610f0570 "!(((PageHeader) (page))->pd_special >=
(__builtin_offsetof (PageHeaderData, pd_linp)))",
errorType=0x55ce610f0507 "FailedAssertion",
fileName=0x55ce610f04e0 "../../../src/include/storage/bufpage.h",
lineNumber=317) at assert.c:54
#3  0x000055ce60b43200 in PageValidateSpecialPointer
(page=0x7ff7692b3d80 "") at
../../../src/include/storage/bufpage.h:317
#4  0x000055ce60b459d4 in read_seq_tuple (rel=0x7ff768ad27e0,
buf=0x7ffc5707f0bc, seqdatatuple=0x7ffc5707f0a0) at
sequence.c:1213
#5  0x000055ce60b447ee in nextval_internal (relid=16385,
check_permissions=true) at sequence.c:678
#6  0x000055ce60b44533 in nextval_oid (fcinfo=0x55ce62537570) at sequence.c:607
--
Michael

Attachment

Re: unlogged sequences

From
Peter Eisentraut
Date:
On 2019-06-21 07:31, Michael Paquier wrote:
> 1) Some SQL queries:
> create unlogged sequence popo;
> alter sequence popo increment 2;

The problem is that the above command does a relation rewrite but the
code doesn't know to copy the init fork of the sequence.  That will need
to be addressed.

> select nextval('popo');
> select nextval('popo');
> 2) Then a hard crash:
> pg_ctl stop -m immediate
> pg_ctl start
> 3) Again, with a crash:
> select nextval('popo');
                                                     
 
> #2  0x000055ce60f3208d in ExceptionalCondition

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: unlogged sequences

From
Andres Freund
Date:
Hi,

On 2019-06-20 09:30:34 +0200, Peter Eisentraut wrote:
> I'm looking for feedback from those who have worked on tableam and
> storage manager to see what the right interfaces are or whether some new
> interfaces might perhaps be appropriate.

Hm, it's not clear to me that tableam design matters much around
sequences? To me it's a historical accident that sequences kinda look
like tables, not more.



> +    /*
> +     * create init fork for unlogged sequences
> +     *
> +     * The logic follows that of RelationCreateStorage() and
> +     * RelationCopyStorage().
> +     */
> +    if (seq->sequence->relpersistence == RELPERSISTENCE_UNLOGGED)
> +    {
> +        SMgrRelation srel;
> +        PGAlignedBlock buf;
> +        Page        page = (Page) buf.data;
> +
> +        FlushRelationBuffers(rel);

That's pretty darn expensive, especially when we just need to flush out
a *single* page, as it needs to scan all of shared buffers. Seems better
to just to initialize the page from scratch? Any reason not to do that?


> +        srel = smgropen(rel->rd_node, InvalidBackendId);
> +        smgrcreate(srel, INIT_FORKNUM, false);
> +        log_smgrcreate(&rel->rd_node, INIT_FORKNUM);
> +
> +        Assert(smgrnblocks(srel, MAIN_FORKNUM) == 1);
> +
> +        smgrread(srel, MAIN_FORKNUM, 0, buf.data);
> +
> +        if (!PageIsVerified(page, 0))
> +            ereport(ERROR,
> +                    (errcode(ERRCODE_DATA_CORRUPTED),
> +                     errmsg("invalid page in block %u of relation %s",
> +                            0,
> +                            relpathbackend(srel->smgr_rnode.node,
> +                                           srel->smgr_rnode.backend,
> +                                           MAIN_FORKNUM))));
> +
> +        log_newpage(&srel->smgr_rnode.node, INIT_FORKNUM, 0, page, false);
> +        PageSetChecksumInplace(page, 0);
> +        smgrextend(srel, INIT_FORKNUM, 0, buf.data, false);
> +        smgrclose(srel);
> +    }

I.e. I think it'd be better if we just added a fork argument to
fill_seq_with_data(), and then do something like

smgrcreate(srel, INIT_FORKNUM, false);
log_smgrcreate(&rel->rd_node, INIT_FORKNUM);
fill_seq_with_data(rel, tuple, INIT_FORKNUM);

and add a FlushBuffer() to the end of fill_seq_with_data() if writing
INIT_FORKNUM. The if (RelationNeedsWAL(rel)) would need an || forkNum ==
INIT_FORKNUM.

Alternatively you could just copy the contents from the buffer currently
filled in fill_seq_with_data() to the main fork, and do a memcpy. But
that seems unnecessarily complicated, because you'd again need to do WAL
logging etc.

Greetings,

Andres Freund



Re: unlogged sequences

From
Thomas Munro
Date:
On Wed, Jun 26, 2019 at 6:38 AM Andres Freund <andres@anarazel.de> wrote:
> On 2019-06-20 09:30:34 +0200, Peter Eisentraut wrote:
> > I'm looking for feedback from those who have worked on tableam and
> > storage manager to see what the right interfaces are or whether some new
> > interfaces might perhaps be appropriate.
>
> [lots of feedback that requires making decisions]

Seems to be actively under development but no new patch yet.  Moved to next CF.

-- 
Thomas Munro
https://enterprisedb.com



Re: unlogged sequences

From
Alvaro Herrera from 2ndQuadrant
Date:
On 2019-Aug-01, Thomas Munro wrote:

> On Wed, Jun 26, 2019 at 6:38 AM Andres Freund <andres@anarazel.de> wrote:
> > On 2019-06-20 09:30:34 +0200, Peter Eisentraut wrote:
> > > I'm looking for feedback from those who have worked on tableam and
> > > storage manager to see what the right interfaces are or whether some new
> > > interfaces might perhaps be appropriate.
> >
> > [lots of feedback that requires making decisions]
> 
> Seems to be actively under development but no new patch yet.  Moved to next CF.

Marked Waiting on Author.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: unlogged sequences

From
Peter Eisentraut
Date:
On 25.06.19 20:37, Andres Freund wrote:
> I.e. I think it'd be better if we just added a fork argument to
> fill_seq_with_data(), and then do something like
> 
> smgrcreate(srel, INIT_FORKNUM, false);
> log_smgrcreate(&rel->rd_node, INIT_FORKNUM);
> fill_seq_with_data(rel, tuple, INIT_FORKNUM);
> 
> and add a FlushBuffer() to the end of fill_seq_with_data() if writing
> INIT_FORKNUM. The if (RelationNeedsWAL(rel)) would need an || forkNum ==
> INIT_FORKNUM.

Now that logical replication of sequences is nearing completion, I 
figured it would be suitable to dust off this old discussion on unlogged 
sequences, mainly so that sequences attached to unlogged tables can be 
excluded from replication.

Attached is a new patch that incorporates the above suggestions, with 
some slight refactoring.  The only thing I didn't/couldn't do was to 
call FlushBuffers(), since that is not an exported function.  So this 
still calls FlushRelationBuffers(), which was previously not liked. 
Ideas welcome.

I have also re-tested the crash reported by Michael Paquier in the old 
discussion and added test cases that catch them.

The rest of the patch is just documentation, DDL support, client 
support, etc.

What is not done yet is support for ALTER SEQUENCE ... SET 
LOGGED/UNLOGGED.  This is a bit of a problem because:

1. The new behavior is that a serial/identity sequence of a new unlogged 
table is now also unlogged.
2. There is also a new restriction that changing a table to logged is 
not allowed if it is linked to an unlogged sequence.  (This is IMO 
similar to the existing restriction on linking mixed logged/unlogged 
tables via foreign keys.)
3. Thus, currently, you can't create an unlogged table with a 
serial/identity column and then change it to logged.  This is reflected 
in some of the test changes I had to make in alter_table.sql to work 
around this.  These should eventually go away.

Interestingly, there is grammar support for ALTER SEQUENCE ... SET 
LOGGED/UNLOGGED because there is this:

         |   ALTER SEQUENCE qualified_name alter_table_cmds
                 {
                     AlterTableStmt *n = makeNode(AlterTableStmt);
                     n->relation = $3;
                     n->cmds = $4;
                     n->objtype = OBJECT_SEQUENCE;
                     n->missing_ok = false;
                     $$ = (Node *)n;
                 }

But it is rejected later in tablecmds.c.  In fact, it appears that this 
piece of grammar is currently useless because there are no 
alter_table_cmds that actually work for sequences.  (This used to be 
different because things like OWNER TO also went through here.)

I tried to make tablecmds.c handle sequences as well, but that became 
messy.  So I'm thinking about making ALTER SEQUENCE ... SET 
LOGGED/UNLOGGED an entirely separate code path and rip out the above 
grammar, but that needs some further pondering.

But all that is a bit of a separate effort, so in the meantime some 
review of the changes in and around fill_seq_with_data() would be useful.
Attachment

Re: unlogged sequences

From
Peter Eisentraut
Date:
rebased patch, no functional changes

On 11.02.22 10:12, Peter Eisentraut wrote:
> On 25.06.19 20:37, Andres Freund wrote:
>> I.e. I think it'd be better if we just added a fork argument to
>> fill_seq_with_data(), and then do something like
>>
>> smgrcreate(srel, INIT_FORKNUM, false);
>> log_smgrcreate(&rel->rd_node, INIT_FORKNUM);
>> fill_seq_with_data(rel, tuple, INIT_FORKNUM);
>>
>> and add a FlushBuffer() to the end of fill_seq_with_data() if writing
>> INIT_FORKNUM. The if (RelationNeedsWAL(rel)) would need an || forkNum ==
>> INIT_FORKNUM.
> 
> Now that logical replication of sequences is nearing completion, I 
> figured it would be suitable to dust off this old discussion on unlogged 
> sequences, mainly so that sequences attached to unlogged tables can be 
> excluded from replication.
> 
> Attached is a new patch that incorporates the above suggestions, with 
> some slight refactoring.  The only thing I didn't/couldn't do was to 
> call FlushBuffers(), since that is not an exported function.  So this 
> still calls FlushRelationBuffers(), which was previously not liked. 
> Ideas welcome.
> 
> I have also re-tested the crash reported by Michael Paquier in the old 
> discussion and added test cases that catch them.
> 
> The rest of the patch is just documentation, DDL support, client 
> support, etc.
> 
> What is not done yet is support for ALTER SEQUENCE ... SET 
> LOGGED/UNLOGGED.  This is a bit of a problem because:
> 
> 1. The new behavior is that a serial/identity sequence of a new unlogged 
> table is now also unlogged.
> 2. There is also a new restriction that changing a table to logged is 
> not allowed if it is linked to an unlogged sequence.  (This is IMO 
> similar to the existing restriction on linking mixed logged/unlogged 
> tables via foreign keys.)
> 3. Thus, currently, you can't create an unlogged table with a 
> serial/identity column and then change it to logged.  This is reflected 
> in some of the test changes I had to make in alter_table.sql to work 
> around this.  These should eventually go away.
> 
> Interestingly, there is grammar support for ALTER SEQUENCE ... SET 
> LOGGED/UNLOGGED because there is this:
> 
>          |   ALTER SEQUENCE qualified_name alter_table_cmds
>                  {
>                      AlterTableStmt *n = makeNode(AlterTableStmt);
>                      n->relation = $3;
>                      n->cmds = $4;
>                      n->objtype = OBJECT_SEQUENCE;
>                      n->missing_ok = false;
>                      $$ = (Node *)n;
>                  }
> 
> But it is rejected later in tablecmds.c.  In fact, it appears that this 
> piece of grammar is currently useless because there are no 
> alter_table_cmds that actually work for sequences.  (This used to be 
> different because things like OWNER TO also went through here.)
> 
> I tried to make tablecmds.c handle sequences as well, but that became 
> messy.  So I'm thinking about making ALTER SEQUENCE ... SET 
> LOGGED/UNLOGGED an entirely separate code path and rip out the above 
> grammar, but that needs some further pondering.
> 
> But all that is a bit of a separate effort, so in the meantime some 
> review of the changes in and around fill_seq_with_data() would be useful.

Attachment

Re: unlogged sequences

From
Peter Eisentraut
Date:
Here is an updated patch that now also includes SET LOGGED/UNLOGGED 
support.  So this version addresses all known issues and open problems.


On 28.02.22 10:56, Peter Eisentraut wrote:
> rebased patch, no functional changes
> 
> On 11.02.22 10:12, Peter Eisentraut wrote:
>> On 25.06.19 20:37, Andres Freund wrote:
>>> I.e. I think it'd be better if we just added a fork argument to
>>> fill_seq_with_data(), and then do something like
>>>
>>> smgrcreate(srel, INIT_FORKNUM, false);
>>> log_smgrcreate(&rel->rd_node, INIT_FORKNUM);
>>> fill_seq_with_data(rel, tuple, INIT_FORKNUM);
>>>
>>> and add a FlushBuffer() to the end of fill_seq_with_data() if writing
>>> INIT_FORKNUM. The if (RelationNeedsWAL(rel)) would need an || forkNum ==
>>> INIT_FORKNUM.
>>
>> Now that logical replication of sequences is nearing completion, I 
>> figured it would be suitable to dust off this old discussion on 
>> unlogged sequences, mainly so that sequences attached to unlogged 
>> tables can be excluded from replication.
>>
>> Attached is a new patch that incorporates the above suggestions, with 
>> some slight refactoring.  The only thing I didn't/couldn't do was to 
>> call FlushBuffers(), since that is not an exported function.  So this 
>> still calls FlushRelationBuffers(), which was previously not liked. 
>> Ideas welcome.
>>
>> I have also re-tested the crash reported by Michael Paquier in the old 
>> discussion and added test cases that catch them.
>>
>> The rest of the patch is just documentation, DDL support, client 
>> support, etc.
>>
>> What is not done yet is support for ALTER SEQUENCE ... SET 
>> LOGGED/UNLOGGED.  This is a bit of a problem because:
>>
>> 1. The new behavior is that a serial/identity sequence of a new 
>> unlogged table is now also unlogged.
>> 2. There is also a new restriction that changing a table to logged is 
>> not allowed if it is linked to an unlogged sequence.  (This is IMO 
>> similar to the existing restriction on linking mixed logged/unlogged 
>> tables via foreign keys.)
>> 3. Thus, currently, you can't create an unlogged table with a 
>> serial/identity column and then change it to logged.  This is 
>> reflected in some of the test changes I had to make in alter_table.sql 
>> to work around this.  These should eventually go away.
>>
>> Interestingly, there is grammar support for ALTER SEQUENCE ... SET 
>> LOGGED/UNLOGGED because there is this:
>>
>>          |   ALTER SEQUENCE qualified_name alter_table_cmds
>>                  {
>>                      AlterTableStmt *n = makeNode(AlterTableStmt);
>>                      n->relation = $3;
>>                      n->cmds = $4;
>>                      n->objtype = OBJECT_SEQUENCE;
>>                      n->missing_ok = false;
>>                      $$ = (Node *)n;
>>                  }
>>
>> But it is rejected later in tablecmds.c.  In fact, it appears that 
>> this piece of grammar is currently useless because there are no 
>> alter_table_cmds that actually work for sequences.  (This used to be 
>> different because things like OWNER TO also went through here.)
>>
>> I tried to make tablecmds.c handle sequences as well, but that became 
>> messy.  So I'm thinking about making ALTER SEQUENCE ... SET 
>> LOGGED/UNLOGGED an entirely separate code path and rip out the above 
>> grammar, but that needs some further pondering.
>>
>> But all that is a bit of a separate effort, so in the meantime some 
>> review of the changes in and around fill_seq_with_data() would be useful.

Attachment

Re: unlogged sequences

From
Peter Eisentraut
Date:
Patch rebased over some conflicts, and some tests simplified.

On 24.03.22 14:10, Peter Eisentraut wrote:
> Here is an updated patch that now also includes SET LOGGED/UNLOGGED 
> support.  So this version addresses all known issues and open problems.
> 
> 
> On 28.02.22 10:56, Peter Eisentraut wrote:
>> rebased patch, no functional changes
>>
>> On 11.02.22 10:12, Peter Eisentraut wrote:
>>> On 25.06.19 20:37, Andres Freund wrote:
>>>> I.e. I think it'd be better if we just added a fork argument to
>>>> fill_seq_with_data(), and then do something like
>>>>
>>>> smgrcreate(srel, INIT_FORKNUM, false);
>>>> log_smgrcreate(&rel->rd_node, INIT_FORKNUM);
>>>> fill_seq_with_data(rel, tuple, INIT_FORKNUM);
>>>>
>>>> and add a FlushBuffer() to the end of fill_seq_with_data() if writing
>>>> INIT_FORKNUM. The if (RelationNeedsWAL(rel)) would need an || 
>>>> forkNum ==
>>>> INIT_FORKNUM.
>>>
>>> Now that logical replication of sequences is nearing completion, I 
>>> figured it would be suitable to dust off this old discussion on 
>>> unlogged sequences, mainly so that sequences attached to unlogged 
>>> tables can be excluded from replication.
>>>
>>> Attached is a new patch that incorporates the above suggestions, with 
>>> some slight refactoring.  The only thing I didn't/couldn't do was to 
>>> call FlushBuffers(), since that is not an exported function.  So this 
>>> still calls FlushRelationBuffers(), which was previously not liked. 
>>> Ideas welcome.
>>>
>>> I have also re-tested the crash reported by Michael Paquier in the 
>>> old discussion and added test cases that catch them.
>>>
>>> The rest of the patch is just documentation, DDL support, client 
>>> support, etc.
>>>
>>> What is not done yet is support for ALTER SEQUENCE ... SET 
>>> LOGGED/UNLOGGED.  This is a bit of a problem because:
>>>
>>> 1. The new behavior is that a serial/identity sequence of a new 
>>> unlogged table is now also unlogged.
>>> 2. There is also a new restriction that changing a table to logged is 
>>> not allowed if it is linked to an unlogged sequence.  (This is IMO 
>>> similar to the existing restriction on linking mixed logged/unlogged 
>>> tables via foreign keys.)
>>> 3. Thus, currently, you can't create an unlogged table with a 
>>> serial/identity column and then change it to logged.  This is 
>>> reflected in some of the test changes I had to make in 
>>> alter_table.sql to work around this.  These should eventually go away.
>>>
>>> Interestingly, there is grammar support for ALTER SEQUENCE ... SET 
>>> LOGGED/UNLOGGED because there is this:
>>>
>>>          |   ALTER SEQUENCE qualified_name alter_table_cmds
>>>                  {
>>>                      AlterTableStmt *n = makeNode(AlterTableStmt);
>>>                      n->relation = $3;
>>>                      n->cmds = $4;
>>>                      n->objtype = OBJECT_SEQUENCE;
>>>                      n->missing_ok = false;
>>>                      $$ = (Node *)n;
>>>                  }
>>>
>>> But it is rejected later in tablecmds.c.  In fact, it appears that 
>>> this piece of grammar is currently useless because there are no 
>>> alter_table_cmds that actually work for sequences.  (This used to be 
>>> different because things like OWNER TO also went through here.)
>>>
>>> I tried to make tablecmds.c handle sequences as well, but that became 
>>> messy.  So I'm thinking about making ALTER SEQUENCE ... SET 
>>> LOGGED/UNLOGGED an entirely separate code path and rip out the above 
>>> grammar, but that needs some further pondering.
>>>
>>> But all that is a bit of a separate effort, so in the meantime some 
>>> review of the changes in and around fill_seq_with_data() would be 
>>> useful.

Attachment

Re: unlogged sequences

From
Tomas Vondra
Date:
Hi,

Here's a slightly improved patch, adding a couple checks and tests for
owned sequences to ensure both objects have the same persistence. In
particular:

* When linking a sequence to a table (ALTER SEQUENCE ... OWNED BY),
there's an ereport(ERROR) if the relpersistence values do not match.

* Disallow changing persistence for owned sequences directly.


But I wonder about two things:

1) Do we need to do something about pg_upgrade? I mean, we did not have
unlogged sequences until now, so existing databases may have unlogged
tables with logged sequences. If people run pg_upgrade, what should be
the end result? Should it convert the sequences to unlogged ones, should
it fail and force the user to fix this manually, or what?

2) Does it actually make sense to force owned sequences to have the same
relpersistence as the table? I can imagine use cases where it's OK to
discard and recalculate the data, but I'd still want to ensure unique
IDs. Like some data loads, for example.



regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachment

Re: unlogged sequences

From
Andres Freund
Date:
Hi,

On 2022-03-31 16:14:25 +0200, Tomas Vondra wrote:
> 1) Do we need to do something about pg_upgrade? I mean, we did not have
> unlogged sequences until now, so existing databases may have unlogged
> tables with logged sequences. If people run pg_upgrade, what should be
> the end result? Should it convert the sequences to unlogged ones, should
> it fail and force the user to fix this manually, or what?

> 2) Does it actually make sense to force owned sequences to have the same
> relpersistence as the table? I can imagine use cases where it's OK to
> discard and recalculate the data, but I'd still want to ensure unique
> IDs. Like some data loads, for example.


I agree it makes sense to have logged sequences with unlogged tables. We
should call out the behavioural change somewhere prominent in the release
notes.

I don't think we should make pg_upgrade change the loggedness of sequences.

Greetings,

Andres Freund



Re: unlogged sequences

From
"David G. Johnston"
Date:
On Thu, Mar 31, 2022 at 9:28 AM Andres Freund <andres@anarazel.de> wrote:
I agree it makes sense to have logged sequences with unlogged tables. We
should call out the behavioural change somewhere prominent in the release
notes.


We can/do already support that unlikely use case by allowing one to remove the OWNERSHIP dependency between the table and the sequence.

I'm fine with owned sequences tracking the persistence attribute of the owning table.

I don't think we should make pg_upgrade change the loggedness of sequences.


We are willing to change the default behavior here so it is going to affect dump/restore anyway, might as well fully commit and do the same for pg_upgrade.  The vast majority of users will benefit from the new default behavior.

I don't actually get, though, how that would play with pg_dump since it always emits an unowned, and thus restored as logged, sequence first and then alters the sequence to be owned by the table.  Thus restoring an old SQL dump into the v15 is going to fail if we prohibit unlogged-table/logged-sequence; unless we actively change the logged-ness of the sequence when subordinating it to a table.

Thus, the choices seem to be:

1) implement forced persistence agreement for owned sequences, changing the sequence to match the table when the alter table happens, and during pg_upgrade.
2) do not force persistence agreement for owned sequences

If choosing option 2, are you on board with changing the behavior of CREATE UNLOGGED TABLE with respect to any auto-generated sequences?

David J.

Re: unlogged sequences

From
Tomas Vondra
Date:
On 3/31/22 19:35, David G. Johnston wrote:
> On Thu, Mar 31, 2022 at 9:28 AM Andres Freund <andres@anarazel.de
> <mailto:andres@anarazel.de>> wrote:
> 
>     I agree it makes sense to have logged sequences with unlogged tables. We
>     should call out the behavioural change somewhere prominent in the
>     release
>     notes.
> 

I'm not sure I follow. If we allow logged sequences with unlogged
tables, there's be no behavioral change, no?

> 
> We can/do already support that unlikely use case by allowing one to
> remove the OWNERSHIP dependency between the table and the sequence.
> 
> I'm fine with owned sequences tracking the persistence attribute of the
> owning table.
> 

So essentially an independent sequence, used in a default value.

>     I don't think we should make pg_upgrade change the loggedness of
>     sequences.
> 
> 
> We are willing to change the default behavior here so it is going to
> affect dump/restore anyway, might as well fully commit and do the same
> for pg_upgrade.  The vast majority of users will benefit from the new
> default behavior.
> 

Whatever we do, I think we should keep the pg_dump and pg_upgrade
behavior as consistent as possible.

> I don't actually get, though, how that would play with pg_dump since it
> always emits an unowned, and thus restored as logged, sequence first and
> then alters the sequence to be owned by the table.  Thus restoring an
> old SQL dump into the v15 is going to fail if we prohibit
> unlogged-table/logged-sequence; unless we actively change the
> logged-ness of the sequence when subordinating it to a table.
> 

Yeah. I guess we'd need to either automatically switch the sequence to
the right persistence when linking it to the table, or maybe we could
modify pg_dump to emit UNLOGGED when the table is unlogged (but that
would work only when using the new pg_dump).

> Thus, the choices seem to be:
> 
> 1) implement forced persistence agreement for owned sequences, changing
> the sequence to match the table when the alter table happens, and during
> pg_upgrade.
> 2) do not force persistence agreement for owned sequences
> 
> If choosing option 2, are you on board with changing the behavior of
> CREATE UNLOGGED TABLE with respect to any auto-generated sequences?
> 

What behavior change, exactly? To create the sequences as UNLOGGED, but
we'd not update the persistence after that?


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: unlogged sequences

From
"David G. Johnston"
Date:
On Thu, Mar 31, 2022 at 12:36 PM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:
On 3/31/22 19:35, David G. Johnston wrote:
> On Thu, Mar 31, 2022 at 9:28 AM Andres Freund <andres@anarazel.de
> <mailto:andres@anarazel.de>> wrote:
>
>     I agree it makes sense to have logged sequences with unlogged tables. We
>     should call out the behavioural change somewhere prominent in the
>     release
>     notes.
>

I'm not sure I follow. If we allow logged sequences with unlogged
tables, there's be no behavioral change, no?


As noted below, the behavior change is in how CREATE TABLE behaves.  Not whether or not mixed persistence is allowed.
 
or maybe we could
modify pg_dump to emit UNLOGGED when the table is unlogged (but that
would work only when using the new pg_dump).

Yes, the horse has already left the barn.  I don't really have an opinion on whether to leave the barn door open or closed.
 

> If choosing option 2, are you on board with changing the behavior of
> CREATE UNLOGGED TABLE with respect to any auto-generated sequences?
>

What behavior change, exactly? To create the sequences as UNLOGGED, but
we'd not update the persistence after that?


Today, a newly created unlogged table with an automatically owned sequence (serial, generated identity) has a logged sequence.  This patch changes that so the new automatically owned sequence is unlogged.  This seems to be generally agreed upon as being desirable - but given the fact that unlogged tables will not have unlogged sequences it seems worth confirming that this minor inconsistency is acceptable.

The first newly added behavior is just allowing sequences to be unlogged.  That is the only mandatory feature introduced by this patch and doesn't seem contentious.

The second newly added behavior being proposed is to have the persistence of the sequence be forcibly matched to the table.  Whether this is desirable is the main point under discussion.

David J.

Re: unlogged sequences

From
Tomas Vondra
Date:

On 3/31/22 21:55, David G. Johnston wrote:
> On Thu, Mar 31, 2022 at 12:36 PM Tomas Vondra
> <tomas.vondra@enterprisedb.com <mailto:tomas.vondra@enterprisedb.com>>
> wrote:
> 
>     On 3/31/22 19:35, David G. Johnston wrote:
>     > On Thu, Mar 31, 2022 at 9:28 AM Andres Freund <andres@anarazel.de
>     <mailto:andres@anarazel.de>
>     > <mailto:andres@anarazel.de <mailto:andres@anarazel.de>>> wrote:
>     >
>     >     I agree it makes sense to have logged sequences with unlogged
>     tables. We
>     >     should call out the behavioural change somewhere prominent in the
>     >     release
>     >     notes.
>     >
> 
>     I'm not sure I follow. If we allow logged sequences with unlogged
>     tables, there's be no behavioral change, no?
> 
> 
> As noted below, the behavior change is in how CREATE TABLE behaves.  Not
> whether or not mixed persistence is allowed.
>  
> 
>     or maybe we could
>     modify pg_dump to emit UNLOGGED when the table is unlogged (but that
>     would work only when using the new pg_dump).
> 
> 
> Yes, the horse has already left the barn.  I don't really have an
> opinion on whether to leave the barn door open or closed.
>  
> 
> 
>     > If choosing option 2, are you on board with changing the behavior of
>     > CREATE UNLOGGED TABLE with respect to any auto-generated sequences?
>     >
> 
>     What behavior change, exactly? To create the sequences as UNLOGGED, but
>     we'd not update the persistence after that?
> 
> 
> Today, a newly created unlogged table with an automatically owned
> sequence (serial, generated identity) has a logged sequence.  This patch
> changes that so the new automatically owned sequence is unlogged.  This
> seems to be generally agreed upon as being desirable - but given the
> fact that unlogged tables will not have unlogged sequences it seems
> worth confirming that this minor inconsistency is acceptable.
> 
> The first newly added behavior is just allowing sequences to be
> unlogged.  That is the only mandatory feature introduced by this patch
> and doesn't seem contentious.
> 
> The second newly added behavior being proposed is to have the
> persistence of the sequence be forcibly matched to the table.  Whether
> this is desirable is the main point under discussion.
> 

Right. The latest version of the patch also prohibits changing
persistence of owned sequences directly. But that's probably considered
 to be part of the second behavior.

I agree the first part is not contentious, so shall we extract this part
of the patch and get that committed for PG15? Or is that too late to
make such changes to the patch?


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: unlogged sequences

From
"David G. Johnston"
Date:
On Thu, Mar 31, 2022 at 1:05 PM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:

I agree the first part is not contentious, so shall we extract this part
of the patch and get that committed for PG15? Or is that too late to
make such changes to the patch?


The minimum viable feature for me, given the written goal for the patch and the premise of not changing any existing behavior, is:

DB State: Allow a sequence to be unlogged.
Command: ALTER SEQUENCE SET UNLOGGED
Limitation: The above command fails if the sequence is unowned, or it is owned and the table owning it is not UNLOGGED

(optional safety) Limitation: Changing a table from unlogged to logged while owning unlogged sequences would be prohibited
(optional safety) Compensatory Behavior: Add the ALTER SEQUENCE SET LOGGED command for owned sequences to get them logged again in preparation for changing the table to being logged.

In particular, I don't see CREATE UNLOGGED SEQUENCE to be all that valuable since CREATE UNLOGGED TABLE wouldn't leverage it.

The above, possibly only half-baked, patch scope does not change any existing behavior but allows for the stated goal: an unlogged table having an unlogged sequence.  The DBA just has to execute the ALTER SEQUENCE command on all relevant sequences.  They can't even really get it wrong since only relevant sequences can be altered.  Not having CREATE TABLE make an unlogged sequence by default is annoying though and likely should be changed - though it can leverage ALTER SEQUENCE too.

Anything else they wish to do can be done via a combination of ownership manipulation and, worse case, dropping and recreating the sequence.  Though allowed for unowned unlogged sequences, while outside the explicit goal of the patch, would be an easy add (just don't error on the ALTER SEQUENCE SET UNLOGGED when the sequence is unowned).

David J.

Re: unlogged sequences

From
"David G. Johnston"
Date:
On Thu, Mar 31, 2022 at 1:40 PM David G. Johnston <david.g.johnston@gmail.com> wrote:
The DBA just has to execute the ALTER SEQUENCE command on all relevant sequences.

Additional, if we do not implement the forced matching of persistence mode, we should consider adding an "ALTER TABLE SET ALL SEQUENCES TO UNLOGGED" command for convenience.  Or maybe make it a function - which would allow for SQL execution against a catalog lookup.

David J.

Re: unlogged sequences

From
Tomas Vondra
Date:
On 3/31/22 22:40, David G. Johnston wrote:
> On Thu, Mar 31, 2022 at 1:05 PM Tomas Vondra
> <tomas.vondra@enterprisedb.com <mailto:tomas.vondra@enterprisedb.com>>
> wrote:
> 
> 
>     I agree the first part is not contentious, so shall we extract this part
>     of the patch and get that committed for PG15? Or is that too late to
>     make such changes to the patch?
> 
> 
> The minimum viable feature for me, given the written goal for the patch
> and the premise of not changing any existing behavior, is:
> 
> DB State: Allow a sequence to be unlogged.
> Command: ALTER SEQUENCE SET UNLOGGED
> Limitation: The above command fails if the sequence is unowned, or it is
> owned and the table owning it is not UNLOGGED
> 
> (optional safety) Limitation: Changing a table from unlogged to logged
> while owning unlogged sequences would be prohibited
> (optional safety) Compensatory Behavior: Add the ALTER SEQUENCE SET
> LOGGED command for owned sequences to get them logged again in
> preparation for changing the table to being logged.
> 
> In particular, I don't see CREATE UNLOGGED SEQUENCE to be all that
> valuable since CREATE UNLOGGED TABLE wouldn't leverage it.
> 

Hmm, so what about doing a little bit different thing:

1) owned sequences inherit persistence of the table by default

2) allow ALTER SEQUENCE to change persistence for all sequences (no
restriction for owned sequences)

3) ALTER TABLE ... SET [UN]LOGGED changes persistence for sequences
matching the initial table persistence

IMHO (1) would address vast majority of cases, which simply want the
same persistence for the whole table and all auxiliary objects. (2)
would address use cases requiring different persistence for sequences
(including owned ones).

I'm not sure about (3) though, maybe that's overkill.

Of course, we'll always have problems with older releases, as it's not
clear whether a logged sequence on unlogged table would be desirable or
is used just because unlogged sequences were not supported. (We do have
the same issue for logged tables too, but I doubt anyone really needs
defining unlogged sequences on logged tables.)

So no matter what we do, we'll make the wrong decision in some cases.

> The above, possibly only half-baked, patch scope does not change any
> existing behavior but allows for the stated goal: an unlogged table
> having an unlogged sequence.  The DBA just has to execute the ALTER
> SEQUENCE command on all relevant sequences.  They can't even really get
> it wrong since only relevant sequences can be altered.  Not having
> CREATE TABLE make an unlogged sequence by default is annoying though and
> likely should be changed - though it can leverage ALTER SEQUENCE too.
> 
> Anything else they wish to do can be done via a combination of ownership
> manipulation and, worse case, dropping and recreating the sequence. 
> Though allowed for unowned unlogged sequences, while outside the
> explicit goal of the patch, would be an easy add (just don't error on
> the ALTER SEQUENCE SET UNLOGGED when the sequence is unowned).
> 

Yeah. I think my proposal is pretty close to that, except that the
sequence would first inherit persistence from the table, and there'd be
an ALTER SEQUENCE for owned sequences where it differs. (And non-owned
sequences would be created as logged/unlogged explicitly.)

I don't think we need to worry about old pg_dump versions on new PG
versions, because that's not really supported.

And for old PG versions the behavior would differ a bit depending on the
pg_dump version used. With old pg_dump version, the ALTER SEQUENCE would
not be emitted, so all owned sequences would inherit table persistence.
With new pg_dump we'd get the expected persistence (which might differ).

That's need to be documented, of course.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: unlogged sequences

From
"David G. Johnston"
Date:
On Thu, Mar 31, 2022 at 3:43 PM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:
On 3/31/22 22:40, David G. Johnston wrote:
> On Thu, Mar 31, 2022 at 1:05 PM Tomas Vondra
> <tomas.vondra@enterprisedb.com <mailto:tomas.vondra@enterprisedb.com>>
> wrote:
>
>
>     I agree the first part is not contentious, so shall we extract this part
>     of the patch and get that committed for PG15? Or is that too late to
>     make such changes to the patch?
>
>
> The minimum viable feature for me, given the written goal for the patch
> and the premise of not changing any existing behavior, is:
>
> DB State: Allow a sequence to be unlogged.
> Command: ALTER SEQUENCE SET UNLOGGED
> Limitation: The above command fails if the sequence is unowned, or it is
> owned and the table owning it is not UNLOGGED
>
> (optional safety) Limitation: Changing a table from unlogged to logged
> while owning unlogged sequences would be prohibited
> (optional safety) Compensatory Behavior: Add the ALTER SEQUENCE SET
> LOGGED command for owned sequences to get them logged again in
> preparation for changing the table to being logged.
>
> In particular, I don't see CREATE UNLOGGED SEQUENCE to be all that
> valuable since CREATE UNLOGGED TABLE wouldn't leverage it.
>

Hmm, so what about doing a little bit different thing:

1) owned sequences inherit persistence of the table by default

This is the contentious point.  If we are going to do it by default - thus changing existing behavior - I would rather just do it always.  This is also underspecified, there are multiple ways for a sequence to become owned.

Personally I'm for the choice to effectively remove the sequence's own concept of logged/unlogged when it is owned by a table and to always just use the table's value.


2) allow ALTER SEQUENCE to change persistence for all sequences (no
restriction for owned sequences)

A generalization that is largely incontrovertible.

3) ALTER TABLE ... SET [UN]LOGGED changes persistence for sequences
matching the initial table persistence

I'm leaning against this, leaving users to set each owned sequence to logged/unlogged as they see fit if they want something other than all-or-nothing.  I would stick to only providing an easy method to get the assumed desired all-same behavior.
ALTER TABLE SET [UN]LOGGED, SET ALL SEQUENCES TO [UN]LOGGED;


IMHO (1) would address vast majority of cases, which simply want the
same persistence for the whole table and all auxiliary objects. (2)
would address use cases requiring different persistence for sequences
(including owned ones).

I'm not sure about (3) though, maybe that's overkill.

Of course, we'll always have problems with older releases, as it's not
clear whether a logged sequence on unlogged table would be desirable or
is used just because unlogged sequences were not supported. (We do have
the same issue for logged tables too, but I doubt anyone really needs
defining unlogged sequences on logged tables.)

So no matter what we do, we'll make the wrong decision in some cases.

Again, I don't have too much concern here because you lose very little by having an unowned sequence.  Which is why I'm fine with owned sequences becoming even moreso implementation details that adhere to the persistence mode of the owning relation.  But if the goal here is to defer such a decision then the tradeoff is the DBA is given control and they get to enforce consistency even if they are not benefitting from the flexibility.
> Not having
> CREATE TABLE make an unlogged sequence by default is annoying though and
> likely should be changed - though it can leverage ALTER SEQUENCE too.

Yeah. I think my proposal is pretty close to that, except that the
sequence would first inherit persistence from the table, and there'd be
an ALTER SEQUENCE for owned sequences where it differs. (And non-owned
sequences would be created as logged/unlogged explicitly.)

I don't have any real problem with 1 or 2, they fill out the feature so it is generally designed as opposed to solving a very specific problem.

For 1:
The "ADD COLUMN" (whether in CREATE TABLE or ALTER TABLE) pathway will produce a new sequence whose persistence matches that of the target table.  While a behavior change it is one aligned with the goal of the patch for typical ongoing behavior and should benefit way more people than it may inconvenience.  The "sequence not found" error that would be generated seems minimally impactful.

The "ALTER SEQUENCE OWNED BY" pathway will not change the sequence's persistence.  This is what pg_dump will use for serial/bigserial
The "ALTER TABLE ALTER COLUMN" pathway will not change the sequence's persistence.  This is what pg_dump will use for generated always as identity

Provide a general purpose ALTER SEQUENCE SET [UN]LOGGED command

Provide an SQL Command to change all owned sequences of a table to be UNLOGGED or LOGGED (I mentioned a function as well if someone thinks it worth the time - in lieu of a function a psql script leveraging \gexec may be nice to reference).


I don't think we need to worry about old pg_dump versions on new PG
versions, because that's not really supported.

Correct.

And for old PG versions the behavior would differ a bit depending on the
pg_dump version used. With old pg_dump version, the ALTER SEQUENCE would
not be emitted,

Correct, nothing else is emitted either...


That's need to be documented, of course.


It (the general promises for pg_dump) is documented.


David J.

Re: unlogged sequences

From
Robert Haas
Date:
On Thu, Mar 31, 2022 at 10:14 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
> * When linking a sequence to a table (ALTER SEQUENCE ... OWNED BY),
> there's an ereport(ERROR) if the relpersistence values do not match.
>
> * Disallow changing persistence for owned sequences directly.

Wait, what? I don't understand why we would want to do either of these things.

It seems to me that it's totally fine to use a logged table with an
unlogged sequence, or an unlogged table with a logged sequence, or any
of the other combinations. You get what you ask for, so make sure to
ask for what you want. And that's it.

If you say something like CREATE [UNLOGGED] TABLE foo (a serial) it's
fine for serial to attribute the same persistence level to the
sequence as it does to the table. But when that's dumped, it's going
to be dumped as a CREATE TABLE command and a CREATE SEQUENCE command,
each of which has a separate persistence level. So you can recreate
whatever state you have.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: unlogged sequences

From
Tomas Vondra
Date:
On 4/1/22 02:25, Robert Haas wrote:
> On Thu, Mar 31, 2022 at 10:14 AM Tomas Vondra
> <tomas.vondra@enterprisedb.com> wrote:
>> * When linking a sequence to a table (ALTER SEQUENCE ... OWNED BY),
>> there's an ereport(ERROR) if the relpersistence values do not match.
>>
>> * Disallow changing persistence for owned sequences directly.
> 
> Wait, what? I don't understand why we would want to do either of these things.
> 
> It seems to me that it's totally fine to use a logged table with an
> unlogged sequence, or an unlogged table with a logged sequence, or any
> of the other combinations. You get what you ask for, so make sure to
> ask for what you want. And that's it.
> 
> If you say something like CREATE [UNLOGGED] TABLE foo (a serial) it's
> fine for serial to attribute the same persistence level to the
> sequence as it does to the table. But when that's dumped, it's going
> to be dumped as a CREATE TABLE command and a CREATE SEQUENCE command,
> each of which has a separate persistence level. So you can recreate
> whatever state you have.
> 

Well, yeah. I did this because the patch was somewhat inconsistent when
handling owned sequences - it updated persistence for owned sequences
when persistence for the table changed, expecting to keep them in sync,
but then it also allowed operations that'd break it.

But that started a discussion about exactly this, and AFAICS there's
agreement we want to allow the table and owned sequences to have
different persistence values.

The discussion about the details is still ongoing, but I think it's
clear we'll ditch the restrictions you point out.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: unlogged sequences

From
"David G. Johnston"
Date:
On Thu, Mar 31, 2022 at 5:25 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Mar 31, 2022 at 10:14 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
> * When linking a sequence to a table (ALTER SEQUENCE ... OWNED BY),
> there's an ereport(ERROR) if the relpersistence values do not match.
>
> * Disallow changing persistence for owned sequences directly.

Wait, what? I don't understand why we would want to do either of these things.

It seems to me that it's totally fine to use a logged table with an
unlogged sequence, or an unlogged table with a logged sequence, or any
of the other combinations. You get what you ask for, so make sure to
ask for what you want. And that's it.

It seems reasonable to extend the definition of "ownership of a sequence" in this way.  We always let you create unowned sequences with whatever persistence you like if you need flexibility.

The "give the user power" argument is also valid.  But since they already have power through unowned sequences, having the owned sequences more narrowly defined doesn't detract from usability, and in many ways enhances it by further reinforcing the fact that the sequence internally used when you say "GENERATED ALWAYS AS IDENTITY" is an implementation detail - one that has the same persistence as the table.

David J.

Re: unlogged sequences

From
Robert Haas
Date:
On Thu, Mar 31, 2022 at 8:42 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
> Well, yeah. I did this because the patch was somewhat inconsistent when
> handling owned sequences - it updated persistence for owned sequences
> when persistence for the table changed, expecting to keep them in sync,
> but then it also allowed operations that'd break it.

Oops.

> But that started a discussion about exactly this, and AFAICS there's
> agreement we want to allow the table and owned sequences to have
> different persistence values.
>
> The discussion about the details is still ongoing, but I think it's
> clear we'll ditch the restrictions you point out.

Great.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: unlogged sequences

From
Robert Haas
Date:
On Thu, Mar 31, 2022 at 8:44 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:
> It seems reasonable to extend the definition of "ownership of a sequence" in this way.  We always let you create
unownedsequences with whatever persistence you like if you need flexibility. 

I'd say it doesn't seem to have any benefit, and therefore seems
unreasonable. Right now, OWNED BY is documented as a way of getting
the sequence to automatically be dropped if the table column goes
away. If it already did five things, maybe you could argue that this
thing is just like the other five and therefore changing it is the
right idea. But going from one thing to two that don't seem to have
much to do with each other seems much less reasonable, especially
since it doesn't seem to buy anything.

> The "give the user power" argument is also valid.  But since they already have power through unowned sequences,
havingthe owned sequences more narrowly defined doesn't detract from usability, and in many ways enhances it by further
reinforcingthe fact that the sequence internally used when you say "GENERATED ALWAYS AS IDENTITY" is an implementation
detail- one that has the same persistence as the table. 

I think there's a question about what happens in the GENERATED ALWAYS
AS IDENTITY case. The DDL commands that create such sequences are of
the form ALTER TABLE something ALTER COLUMN somethingelse GENERATED
ALWAYS AS (sequence_parameters), and if we need to specify somewhere
in the whether the sequence should be logged or unlogged, how do we do
that? Consider:

rhaas=# create unlogged table xyz (a int generated always as identity);
CREATE TABLE
rhaas=# \d+ xyz
                                                 Unlogged table "public.xyz"
 Column |  Type   | Collation | Nullable |           Default
 | Storage | Compression | Stats target | Description

--------+---------+-----------+----------+------------------------------+---------+-------------+--------------+-------------
 a      | integer |           | not null | generated always as
identity | plain   |             |              |
Access method: heap

rhaas=# \d+ xyz_a_seq
                     Sequence "public.xyz_a_seq"
  Type   | Start | Minimum |  Maximum   | Increment | Cycles? | Cache
---------+-------+---------+------------+-----------+---------+-------
 integer |     1 |       1 | 2147483647 |         1 | no      |     1
Sequence for identity column: public.xyz.a

In this new system, does the user still get a logged sequence? If they
get an unlogged sequence, how does dump-and-restore work? What if they
want to still have a logged sequence? But for sequences that are
simply owned, there is no problem here, and I think that inventing one
would not be a good plan.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: unlogged sequences

From
"David G. Johnston"
Date:
On Thu, Mar 31, 2022 at 6:03 PM Robert Haas <robertmhaas@gmail.com> wrote:

In this new system, does the user still get a logged sequence? If they
get an unlogged sequence, how does dump-and-restore work? What if they
want to still have a logged sequence? But for sequences that are
simply owned, there is no problem here, and I think that inventing one
would not be a good plan.

There is no design problem here, just coding (including special handling for pg_upgrade).  When making a sequence owned we can, without requiring any syntax, choose to change its persistence to match the table owning it.  Or not.  These are basically options 1 and 2 I laid out earlier:


I slightly favor 1, making owned sequences implementation details having a matched persistence mode.  But we seem to be leaning toward option 2 per subsequent emails. I'm ok with that - just give me an easy way to change all my upgraded logged sequences to unlogged.  And probably do the same if I change my table's mode as well.

That there is less implementation complexity is nice but the end user won't see that.  I think the typical end user would appreciate having the sequence stay in sync with the table instead of having to worry about those kinds of details.  Hence my slight favor given toward 1.

David J.



Re: unlogged sequences

From
"David G. Johnston"
Date:
On Thu, Mar 31, 2022 at 6:03 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Mar 31, 2022 at 8:44 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:

> The "give the user power" argument is also valid.  But since they already have power through unowned sequences, having the owned sequences more narrowly defined doesn't detract from usability, and in many ways enhances it by further reinforcing the fact that the sequence internally used when you say "GENERATED ALWAYS AS IDENTITY" is an implementation detail - one that has the same persistence as the table.

I think there's a question about what happens in the GENERATED ALWAYS
AS IDENTITY case. The DDL commands that create such sequences are of
the form ALTER TABLE something ALTER COLUMN somethingelse GENERATED
ALWAYS AS (sequence_parameters), and if we need to specify somewhere
in the whether the sequence should be logged or unlogged, how do we do
that?

I give answers for the "owned sequences match their owning table's persistence" model below:

You would not need to specify it - the table is specified and that is sufficient to know what value to choose.
 
Consider:

rhaas=# create unlogged table xyz (a int generated always as identity);
CREATE TABLE
rhaas=# \d+ xyz
                                                 Unlogged table "public.xyz"
 Column |  Type   | Collation | Nullable |           Default
 | Storage | Compression | Stats target | Description
--------+---------+-----------+----------+------------------------------+---------+-------------+--------------+-------------
 a      | integer |           | not null | generated always as
identity | plain   |             |              |
Access method: heap

rhaas=# \d+ xyz_a_seq
                     Sequence "public.xyz_a_seq"
  Type   | Start | Minimum |  Maximum   | Increment | Cycles? | Cache
---------+-------+---------+------------+-----------+---------+-------
 integer |     1 |       1 | 2147483647 |         1 | no      |     1
Sequence for identity column: public.xyz.a

In this new system, does the user still get a logged sequence?

No
 
If they
get an unlogged sequence, how does dump-and-restore work?

As described in the first response, since ALTER COLUMN is used during dump-and-restore, the sequence creation occurs in a command where we know the owning table is unlogged so the created sequence is unlogged.
 
What if they
want to still have a logged sequence?

I was expecting the following to work, though it does not presently:

ALTER SEQUENCE yetanotherthing OWNED BY NONE;
ERROR: cannot change ownership of identity sequence

ALTER SEQUENCE yetanotherthing SET LOGGED;

IMO, the generated case is the stronger one for not allowing them to be different.  They can fall back onto the DEFAULT nextval('sequence_that_is_unowned') option to get the desired behavior.

David J.


Re: unlogged sequences

From
Peter Eisentraut
Date:
On 01.04.22 00:43, Tomas Vondra wrote:
> Hmm, so what about doing a little bit different thing:
> 
> 1) owned sequences inherit persistence of the table by default
> 
> 2) allow ALTER SEQUENCE to change persistence for all sequences (no
> restriction for owned sequences)
> 
> 3) ALTER TABLE ... SET [UN]LOGGED changes persistence for sequences
> matching the initial table persistence

Consider that an identity sequence creates an "internal" dependency and 
a serial sequence creates an "auto" dependency.

An "internal" dependency means that the internal object shouldn't really 
be operated on directly.  (In some cases it's allowed for convenience.) 
So I think in that case the sequence must follow the table's persistence 
in all cases.  This is accomplished by setting the initial persistence 
to the table's, making ALTER TABLE propagate persistence changes, and 
prohibiting direct ALTER SEQUENCE SET.

An "auto" dependency is looser, so manipulating both objects 
independently can be allowed.  In that case, I would do (1), (2), and (3).

(I think your (3) is already the behavior in the patch, since there are 
only two persistence levels in play at that point.)

I wanted to check if you can have a persistent sequence owned by a temp 
table, but that is rejected because both sequence and table must be in 
the same schema.  So the sequence owned-by schema does insist on some 
tight coupling.  So for example, once a sequence is owned by a table, 
you can't move it around or change the ownership.



Re: unlogged sequences

From
Peter Eisentraut
Date:
On 01.04.22 18:22, Peter Eisentraut wrote:
> 
> On 01.04.22 00:43, Tomas Vondra wrote:
>> Hmm, so what about doing a little bit different thing:
>>
>> 1) owned sequences inherit persistence of the table by default
>>
>> 2) allow ALTER SEQUENCE to change persistence for all sequences (no
>> restriction for owned sequences)
>>
>> 3) ALTER TABLE ... SET [UN]LOGGED changes persistence for sequences
>> matching the initial table persistence
> 
> Consider that an identity sequence creates an "internal" dependency and 
> a serial sequence creates an "auto" dependency.
> 
> An "internal" dependency means that the internal object shouldn't really 
> be operated on directly.  (In some cases it's allowed for convenience.) 
> So I think in that case the sequence must follow the table's persistence 
> in all cases.  This is accomplished by setting the initial persistence 
> to the table's, making ALTER TABLE propagate persistence changes, and 
> prohibiting direct ALTER SEQUENCE SET.

But to make pg_upgrade work for identity sequences of unlogged tables, 
we need to allow ALTER SEQUENCE ... SET LOGGED on such sequences.  Which 
I guess is not a real problem in the end.



Re: unlogged sequences

From
"David G. Johnston"
Date:
On Fri, Apr 1, 2022 at 9:22 AM Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote:

On 01.04.22 00:43, Tomas Vondra wrote:
> Hmm, so what about doing a little bit different thing:
>
> 1) owned sequences inherit persistence of the table by default
>
> 2) allow ALTER SEQUENCE to change persistence for all sequences (no
> restriction for owned sequences)
>
> 3) ALTER TABLE ... SET [UN]LOGGED changes persistence for sequences
> matching the initial table persistence

Consider that an identity sequence creates an "internal" dependency and
a serial sequence creates an "auto" dependency.

An "internal" dependency means that the internal object shouldn't really
be operated on directly.  (In some cases it's allowed for convenience.)
So I think in that case the sequence must follow the table's persistence
in all cases.  This is accomplished by setting the initial persistence
to the table's, making ALTER TABLE propagate persistence changes, and
prohibiting direct ALTER SEQUENCE SET.

An "auto" dependency is looser, so manipulating both objects
independently can be allowed.  In that case, I would do (1), (2), and (3).

(I think your (3) is already the behavior in the patch, since there are
only two persistence levels in play at that point.)

I would support having a serial sequence be allowed to be changed independently while an identity sequence is made to match the table it is owned by.  Older version restores would produce a logged serial sequence (since the sequence is independently created and then attached to the table) on unlogged tables but since identity sequences are only even implicitly created they would become unlogged as part of the restore.  Though I suspect that pg_upgrade will need to change them explicitly.

I would support all owned sequences as well, but that seems unreachable at the moment.

David J.

Re: unlogged sequences

From
"David G. Johnston"
Date:
On Fri, Apr 1, 2022 at 9:31 AM Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote:
On 01.04.22 18:22, Peter Eisentraut wrote:
>
> On 01.04.22 00:43, Tomas Vondra wrote:
>> Hmm, so what about doing a little bit different thing:
>>
>> 1) owned sequences inherit persistence of the table by default
>>
>> 2) allow ALTER SEQUENCE to change persistence for all sequences (no
>> restriction for owned sequences)
>>
>> 3) ALTER TABLE ... SET [UN]LOGGED changes persistence for sequences
>> matching the initial table persistence
>
> Consider that an identity sequence creates an "internal" dependency and
> a serial sequence creates an "auto" dependency.
>
> An "internal" dependency means that the internal object shouldn't really
> be operated on directly.  (In some cases it's allowed for convenience.)
> So I think in that case the sequence must follow the table's persistence
> in all cases.  This is accomplished by setting the initial persistence
> to the table's, making ALTER TABLE propagate persistence changes, and
> prohibiting direct ALTER SEQUENCE SET.

But to make pg_upgrade work for identity sequences of unlogged tables,
we need to allow ALTER SEQUENCE ... SET LOGGED on such sequences.  Which
I guess is not a real problem in the end.

Indeed, we need the syntax anyway.  We can constrain it though, and error when trying to make them different but allow making them the same.  To change a table's persistence you have to then change the table first - putting them back into different states - then sync up the sequence again.

David J.

Re: unlogged sequences

From
Robert Haas
Date:
On Fri, Apr 1, 2022 at 12:31 PM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
> > An "internal" dependency means that the internal object shouldn't really
> > be operated on directly.  (In some cases it's allowed for convenience.)
> > So I think in that case the sequence must follow the table's persistence
> > in all cases.  This is accomplished by setting the initial persistence
> > to the table's, making ALTER TABLE propagate persistence changes, and
> > prohibiting direct ALTER SEQUENCE SET.
>
> But to make pg_upgrade work for identity sequences of unlogged tables,
> we need to allow ALTER SEQUENCE ... SET LOGGED on such sequences.  Which
> I guess is not a real problem in the end.

And I think also SET UNLOGGED, since it would be weird IMHO to make
such a change irreversible.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: unlogged sequences

From
Peter Eisentraut
Date:
On 01.04.22 18:31, Peter Eisentraut wrote:
>> Consider that an identity sequence creates an "internal" dependency 
>> and a serial sequence creates an "auto" dependency.
>>
>> An "internal" dependency means that the internal object shouldn't 
>> really be operated on directly.  (In some cases it's allowed for 
>> convenience.) So I think in that case the sequence must follow the 
>> table's persistence in all cases.  This is accomplished by setting the 
>> initial persistence to the table's, making ALTER TABLE propagate 
>> persistence changes, and prohibiting direct ALTER SEQUENCE SET.
> 
> But to make pg_upgrade work for identity sequences of unlogged tables, 
> we need to allow ALTER SEQUENCE ... SET LOGGED on such sequences.  Which 
> I guess is not a real problem in the end.

Here is an updated patch that fixes this pg_dump/pg_upgrade issue and 
also adds a few more comments and documentation sentences about what 
happens and what is allowed.  I didn't change any behaviors; it seems we 
didn't have consensus to do that.

These details about how tables and sequences are linked or not are 
pretty easy to adjust, if people still have some qualms.
Attachment

Re: unlogged sequences

From
"David G. Johnston"
Date:
On Sun, Apr 3, 2022 at 10:19 AM Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote:
Here is an updated patch that fixes this pg_dump/pg_upgrade issue and
also adds a few more comments and documentation sentences about what
happens and what is allowed.  I didn't change any behaviors; it seems we
didn't have consensus to do that.

IIUC the patch behavior with respect to migration is to have pg_upgrade retain the current logged persistence mode for all owned sequences regardless of the owning table's persistence.  The same goes for pg_dump for serial sequences since they will never be annotated with UNLOGGED and simply adding an ownership link doesn't cause a table rewrite.

However, tables having an identity sequence seem to be unaddressed in this patch.  The existing (and unchanged) pg_dump.c code results in:

CREATE TABLE public.testgenid (
    getid bigint NOT NULL
);

ALTER TABLE public.testgenid OWNER TO postgres;

ALTER TABLE public.testgenid ALTER COLUMN getid ADD GENERATED ALWAYS AS IDENTITY (
    SEQUENCE NAME public.testgenid_getid_seq
    START WITH 1
    INCREMENT BY 1
    NO MINVALUE
    NO MAXVALUE
    CACHE 1
);

ISTM that we need to add the ability to specify [UN]LOGGED in those sequence_options and have pg_dump.c output the choice explicitly instead of relying upon a default.

Without that, the post-patch dump/restore cannot retain the existing persistence mode value for the sequence.  For the default we would want to have ALTER TABLE ALTER COLUMN be LOGGED to match the claim that pg_dump doesn't change the persistence mode.  The main decision, then, is whether CREATE TABLE and ALTER TABLE ADD COLUMN should default to UNLOGGED (this combination preserves existing values via pg_dump while still letting the user benefit from the new feature without having to specify UNLOGGED in multiple places) or LOGGED (preserving existing values and consistency).  All UNLOGGED is an option but I think it would need to be considered along with pg_upgrade changing them all as well.  Again, limiting this decision to identity sequences only.

David J.



Re: unlogged sequences

From
Peter Eisentraut
Date:
On 03.04.22 20:50, David G. Johnston wrote:
> However, tables having an identity sequence seem to be unaddressed in 
> this patch.  The existing (and unchanged) pg_dump.c code results in:

It is addressed.  For example, run this in PG14:

create unlogged table t1 (a int generated always as identity, b text);

Then dump it with PG15 with this patch:

CREATE UNLOGGED TABLE public.t1 (
     a integer NOT NULL,
     b text
);


ALTER TABLE public.t1 OWNER TO peter;

--
-- Name: t1_a_seq; Type: SEQUENCE; Schema: public; Owner: peter
--

ALTER TABLE public.t1 ALTER COLUMN a ADD GENERATED ALWAYS AS IDENTITY (
     SEQUENCE NAME public.t1_a_seq
     START WITH 1
     INCREMENT BY 1
     NO MINVALUE
     NO MAXVALUE
     CACHE 1
);
ALTER SEQUENCE public.t1_a_seq SET LOGGED;



Re: unlogged sequences

From
"David G. Johnston"
Date:
On Sun, Apr 3, 2022 at 12:36 PM Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote:
On 03.04.22 20:50, David G. Johnston wrote:
> However, tables having an identity sequence seem to be unaddressed in
> this patch.  The existing (and unchanged) pg_dump.c code results in:

It is addressed.  For example, run this in PG14:

create unlogged table t1 (a int generated always as identity, b text);

Then dump it with PG15 with this patch:

Sorry, I wasn't being specific enough.  Per our documentation (and I seem to recall many comments from Tom):
"Because pg_dump is used to transfer data to newer versions of PostgreSQL, the output of pg_dump can be expected to load into PostgreSQL server versions newer than pg_dump's version." [1]

That is what I'm getting on about when talking about migrations.  So a v14 SQL backup produced by a v14 pg_dump restored by a v15 psql. (custom format and pg_restore supposedly aren't supposed to be different though, right?)


David J.

Re: unlogged sequences

From
"David G. Johnston"
Date:
On Sun, Apr 3, 2022 at 12:36 PM Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote:
On 03.04.22 20:50, David G. Johnston wrote:
> However, tables having an identity sequence seem to be unaddressed in
> this patch.  The existing (and unchanged) pg_dump.c code results in:

It is addressed.  For example, run this in PG14:

ALTER TABLE public.t1 ALTER COLUMN a ADD GENERATED ALWAYS AS IDENTITY (
     SEQUENCE NAME public.t1_a_seq
     START WITH 1
     INCREMENT BY 1
     NO MINVALUE
     NO MAXVALUE
     CACHE 1
);
ALTER SEQUENCE public.t1_a_seq SET LOGGED;

OK, I do see the new code for this and see how my prior email was confusing/wrong.  I do still have the v14 dump file restoration concern but that actually isn't something pg_dump.c has to (or even can) worry about.  Ensuring that a v15+ dump represents the existing state correctly is basically a given which is why I wasn't seeing how my comments would be interpreted relative to that.

For the patch I'm still thinking we want to add [UN]LOGGED to sequence_options.  Even if pg_dump doesn't utilize it, though aside from potential code cleanliness I don't see why it wouldn't.  If absent, the default behavior shown here (sequence matches table, as per "+ seqstmt->sequence->relpersistence = cxt->relation->relpersistence;" would take effect) applies, otherwise the newly created sequence is as requested.

From this, in the current patch, a pg_dump v14- produced dump file restoration will change the persistence of owned sequences on an unlogged table to unlogged from logged during restoration into v15+ (since the alter sequence will not be present after the alter table).  A v15+ pg_dump produced dump file will retain the logged persistence mode for the sequence.  The only way to avoid this discrepancy is to have sequence_options taken on a [UN]LOGGED option that defaults to LOGGED.  This then correctly reflects historical behavior and will produce a consistently restored dump file.

David J.

Re: unlogged sequences

From
Peter Eisentraut
Date:
On 04.04.22 01:58, David G. Johnston wrote:
> "Because pg_dump is used to transfer data to newer versions of 
> PostgreSQL, the output of pg_dump can be expected to load into 
> PostgreSQL server versions newer than pg_dump's version." [1]
> 
> That is what I'm getting on about when talking about migrations.  So a 
> v14 SQL backup produced by a v14 pg_dump restored by a v15 psql.

It has always been the case that if you want the best upgrade 
experience, you need to use the pg_dump that is >= server version.

The above quote is a corollary to that we don't want to gratuitously 
break SQL syntax compatibility.  But I don't think that implies that the 
behavior of those commands cannot change at all.  Otherwise we could 
never add new behavior with new defaults.




Re: unlogged sequences

From
Peter Eisentraut
Date:
On 03.04.22 19:19, Peter Eisentraut wrote:
> 
> On 01.04.22 18:31, Peter Eisentraut wrote:
>>> Consider that an identity sequence creates an "internal" dependency 
>>> and a serial sequence creates an "auto" dependency.
>>>
>>> An "internal" dependency means that the internal object shouldn't 
>>> really be operated on directly.  (In some cases it's allowed for 
>>> convenience.) So I think in that case the sequence must follow the 
>>> table's persistence in all cases.  This is accomplished by setting 
>>> the initial persistence to the table's, making ALTER TABLE propagate 
>>> persistence changes, and prohibiting direct ALTER SEQUENCE SET.
>>
>> But to make pg_upgrade work for identity sequences of unlogged tables, 
>> we need to allow ALTER SEQUENCE ... SET LOGGED on such sequences.  
>> Which I guess is not a real problem in the end.
> 
> Here is an updated patch that fixes this pg_dump/pg_upgrade issue and 
> also adds a few more comments and documentation sentences about what 
> happens and what is allowed.  I didn't change any behaviors; it seems we 
> didn't have consensus to do that.
> 
> These details about how tables and sequences are linked or not are 
> pretty easy to adjust, if people still have some qualms.

This patch is now in limbo because it appears that the logical 
replication of sequences feature might end up being reverted for PG15. 
This unlogged sequences feature is really a component of that overall 
feature.

If we think that logical replication of sequences might stay in, then I 
would like to commit this patch as well.

If we think that it will be reverted, then this patch is probably just 
going to be in the way of that.

We could also move forward with this patch independently of the other 
one.  If we end up reverting the other one, then this one won't be very 
useful but it won't really hurt anything and it would presumably become 
useful eventually.  What we presumably don't want is that the sequence 
replication patch gets repaired for PG15 and we didn't end up committing 
this patch because of uncertainty.

Thoughts?



Re: unlogged sequences

From
Peter Eisentraut
Date:
On 06.04.22 11:12, Peter Eisentraut wrote:
> We could also move forward with this patch independently of the other 
> one.  If we end up reverting the other one, then this one won't be very 
> useful but it won't really hurt anything and it would presumably become 
> useful eventually.  What we presumably don't want is that the sequence 
> replication patch gets repaired for PG15 and we didn't end up committing 
> this patch because of uncertainty.

I have received some encouragement off-list to go ahead with this, so 
it's been committed.