Re: wal_dump output on CREATE DATABASE - Mailing list pgsql-hackers

From Jean-Christophe Arnu
Subject Re: wal_dump output on CREATE DATABASE
Date
Msg-id CAHZmTm3C_r0c964-1WHjO-r-7t4ePrxKE38hMYZzBryhXm6A7w@mail.gmail.com
Whole thread Raw
In response to Re: wal_dump output on CREATE DATABASE  (Jean-Christophe Arnu <jcarnu@gmail.com>)
Responses Re: wal_dump output on CREATE DATABASE
Re: wal_dump output on CREATE DATABASE
List pgsql-hackers


Le lun. 5 nov. 2018 à 15:37, Jean-Christophe Arnu <jcarnu@gmail.com> a écrit :


Le dim. 4 nov. 2018 à 18:01, Jean-Christophe Arnu <jcarnu@gmail.com> a écrit :
Le ven. 2 nov. 2018 à 08:37, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> a écrit :
On 26/10/2018 15:53, Jean-Christophe Arnu wrote:
> Exemple on CREATE DATABASE (without defining a template database) :
> rmgr: Database    len (rec/tot):     42/    42, tx:        568, lsn:
> 0/01865790, prev 0/01865720, desc: CREATE copy dir 1/1663 to 16384/1663
>
> It comes out (to me) it may be more consistent to use the same template
> than the other operations in pg_waldump.
> I propose to swap the parameters positions for the copy dir operation
> output.
>
> You'll find a patch file included which does the switching job :
> rmgr: Database    len (rec/tot):     42/    42, tx:        568, lsn:
> 0/01865790, prev 0/01865720, desc: CREATE copy dir 1663/1 to 1663/16384

I see your point.  I suspect this was mainly implemented that way
because that's how the fields occur in the underlying
xl_dbase_create_rec structure.  (But that structure also has the target
location first, so it's not entirely consistent that way either.)  We
could switch the fields around in the output, but I wonder whether we
couldn't make the whole thing a bit more readable like this:

desc: CREATE copy dir ts=1663 db=1 to ts=1663 db=16384

or maybe like this

desc: CREATE copy dir (ts/db) 1663/1 to 1663/16384


Thank you Peter for your review and proposal . 
I like the last one which gives the fields semantics in a concise way.
Pushing the idea a bit farther we could think of applying that description to any other operation that involves the ts/db/oid fields. What do you think ?

Example in nbtdesc.c we could add the "(ts/db/oid)" information to help the DBA to identify the objects:
    case XLOG_BTREE_REUSE_PAGE:
            {
                xl_btree_reuse_page *xlrec = (xl_btree_reuse_page *) rec;

                appendStringInfo(buf, "rel (ts/db/rel) %u/%u/%u; latestRemovedXid %u",
                                 xlrec->node.spcNode, xlrec->node.dbNode,
                                 xlrec->node.relNode, xlrec->latestRemovedXid);
                break;
            }

This may help DBAs to better identify the objects related to the messages.

Any thought/comments/suggestions ?

~~~ Side story
Well to be clear, my first proposal is born while i was writting a side script to textually identify which objects were involved into pg_waldump operations (if that objects already exists at script run time). I'm wondering whether it could be interesting to incllde such a "textual decoding" into pg_waldump or not (as a command line switch). Anyway, my script would be available as proof of concept first. We have time to discuss that last point in another thread.
~~~

Thank you

I've dug a little more in that way and spotted different locations in the code where such semantics might be useful too.
Sometimes just like in XLOG_HEAP_TRUNCATE (heapdesc.c), there's a bunch of rels and descendants that are held by a single db. Adding the database id may be useful in that case. Decoding tablespace for each object may be interesting  (but not mandatory IMHO) for each rel. That information does not seem to be included in the structure, but as newcomer I assume there's a convenient way to retrieve that information easily.

Another operation that might be eligible to end user message improvements is the one handled by xact_desc_commit() function (xactdesc.c) . Each time a COMMIT (XLOG_XACT_COMMIT or XLOG_XACT_COMMIT_PREPARED)  occurs the folllowing snippets is output :

 desc: COMMIT 2018-11-05 15:11:03.087546 CET; rels: base/16384/16385 base/16384/16388 base/16384/16390;

in that case, file path is output using relpathperm macro which ends up callin gthe GetRelationPath() function.  In that last function, we can have the dbNode, spcNode (tablespace) and relNode Oid (include/common/relpath.h )

The question is now to know if it would be interesting to have a consistent way to represent all objects and their hierarchy :
 BTW, we could have db/ts/rel or ts/db/rel ?
What about the "base/ts/rel" from XLOG_XACT_COMMIT/COMMIT_PREPARED ? Why is it different from the other representation (it must serve a purpose I assume) ?
How do we represent multiples rels such as XLOG_HEAP_TRUNCATE ? My proposal (db/rels) dboid/ reloid1 reloid2 reloid3 ... reloidN  (as TRUNCATE only deals with one DB, but no tablespace is defined, this may be another point ?)

Well, if you could give me some directions, I would appreciate !

As ever, any thought, comments are more than welcomed.

Hello,
May I request any update on my (not so important) proposal ?

Thank you!
--
Jean-Christophe Arnu

pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: BUG #15212: Default values in partition tables don't work asexpected and allow NOT NULL violation
Next
From: Andres Freund
Date:
Subject: Re: Refactoring the checkpointer's fsync request queue