Home > mailing lists

Thread: wal_dump output on CREATE DATABASE

wal_dump output on CREATE DATABASE

From

Jean-Christophe Arnu

Date:

26 October 2018, 13:53:01

Dear hackers,

This is my first try to post on that list to propose a workaround on an issue I noticed while using pg_waldump. Each time an object is referenced by "oids" following output template :

tablespace oid/database oid/relfilenodeid

That template seems to be used for each operation but the *copy dir* operation. The latter is performed (at least) when CREATE DATABASE statement is called with the template: oid/tablespace oid

Exemple on CREATE DATABASE (without defining a template database) :
rmgr: Database len (rec/tot): 42/ 42, tx: 568, lsn: 0/01865790, prev 0/01865720, desc: CREATE copy dir 1/1663 to 16384/1663

It comes out (to me) it may be more consistent to use the same template than the other operations in pg_waldump.
I propose to swap the parameters positions for the copy dir operation output.

You'll find a patch file included which does the switching job :
rmgr: Database len (rec/tot): 42/ 42, tx: 568, lsn: 0/01865790, prev 0/01865720, desc: CREATE copy dir 1663/1 to 1663/16384

Project name ; PostgreSQL
File : copy_dir_message_switch_parameters_v0.patch
Description : Swaps copy dir output parameters
This patch is not WIP (but may be discussed)
Patch applied against master branch
Compiles and tests successfully
No platform specific code
No need of regression test
Not a new feature
No performance impact

Any comment or advice are more than welcome.

If I didn't do the things the right way, I would appreciate some help from a kind mentor.

I'll put the patch for the next commitfest.

Thank you,

--

Jean-Christophe Arnu

Attachment

copy_dir_message_switch_parameters_v0.patch

Re: wal_dump output on CREATE DATABASE

From

Peter Eisentraut

Date:

02 November 2018, 07:37:13

On 26/10/2018 15:53, Jean-Christophe Arnu wrote:
> Exemple on CREATE DATABASE (without defining a template database) :
> rmgr: Database    len (rec/tot):     42/    42, tx:        568, lsn:
> 0/01865790, prev 0/01865720, desc: CREATE copy dir 1/1663 to 16384/1663
> 
> It comes out (to me) it may be more consistent to use the same template
> than the other operations in pg_waldump.
> I propose to swap the parameters positions for the copy dir operation
> output.
> 
> You'll find a patch file included which does the switching job :
> rmgr: Database    len (rec/tot):     42/    42, tx:        568, lsn:
> 0/01865790, prev 0/01865720, desc: CREATE copy dir 1663/1 to 1663/16384

I see your point.  I suspect this was mainly implemented that way
because that's how the fields occur in the underlying
xl_dbase_create_rec structure.  (But that structure also has the target
location first, so it's not entirely consistent that way either.)  We
could switch the fields around in the output, but I wonder whether we
couldn't make the whole thing a bit more readable like this:

desc: CREATE copy dir ts=1663 db=1 to ts=1663 db=16384

or maybe like this

desc: CREATE copy dir (ts/db) 1663/1 to 1663/16384

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: wal_dump output on CREATE DATABASE

From

Jean-Christophe Arnu

Date:

04 November 2018, 17:01:51

Le ven. 2 nov. 2018 à 08:37, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> a écrit :

On 26/10/2018 15:53, Jean-Christophe Arnu wrote:
> Exemple on CREATE DATABASE (without defining a template database) :
> rmgr: Database len (rec/tot): 42/ 42, tx: 568, lsn:
> 0/01865790, prev 0/01865720, desc: CREATE copy dir 1/1663 to 16384/1663
>
> It comes out (to me) it may be more consistent to use the same template
> than the other operations in pg_waldump.
> I propose to swap the parameters positions for the copy dir operation
> output.
>
> You'll find a patch file included which does the switching job :
> rmgr: Database len (rec/tot): 42/ 42, tx: 568, lsn:
> 0/01865790, prev 0/01865720, desc: CREATE copy dir 1663/1 to 1663/16384

I see your point. I suspect this was mainly implemented that way
because that's how the fields occur in the underlying
xl_dbase_create_rec structure. (But that structure also has the target
location first, so it's not entirely consistent that way either.) We
could switch the fields around in the output, but I wonder whether we
couldn't make the whole thing a bit more readable like this:

desc: CREATE copy dir ts=1663 db=1 to ts=1663 db=16384

or maybe like this

desc: CREATE copy dir (ts/db) 1663/1 to 1663/16384

Thank you Peter for your review and proposal .

I like the last one which gives the fields semantics in a concise way.

Pushing the idea a bit farther we could think of applying that description to any other operation that involves the ts/db/oid fields. What do you think ?

Example in nbtdesc.c we could add the "(ts/db/oid)" information to help the DBA to identify the objects:
    case XLOG_BTREE_REUSE_PAGE:
            {
                xl_btree_reuse_page *xlrec = (xl_btree_reuse_page *) rec;

                appendStringInfo(buf, "rel (ts/db/rel) %u/%u/%u; latestRemovedXid %u",
                                xlrec->node.spcNode, xlrec->node.dbNode,
                                xlrec->node.relNode, xlrec->latestRemovedXid);
                break;
            }

This may help DBAs to better identify the objects related to the messages.

Any thought/comments/suggestions ?

~~~ Side story
Well to be clear, my first proposal is born while i was writting a side script to textually identify which objects were involved into pg_waldump operations (if that objects already exists at script run time). I'm wondering whether it could be interesting to incllde such a "textual decoding" into pg_waldump or not (as a command line switch). Anyway, my script would be available as proof of concept first. We have time to discuss that last point in another thread.
~~~

Thank you

Re: wal_dump output on CREATE DATABASE

From

Jean-Christophe Arnu

Date:

05 November 2018, 14:37:06

Le dim. 4 nov. 2018 à 18:01, Jean-Christophe Arnu <jcarnu@gmail.com> a écrit :

Le ven. 2 nov. 2018 à 08:37, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> a écrit :
On 26/10/2018 15:53, Jean-Christophe Arnu wrote:
> Exemple on CREATE DATABASE (without defining a template database) :
> rmgr: Database    len (rec/tot):     42/    42, tx:        568, lsn:
> 0/01865790, prev 0/01865720, desc: CREATE copy dir 1/1663 to 16384/1663
>
> It comes out (to me) it may be more consistent to use the same template
> than the other operations in pg_waldump.
> I propose to swap the parameters positions for the copy dir operation
> output.
>
> You'll find a patch file included which does the switching job :
> rmgr: Database    len (rec/tot):     42/    42, tx:        568, lsn:
> 0/01865790, prev 0/01865720, desc: CREATE copy dir 1663/1 to 1663/16384

I see your point. I suspect this was mainly implemented that way
because that's how the fields occur in the underlying
xl_dbase_create_rec structure. (But that structure also has the target
location first, so it's not entirely consistent that way either.) We
could switch the fields around in the output, but I wonder whether we
couldn't make the whole thing a bit more readable like this:

desc: CREATE copy dir ts=1663 db=1 to ts=1663 db=16384

or maybe like this

desc: CREATE copy dir (ts/db) 1663/1 to 1663/16384

Thank you Peter for your review and proposal .
I like the last one which gives the fields semantics in a concise way.
Pushing the idea a bit farther we could think of applying that description to any other operation that involves the ts/db/oid fields. What do you think ?

Example in nbtdesc.c we could add the "(ts/db/oid)" information to help the DBA to identify the objects:
    case XLOG_BTREE_REUSE_PAGE:
            {
                xl_btree_reuse_page *xlrec = (xl_btree_reuse_page *) rec;

                appendStringInfo(buf, "rel (ts/db/rel) %u/%u/%u; latestRemovedXid %u",
                                xlrec->node.spcNode, xlrec->node.dbNode,
                                xlrec->node.relNode, xlrec->latestRemovedXid);
                break;
            }

This may help DBAs to better identify the objects related to the messages.

Any thought/comments/suggestions ?

~~~ Side story
Well to be clear, my first proposal is born while i was writting a side script to textually identify which objects were involved into pg_waldump operations (if that objects already exists at script run time). I'm wondering whether it could be interesting to incllde such a "textual decoding" into pg_waldump or not (as a command line switch). Anyway, my script would be available as proof of concept first. We have time to discuss that last point in another thread.
~~~

Thank you

I've dug a little more in that way and spotted different locations in the code where such semantics might be useful too.
Sometimes just like in XLOG_HEAP_TRUNCATE (heapdesc.c), there's a bunch of rels and descendants that are held by a single db. Adding the database id may be useful in that case. Decoding tablespace for each object may be interesting (but not mandatory IMHO) for each rel. That information does not seem to be included in the structure, but as newcomer I assume there's a convenient way to retrieve that information easily.

Another operation that might be eligible to end user message improvements is the one handled by xact_desc_commit() function (xactdesc.c) . Each time a COMMIT (XLOG_XACT_COMMIT or XLOG_XACT_COMMIT_PREPARED) occurs the folllowing snippets is output :

desc: COMMIT 2018-11-05 15:11:03.087546 CET; rels: base/16384/16385 base/16384/16388 base/16384/16390;

in that case, file path is output using relpathperm macro which ends up callin gthe GetRelationPath() function. In that last function, we can have the dbNode, spcNode (tablespace) and relNode Oid (include/common/relpath.h )

The question is now to know if it would be interesting to have a consistent way to represent all objects and their hierarchy :
BTW, we could have db/ts/rel or ts/db/rel ?
What about the "base/ts/rel" from XLOG_XACT_COMMIT/COMMIT_PREPARED ? Why is it different from the other representation (it must serve a purpose I assume) ?
How do we represent multiples rels such as XLOG_HEAP_TRUNCATE ? My proposal (db/rels) dboid/ reloid1 reloid2 reloid3 ... reloidN (as TRUNCATE only deals with one DB, but no tablespace is defined, this may be another point ?)

Well, if you could give me some directions, I would appreciate !

As ever, any thought, comments are more than welcomed.

--

Jean-Christophe Arnu

Re: wal_dump output on CREATE DATABASE

From

Jean-Christophe Arnu

Date:

13 November 2018, 17:53:06

Le lun. 5 nov. 2018 à 15:37, Jean-Christophe Arnu <jcarnu@gmail.com> a écrit :

Le dim. 4 nov. 2018 à 18:01, Jean-Christophe Arnu <jcarnu@gmail.com> a écrit :
Le ven. 2 nov. 2018 à 08:37, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> a écrit :
On 26/10/2018 15:53, Jean-Christophe Arnu wrote:
> Exemple on CREATE DATABASE (without defining a template database) :
> rmgr: Database    len (rec/tot):     42/    42, tx:        568, lsn:
> 0/01865790, prev 0/01865720, desc: CREATE copy dir 1/1663 to 16384/1663
>
> It comes out (to me) it may be more consistent to use the same template
> than the other operations in pg_waldump.
> I propose to swap the parameters positions for the copy dir operation
> output.
>
> You'll find a patch file included which does the switching job :
> rmgr: Database    len (rec/tot):     42/    42, tx:        568, lsn:
> 0/01865790, prev 0/01865720, desc: CREATE copy dir 1663/1 to 1663/16384

I see your point. I suspect this was mainly implemented that way
because that's how the fields occur in the underlying
xl_dbase_create_rec structure. (But that structure also has the target
location first, so it's not entirely consistent that way either.) We
could switch the fields around in the output, but I wonder whether we
couldn't make the whole thing a bit more readable like this:

desc: CREATE copy dir ts=1663 db=1 to ts=1663 db=16384

or maybe like this

desc: CREATE copy dir (ts/db) 1663/1 to 1663/16384

Thank you Peter for your review and proposal .
I like the last one which gives the fields semantics in a concise way.
Pushing the idea a bit farther we could think of applying that description to any other operation that involves the ts/db/oid fields. What do you think ?

Example in nbtdesc.c we could add the "(ts/db/oid)" information to help the DBA to identify the objects:
    case XLOG_BTREE_REUSE_PAGE:
            {
                xl_btree_reuse_page *xlrec = (xl_btree_reuse_page *) rec;

                appendStringInfo(buf, "rel (ts/db/rel) %u/%u/%u; latestRemovedXid %u",
                                xlrec->node.spcNode, xlrec->node.dbNode,
                                xlrec->node.relNode, xlrec->latestRemovedXid);
                break;
            }

This may help DBAs to better identify the objects related to the messages.

Any thought/comments/suggestions ?

~~~ Side story
Well to be clear, my first proposal is born while i was writting a side script to textually identify which objects were involved into pg_waldump operations (if that objects already exists at script run time). I'm wondering whether it could be interesting to incllde such a "textual decoding" into pg_waldump or not (as a command line switch). Anyway, my script would be available as proof of concept first. We have time to discuss that last point in another thread.
~~~

Thank you

I've dug a little more in that way and spotted different locations in the code where such semantics might be useful too.
Sometimes just like in XLOG_HEAP_TRUNCATE (heapdesc.c), there's a bunch of rels and descendants that are held by a single db. Adding the database id may be useful in that case. Decoding tablespace for each object may be interesting (but not mandatory IMHO) for each rel. That information does not seem to be included in the structure, but as newcomer I assume there's a convenient way to retrieve that information easily.

Another operation that might be eligible to end user message improvements is the one handled by xact_desc_commit() function (xactdesc.c) . Each time a COMMIT (XLOG_XACT_COMMIT or XLOG_XACT_COMMIT_PREPARED) occurs the folllowing snippets is output :

desc: COMMIT 2018-11-05 15:11:03.087546 CET; rels: base/16384/16385 base/16384/16388 base/16384/16390;

in that case, file path is output using relpathperm macro which ends up callin gthe GetRelationPath() function. In that last function, we can have the dbNode, spcNode (tablespace) and relNode Oid (include/common/relpath.h )

The question is now to know if it would be interesting to have a consistent way to represent all objects and their hierarchy :
BTW, we could have db/ts/rel or ts/db/rel ?
What about the "base/ts/rel" from XLOG_XACT_COMMIT/COMMIT_PREPARED ? Why is it different from the other representation (it must serve a purpose I assume) ?
How do we represent multiples rels such as XLOG_HEAP_TRUNCATE ? My proposal (db/rels) dboid/ reloid1 reloid2 reloid3 ... reloidN (as TRUNCATE only deals with one DB, but no tablespace is defined, this may be another point ?)

Well, if you could give me some directions, I would appreciate !

As ever, any thought, comments are more than welcomed.

Hello,
May I request any update on my (not so important) proposal ?

Thank you!

--

Jean-Christophe Arnu

Re: wal_dump output on CREATE DATABASE

From

Peter Eisentraut

Date:

13 November 2018, 19:54:28

On 13/11/2018 18:53, Jean-Christophe Arnu wrote:
> May I request any update on my (not so important) proposal ?

Other opinions out there perhaps?  I think your proposals are good, but
I'm not sure to what extent people are automatically parsing the
pg_waldump output and are relying on a particular fixed format.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: wal_dump output on CREATE DATABASE

From

Alvaro Herrera

Date:

13 November 2018, 20:18:32

On 2018-Nov-13, Peter Eisentraut wrote:

> On 13/11/2018 18:53, Jean-Christophe Arnu wrote:
> > May I request any update on my (not so important) proposal ?
> 
> Other opinions out there perhaps?  I think your proposals are good, but
> I'm not sure to what extent people are automatically parsing the
> pg_waldump output and are relying on a particular fixed format.

While I've used pg_waldump a number of times to investigate various
problems, I have never written a script that would be broken by the
proposed changes.  It seems the requirements vary every time.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: wal_dump output on CREATE DATABASE

From

Andres Freund

Date:

13 November 2018, 20:24:59

Hi,

On 2018-11-13 17:18:32 -0300, Alvaro Herrera wrote:
> On 2018-Nov-13, Peter Eisentraut wrote:
> 
> > On 13/11/2018 18:53, Jean-Christophe Arnu wrote:
> > > May I request any update on my (not so important) proposal ?
> > 
> > Other opinions out there perhaps?  I think your proposals are good, but
> > I'm not sure to what extent people are automatically parsing the
> > pg_waldump output and are relying on a particular fixed format.

I personally don't care either way about this change. But:

> While I've used pg_waldump a number of times to investigate various
> problems, I have never written a script that would be broken by the
> proposed changes.  It seems the requirements vary every time.

I'm not too concerned either. There's plenty change of the WAL contents
across versions anyway. We shouldn't break things gratuitously, but we
shouldn't hesitate to make the format better either.

If somebody really wanted something that parses reasonably across
versions I'd like to a) hear that usecase b) come up with an output
format that's oriented towards that.

Greetings,

Andres Freund

Re: wal_dump output on CREATE DATABASE

From

Tomas Vondra

Date:

13 November 2018, 20:31:56

On Tue, 2018-11-13 at 12:24 -0800, Andres Freund wrote:
> Hi,
> 
> On 2018-11-13 17:18:32 -0300, Alvaro Herrera wrote:
> > On 2018-Nov-13, Peter Eisentraut wrote:
> > 
> > > On 13/11/2018 18:53, Jean-Christophe Arnu wrote:
> > > > May I request any update on my (not so important) proposal ?
> > > 
> > > Other opinions out there perhaps?  I think your proposals are
> > > good, but
> > > I'm not sure to what extent people are automatically parsing the
> > > pg_waldump output and are relying on a particular fixed format.
> 
> I personally don't care either way about this change. But:
> 
> > While I've used pg_waldump a number of times to investigate various
> > problems, I have never written a script that would be broken by the
> > proposed changes.  It seems the requirements vary every time.
> 
> I'm not too concerned either. There's plenty change of the WAL
> contents across versions anyway. We shouldn't break things
> gratuitously, but we shouldn't hesitate to make the format better
> either.
> 
> If somebody really wanted something that parses reasonably across
> versions I'd like to a) hear that usecase b) come up with an output
> format that's oriented towards that.
> 

+1 to that

I do process pg_waldump output quite often, and IMHO it's more valuable
to produce consistently formatted output (for a given version) than try
maintaining cross-version compatibility at all costs.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: wal_dump output on CREATE DATABASE

From

Tomas Vondra

Date:

13 November 2018, 20:39:55

On Tue, 2018-11-13 at 18:53 +0100, Jean-Christophe Arnu wrote:
> Le lun. 5 nov. 2018 à 15:37, Jean-Christophe Arnu <jcarnu@gmail.com>
> a
> écrit :
> 
> > 
> > 
> > Le dim. 4 nov. 2018 à 18:01, Jean-Christophe Arnu <jcarnu@gmail.com
> > > a
> > écrit :
> > 
> > > Le ven. 2 nov. 2018 à 08:37, Peter Eisentraut <
> > > peter.eisentraut@2ndquadrant.com> a écrit :
> > > 
> > > > On 26/10/2018 15:53, Jean-Christophe Arnu wrote:
> > > > > Exemple on CREATE DATABASE (without defining a template
> > > > > database) :
> > > > > rmgr: Database    len (rec/tot):     42/    42,
> > > > > tx:        568, lsn:
> > > > > 0/01865790, prev 0/01865720, desc: CREATE copy dir 1/1663 to
> > > > > 16384/1663
> > > > > 
> > > > > It comes out (to me) it may be more consistent to use the
> > > > > same template
> > > > > than the other operations in pg_waldump.
> > > > > I propose to swap the parameters positions for the copy dir
> > > > > operation
> > > > > output.
> > > > > 
> > > > > You'll find a patch file included which does the switching
> > > > > job :
> > > > > rmgr: Database    len (rec/tot):     42/    42,
> > > > > tx:        568, lsn:
> > > > > 0/01865790, prev 0/01865720, desc: CREATE copy dir 1663/1 to
> > > > > 1663/16384
> > > > 
> > > > I see your point.  I suspect this was mainly implemented that
> > > > way
> > > > because that's how the fields occur in the underlying
> > > > xl_dbase_create_rec structure.  (But that structure also has
> > > > the target
> > > > location first, so it's not entirely consistent that way
> > > > either.)  We
> > > > could switch the fields around in the output, but I wonder
> > > > whether we
> > > > couldn't make the whole thing a bit more readable like this:
> > > > 
> > > > desc: CREATE copy dir ts=1663 db=1 to ts=1663 db=16384
> > > > 
> > > > or maybe like this
> > > > 
> > > > desc: CREATE copy dir (ts/db) 1663/1 to 1663/16384
> > > > 
> > > 

I don't know, but this feels like a step too far to me.

The patch started with the goal of making the CREATE DATABASE format
more consistent w.r.t. to other pg_waldump output, and this seems more
like redesigning the whole format with no clear use case.

People reading pg_waldump output quickly learn to read the A/B/C format
and what those fields mean. Breaking that into ts=A db=B relfilenode=C
does not make that particularly clearer or easier to read. I'd say it'd
also makes it harder to parse, and it increases the size of the output
(both in terms of line length and data size).

So -1 to this from me. I'd just swap the fields in the copy dir message
and call it a day.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: wal_dump output on CREATE DATABASE

From

Michael Paquier

Date:

14 November 2018, 01:45:57

On Tue, Nov 13, 2018 at 09:39:55PM +0100, Tomas Vondra wrote:
> So -1 to this from me. I'd just swap the fields in the copy dir message
> and call it a day.

+1.  Using the opposite order when what is actually a database path is
kind of confusing..
--
Michael

Attachment

signature.asc

Re: wal_dump output on CREATE DATABASE

From

Robert Haas

Date:

15 November 2018, 18:44:43

On Tue, Nov 13, 2018 at 3:40 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
> People reading pg_waldump output quickly learn to read the A/B/C format
> and what those fields mean. Breaking that into ts=A db=B relfilenode=C
> does not make that particularly clearer or easier to read. I'd say it'd
> also makes it harder to parse, and it increases the size of the output
> (both in terms of line length and data size).

I agree.

> So -1 to this from me. I'd just swap the fields in the copy dir message
> and call it a day.

Here, too.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: wal_dump output on CREATE DATABASE

From

Jean-Christophe Arnu

Date:

16 November 2018, 11:05:00

Le jeu. 15 nov. 2018 à 19:44, Robert Haas <robertmhaas@gmail.com> a écrit :

On Tue, Nov 13, 2018 at 3:40 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
> People reading pg_waldump output quickly learn to read the A/B/C format
> and what those fields mean. Breaking that into ts=A db=B relfilenode=C
> does not make that particularly clearer or easier to read. I'd say it'd
> also makes it harder to parse, and it increases the size of the output
> (both in terms of line length and data size).

I agree.

First, thank you all for your reviews.

I also agree that the A/B/C format is right (and it may be a good thing to document it, maybe by adding some changes in the doc/src/sgml/ref/pg_waldump.sgml file to this patch).

To reply to Andres, I agree we should not change things for a target format that would not fit clearly defined syntax. In that way, I agree with Tomas on the fact that people reading

pg_waldump output are quickly familiar with the A/B/C notation.

My first use case was to decode the ids with a processing script to identify each id in A/B/C or pg_waldump output with a "human readable" item. For this, my processing script connects the cluster and tries resolve the ids with simple queries (and building a local cache for this). Then it replaces each looked up id item with its corresponding text.

In some cases, this could be useful for DBA to find more easily when a specific relation was modified (searching for DELETE BTW). But that's only my use case and my little script.

Going back to the code :

As I can figure by crawling the source tree (and discovering it) there are messages with :

* A/B/C notation which seems to be the one we should adopt ( meaning ts/db/refilenode )

some are only

* A/B for the COPY message we discussed later

On the other hand, and I don't know if it's relevant, I've pointed some examples such as XLOG_RELMAP_UPDATE in relmapdesc.c which could benefit of that "notation" :

appendStringInfo(buf, "database %u tablespace %u size %u",
xlrec->dbid, xlrec->tsid, xlrec->nbytes);

could be written like this :

appendStringInfo(buf, "%u/%u size %u",
xlrec->tsid, xlrec->dbid, xlrec->nbytes);

In that case ts and db should also be switched. In that case the message would only by B/C which is confusing, but we have other place where "base/" is put in prefix.

The same transform may be also applied to standbydesc.c in standby_desc() function.

appendStringInfo(buf, "xid %u db %u rel %u ",
xlrec->locks[i].xid, xlrec->locks[i].dbOid,
xlrec->locks[i].relOid);

may be changed to

appendStringInfo(buf, "xid %u (db/rel) %u/%u ",
xlrec->locks[i].xid, xlrec->locks[i].dbOid,
xlrec->locks[i].relOid);

As I said, I don't know whether it's relevant to perform these changes or not.

If the A/B/C notation is to be generalized, it would be worth document it in the SGML file.

If not, the first patch provided should be enough.

Regards

--

Jean-Christophe Arnu

Re: wal_dump output on CREATE DATABASE

From

Michael Paquier

Date:

16 November 2018, 12:16:04

On Fri, Nov 16, 2018 at 12:05:00PM +0100, Jean-Christophe Arnu wrote:
> As I said, I don't know whether it's relevant to perform these changes or
> not.  If the A/B/C notation is to be generalized, it would be worth document it
> in the SGML file.  If not, the first patch provided should be enough.

Just applying the first patch would be fine in my opinion.
--
Michael

Attachment

signature.asc

Re: wal_dump output on CREATE DATABASE

From

Tomas Vondra

Date:

16 November 2018, 13:32:50


On 11/16/18 12:05 PM, Jean-Christophe Arnu wrote:
> 
> 
> Le jeu. 15 nov. 2018 à 19:44, Robert Haas <robertmhaas@gmail.com 
> <mailto:robertmhaas@gmail.com>> a écrit :
> 
>     On Tue, Nov 13, 2018 at 3:40 PM Tomas Vondra
>     <tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>>
>     wrote:
>      > People reading pg_waldump output quickly learn to read the A/B/C
>     format
>      > and what those fields mean. Breaking that into ts=A db=B
>     relfilenode=C
>      > does not make that particularly clearer or easier to read. I'd
>     say it'd
>      > also makes it harder to parse, and it increases the size of the
>     output
>      > (both in terms of line length and data size).
> 
>     I agree.
> 
> 
> First, thank you all for your reviews.
> 
> I also agree that the A/B/C format is right (and it may be a good thing 
> to document it, maybe by adding some changes in the 
> doc/src/sgml/ref/pg_waldump.sgml file to this patch).
> 
> To reply to Andres, I agree we should not change things for a target 
> format that would not fit clearly defined syntax. In that way, I agree 
> with Tomas on the fact that people reading
> pg_waldump output are quickly familiar with the A/B/C notation.
> 
> My first use case was to decode the ids with a processing script to 
> identify each id in A/B/C or pg_waldump output with a "human readable" 
> item. For this, my processing script connects the cluster and tries 
> resolve the ids with simple queries (and building a local cache for 
> this). Then it replaces each looked up id item with its corresponding text.
> In some cases, this could be useful for DBA to find more easily when a 
> specific relation was modified (searching for DELETE BTW). But that's 
> only my use case and my little script.
> 
> Going back to the code :
> 
> As I can figure by crawling the source tree (and discovering it) there 
> are messages with :
>    * A/B/C notation which seems to be the one we should adopt ( meaning 
> ts/db/refilenode )
> some are only
>    * A/B for the COPY message we discussed later
> 
> On the other hand, and I don't know if it's relevant, I've pointed some 
> examples such as XLOG_RELMAP_UPDATE in relmapdesc.c which could benefit 
> of that "notation" :
> 
> appendStringInfo(buf, "database %u tablespace %u size %u",
>                           xlrec->dbid, xlrec->tsid, xlrec->nbytes);
> 
> could be written like this :
> 
> appendStringInfo(buf, "%u/%u size %u",
>                           xlrec->tsid, xlrec->dbid, xlrec->nbytes);
> 
> In that case ts and db should also be switched. In that case the message 
> would only by B/C which is confusing, but we have other place where 
> "base/" is put in prefix.
> 
> The same transform may be also applied to standbydesc.c in 
> standby_desc() function.
> 
> appendStringInfo(buf, "xid %u db %u rel %u ",
>                               xlrec->locks[i].xid, xlrec->locks[i].dbOid,
>                               xlrec->locks[i].relOid);
> 
> may be  changed to
> 
> appendStringInfo(buf, "xid %u (db/rel) %u/%u ",
>                               xlrec->locks[i].xid, xlrec->locks[i].dbOid,
>                               xlrec->locks[i].relOid);
> 
> 
> As I said, I don't know whether it's relevant to perform these changes 
> or not.

Maybe, I'm not against doing that. But if we do that, I don't think we 
need to add the "(db/rel)" bit - we don't do that elsewhere, so why 
here? And if we adopt the same format, should that also include the 
tablespace? Although, maybe for locks that doesn't make much sense.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: wal_dump output on CREATE DATABASE

From

Jean-Christophe Arnu

Date:

16 November 2018, 16:28:32

Le ven. 16 nov. 2018 à 14:32, Tomas Vondra <tomas.vondra@2ndquadrant.com> a écrit :

>
> appendStringInfo(buf, "xid %u (db/rel) %u/%u ",
> xlrec->locks[i].xid, xlrec->locks[i].dbOid,
> xlrec->locks[i].relOid);
>
>
> As I said, I don't know whether it's relevant to perform these changes
> or not.

Maybe, I'm not against doing that. But if we do that, I don't think we
need to add the "(db/rel)" bit - we don't do that elsewhere, so why
here?

Yes, you spotted a bad copy/paste from me ; "(db/rel)" should bit should not be there. Sorry.
Why that error so ? I admit I tried some changes on a local git branch of mine with another format helping the user to understand what ids really were. (each line containing ids had "(ts/db/rel)" or "(db/rel)" inside. On COMMIT messages, there also was only one "(db/rels)" bit.
This was just an answer to Peter proposal on a different format. I wanted to keep A/B/C-like format and help DBA to understand what it was using a prefix string like (ts/db/rel).
But as discussions were going on, I figured out it might be a bad idea so I did not submit.

And if we adopt the same format, should that also include the
tablespace? Although, maybe for locks that doesn't make much sense.

I agree that, in a way, it may be a good idea to use that A/B/C format every where each time we use ids. It would make sense for the dba to know what kind of ids are involved if we have partial ones (A/B or B/C).

I don't know the code enough to say whether it would be difficult to retrieve each time the tablespace or not.

There's also lines with base/db/rel ... So there's some different format.
Anyway, I handle all of them (I hope so) in my processing script. So keeping the things just as they are (out of the initial COPY patch that should be applied *to me*) is no problem for me.

On the other hand, there's a consensus not to go further than the initial patch.

--

Jean-Christophe Arnu

Re: wal_dump output on CREATE DATABASE

From

Peter Eisentraut

Date:

20 November 2018, 12:33:59

On 16/11/2018 17:28, Jean-Christophe Arnu wrote:
> On the other hand, there's a consensus not to go further than the
> initial patch.

I have committed your original patch.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: wal_dump output on CREATE DATABASE

From

Jean-Christophe Arnu

Date:

20 November 2018, 12:44:14

Le mar. 20 nov. 2018 à 13:34, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> a écrit :

On 16/11/2018 17:28, Jean-Christophe Arnu wrote:
> On the other hand, there's a consensus not to go further than the
> initial patch.

I have committed your original patch.

Great, thank you Peter. And thank you all for that interesting discussion.

Regards

--

Jean-Christophe Arnu