Thread: pg_xlogdump --stats

pg_xlogdump --stats

From

Abhijit Menon-Sen

Date:

04 June 2014, 13:47:30

Hi.

Here's a patch to make pg_xlogdump print summary statistics instead of
individual records.

By default, for each rmgr it shows the number of records, the size of
rmgr-specific records, the size of full-page images, and the combined
size. With --stats=record it shows these figures for each rmgr/xl_info
combination (omitting those with zero counts for readability).

Here's an example of the output after a few pgbench runs with the
default checkpoint settings. I raised wal_keep_segments, resulting
in 3.5GB of WAL data in pg_xlog.

As you can see in the "Total" line, 96.83% of this is full-page images.
As Andres observed, this is a good demonstration of why one should not
use the default checkpoint_segments in production.

$ ../bin/pg_xlogdump --stats=record 000000010000000000000001 0000000100000000000000DE
Type                                           N      (%)          Record size      (%)             FPI size      (%)
    Combined size      (%) 
----                                           -      ---          -----------      ---             --------      ---
    -------------      --- 
XLOG/CHECKPOINT_SHUTDOWN                      16 (  0.00)                 1152 (  0.00)                    0 (  0.00)
             1152 (  0.00) 
XLOG/CHECKPOINT_ONLINE                        80 (  0.00)                 5760 (  0.01)                    0 (  0.00)
             5760 (  0.00) 
XLOG/NEXTOID                                  12 (  0.00)                   48 (  0.00)                    0 (  0.00)
               48 (  0.00) 
Transaction/COMMIT                            71 (  0.00)                 4708 (  0.00)                    0 (  0.00)
             4708 (  0.00) 
Transaction/COMMIT_COMPACT                413956 ( 14.82)              4967472 (  4.35)                    0 (  0.00)
          4967472 (  0.14) 
Storage/CREATE                               231 (  0.01)                 3696 (  0.00)                    0 (  0.00)
             3696 (  0.00) 
Storage/TRUNCATE                               1 (  0.00)                   16 (  0.00)                    0 (  0.00)
               16 (  0.00) 
CLOG/ZEROPAGE                                 13 (  0.00)                   52 (  0.00)                    0 (  0.00)
               52 (  0.00) 
Database/CREATE                                3 (  0.00)                   48 (  0.00)                    0 (  0.00)
               48 (  0.00) 
RelMap/UPDATE                                 14 (  0.00)                 7336 (  0.01)                    0 (  0.00)
             7336 (  0.00) 
Heap2/CLEAN                               369312 ( 13.22)             10769122 (  9.43)           2698910088 ( 77.33)
       2709679210 ( 75.17) 
Heap2/FREEZE_PAGE                             53 (  0.00)                 3276 (  0.00)               327732 (  0.01)
           331008 (  0.01) 
Heap2/VISIBLE                              58160 (  2.08)              1163200 (  1.02)               599768 (  0.02)
          1762968 (  0.05) 
Heap2/MULTI_INSERT                             1 (  0.00)                   59 (  0.00)                    0 (  0.00)
               59 (  0.00) 
Heap2/MULTI_INSERT+INIT                        7 (  0.00)                38874 (  0.03)                    0 (  0.00)
            38874 (  0.00) 
Heap/INSERT                               425472 ( 15.23)             22392664 ( 19.61)              6081712 (  0.17)
         28474376 (  0.79) 
Heap/DELETE                                 1638 (  0.06)                42588 (  0.04)                19800 (  0.00)
            62388 (  0.00) 
Heap/UPDATE                                53912 (  1.93)              7145531 (  6.26)            390264760 ( 11.18)
        397410291 ( 11.03) 
Heap/HOT_UPDATE                          1185607 ( 42.43)             59538947 ( 52.13)             48724168 (  1.40)
        108263115 (  3.00) 
Heap/LOCK                                 199320 (  7.13)              4983000 (  4.36)              1656812 (  0.05)
          6639812 (  0.18) 
Heap/INPLACE                                 469 (  0.02)                66676 (  0.06)               558604 (  0.02)
           625280 (  0.02) 
Heap/INSERT+INIT                            2992 (  0.11)               272895 (  0.24)                    0 (  0.00)
           272895 (  0.01) 
Heap/UPDATE+INIT                            1184 (  0.04)               146420 (  0.13)              6611352 (  0.19)
          6757772 (  0.19) 
Btree/INSERT_LEAF                          81058 (  2.90)              2224916 (  1.95)            336444372 (  9.64)
        338669288 (  9.40) 
Btree/INSERT_UPPER                           128 (  0.00)                 5272 (  0.00)                16368 (  0.00)
            21640 (  0.00) 
Btree/SPLIT_L                                 48 (  0.00)               171384 (  0.15)                46712 (  0.00)
           218096 (  0.01) 
Btree/SPLIT_R                                 80 (  0.00)               233736 (  0.20)                39116 (  0.00)
           272852 (  0.01) 
Btree/SPLIT_L_ROOT                             7 (  0.00)                 5488 (  0.00)                    0 (  0.00)
             5488 (  0.00) 
Btree/SPLIT_R_ROOT                             4 (  0.00)                 2880 (  0.00)                    0 (  0.00)
             2880 (  0.00) 
Btree/DELETE                                   3 (  0.00)                  454 (  0.00)                    0 (  0.00)
              454 (  0.00) 
Btree/UNLINK_PAGE                              4 (  0.00)                  176 (  0.00)                    0 (  0.00)
              176 (  0.00) 
Btree/NEWROOT                                 33 (  0.00)                  980 (  0.00)                    0 (  0.00)
              980 (  0.00) 
Btree/MARK_PAGE_HALFDEAD                       4 (  0.00)                  144 (  0.00)                    0 (  0.00)
              144 (  0.00) 
Btree/VACUUM                                  66 (  0.00)                 7890 (  0.01)                18400 (  0.00)
            26290 (  0.00) 
                                        --------                      --------                      --------
         -------- 
Total                                    2793959                     114206860 [3.17%]            3490319764 [96.83%]
       3604526624 [100%] 
pg_xlogdump: FATAL:  error in WAL record at 0/DEA52150: record with zero length at 0/DEA521B8

(Note: the FATAL error above is just the normal end of WAL.)

In each row,

    - Type is rmgr/record
    - N is the number of records of that type
    - % is the percentage of total records
    - Record size is sum(xl_len+SizeOfXLogRecord)
    - % is the percentage of the total record size
    - FPI size is the size of full-page images
    - % is the percentage of the total FPI size
    - Combined size is sum(xl_tot_len)
    - % is the percentage of the total combined size

The last line ("Total") shows the total number of records of all types,
the total record size (and what percentage that is of the total size),
the total full-page image size (and ditto), and the total combined size
(which is 100% by definition).

The patch is quite straightforward, but became quite long due to the
many switch/cases needed to name each meaningful xl_rmid/xl_info
combination.

I'll add it to the CF.

-- Abhijit

Attachment

stats.diff

Re: pg_xlogdump --stats

From

Date:

10 June 2014, 12:06:12

> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Abhijit
> Menon-Sen
> Sent: Wednesday, June 04, 2014 7:47 PM
> To: pgsql-hackers@postgresql.org
> Subject: [HACKERS] pg_xlogdump --stats
> 
> Hi.
> 
> Here's a patch to make pg_xlogdump print summary statistics instead of
> individual records.
> 
The function works fine. It is a good to the learning of PostgreSQL.
But By applying the patch,warning comes out then make.
Although I think that's no particular problem, However I think better to fix is good.


$ make
...
pg_xlogdump.c: In function ‘XLogDumpStatsRow’:
pg_xlogdump.c:851: warning: format ‘%20llu’ expects type ‘long long unsigned int’, but argument 3 has type ‘uint64’
pg_xlogdump.c:851: warning: format ‘%20llu’ expects type ‘long long unsigned int’, but argument 5 has type ‘uint64’
pg_xlogdump.c:851: warning: format ‘%20llu’ expects type ‘long long unsigned int’, but argument 7 has type ‘uint64’
pg_xlogdump.c:851: warning: format ‘%20llu’ expects type ‘long long unsigned int’, but argument 9 has type ‘uint64’
pg_xlogdump.c: In function ‘XLogDumpDisplayStats’:
pg_xlogdump.c:948: warning: format ‘%20llu’ expects type ‘long long unsigned int’, but argument 3 has type ‘uint64’
pg_xlogdump.c:948: warning: format ‘%20llu’ expects type ‘long long unsigned int’, but argument 5 has type ‘uint64’
pg_xlogdump.c:948: warning: format ‘%20llu’ expects type ‘long long unsigned int’, but argument 7 has type ‘uint64’
pg_xlogdump.c:948: warning: format ‘%20llu’ expects type ‘long long unsigned int’, but argument 9 has type ‘uint64'
...

My environment:
RHEL 6.4
gcc version 4.4.7 20120313 (Red Hat 4.4.7-3) (GCC)

Regards,

-- 
Furuya Osamu

Re: pg_xlogdump --stats

From

Abhijit Menon-Sen

Date:

13 June 2014, 10:31:16

At 2014-06-10 18:04:13 +0900, furuyao@pm.nttdata.co.jp wrote:
>
> The function works fine. It is a good to the learning of PostgreSQL.

Thanks for having a look.

> pg_xlogdump.c: In function ‘XLogDumpStatsRow’:
> pg_xlogdump.c:851: warning: format ‘%20llu’ expects type ‘long long unsigned int’, but argument 3 has type ‘uint64’

I've changed this to use %zu at Álvaro's suggestion. I'll post an
updated patch after I've finished some (unrelated) refactoring.

-- Abhijit

Re: pg_xlogdump --stats

From

Dilip kumar

Date:

30 June 2014, 08:19:43

On 13 June 2014 13:01, Abhijit Menon-Sen Wrote

> 
> I've changed this to use %zu at Álvaro's suggestion. I'll post an
> updated patch after I've finished some (unrelated) refactoring.

I have started reviewing the patch..

1. Patch applied to git head cleanly.
2. Compiled in Linux  -- Some warnings same as mentioned by furuyao 

3. Some compilation error in windows
.\contrib\pg_xlogdump\pg_xlogdump.c(1002) : error C2065: 'optional_argument' : undeclared identifier
.\contrib\pg_xlogdump\pg_xlogdump.c(1002) : error C2099: initializer is not a constant

optional_argument should be added to getopt_long.h file for windows.

Please fix these issues and send the updated patch..

I will continue reviewing the patch..


Regards,
Dilip

Re: pg_xlogdump --stats

From

Abhijit Menon-Sen

Date:

30 June 2014, 09:35:21

At 2014-06-30 05:19:10 +0000, dilip.kumar@huawei.com wrote:
>
> I have started reviewing the patch..

Thanks.

> 1. Patch applied to git head cleanly.
> 2. Compiled in Linux  -- Some warnings same as mentioned by furuyao

I've attached an updated version of the patch which should fix the
warnings by using %zu.

> 3. Some compilation error in windows
> .\contrib\pg_xlogdump\pg_xlogdump.c(1002) : error C2065: 'optional_argument' : undeclared identifier
> .\contrib\pg_xlogdump\pg_xlogdump.c(1002) : error C2099: initializer is not a constant
>
> optional_argument should be added to getopt_long.h file for windows.

Hmm. I have no idea what to do about this. I did notice when I wrote the
code that nothing else used optional_argument, but I didn't realise that
it wouldn't work on Windows.

It may be that the best thing to do would be to avoid using
optional_argument altogether, and have separate --stats and
--stats-per-record options. Thoughts?

> Please fix these issues and send the updated patch..

I've also attached a separate patch to factor out the identify_record
function into rm_identify functions in the individual rmgr files, so
that the code can live next to the respective _desc functions.

This allows us to remove a number of #includes from pg_xlogdump, and
makes the code a bit more straightforward to read.

-- Abhijit

Attachment

Re: pg_xlogdump --stats

From

Abhijit Menon-Sen

Date:

30 June 2014, 09:49:30

At 2014-06-30 12:05:06 +0530, ams@2ndQuadrant.com wrote:
>
> It may be that the best thing to do would be to avoid using
> optional_argument altogether, and have separate --stats and
> --stats-per-record options. Thoughts?

That's what I've done in the attached patch, except I've called the new
option --record-stats. Both options now use no_argument. This should
apply on top of the diff I posted a little while ago.

-- Abhijit

Attachment

stats-newopt.diff

Re: pg_xlogdump --stats

From

Marti Raudsepp

Date:

01 July 2014, 16:40:43

On Wed, Jun 4, 2014 at 1:47 PM, Abhijit Menon-Sen <ams@2ndquadrant.com> wrote:
> Here's a patch to make pg_xlogdump print summary statistics instead of
> individual records.

Thanks! I had a use for this feature so I backported the (first) patch
to PostgreSQL 9.3. It's a rush job so it's ugly and may have bugs, but
it works for me. Just posting here, maybe someone else will find it
useful too.

Regards,
Marti

Attachment

xlogdiff_stats_93.patch.gz

Re: pg_xlogdump --stats

From

Abhijit Menon-Sen

Date:

01 July 2014, 23:13:30

At 2014-07-01 16:39:57 +0300, marti@juffo.org wrote:
>
> > Here's a patch to make pg_xlogdump print summary statistics instead
> > of individual records.
> 
> Thanks! I had a use for this feature so I backported the (first) patch
> to PostgreSQL 9.3. It's a rush job so it's ugly and may have bugs, but
> it works for me. Just posting here, maybe someone else will find it
> useful too.

Thanks for taking the trouble.

In CF terms, did you form any opinion while porting the patch I posted
about whether it's sensible/ready for inclusion in 9.5?

-- Abhijit

P.S. In your patch, the documentation changes are duplicated.

Re: pg_xlogdump --stats

From

Marti Raudsepp

Date:

02 July 2014, 04:21:22

On Tue, Jul 1, 2014 at 11:13 PM, Abhijit Menon-Sen <ams@2ndquadrant.com> wrote:
> In CF terms, did you form any opinion while porting the patch I posted
> about whether it's sensible/ready for inclusion in 9.5?

I didn't look at the code more than necessary to make the build work.

As far as functionality goes, it does exactly what I needed it to do;
the output is very clear. I wanted to see how much WAL is being
generated from SELECT FOR UPDATE locks.

Here are some thoughts:

The "FPI" acronym made me wonder at first. Perhaps "Full page image
size" would be clearer (exactly 20 characters long, so it fits).

But on the other hand, I find that the table is too wide, I always
need to stretch my terminal window to fit it. I don't think we need to
accommodate for 10^20 bytes ~ 87 exabytes of WAL files. :)

You might also add units (kB/MB) to the table like pg_size_pretty,
although that would make the magnitudes harder to gauge.

> P.S. In your patch, the documentation changes are duplicated.

Oh right you are, I don't know how that happened. I also missed some
record types (Btree/0x80, Btree/0xb0).

Regards,
Marti

Re: pg_xlogdump --stats

From

Abhijit Menon-Sen

Date:

02 July 2014, 06:07:14

At 2014-07-02 04:20:31 +0300, marti@juffo.org wrote:
>
> As far as functionality goes, it does exactly what I needed it to do;
> the output is very clear.

Good to hear.

> You might also add units (kB/MB) to the table like pg_size_pretty,
> although that would make the magnitudes harder to gauge.

I think df-style -k/m/g switches might be useful to scale all printed
values. If we did that, we could also shrink the columns accordingly.
But that would also complicate the code a bit. I think it's better to
start with the simplest thing, and add more tweaks later.

-- Abhijit

Re: pg_xlogdump --stats

From

Craig Ringer

Date:

02 July 2014, 08:22:01

On 07/02/2014 09:20 AM, Marti Raudsepp wrote:

> You might also add units (kB/MB) to the table like pg_size_pretty,
> although that would make the magnitudes harder to gauge.

What 'du' does, or a subset of, may be worthwhile.

-k: kB
-b: blocks
-m: MB
-h: human-readable auto-scaling

though I think just the "-h" flag would be useful here, meaning "wrap in
pg_size_pretty".

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: pg_xlogdump --stats

From

Abhijit Menon-Sen

Date:

04 July 2014, 09:36:48

At 2014-06-30 05:19:10 +0000, dilip.kumar@huawei.com wrote:
>
> Please fix these issues and send the updated patch..
> 
> I will continue reviewing the patch..

Did you get anywhere with the updated patch?

-- Abhijit

Re: pg_xlogdump --stats

From

Dilip kumar

Date:

04 July 2014, 11:38:38

On 04 July 2014 12:07, Abhijit Menon-Sen Wrote,

> -----Original Message-----
> From: Abhijit Menon-Sen [mailto:ams@2ndQuadrant.com]
> Sent: 04 July 2014 12:07
> To: Dilip kumar
> Cc: pgsql-hackers@postgresql.org; furuyao@pm.nttdata.co.jp
> Subject: Re: [HACKERS] pg_xlogdump --stats
>
> At 2014-06-30 05:19:10 +0000, dilip.kumar@huawei.com wrote:
> >
> > Please fix these issues and send the updated patch..
> >
> > I will continue reviewing the patch..
>
> Did you get anywhere with the updated patch?
>

Patch looks fine to me, except few small comments.

1. Update this new option in "usage" function also  this still have the old way { -z, --stats[=record] }
    {"stats", no_argument, NULL, 'z'},    {"record-stats", no_argument, NULL, 'Z'},

2. While applying stats-newopt.dif (after applying stat2.diff), some error in merging sgml file.

patching file `doc/src/sgml/pg_xlogdump.sgml'
Hunk #1 FAILED at 181.
1 out of 1 hunk FAILED -- saving rejects to doc/src/sgml/pg_xlogdump.sgml.rej

Once you fix above erros, I think patch is ok from my side.

Thanks & Regards,
Dilip Kumar

Re: pg_xlogdump --stats

From

Abhijit Menon-Sen

Date:

04 July 2014, 11:46:53

At 2014-07-04 08:38:17 +0000, dilip.kumar@huawei.com wrote:
>
> Once you fix above erros, I think patch is ok from my side.

I've attached two updated patches here, including the fix to the usage
message. I'll mark this ready for committer now. (The rm_identify patch
is posted separately from the xlogdump one only to make the whole thing
easier to follow.)

Thank you.

-- Abhijit

Attachment

Re: pg_xlogdump --stats

From

Andres Freund

Date:

04 July 2014, 11:54:22

On 2014-07-04 14:16:42 +0530, Abhijit Menon-Sen wrote:
> At 2014-07-04 08:38:17 +0000, dilip.kumar@huawei.com wrote:
> >
> > Once you fix above erros, I think patch is ok from my side.
> 
> I've attached two updated patches here, including the fix to the usage
> message. I'll mark this ready for committer now. (The rm_identify patch
> is posted separately from the xlogdump one only to make the whole thing
> easier to follow.)

I'm pretty sure that most committers would want to apply the rm_identity
part in a separate commit first. Could you make it two patches, one
introducing rm_identity, and another with xlogdump using it?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: pg_xlogdump --stats

From

Abhijit Menon-Sen

Date:

04 July 2014, 12:12:14

At 2014-07-04 10:54:08 +0200, andres@2ndquadrant.com wrote:
>
> I'm pretty sure that most committers would want to apply the
> rm_identity part in a separate commit first. Could you make it
> two patches, one introducing rm_identity, and another with
> xlogdump using it?

Sure, attached.

-- Abhijit

Attachment

Re: pg_xlogdump --stats

From

Andres Freund

Date:

04 July 2014, 12:34:31

Hi,

On 2014-07-04 14:42:03 +0530, Abhijit Menon-Sen wrote:
> Sure, attached.
> 
> +const char *
> +heap_identify(uint8 info)
> +{
> +    const char *id = NULL;
> +
> +    switch (info & XLOG_HEAP_OPMASK)
> +    {
> +        case XLOG_HEAP_INSERT:
> +            id = "INSERT";
> +            break;
> +        case XLOG_HEAP_DELETE:
> +            id = "DELETE";
> +            break;
> +        case XLOG_HEAP_UPDATE:
> +            id = "UPDATE";
> +            break;
> +        case XLOG_HEAP_HOT_UPDATE:
> +            id = "HOT_UPDATE";
> +            break;
> +        case XLOG_HEAP_NEWPAGE:
> +            id = "NEWPAGE";
> +            break;
> +        case XLOG_HEAP_LOCK:
> +            id = "LOCK";
> +            break;
> +        case XLOG_HEAP_INPLACE:
> +            id = "INPLACE";
> +            break;
> +    }
> +
> +    if (info & XLOG_HEAP_INIT_PAGE)
> +        id = psprintf("%s+INIT", id);
> +
> +    return id;
> +}

So we're leaking memory here? For --stats that could well be relevant...
> +/*
> + * Display a single row of record counts and sizes for an rmgr or record.
> + */
> +static void
> +XLogDumpStatsRow(const char *name,
> +                 uint64 n, double n_pct,
> +                 uint64 rec_len, double rec_len_pct,
> +                 uint64 fpi_len, double fpi_len_pct,
> +                 uint64 total_len, double total_len_pct)
> +{
> +    printf("%-27s %20zu (%6.02f) %20zu (%6.02f) %20zu (%6.02f) %20zu (%6.02f)\n",
> +           name, n, n_pct, rec_len, rec_len_pct, fpi_len, fpi_len_pct,
> +           total_len, total_len_pct);
> +}

This (and following locations) is going to break on 32bit platforms. %z
indicates size_t, not 64bit. I think we're going to have to redefine the
PGAC_FUNC_SNPRINTF_LONG_LONG_INT_FORMAT callsite in configure.in to
define INT64_MODIFIER='"ll/l/I64D"'
and then define
#define INT64_FORMAT "%"CppAsString(INT64_MODIFIER)"d"
#define UINT64_FORMAT "%"CppAsString(INT64_MODIFIER)"u"
in c.h based on that, or something like i. This was written blindly, so
it'd might need further work.

Then you can use INT64_MODIFIER in you format strings. Won't be pretty,
but at least it'd work...

> +static void
> +XLogDumpDisplayStats(XLogDumpConfig *config, XLogDumpStats *stats)
> +{

> +    /*
> +     * 27 is strlen("Transaction/COMMIT_PREPARED"),
> +     * 20 is strlen(2^64), 8 is strlen("(100.00%)")
> +     */

It's far from guaranteed that 27 will always suffice. I guess it's ok
because it doesn't cause bad breakage, but just some misalignment...

> +    for (ri = 0; ri < RM_NEXT_ID; ri++)
> +    {
> +        uint64        count, rec_len, fpi_len;
> +        const RmgrDescData *desc = &RmgrDescTable[ri];
> +
> +        if (!config->stats_per_record)
> +        {
> +            count = stats->rmgr_stats[ri].count;
> +            rec_len = stats->rmgr_stats[ri].rec_len;
> +            fpi_len = stats->rmgr_stats[ri].fpi_len;
> +
> +            XLogDumpStatsRow(desc->rm_name,
> +                             count, 100*(double)count/total_count,
> +                             rec_len, 100*(double)rec_len/total_rec_len,
> +                             fpi_len, 100*(double)fpi_len/total_fpi_len,
> +                             rec_len+fpi_len, 100*(double)(rec_len+fpi_len)/total_len);
> +        }

Many missing spaces here.

> +        else
> +        {
> +            for (rj = 0; rj < 16; rj++)
> +            {
> +                const char *id;
> +
> +                count = stats->record_stats[ri][rj].count;
> +                rec_len = stats->record_stats[ri][rj].rec_len;
> +                fpi_len = stats->record_stats[ri][rj].fpi_len;
> +
> +                /* Skip undefined combinations and ones that didn't occur */
> +                if (count == 0)
> +                    continue;
> +
> +                id = desc->rm_identify(rj << 4);
> +                if (id == NULL)
> +                    id = psprintf("0x%x", rj << 4);
> +
> +                XLogDumpStatsRow(psprintf("%s/%s", desc->rm_name, id),
> +                                 count, 100*(double)count/total_count,
> +                                 rec_len, 100*(double)rec_len/total_rec_len,
> +                                 fpi_len, 100*(double)fpi_len/total_fpi_len,
> +                                 rec_len+fpi_len, 100*(double)(rec_len+fpi_len)/total_len);
> +            }
> +        }
> +    }

Some additional leaking here.

> diff --git a/contrib/pg_xlogdump/rmgrdesc.c b/contrib/pg_xlogdump/rmgrdesc.c
> index cbcaaa6..dc27fd1 100644
> --- a/contrib/pg_xlogdump/rmgrdesc.c
> +++ b/contrib/pg_xlogdump/rmgrdesc.c
> @@ -27,8 +27,8 @@
>  #include "storage/standby.h"
>  #include "utils/relmapper.h"
>  
> -#define PG_RMGR(symname,name,redo,desc,startup,cleanup) \
> -    { name, desc, },
> +#define PG_RMGR(symname,name,redo,desc,identify,startup,cleanup) \
> +    { name, desc, identify, },
>  
>  const RmgrDescData RmgrDescTable[RM_MAX_ID + 1] = {
>  #include "access/rmgrlist.h"
> diff --git a/contrib/pg_xlogdump/rmgrdesc.h b/contrib/pg_xlogdump/rmgrdesc.h
> index d964118..da805c5 100644
> --- a/contrib/pg_xlogdump/rmgrdesc.h
> +++ b/contrib/pg_xlogdump/rmgrdesc.h
> @@ -14,6 +14,7 @@ typedef struct RmgrDescData
>  {
>      const char *rm_name;
>      void        (*rm_desc) (StringInfo buf, XLogRecord *record);
> +    const char *(*rm_identify) (uint8 info);
>  } RmgrDescData;

Looks like that should at least partially have been in the other patch?
The old PG_RMGR looks like it wouldn't compile with the rm_identity
argument added?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: pg_xlogdump --stats

From

Abhijit Menon-Sen

Date:

04 July 2014, 13:04:20

At 2014-07-04 11:34:21 +0200, andres@2ndquadrant.com wrote:
>
> So we're leaking memory here? For --stats that could well be
> relevant...

Will fix (I think using a static buffer would be OK here).

> I think we're going to have to redefine the
> PGAC_FUNC_SNPRINTF_LONG_LONG_INT_FORMAT callsite in configure.in […]

OK, will do.

> It's far from guaranteed that 27 will always suffice. I guess it's ok
> because it doesn't cause bad breakage, but just some misalignment...

Yes, that was my thought too.

I could measure the lengths of things and align columns dynamically, but
I really doubt it's worth the effort in this case.

> Many missing spaces here. […]
> Some additional leaking here.

Will fix both.

> Looks like that should at least partially have been in the other
> patch?

Yes, sorry. Will fix.

(I'll set this back to waiting on author. Thanks for having a look.)

-- Abhijit

Re: pg_xlogdump --stats

From

Andres Freund

Date:

04 July 2014, 13:08:18

On 2014-07-04 15:34:14 +0530, Abhijit Menon-Sen wrote:
> At 2014-07-04 11:34:21 +0200, andres@2ndquadrant.com wrote:
> >
> > So we're leaking memory here? For --stats that could well be
> > relevant...
> 
> Will fix (I think using a static buffer would be OK here).

In that place, yes. There there's several other places where that's
probably isn't going to work. Probably makes sense to check memory usage
after running it over a larger amount of WAL.

> > It's far from guaranteed that 27 will always suffice. I guess it's ok
> > because it doesn't cause bad breakage, but just some misalignment...
> 
> Yes, that was my thought too.
> 
> I could measure the lengths of things and align columns dynamically, but
> I really doubt it's worth the effort in this case.

Nah. Too much work for no gain ;)

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: pg_xlogdump --stats

From

"gotoschool6g"

Date:

04 July 2014, 13:31:53

> I'm pretty sure that most committers would want to apply the

> rm_identity part in a separate commit first. Could you make it

> two patches, one introducing rm_identity, and another with

> xlogdump using it?

Carp the code:

const char *
clog_identify(uint8 info)
{
switch (info)
{
      case CLOG_ZEROPAGE:
       return "ZEROPAGE";
      case CLOG_TRUNCATE:
       return "TRUNCATE";
       break;
}
return NULL;
}

const char *
clog_identify(uint8 info)
{
        if(info==CLOG_ZEROPAGE)return "ZEROPAGE";
        if(info==CLOG_TRUNCATE)return "TRUNCATE";
        return NULL;
}

is a bit faster than:

const char *
clog_identify(uint8 info)
{
const char *id = NULL;

switch (info)
{
  case CLOG_ZEROPAGE:
   id = "ZEROPAGE";
   break;
  case CLOG_TRUNCATE:
   id = "TRUNCATE";
   break;
}

return id;
}

Re: pg_xlogdump --stats

From

Andres Freund

Date:

04 July 2014, 13:36:26

On 2014-07-04 18:31:34 +0800, gotoschool6g wrote:
> > 
> > I'm pretty sure that most committers would want to apply the 
> > rm_identity part in a separate commit first. Could you make it 
> > two patches, one introducing rm_identity, and another with 
> > xlogdump using it? 
> 
> Carp the code:
> 
> const char *
> clog_identify(uint8 info)
> { 
>  switch (info)
>  {
>       case CLOG_ZEROPAGE:
>        return "ZEROPAGE";
>       case CLOG_TRUNCATE:
>        return "TRUNCATE";
>        break;
>  }
>  return NULL;
> }
> 
> or
> 
> const char *
> clog_identify(uint8 info)
> {
>         if(info==CLOG_ZEROPAGE)return "ZEROPAGE";
>         if(info==CLOG_TRUNCATE)return "TRUNCATE";
>         return NULL;
> }
> 
> is a bit faster than:

Any halfway decent compiler will not use a local variable here. Don't
think that matters much. Also the code isn't a performance bottleneck...

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: pg_xlogdump --stats

From

Abhijit Menon-Sen

Date:

04 July 2014, 16:29:17

At 2014-07-04 11:34:21 +0200, andres@2ndquadrant.com wrote:
>
> I think we're going to have to redefine the
> PGAC_FUNC_SNPRINTF_LONG_LONG_INT_FORMAT callsite in configure.in to
> define INT64_MODIFIER='"ll/l/I64D"'

I've attached a patch to do this, and also add INT64_MODIFIER to
pg_config.h.in: 0001-modifier.diff

I reran autoconf, and just for convenience I've attached the resulting
changes to configure: 0002-configure.diff

Then there are the rm_identify changes: 0003-rmid.diff

Finally, the xlogdump patch using INT64_MODIFIER: 0004-xlogdump.diff

I can confirm that this series applies in-order to master, and that the
result builds cleanly (including after each patch) on my machine, and
that the resulting pg_xlogdump works as expected.

NOTE: I do not know what to do about pg_config.h.win32. If someone tells
me what to do, I can submit another patch.

> Some additional leaking here.

Two of the extra calls to psprintf in pg_xlogdump happen at most
RM_MAX_ID*16 (i.e. O(record-types) not O(records)) times, and the other
two happen just before exit. It would be easy to use a static buffer and
snprintf, but I don't think it's worth doing in this case.

-- Abhijit, hoping with crossed fingers to not forget attachments now.

Attachment

Re: pg_xlogdump --stats

From

Alvaro Herrera

Date:

04 July 2014, 20:44:01

Andres Freund wrote:
> On 2014-07-04 18:31:34 +0800, gotoschool6g wrote:

> > Carp the code:
> > 
> > const char *
> > clog_identify(uint8 info)
> > { 
> >  switch (info)
> >  {
> >       case CLOG_ZEROPAGE:
> >        return "ZEROPAGE";
> >       case CLOG_TRUNCATE:
> >        return "TRUNCATE";
> >        break;
> >  }
> >  return NULL;
> > }

I agree that performance is secondary here.  The part that I don't quite
like in all these blocks is the fact that they initialize the return
value to NULL beforehand, and there's no 'default' label.  Now, I see
that pg_xlogdump is checking for NULL return, but I'm not sure this is
the best possible API.  Shouldn't we perhaps return a different string
that indicates there is no known description?

Also, are we certain that we're never going to need anything other than
the "info" to identify the record?  In practice we seem to follow that
rule currently, but it's not totally out of the question that someday
the rm_redo function uses more than just "info" to identify the record.
I wonder if it'd be better to pass the whole record instead and have the
rm_identify function pull out the info if it's all it needs, but leave
the possibility open that it could read, say, some header bytes in the
record data.

Also I didn't check how you handle the "init" bit in RM_HEAP and
RM_HEAP2.  Are we just saying that "insert (init)" is a different record
type from "insert"?  Maybe that's okay, but I'm not 100% sure.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: pg_xlogdump --stats

From

Abhijit Menon-Sen

Date:

04 July 2014, 21:06:53

At 2014-07-04 13:43:46 -0400, alvherre@2ndquadrant.com wrote:
>
> Now, I see that pg_xlogdump is checking for NULL return, but I'm not
> sure this is the best possible API. 

Note that the patched pg_xlogdump will display "rm_name/0xNN" for any
records that rm_identify returns a NULL for.

Earlier, when there was the possibility that new record types could be
added without updating pg_xlogdump's identification code, this made good
sense. Now one could argue that pg_xlogdump should fully depend on every
record in WAL being properly identified.

If that's the case, rm_identify could return "UNKNOWN", and pg_xlogdump
could use the return value without checking. I considered that, but the
only thing that stopped me was the thought that if a "weird" record DOES
show up in WAL somehow, it'd be pretty handy to (a) know the value of
xl_info, and (b) see if there's more than one kind (per-rmid).

But I don't feel strongly enough about it that I'd object to displaying
just "UNKNOWN".

> I wonder if it'd be better to pass the whole record instead and have
> the rm_identify function pull out the info if it's all it needs, but
> leave the possibility open that it could read, say, some header bytes
> in the record data.

I don't *have* an XLogRecord at the point where I'm calling rm_identify.
I could call rm_identify earlier, and either store the name in the stats
structure, or hash the name and use the hash value to store that record
type's statistics.

We don't even have any other potential callers for rm_identify. Adding
it and making it significantly more difficult to use for the only code
that actually needs it… no, I pretty much hate that idea.

Personally, I think it's fine to keep rm_identify the way it is. It's
hardly very difficult code, and if the assumption that records can be
identified by xl_info ever becomes invalid, this isn't the only code
that will need to be changed either.

(Otherwise, I'd actually prefer to keep all the identification code in
pg_xlogdump, i.e. the pre-rm_identify version of my patch. At least that
would make it possible to maintain a version that could be built against
9.3, the way Marti did.)

> Also I didn't check how you handle the "init" bit in RM_HEAP and
> RM_HEAP2.  Are we just saying that "insert (init)" is a different
> record type from "insert"?

Yes, that's what the patch does currently. "X" vs. "X+INIT".

-- Abhijit

Re: pg_xlogdump --stats

From

Andres Freund

Date:

07 July 2014, 12:29:17

On 2014-07-04 18:59:07 +0530, Abhijit Menon-Sen wrote:
> At 2014-07-04 11:34:21 +0200, andres@2ndquadrant.com wrote:
> >
> > I think we're going to have to redefine the
> > PGAC_FUNC_SNPRINTF_LONG_LONG_INT_FORMAT callsite in configure.in to
> > define INT64_MODIFIER='"ll/l/I64D"'
> 
> I've attached a patch to do this, and also add INT64_MODIFIER to
> pg_config.h.in: 0001-modifier.diff
> 
> I reran autoconf, and just for convenience I've attached the resulting
> changes to configure: 0002-configure.diff
> 
> Then there are the rm_identify changes: 0003-rmid.diff
> 
> Finally, the xlogdump patch using INT64_MODIFIER: 0004-xlogdump.diff
> 
> I can confirm that this series applies in-order to master, and that the
> result builds cleanly (including after each patch) on my machine, and
> that the resulting pg_xlogdump works as expected.

> Two of the extra calls to psprintf in pg_xlogdump happen at most
> RM_MAX_ID*16 (i.e. O(record-types) not O(records)) times, and the other
> two happen just before exit. It would be easy to use a static buffer and
> snprintf, but I don't think it's worth doing in this case.

Agreed.

> diff --git a/config/c-library.m4 b/config/c-library.m4
> index 8f45010..4821a61 100644
> --- a/config/c-library.m4
> +++ b/config/c-library.m4
> @@ -221,22 +221,22 @@ HAVE_POSIX_SIGNALS=$pgac_cv_func_posix_signals
>  AC_SUBST(HAVE_POSIX_SIGNALS)])# PGAC_FUNC_POSIX_SIGNALS
>  
>  
> -# PGAC_FUNC_SNPRINTF_LONG_LONG_INT_FORMAT
> +# PGAC_FUNC_SNPRINTF_LONG_LONG_INT_MODIFIER
>  # ---------------------------------------
> -# Determine which format snprintf uses for long long int.  We handle
> -# %lld, %qd, %I64d.  The result is in shell variable
> -# LONG_LONG_INT_FORMAT.
> +# Determine which length modifier snprintf uses for long long int.  We
> +# handle ll, q, and I64.  The result is in shell variable
> +# LONG_LONG_INT_MODIFIER.
>  #
>  # MinGW uses '%I64d', though gcc throws an warning with -Wall,
>  # while '%lld' doesn't generate a warning, but doesn't work.
>  #
> -AC_DEFUN([PGAC_FUNC_SNPRINTF_LONG_LONG_INT_FORMAT],
> -[AC_MSG_CHECKING([snprintf format for long long int])
> -AC_CACHE_VAL(pgac_cv_snprintf_long_long_int_format,
> -[for pgac_format in '%lld' '%qd' '%I64d'; do
> +AC_DEFUN([PGAC_FUNC_SNPRINTF_LONG_LONG_INT_MODIFIER],
> +[AC_MSG_CHECKING([snprintf length modifier for long long int])
> +AC_CACHE_VAL(pgac_cv_snprintf_long_long_int_modifier,
> +[for pgac_modifier in 'll' 'q' 'I64'; do
>  AC_TRY_RUN([#include <stdio.h>
>  typedef long long int ac_int64;
> -#define INT64_FORMAT "$pgac_format"
> +#define INT64_FORMAT "%${pgac_modifier}d"

I'd rather not define INT64_FORMAT here.

> +INT64_FORMAT="\"%${LONG_LONG_INT_MODIFIER}d\""
> +UINT64_FORMAT="\"%${LONG_LONG_INT_MODIFIER}u\""
> +INT64_MODIFIER="\"$LONG_LONG_INT_MODIFIER\""
> +

>  AC_DEFINE_UNQUOTED(INT64_FORMAT, $INT64_FORMAT,
>                     [Define to the appropriate snprintf format for 64-bit ints.])
>  
>  AC_DEFINE_UNQUOTED(UINT64_FORMAT, $UINT64_FORMAT,
>                     [Define to the appropriate snprintf format for unsigned 64-bit ints.])
>  
> +AC_DEFINE_UNQUOTED(INT64_MODIFIER, $INT64_MODIFIER,
> +                   [Define to the appropriate snprintf length modifier for 64-bit ints.])
> +

I'd suggest only defining INT64_MODIFIER here and build INT64_FORMAT,
UINT64_FORMAT ontop, in c.h.

> NOTE: I do not know what to do about pg_config.h.win32. If someone tells
> me what to do, I can submit another patch.

Which would also take care of pg_config.h.win32 - just define
INT64_MODIFIER in there instead of INT64_FORMAT/UINT64_FORMAT and you
should be good.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: pg_xlogdump --stats

From

Abhijit Menon-Sen

Date:

07 July 2014, 22:32:57

At 2014-07-07 09:48:44 +0200, andres@2ndquadrant.com wrote:
>
> I'd suggest only defining INT64_MODIFIER here and build INT64_FORMAT,
> UINT64_FORMAT ontop, in c.h.

Oh, I see. I'm sorry, I misread your earlier suggestion. Regenerated
patches attached. Is this what you had in mind?

-- Abhijit

Attachment

Re: pg_xlogdump --stats

From

Heikki Linnakangas

Date:

21 August 2014, 10:07:22

On 07/07/2014 10:32 PM, Abhijit Menon-Sen wrote:
> At 2014-07-07 09:48:44 +0200, andres@2ndquadrant.com wrote:
>>
>> I'd suggest only defining INT64_MODIFIER here and build INT64_FORMAT,
>> UINT64_FORMAT ontop, in c.h.
>
> Oh, I see. I'm sorry, I misread your earlier suggestion. Regenerated
> patches attached. Is this what you had in mind?

Committed the patch to add INT64_MODIFIER, with minor fixes.

The new rm_identify method needs to be documented. Not sure where; 
surprisingly I can't find any documentation on the existing methods 
either. Perhaps add comments to the RmgrData struct, explaining all of 
the methods. In particular, should give guidelines on what output 
belongs in rm_desc and what in rm_identify.

I think the names that rm_identify returns should match those that the 
rm_desc functions print. As the patch stands, for example when heap_desc 
prints out something like:

hot_update(init): xmax 123; new tid 1/2 xmax 456

The corresponding rm_identify output is:

HOT_UPDATE+INIT

It would be better to spell them the same, so that when you look at the 
summary stats and the raw pg_xlogdump output, you can tell which records 
contributed to which summary line.

- Heikki

Re: pg_xlogdump --stats

From

Abhijit Menon-Sen

Date:

11 September 2014, 11:43:25

At 2014-08-21 10:06:39 +0300, hlinnakangas@vmware.com wrote:
>
> Committed the patch to add INT64_MODIFIER, with minor fixes.

Thank you.

> The new rm_identify method needs to be documented. […]
> Perhaps add comments to the RmgrData struct, explaining
> all of the methods.

OK, I'll submit a patch to add these comments.

> I think the names that rm_identify returns should match those that the
> rm_desc functions print.

I had originally started off trying to keep the output in sync, but it
doesn't work very well. There are rm_desc functions that print things
like "truncate before" and "Create posting tree", and many decisions
are quite arbitrary ("freeze_page", "cleanup info", "multi-insert").

I think it's better the (grep-friendly) way it is. If anything, perhaps
rm_desc should output "${rm_identify}[: optional explanation]". That
would also make it very clear what should go in rm_identify and what
should go in rm_desc.

Thoughts?

> The corresponding rm_identify output is:
> 
> HOT_UPDATE+INIT

The +INIT is admittedly a special case, and I would have no objection to
writing that as (INIT) or something instead.

-- Abhijit

Re: pg_xlogdump --stats

From

Heikki Linnakangas

Date:

11 September 2014, 12:15:03

On 09/11/2014 11:43 AM, Abhijit Menon-Sen wrote:
> At 2014-08-21 10:06:39 +0300, hlinnakangas@vmware.com wrote:
>> I think the names that rm_identify returns should match those that the
>> rm_desc functions print.
>
> I had originally started off trying to keep the output in sync, but it
> doesn't work very well. There are rm_desc functions that print things
> like "truncate before" and "Create posting tree", and many decisions
> are quite arbitrary ("freeze_page", "cleanup info", "multi-insert").

It would be good to clean up those messages to be more sensible and 
consistent.

> I think it's better the (grep-friendly) way it is. If anything, perhaps
> rm_desc should output "${rm_identify}[: optional explanation]". That
> would also make it very clear what should go in rm_identify and what
> should go in rm_desc.

Yeah, that makes sense too.

>> The corresponding rm_identify output is:
>>
>> HOT_UPDATE+INIT
>
> The +INIT is admittedly a special case, and I would have no objection to
> writing that as (INIT) or something instead.

I don't care much, as long as it's consistent the rm_desc output.

It's quite arbitrary that the INIT cases are considered different record 
types, with their own "identity" string, instead of being just a flag in 
the same record type. It's an implementation detail that that flag is 
stored in the xl_info, and that implementation detail is now exposed in 
the stats pg_xlogdump --stats output. There are similar cases in B-tree 
splits, for example, where a split where the new tuple goes to the left 
half is considered a different record type than one where it goes ot the 
right half. It might be interesting to count them separately in the 
stats, but then again it might just be confusing. The xl_info flags and 
the rm_desc output were never intended to be a useful categorization for 
statistics purposes, so it's not surprising that it's not too well 
suited for that. Nevertheless, the "pg_xlogdump --stats" is useful as it 
is, if you know the limitations.

- Heikki

Re: pg_xlogdump --stats

From

Andres Freund

Date:

11 September 2014, 12:23:09

On 2014-09-11 12:14:42 +0300, Heikki Linnakangas wrote:
> On 09/11/2014 11:43 AM, Abhijit Menon-Sen wrote:
> >At 2014-08-21 10:06:39 +0300, hlinnakangas@vmware.com wrote:
> >>I think the names that rm_identify returns should match those that the
> >>rm_desc functions print.
> >
> >I had originally started off trying to keep the output in sync, but it
> >doesn't work very well. There are rm_desc functions that print things
> >like "truncate before" and "Create posting tree", and many decisions
> >are quite arbitrary ("freeze_page", "cleanup info", "multi-insert").
> 
> It would be good to clean up those messages to be more sensible and
> consistent.

> >I think it's better the (grep-friendly) way it is. If anything, perhaps
> >rm_desc should output "${rm_identify}[: optional explanation]". That
> >would also make it very clear what should go in rm_identify and what
> >should go in rm_desc.

> Yeah, that makes sense too.

I'm not sure I agree here. From a theoretical POV sure, but wouldn't
that lead to even longer lines for xlogdump and other error messages for
some records?
We probably need to see how it plays out.

> >>The corresponding rm_identify output is:
> >>
> >>HOT_UPDATE+INIT
> >
> >The +INIT is admittedly a special case, and I would have no objection to
> >writing that as (INIT) or something instead.
> 
> I don't care much, as long as it's consistent the rm_desc output.
> 
> It's quite arbitrary that the INIT cases are considered different record
> types, with their own "identity" string, instead of being just a flag in the
> same record type. It's an implementation detail that that flag is stored in
> the xl_info, and that implementation detail is now exposed in the stats
> pg_xlogdump --stats output.

It's also actually quite good from a display purpose. +INIT will have
quite different wal logging characteristics :)

> There are similar cases in B-tree splits, for
> example, where a split where the new tuple goes to the left half is
> considered a different record type than one where it goes ot the right half.
> It might be interesting to count them separately in the stats, but then
> again it might just be confusing. The xl_info flags and the rm_desc output
> were never intended to be a useful categorization for statistics purposes,
> so it's not surprising that it's not too well suited for that. Nevertheless,
> the "pg_xlogdump --stats" is useful as it is, if you know the limitations.

I agree it's not perfect, but I think it's a huge step forward. We very
well might want to improve upon the stats granularity once in, but I
think it already gives a fair amount of direction where to look.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: pg_xlogdump --stats

From

Heikki Linnakangas

Date:

11 September 2014, 12:32:48

On 09/11/2014 12:22 PM, Andres Freund wrote:
> On 2014-09-11 12:14:42 +0300, Heikki Linnakangas wrote:
>> On 09/11/2014 11:43 AM, Abhijit Menon-Sen wrote:
>>> At 2014-08-21 10:06:39 +0300, hlinnakangas@vmware.com wrote:
>>>> I think the names that rm_identify returns should match those that the
>>>> rm_desc functions print.
>>>
>>> I had originally started off trying to keep the output in sync, but it
>>> doesn't work very well. There are rm_desc functions that print things
>>> like "truncate before" and "Create posting tree", and many decisions
>>> are quite arbitrary ("freeze_page", "cleanup info", "multi-insert").
>>
>> It would be good to clean up those messages to be more sensible and
>> consistent.
>
>>> I think it's better the (grep-friendly) way it is. If anything, perhaps
>>> rm_desc should output "${rm_identify}[: optional explanation]". That
>>> would also make it very clear what should go in rm_identify and what
>>> should go in rm_desc.
>
>> Yeah, that makes sense too.
>
> I'm not sure I agree here. From a theoretical POV sure, but wouldn't
> that lead to even longer lines for xlogdump and other error messages for
> some records?

No. All the rm_desc functions already follow that convention, and print 
"foo: details", where "foo" identifies the record type based on xl_info. 
This proposal would just make that convention more official, and 
stipulate that the "foo" part should match what the new rm_identify() 
function returns. (Or perhaps even better, define it so that rm_desc 
prints only the "details" part, and rm_identify() the "foo" part, and 
have the callers put the two strings together if they want)

>> There are similar cases in B-tree splits, for
>> example, where a split where the new tuple goes to the left half is
>> considered a different record type than one where it goes ot the right half.
>> It might be interesting to count them separately in the stats, but then
>> again it might just be confusing. The xl_info flags and the rm_desc output
>> were never intended to be a useful categorization for statistics purposes,
>> so it's not surprising that it's not too well suited for that. Nevertheless,
>> the "pg_xlogdump --stats" is useful as it is, if you know the limitations.
>
> I agree it's not perfect, but I think it's a huge step forward. We very
> well might want to improve upon the stats granularity once in, but I
> think it already gives a fair amount of direction where to look.

Agreed. That's what I was also trying to say.

- Heikki

Re: pg_xlogdump --stats

From

Abhijit Menon-Sen

Date:

19 September 2014, 10:57:31

Hi.

I've attached two patches here:

0001-Make-pg_xlogdump-record-stats-display-summary-statis.patch is my
earlier patch to pg_xlogdump, rebased to master. It introduces the new
rm_identify callback, but doesn't touch rm_desc. Other than rebasing to
master after the INT64_FORMAT changes, I haven't changed anything.

0002-Clearly-separate-rm_identify-and-rm_desc-outputs.patch then makes
the change you (Heikki) wanted to see: rm_identify returns a name, and
rm_desc can be used to obtain optional additional detail, and xlog.c
just glues the two together in a new xlog_outdesc function. rm_desc
is changed largely only to (a) remove the "prefix: ", and (b) not
append UNKNOWN for unidentified records.

(I've done a little cleaning-up in the second patch, e.g. nbtdesc.c had
a bunch of repeated cases that I've unified, which seems a pretty nice
readability improvement overall.)

Good enough?

-- Abhijit

Attachment

Re: pg_xlogdump --stats

From

Andres Freund

Date:

19 September 2014, 11:44:57

On 2014-09-19 13:24:11 +0530, Abhijit Menon-Sen wrote:
> diff --git a/configure.in b/configure.in
> index 1392277..c3c458c 100644
> --- a/configure.in
> +++ b/configure.in
> @@ -1637,7 +1637,7 @@ fi
>  # If we found "long int" is 64 bits, assume snprintf handles it.  If
>  # we found we need to use "long long int", better check.  We cope with
>  # snprintfs that use %lld, %qd, or %I64d as the format.  If none of these
> -# work, fall back to our own snprintf emulation (which we know uses %lld).
> +# works, fall back to our own snprintf emulation (which we know uses %lld).

spurious independent change? Also I actually think the original version
is correct?


> +typedef struct XLogDumpStats
> +{
> +    uint64        count;
> +    Stats        rmgr_stats[RM_NEXT_ID];
> +    Stats        record_stats[RM_NEXT_ID][16];
> +} XLogDumpStats;

I dislike the literal 16 here and someplace later. A define for the max
number of records would make it clearer.

>  /*
> + * Store per-rmgr and per-record statistics for a given record.
> + */
> +static void
> +XLogDumpCountRecord(XLogDumpConfig *config, XLogDumpStats *stats, XLogRecPtr ReadRecPtr, XLogRecord *record)
> +{
> +    RmgrId        rmid;
> +    uint8          recid;
> +
> +    if (config->filter_by_rmgr != -1 &&
> +        config->filter_by_rmgr != record->xl_rmid)
> +        return;
> +
> +    if (config->filter_by_xid_enabled &&
> +        config->filter_by_xid != record->xl_xid)
> +        return;

Perhaps we should move these kind of checks outside? So
XLogDumpDisplayRecord and this don't have to repeat them. I sure hope
we'll get some more. I e.g. really, really would like to have a
relfilenode check once Heikki's changes to make that possible are in.

> +    stats->count++;
> +
> +    /* Update per-rmgr statistics */
> +
> +    rmid = record->xl_rmid;
> +
> +    stats->rmgr_stats[rmid].count++;
> +    stats->rmgr_stats[rmid].rec_len +=
> +        record->xl_len + SizeOfXLogRecord;
> +    stats->rmgr_stats[rmid].fpi_len +=
> +        record->xl_tot_len - (record->xl_len + SizeOfXLogRecord);

a) Whoever introduced the notion of rec_len vs tot_len in regards to  including/excluding SizeOfXLogRecord ...

b) I'm not against it, but I wonder if the best way to add the  SizeOfXLogRecord to the record size. It's just as much
partof the  FPI. And this means that the record length will be > 0 even if all  the record data has been removed due to
theFPI.
 

>  static void
>  usage(void)
>  {
> @@ -401,6 +581,8 @@ usage(void)
>      printf("                         (default: 1 or the value used in STARTSEG)\n");
>      printf("  -V, --version          output version information, then exit\n");
>      printf("  -x, --xid=XID          only show records with TransactionId XID\n");
> +    printf("  -z, --stats            show per-rmgr statistics\n");
> +    printf("  -Z, --record-stats     show per-record statistics\n");
>      printf("  -?, --help             show this help, then exit\n");
>  }

What was the reason you moved away from --stats=record/rmgr? I think we
possibly will add further ones, so that seems more
extensible?

> diff --git a/contrib/pg_xlogdump/rmgrdesc.c b/contrib/pg_xlogdump/rmgrdesc.c
> index cbcaaa6..dc27fd1 100644
> --- a/contrib/pg_xlogdump/rmgrdesc.c
> +++ b/contrib/pg_xlogdump/rmgrdesc.c

It's trivial to separate in this case, but I'd much rather have patches
like this rm_identity stuff split up in the future.


> diff --git a/src/backend/access/rmgrdesc/heapdesc.c b/src/backend/access/rmgrdesc/heapdesc.c
> index 24b6f92..91a8e72 100644
> --- a/src/backend/access/rmgrdesc/heapdesc.c
> +++ b/src/backend/access/rmgrdesc/heapdesc.c
> @@ -193,3 +193,86 @@ heap2_desc(StringInfo buf, XLogRecord *record)
>      else
>          appendStringInfoString(buf, "UNKNOWN");
>  }
> +
> +static const char *
> +append_init(const char *str)
> +{
> +    static char x[32];
> +
> +    strcpy(x, str);
> +    strcat(x, "+INIT");
> +
> +    return x;
> +}
> +
> +const char *
> +heap_identify(uint8 info)
> +{
> +    const char *id = NULL;
> +
> +    switch (info & XLOG_HEAP_OPMASK)
> +    {
> +        case XLOG_HEAP_INSERT:
> +            id = "INSERT";
> +            break;
> +        case XLOG_HEAP_DELETE:
> +            id = "DELETE";
> +            break;
> +        case XLOG_HEAP_UPDATE:
> +            id = "UPDATE";
> +            break;
> +        case XLOG_HEAP_HOT_UPDATE:
> +            id = "HOT_UPDATE";
> +            break;
> +        case XLOG_HEAP_LOCK:
> +            id = "LOCK";
> +            break;
> +        case XLOG_HEAP_INPLACE:
> +            id = "INPLACE";
> +            break;
> +    }
> +
> +    if (info & XLOG_HEAP_INIT_PAGE)
> +        id = append_init(id);
> +
> +    return id;
> +}

Hm. I'm a bit doubtful about the static buffer used in
append_init(). That means the returned value from heap_identity() is
only valid until the next call. That at the very least needs to be
written down explicitly somewhere.

> diff --git a/src/include/c.h b/src/include/c.h
> index 2ceaaf6..cf3cbd1 100644
> --- a/src/include/c.h
> +++ b/src/include/c.h
> @@ -867,6 +867,9 @@ typedef NameData *Name;
>   * ----------------------------------------------------------------
>   */
>  
> +#define INT64_FORMAT "%" INT64_MODIFIER "d"
> +#define UINT64_FORMAT "%" INT64_MODIFIER "u"
> +

That's already in there afaics:

/* snprintf format strings to use for 64-bit integers */
#define INT64_FORMAT "%" INT64_MODIFIER "d"
#define UINT64_FORMAT "%" INT64_MODIFIER "u"


> +/*
> + * Returns a string describing an XLogRecord, consisting of its identity
> + * optionally followed by a colon, a space, and a further description.
> + */
> +static void
> +xlog_outdesc(StringInfo buf, RmgrId rmid, XLogRecord *record)
> +{
> +    const char *id;
> +
> +    id = RmgrTable[rmid].rm_identify(record->xl_info);
> +    if (id == NULL)
> +        appendStringInfo(buf, "%X", record->xl_info);

Given that you've removed the UNKNOWNs from the rm_descs, this really
should add it here.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: pg_xlogdump --stats

From

Abhijit Menon-Sen

Date:

19 September 2014, 11:48:34

At 2014-09-19 13:24:11 +0530, ams@2ndQuadrant.com wrote:
>
> Good enough?

Not quite. I've attached a small additional patch that shifts the
responsibility of adding rm_name to the output from xlog_outrec to
xlog_outdesc. Now we get WAL_DEBUG output like this:

LOG:  INSERT @ 0/16C51D0: prev 0/16C5160; xid 692; len 31: Heap/INSERT: rel 1663/16384/16385; tid 0/5

…which is consistent with pg_xlogdump --stats output too.

-- Abhijit

Attachment

move-rm-name.diff

Re: pg_xlogdump --stats

From

Abhijit Menon-Sen

Date:

19 September 2014, 12:08:37

At 2014-09-19 10:44:48 +0200, andres@2ndquadrant.com wrote:
>
> >  # snprintfs that use %lld, %qd, or %I64d as the format.  If none of these
> > -# work, fall back to our own snprintf emulation (which we know uses %lld).
> > +# works, fall back to our own snprintf emulation (which we know uses %lld).
> 
> spurious independent change?

It was part of the original patch, but I guess Heikki didn't commit it,
so it was left over in the rebase.

> Also I actually think the original version is correct?

It is not. I suspect it will begin to sound wrong to you if you replace
"none" with "not one" in the sentence. But I don't care enough to argue
about it. It's a very common error.

> > +    Stats        record_stats[RM_NEXT_ID][16];
> > +} XLogDumpStats;
> 
> I dislike the literal 16 here and someplace later. A define for the max
> number of records would make it clearer.

OK, will change.

> Perhaps we should move these kind of checks outside?

OK, will change.

> a) Whoever introduced the notion of rec_len vs tot_len in regards to
>    including/excluding SizeOfXLogRecord ...

(It wasn't me, honest!)

> b) I'm not against it, but I wonder if the best way to add the
>    SizeOfXLogRecord to the record size. It's just as much part of the
>    FPI. And this means that the record length will be > 0 even if all
>    the record data has been removed due to the FPI.

I'm not sure I understand what you are proposing here.

> What was the reason you moved away from --stats=record/rmgr? I think
> we possibly will add further ones, so that seems more extensible?

It was because I wanted --stats to default to "=rmgr", so I tried to
make the argument optional, but getopt in Windows didn't like that.
Here's an excerpt from the earlier discussion:
   > 3. Some compilation error in windows   > .\contrib\pg_xlogdump\pg_xlogdump.c(1002) : error C2065:
'optional_argument': undeclared identifier   > .\contrib\pg_xlogdump\pg_xlogdump.c(1002) : error C2099: initializer is
nota constant   >   > optional_argument should be added to getopt_long.h file for windows.
 
   Hmm. I have no idea what to do about this. I did notice when I wrote the   code that nothing else used
optional_argument,but I didn't realise that   it wouldn't work on Windows.
 
   It may be that the best thing to do would be to avoid using   optional_argument altogether, and have separate
--statsand   --stats-per-record options. Thoughts?
 

I have no objection to doing it differently if someone tells me how to
make Windows happy (preferably without making me unhappy).

> It's trivial to separate in this case, but I'd much rather have
> patches like this rm_identity stuff split up in the future.

Sorry. I'd split it up that way in the past, but forgot to do it again
in this round. Will do when I resend with the changes above.

> That means the returned value from heap_identity() is only valid until
> the next call. That at the very least needs to be written down
> explicitly somewhere.

Where? In the comment in xlog_internal.h, perhaps?

> That's already in there afaics:

Will fix (rebase cruft).

> Given that you've removed the UNKNOWNs from the rm_descs, this really
> should add it here.

You would prefer to see HEAP/UNKNOWN rather than HEAP/32 for an
unidentifiable xl_info?

-- Abhijit

Re: pg_xlogdump --stats

From

Andres Freund

Date:

19 September 2014, 12:17:09

Hi,

On 2014-09-19 14:38:29 +0530, Abhijit Menon-Sen wrote:
> > b) I'm not against it, but I wonder if the best way to add the
> >    SizeOfXLogRecord to the record size. It's just as much part of the
> >    FPI. And this means that the record length will be > 0 even if all
> >    the record data has been removed due to the FPI.
> 
> I'm not sure I understand what you are proposing here.

I'm not really proposing anything. I'm just stating that it's
notationally not clear and might be improved. Doesn't need to be now.

> > What was the reason you moved away from --stats=record/rmgr? I think
> > we possibly will add further ones, so that seems more extensible?
> 
> It was because I wanted --stats to default to "=rmgr", so I tried to
> make the argument optional, but getopt in Windows didn't like that.
> Here's an excerpt from the earlier discussion:
> 
>     > 3. Some compilation error in windows
>     > .\contrib\pg_xlogdump\pg_xlogdump.c(1002) : error C2065: 'optional_argument' : undeclared identifier
>     > .\contrib\pg_xlogdump\pg_xlogdump.c(1002) : error C2099: initializer is not a constant
>     >
>     > optional_argument should be added to getopt_long.h file for windows.
> 
>     Hmm. I have no idea what to do about this. I did notice when I wrote the
>     code that nothing else used optional_argument, but I didn't realise that
>     it wouldn't work on Windows.
> 
>     It may be that the best thing to do would be to avoid using
>     optional_argument altogether, and have separate --stats and
>     --stats-per-record options. Thoughts?

Oh, I've since implemented optional arguments for windows/replacement
getopt_long. It's used by psql since a week or so ago.

> I have no objection to doing it differently if someone tells me how to
> make Windows happy (preferably without making me unhappy).

[x] Done

> > Given that you've removed the UNKNOWNs from the rm_descs, this really
> > should add it here.
> 
> You would prefer to see HEAP/UNKNOWN rather than HEAP/32 for an
> unidentifiable xl_info?

UNKNOWN (32)? It should visually stand out more than just the
number. Somebody borked something if that happens.

Greetings,

Andres Freund

Re: pg_xlogdump --stats

From

Abhijit Menon-Sen

Date:

19 September 2014, 13:09:50

I've attached the revised patch, split up differently:

1. Introduce rm_identify, change rm_desc, glue the two together in
   xlog.c
2. Introduce pg_xlogdump --stats[=record].

The requested changes (16, filter, z:, UNKNOWN) are included. The
grammar nitpicking and rebase cruft is^Ware^Wain't included.

I ran Postgres with WAL_DEBUG for a while, and I ran pg_xlogdump with
--stats (and --stats=rmgr) and --stats=record with and without a few
different variants of "-r heap". Everything looked OK to me.

I hope I didn't miss anything this time.

-- Abhijit

Attachment

Re: pg_xlogdump --stats

From

Abhijit Menon-Sen

Date:

19 September 2014, 13:14:31

At 2014-09-19 15:39:37 +0530, ams@2ndQuadrant.com wrote:
>
> I hope I didn't miss anything this time.

But of course I did. The attached fixup makes the output of pg_xlogdump
match that of xlog_outdesc for unidentifiable records (UNKNOWN (%x)).
Sorry for the inconvenience.

-- Abhijit

Attachment

rmid-unknown-fixup.diff

Re: pg_xlogdump --stats

From

Andres Freund

Date:

19 September 2014, 17:54:23

Hi,

On 2014-09-19 15:39:37 +0530, Abhijit Menon-Sen wrote:
> I've attached the revised patch, split up differently:
> 
> 1. Introduce rm_identify, change rm_desc, glue the two together in
>    xlog.c
> 2. Introduce pg_xlogdump --stats[=record].

I've pushed these after some fixing up.

As we previously discussed, I think we might want to adjust the stats
further. But this is already really useful.

Thanks!

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: pg_xlogdump --stats

From

Alvaro Herrera

Date:

20 September 2014, 01:34:16

Andres Freund wrote:
> Hi,
> 
> On 2014-09-19 15:39:37 +0530, Abhijit Menon-Sen wrote:
> > I've attached the revised patch, split up differently:
> > 
> > 1. Introduce rm_identify, change rm_desc, glue the two together in
> >    xlog.c
> > 2. Introduce pg_xlogdump --stats[=record].
> 
> I've pushed these after some fixing up.

Hm, did you keep the static thingy you mentioned upthread (append_init
which is duped unless I read wrong)?  There's quite some angst due to
pg_dump's fmtId() -- I don't think we should be introducing more such
things ...

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: pg_xlogdump --stats

From

Andres Freund

Date:

20 September 2014, 01:36:53

On 2014-09-19 19:34:08 -0300, Alvaro Herrera wrote:
> Andres Freund wrote:
> > Hi,
> > 
> > On 2014-09-19 15:39:37 +0530, Abhijit Menon-Sen wrote:
> > > I've attached the revised patch, split up differently:
> > > 
> > > 1. Introduce rm_identify, change rm_desc, glue the two together in
> > >    xlog.c
> > > 2. Introduce pg_xlogdump --stats[=record].
> > 
> > I've pushed these after some fixing up.
> 
> Hm, did you keep the static thingy you mentioned upthread (append_init
> which is duped unless I read wrong)?

Yes, I did.

>  There's quite some angst due to pg_dump's fmtId() -- I don't think we
> should be introducing more such things ...

I don't think these two are comparable. I don't really see many more
users of this springing up and printing the identity of two different
records in the same printf doesn't seem likely either.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services