Thread: [PATCH] Support for pg_stat_archiver view
Hello, please find attached the patch that adds basic support for the pg_stat_archiver system view, which allows users that have continuous archiving procedures in place to keep track of some important metrics and information. Currently, pg_stat_archiver displays: * archived_wals: number of successfully archived WAL files since start (or the last reset) * last_archived_wal: last successfully archived WAL file * last_archived_wal_time: timestamp of the latest successful WAL archival * stats_reset: time of last stats reset This is an example of output: postgres=# select * from pg_stat_archiver ; -[ RECORD 1 ]----------+------------------------------ archived_wals | 1 last_archived_wal | 000000010000000000000001 last_archived_wal_time | 2014-01-04 01:01:08.858648+01 stats_reset | 2014-01-04 00:59:25.895034+01 Similarly to pg_stat_bgwriter, it is possible to reset statistics just for this context, calling the pg_stat_reset_shared('archiver') function. The patch is here for discussion and has been prepared against HEAD. It includes also changes in the documentation and the rules.out test. I plan to add further information to the pg_stat_archiver view, including the number of failed attempts of archival and the WAL and timestamp of the latest failure. However, before proceeding, I'd like to get some feedback on this small patch as well as advice on possible regression tests to be added. Thank you. Cheers, Gabriele -- Gabriele Bartolini - 2ndQuadrant Italia PostgreSQL Training, Services and Support gabriele.bartolini@2ndQuadrant.it | www.2ndQuadrant.it
Attachment
On Sat, Jan 4, 2014 at 1:33 AM, Gabriele Bartolini <gabriele.bartolini@2ndquadrant.it> wrote:
Hello,
please find attached the patch that adds basic support for the
pg_stat_archiver system view, which allows users that have continuous
archiving procedures in place to keep track of some important metrics
and information.
Currently, pg_stat_archiver displays:
* archived_wals: number of successfully archived WAL files since start
(or the last reset)
* last_archived_wal: last successfully archived WAL file
* last_archived_wal_time: timestamp of the latest successful WAL archival
* stats_reset: time of last stats reset
This is an example of output:
postgres=# select * from pg_stat_archiver ;
-[ RECORD 1 ]----------+------------------------------
archived_wals | 1
last_archived_wal | 000000010000000000000001
last_archived_wal_time | 2014-01-04 01:01:08.858648+01
stats_reset | 2014-01-04 00:59:25.895034+01
Similarly to pg_stat_bgwriter, it is possible to reset statistics just
for this context, calling the pg_stat_reset_shared('archiver') function.
The patch is here for discussion and has been prepared against HEAD.
It includes also changes in the documentation and the rules.out test.
I plan to add further information to the pg_stat_archiver view,
including the number of failed attempts of archival and the WAL and
timestamp of the latest failure. However, before proceeding, I'd like to
get some feedback on this small patch as well as advice on possible
regression tests to be added.
My first reaction was that exactly those two things were missing. And then I read your whole email :)
With those two, I think it would make much sense to have a view like this.
I'd suggest making the view on top of an SRF like pg_stat_replication and pg_stat_activity (for example), instead of a whole lot of separate function calls like the older stats views.
in pgarch_ArchiveDone() you seem to be increasing the m_archived_vals value for each call and then sending it off. And then you add that number in the stats collector. Isn't that going to add the wrong number in the end - after a while, the archiver is going to send "add 100" when it's just sent one file? ISTM that pgstat_recv_archiver should just do ++ on the value?
Oh, and you need to change the format id number of the stats file.
There's a quick review you for ;) I think it's definitely worthwhile with those things fixed (and a proper review, that was just a quick one-over)
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
Hi Magnus, Il 04/01/14 13:25, Magnus Hagander ha scritto: > My first reaction was that exactly those two things were missing. And > then I read your whole email :) :) > With those two, I think it would make much sense to have a view like > this. Ok, I will prepare version 2 with those. > I'd suggest making the view on top of an SRF like pg_stat_replication > and pg_stat_activity (for example), instead of a whole lot of separate > function calls like the older stats views. Ok, good idea. > in pgarch_ArchiveDone() you seem to be increasing the m_archived_vals > value for each call and then sending it off. And then you add that > number in the stats collector. Isn't that going to add the wrong > number in the end - after a while, the archiver is going to send "add > 100" when it's just sent one file? ISTM that pgstat_recv_archiver > should just do ++ on the value? You are right. The purpose was to set it to 1 in ArchiveDone (I might have missed that change), so that I can manage the failed counters in the same way. I will fix this in version 2. > Oh, and you need to change the format id number of the stats file. I have not found any instruction on how to set it. I assume you are talking about this: PGSTAT_FILE_FORMAT_ID 0x01A5BC9B Any suggestion is welcome. > There's a quick review you for ;) I think it's definitely worthwhile > with those things fixed (and a proper review, that was just a quick > one-over) Thanks for that. It already means a lot if you agree too it is worth it. Ciao, Gabriele -- Gabriele Bartolini - 2ndQuadrant ItaliaPostgreSQL Training, Services and Supportgabriele.bartolini@2ndQuadrant.it | www.2ndQuadrant.it
On Sat, Jan 4, 2014 at 2:01 PM, Gabriele Bartolini <gabriele.bartolini@2ndquadrant.it> wrote:
Il 04/01/14 13:25, Magnus Hagander ha scritto:> With those two, I think it would make much sense to have a view likeOk, I will prepare version 2 with those.
> this.
> Oh, and you need to change the format id number of the stats file.I have not found any instruction on how to set it. I assume you are
talking about this:
PGSTAT_FILE_FORMAT_ID 0x01A5BC9B
Any suggestion is welcome.
Yes, that's what I'm talking about. And just increment it by 1.
Not sure where the original value came from, but that's what people have been doing recently.
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
Il 05/01/14 13:52, Magnus Hagander ha scritto: > Yes, that's what I'm talking about. And just increment it by 1. Done. I am attaching version 2 of the patch, which now implements only one function (pg_stat_get_archiver()) and adds: * failed attempts * WAL of the last failed attempt * time of the last failed attempt Thanks for your inputs. Ciao, Gabriele -- Gabriele Bartolini - 2ndQuadrant Italia PostgreSQL Training, Services and Support gabriele.bartolini@2ndQuadrant.it | www.2ndQuadrant.it
Attachment
Enviado via iPhone > Em 05/01/2014, às 16:27, Gabriele Bartolini <gabriele.bartolini@2ndQuadrant.it> escreveu: > > Il 05/01/14 13:52, Magnus Hagander ha scritto: >> Yes, that's what I'm talking about. And just increment it by 1. > Done. I am attaching version 2 of the patch, which now implements only > one function (pg_stat_get_archiver()) and adds: > > * failed attempts > * WAL of the last failed attempt > * time of the last failed attempt Hi, I don't see your code yet, but I would like to know if is possible to implement this view as an extension. Regards, Fabrízio Mello
Hi Fabrizio, Il 05/01/14 20:46, Fabrizio Mello ha scritto: > I don't see your code yet, but I would like to know if is possible to > implement this view as an extension. I wanted to do it as an extension - so that I could backport that to previous versions of Postgres. I do not think it is a possibility, given that the client code that is aware of the events lies in pgarch.c. Ciao, Gabriele -- Gabriele Bartolini - 2ndQuadrant ItaliaPostgreSQL Training, Services and Supportgabriele.bartolini@2ndQuadrant.it | www.2ndQuadrant.it
On Mon, Jan 6, 2014 at 5:37 PM, Gabriele Bartolini <gabriele.bartolini@2ndquadrant.it> wrote: > Hi Fabrizio, > > Il 05/01/14 20:46, Fabrizio Mello ha scritto: >> I don't see your code yet, but I would like to know if is possible to >> implement this view as an extension. > I wanted to do it as an extension - so that I could backport that to > previous versions of Postgres. > > I do not think it is a possibility, given that the client code that is > aware of the events lies in pgarch.c. I don't see particularly any point in doing that as an extension as you want to log this statistical information once archive has either failed or passed. I just had a quick look at v2, and it looks that the patch is in good shape. Sorry to be picky, but I am not sure that using the character m_type is adapted when delivering messages to pgstat facility. You might want to use a boolean instead... MAX_XFN_CHARS is not adapted as well IMO: you should remove it and use instead MAXFNAMELEN of xlog_internal.h, where all the file names related to WAL files, including history and backup files, are generated. Then, the patch looks to be working as expected, here are some results with a short test: =# \x Expanded display (expanded) is on. =# select * from pg_stat_get_archiver(); -[ RECORD 1 ]----------+------------------------------ archived_wals | 6 last_archived_wal | 000000010000000000000005 last_archived_wal_time | 2014-01-07 17:27:34.752903+09 failed_attempts | 12 last_failed_wal | 000000010000000000000006 last_failed_wal_time | 2014-01-07 17:31:18.409528+09 stats_reset (1 row) =# select * from pg_stat_archiver; -[ RECORD 1 ]----------+------------------------------ archived_wals | 6 last_archived_wal | 000000010000000000000005 last_archived_wal_time | 2014-01-07 17:27:34.752903+09 failed_attempts | 12 last_failed_wal | 000000010000000000000006 last_failed_wal_time | 2014-01-07 17:31:18.409528+09 stats_reset | 2014-01-07 17:25:51.949498+09 Regards, -- Michael
Hi Michael, Il 07/01/14 09:33, Michael Paquier ha scritto: > I just had a quick look at v2, and it looks that the patch is in good > shape. Thanks. > Sorry to be picky, but I am not sure that using the character m_type > is adapted when delivering messages to pgstat facility. You might want > to use a boolean instead... I had had the same doubt, and opted for a more generic approach using a char. I have followed your tip and made the field a boolean. > MAX_XFN_CHARS is not adapted as well IMO: you should remove it and use > instead MAXFNAMELEN of xlog_internal.h Again, I had some concerns about redefining a constant (as MAX_XFN_CHARS was defined only in pgarch.c). I wonder now if we should change this behaviour too in pgarch.c (but that's a different patch). Please find attached version 3. I will update the commitfest page and add your previous message as comment/review. I will tentatively add you as reviewer of the patch (as well as Magnus). Thanks, Gabriele -- Gabriele Bartolini - 2ndQuadrant Italia PostgreSQL Training, Services and Support gabriele.bartolini@2ndQuadrant.it | www.2ndQuadrant.it
Attachment
On 4 January 2014 13:01, Gabriele Bartolini <gabriele.bartolini@2ndquadrant.it> wrote: >> I'd suggest making the view on top of an SRF like pg_stat_replication >> and pg_stat_activity (for example), instead of a whole lot of separate >> function calls like the older stats views. > > Ok, good idea. Not sure I see why it needs to be an SRF. It only returns one row. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Wed, Jan 8, 2014 at 6:42 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
On 4 January 2014 13:01, Gabriele Bartolini<gabriele.bartolini@2ndquadrant.it> wrote:Not sure I see why it needs to be an SRF. It only returns one row.
>> I'd suggest making the view on top of an SRF like pg_stat_replication
>> and pg_stat_activity (for example), instead of a whole lot of separate
>> function calls like the older stats views.
>
> Ok, good idea.
Good point, it could/should be a general function returning a composite type.
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
Hi Simon, Il 08/01/14 18:42, Simon Riggs ha scritto: > Not sure I see why it needs to be an SRF. It only returns one row. Attached is version 4, which removes management of SRF stages. Thanks for pointing it out. Cheers, Gabriele -- Gabriele Bartolini - 2ndQuadrant Italia PostgreSQL Training, Services and Support gabriele.bartolini@2ndQuadrant.it | www.2ndQuadrant.it
Attachment
On Thu, Jan 9, 2014 at 6:36 AM, Gabriele Bartolini <gabriele.bartolini@2ndquadrant.it> wrote: > Il 08/01/14 18:42, Simon Riggs ha scritto: >> Not sure I see why it needs to be an SRF. It only returns one row. > Attached is version 4, which removes management of SRF stages. I have been looking at v4 a bit more, and found a couple of small things: - a warning in pgstatfuncs.c - some typos and format fixing in the sgml docs - some corrections in a couple of comments - Fixed an error message related to pg_stat_reset_shared referring only to bgwriter and not the new option archiver. Here is how the new message looks: =# select pg_stat_reset_shared('popo'); ERROR: 22023: unrecognized reset target: "popo" HINT: Target must be "bgwriter" or "archiver" A new version is attached containing those fixes. I played also with the patch and pgbench, simulating some archive failures and successes while running pgbench and the reports given by pg_stat_archiver were correct, so I am marking this patch as "Ready for committer". Regards, -- Michael
Attachment
On Thu, Jan 23, 2014 at 4:10 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > On Thu, Jan 9, 2014 at 6:36 AM, Gabriele Bartolini > <gabriele.bartolini@2ndquadrant.it> wrote: >> Il 08/01/14 18:42, Simon Riggs ha scritto: >>> Not sure I see why it needs to be an SRF. It only returns one row. >> Attached is version 4, which removes management of SRF stages. > > I have been looking at v4 a bit more, and found a couple of small things: > - a warning in pgstatfuncs.c > - some typos and format fixing in the sgml docs > - some corrections in a couple of comments > - Fixed an error message related to pg_stat_reset_shared referring > only to bgwriter and not the new option archiver. Here is how the new > message looks: > =# select pg_stat_reset_shared('popo'); > ERROR: 22023: unrecognized reset target: "popo" > HINT: Target must be "bgwriter" or "archiver" > A new version is attached containing those fixes. I played also with > the patch and pgbench, simulating some archive failures and successes > while running pgbench and the reports given by pg_stat_archiver were > correct, so I am marking this patch as "Ready for committer". I refactored the patch further. * Remove ArchiverStats global variable to simplify pgarch.c. * Remove stats_timestamp field from PgStat_ArchiverStats struct because it's not required. I have some review comments: + s.archived_wals, + s.last_archived_wal, + s.last_archived_wal_time, + s.failed_attempts, + s.last_failed_wal, + s.last_failed_wal_time, The column names of pg_stat_archiver look not consistent at least to me. What about the followings? archive_count last_archived_wal last_archived_time fail_count last_failed_wal last_failed_time I think that it's time to rename all the variables related to pg_stat_bgwriter. For example, it's better to change PgStat_GlobalStats to PgStat_Bgwriter. I think that it's okay to make this change as separate patch, though. + char last_archived_wal[MAXFNAMELEN]; /* last WAL file archived */ + TimestampTz last_archived_wal_timestamp; /* last archival success */ + PgStat_Counter failed_attempts; + char last_failed_wal[MAXFNAMELEN]; /* last WAL file involved in failure */ Some hackers don't like the increase of the size of the statsfile. In order to reduce the size as much as possible, we should use the exact size of WAL file here instead of MAXFNAMELEN? Regards, -- Fujii Masao
Attachment
On Sat, Jan 25, 2014 at 5:41 AM, Fujii Masao <masao.fujii@gmail.com> wrote: > On Thu, Jan 23, 2014 at 4:10 PM, Michael Paquier > <michael.paquier@gmail.com> wrote: >> On Thu, Jan 9, 2014 at 6:36 AM, Gabriele Bartolini >> <gabriele.bartolini@2ndquadrant.it> wrote: >>> Il 08/01/14 18:42, Simon Riggs ha scritto: >>>> Not sure I see why it needs to be an SRF. It only returns one row. >>> Attached is version 4, which removes management of SRF stages. >> >> I have been looking at v4 a bit more, and found a couple of small things: >> - a warning in pgstatfuncs.c >> - some typos and format fixing in the sgml docs >> - some corrections in a couple of comments >> - Fixed an error message related to pg_stat_reset_shared referring >> only to bgwriter and not the new option archiver. Here is how the new >> message looks: >> =# select pg_stat_reset_shared('popo'); >> ERROR: 22023: unrecognized reset target: "popo" >> HINT: Target must be "bgwriter" or "archiver" >> A new version is attached containing those fixes. I played also with >> the patch and pgbench, simulating some archive failures and successes >> while running pgbench and the reports given by pg_stat_archiver were >> correct, so I am marking this patch as "Ready for committer". > > I refactored the patch further. > > * Remove ArchiverStats global variable to simplify pgarch.c. > * Remove stats_timestamp field from PgStat_ArchiverStats struct because > it's not required. Thanks, pgstat_send_archiver is cleaner now. > > I have some review comments: > > + s.archived_wals, > + s.last_archived_wal, > + s.last_archived_wal_time, > + s.failed_attempts, > + s.last_failed_wal, > + s.last_failed_wal_time, > > The column names of pg_stat_archiver look not consistent at least to me. > What about the followings? > > archive_count > last_archived_wal > last_archived_time > fail_count > last_failed_wal > last_failed_time And what about archived_count and failed_count instead of respectively archive_count and failed_count? The rest of the names are better now indeed. > I think that it's time to rename all the variables related to pg_stat_bgwriter. > For example, it's better to change PgStat_GlobalStats to PgStat_Bgwriter. > I think that it's okay to make this change as separate patch, though. Yep, this is definitely a different patch. > + char last_archived_wal[MAXFNAMELEN]; /* last WAL file archived */ > + TimestampTz last_archived_wal_timestamp; /* last archival success */ > + PgStat_Counter failed_attempts; > + char last_failed_wal[MAXFNAMELEN]; /* last WAL file involved > in failure */ > Some hackers don't like the increase of the size of the statsfile. In order to > reduce the size as much as possible, we should use the exact size of WAL file > here instead of MAXFNAMELEN? The first versions of the patch used a more limited size length more inline with what you say: +#define MAX_XFN_CHARS 40 But this is inconsistent with xlog_internal.h. Regards, -- Michael
On Sat, Jan 25, 2014 at 3:19 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > On Sat, Jan 25, 2014 at 5:41 AM, Fujii Masao <masao.fujii@gmail.com> wrote: >> On Thu, Jan 23, 2014 at 4:10 PM, Michael Paquier >> <michael.paquier@gmail.com> wrote: >>> On Thu, Jan 9, 2014 at 6:36 AM, Gabriele Bartolini >>> <gabriele.bartolini@2ndquadrant.it> wrote: >>>> Il 08/01/14 18:42, Simon Riggs ha scritto: >>>>> Not sure I see why it needs to be an SRF. It only returns one row. >>>> Attached is version 4, which removes management of SRF stages. >>> >>> I have been looking at v4 a bit more, and found a couple of small things: >>> - a warning in pgstatfuncs.c >>> - some typos and format fixing in the sgml docs >>> - some corrections in a couple of comments >>> - Fixed an error message related to pg_stat_reset_shared referring >>> only to bgwriter and not the new option archiver. Here is how the new >>> message looks: >>> =# select pg_stat_reset_shared('popo'); >>> ERROR: 22023: unrecognized reset target: "popo" >>> HINT: Target must be "bgwriter" or "archiver" >>> A new version is attached containing those fixes. I played also with >>> the patch and pgbench, simulating some archive failures and successes >>> while running pgbench and the reports given by pg_stat_archiver were >>> correct, so I am marking this patch as "Ready for committer". >> >> I refactored the patch further. >> >> * Remove ArchiverStats global variable to simplify pgarch.c. >> * Remove stats_timestamp field from PgStat_ArchiverStats struct because >> it's not required. > Thanks, pgstat_send_archiver is cleaner now. > >> >> I have some review comments: >> >> + s.archived_wals, >> + s.last_archived_wal, >> + s.last_archived_wal_time, >> + s.failed_attempts, >> + s.last_failed_wal, >> + s.last_failed_wal_time, >> >> The column names of pg_stat_archiver look not consistent at least to me. >> What about the followings? >> >> archive_count >> last_archived_wal >> last_archived_time >> fail_count >> last_failed_wal >> last_failed_time > And what about archived_count and failed_count instead of respectively > archive_count and failed_count? The rest of the names are better now > indeed. Please find attached an updated patch updated with the new column names (in context diffs this time). Regards, -- Michael
Attachment
On Sat, Jan 25, 2014 at 3:19 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > On Sat, Jan 25, 2014 at 5:41 AM, Fujii Masao <masao.fujii@gmail.com> wrote: >> On Thu, Jan 23, 2014 at 4:10 PM, Michael Paquier >> <michael.paquier@gmail.com> wrote: >>> On Thu, Jan 9, 2014 at 6:36 AM, Gabriele Bartolini >>> <gabriele.bartolini@2ndquadrant.it> wrote: >>>> Il 08/01/14 18:42, Simon Riggs ha scritto: >>>>> Not sure I see why it needs to be an SRF. It only returns one row. >>>> Attached is version 4, which removes management of SRF stages. >>> >>> I have been looking at v4 a bit more, and found a couple of small things: >>> - a warning in pgstatfuncs.c >>> - some typos and format fixing in the sgml docs >>> - some corrections in a couple of comments >>> - Fixed an error message related to pg_stat_reset_shared referring >>> only to bgwriter and not the new option archiver. Here is how the new >>> message looks: >>> =# select pg_stat_reset_shared('popo'); >>> ERROR: 22023: unrecognized reset target: "popo" >>> HINT: Target must be "bgwriter" or "archiver" >>> A new version is attached containing those fixes. I played also with >>> the patch and pgbench, simulating some archive failures and successes >>> while running pgbench and the reports given by pg_stat_archiver were >>> correct, so I am marking this patch as "Ready for committer". >> >> I refactored the patch further. >> >> * Remove ArchiverStats global variable to simplify pgarch.c. >> * Remove stats_timestamp field from PgStat_ArchiverStats struct because >> it's not required. > Thanks, pgstat_send_archiver is cleaner now. > >> >> I have some review comments: >> >> + s.archived_wals, >> + s.last_archived_wal, >> + s.last_archived_wal_time, >> + s.failed_attempts, >> + s.last_failed_wal, >> + s.last_failed_wal_time, >> >> The column names of pg_stat_archiver look not consistent at least to me. >> What about the followings? >> >> archive_count >> last_archived_wal >> last_archived_time >> fail_count >> last_failed_wal >> last_failed_time > And what about archived_count and failed_count instead of respectively > archive_count and failed_count? The rest of the names are better now > indeed. > >> I think that it's time to rename all the variables related to pg_stat_bgwriter. >> For example, it's better to change PgStat_GlobalStats to PgStat_Bgwriter. >> I think that it's okay to make this change as separate patch, though. > Yep, this is definitely a different patch. > >> + char last_archived_wal[MAXFNAMELEN]; /* last WAL file archived */ >> + TimestampTz last_archived_wal_timestamp; /* last archival success */ >> + PgStat_Counter failed_attempts; >> + char last_failed_wal[MAXFNAMELEN]; /* last WAL file involved >> in failure */ >> Some hackers don't like the increase of the size of the statsfile. In order to >> reduce the size as much as possible, we should use the exact size of WAL file >> here instead of MAXFNAMELEN? > The first versions of the patch used a more limited size length more > inline with what you say: > +#define MAX_XFN_CHARS 40 > But this is inconsistent with xlog_internal.h. Using MAX_XFN_CHARS instead of MAXFNAMELEN has a clear advantage to reduce the size of the statistics file. Though I'm not sure how much this change improve the performance of the statistics collector, basically I'd like to use MAX_XFN_CHARS here. + (errmsg("corrupted statistics file (global) \"%s\"", statfile))); I think that it's better to use the view name "pg_stat_bgwriter" instead of internal name "global" here. The view name is easier-to-understand to a user. + (errmsg("corrupted statistics file (archiver) \"%s\"", statfile))); Same as above. What using "pg_stat_archiver" instead of just "archive"? + if (fread(&myArchiverStats, 1, sizeof(myArchiverStats), + fpin) != sizeof(myArchiverStats)) + { + ereport(pgStatRunningInCollector ? LOG : WARNING, + (errmsg("corrupted statistics file \"%s\"", statfile))); Same as above. Isn't it better to add something like "pg_stat_archiver" even here? Regards, -- Fujii Masao
On Tue, Jan 28, 2014 at 3:42 AM, Fujii Masao <masao.fujii@gmail.com> wrote: > On Sat, Jan 25, 2014 at 3:19 PM, Michael Paquier > <michael.paquier@gmail.com> wrote: >> On Sat, Jan 25, 2014 at 5:41 AM, Fujii Masao <masao.fujii@gmail.com> wrote: >>> On Thu, Jan 23, 2014 at 4:10 PM, Michael Paquier >>> <michael.paquier@gmail.com> wrote: >>>> On Thu, Jan 9, 2014 at 6:36 AM, Gabriele Bartolini >>>> <gabriele.bartolini@2ndquadrant.it> wrote: >>>>> Il 08/01/14 18:42, Simon Riggs ha scritto: >>>>>> Not sure I see why it needs to be an SRF. It only returns one row. >>>>> Attached is version 4, which removes management of SRF stages. >>>> >>>> I have been looking at v4 a bit more, and found a couple of small things: >>>> - a warning in pgstatfuncs.c >>>> - some typos and format fixing in the sgml docs >>>> - some corrections in a couple of comments >>>> - Fixed an error message related to pg_stat_reset_shared referring >>>> only to bgwriter and not the new option archiver. Here is how the new >>>> message looks: >>>> =# select pg_stat_reset_shared('popo'); >>>> ERROR: 22023: unrecognized reset target: "popo" >>>> HINT: Target must be "bgwriter" or "archiver" >>>> A new version is attached containing those fixes. I played also with >>>> the patch and pgbench, simulating some archive failures and successes >>>> while running pgbench and the reports given by pg_stat_archiver were >>>> correct, so I am marking this patch as "Ready for committer". >>> >>> I refactored the patch further. >>> >>> * Remove ArchiverStats global variable to simplify pgarch.c. >>> * Remove stats_timestamp field from PgStat_ArchiverStats struct because >>> it's not required. >> Thanks, pgstat_send_archiver is cleaner now. >> >>> >>> I have some review comments: >>> >>> + s.archived_wals, >>> + s.last_archived_wal, >>> + s.last_archived_wal_time, >>> + s.failed_attempts, >>> + s.last_failed_wal, >>> + s.last_failed_wal_time, >>> >>> The column names of pg_stat_archiver look not consistent at least to me. >>> What about the followings? >>> >>> archive_count >>> last_archived_wal >>> last_archived_time >>> fail_count >>> last_failed_wal >>> last_failed_time >> And what about archived_count and failed_count instead of respectively >> archive_count and failed_count? The rest of the names are better now >> indeed. >> >>> I think that it's time to rename all the variables related to pg_stat_bgwriter. >>> For example, it's better to change PgStat_GlobalStats to PgStat_Bgwriter. >>> I think that it's okay to make this change as separate patch, though. >> Yep, this is definitely a different patch. >> >>> + char last_archived_wal[MAXFNAMELEN]; /* last WAL file archived */ >>> + TimestampTz last_archived_wal_timestamp; /* last archival success */ >>> + PgStat_Counter failed_attempts; >>> + char last_failed_wal[MAXFNAMELEN]; /* last WAL file involved >>> in failure */ >>> Some hackers don't like the increase of the size of the statsfile. In order to >>> reduce the size as much as possible, we should use the exact size of WAL file >>> here instead of MAXFNAMELEN? >> The first versions of the patch used a more limited size length more >> inline with what you say: >> +#define MAX_XFN_CHARS 40 >> But this is inconsistent with xlog_internal.h. > > Using MAX_XFN_CHARS instead of MAXFNAMELEN has a clear advantage > to reduce the size of the statistics file. Though I'm not sure how much this > change improve the performance of the statistics collector, basically > I'd like to > use MAX_XFN_CHARS here. I changed the patch in this way, fixed some existing bugs (e.g., correct the column names of pg_stat_archiver in rules.out), and then just committed it. Regards, -- Fujii Masao
Anybody knows about this patch? http://www.postgresql.org/message-id/508DFEC9.4020102@uptime.jp -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Wed, Jan 29, 2014 at 3:03 AM, Fujii Masao <masao.fujii@gmail.com> wrote: > On Tue, Jan 28, 2014 at 3:42 AM, Fujii Masao <masao.fujii@gmail.com> wrote: >> On Sat, Jan 25, 2014 at 3:19 PM, Michael Paquier >> <michael.paquier@gmail.com> wrote: >>> On Sat, Jan 25, 2014 at 5:41 AM, Fujii Masao <masao.fujii@gmail.com> wrote: >>>> On Thu, Jan 23, 2014 at 4:10 PM, Michael Paquier >>>> <michael.paquier@gmail.com> wrote: >>>>> On Thu, Jan 9, 2014 at 6:36 AM, Gabriele Bartolini >>>>> <gabriele.bartolini@2ndquadrant.it> wrote: >>>>>> Il 08/01/14 18:42, Simon Riggs ha scritto: >>>>>>> Not sure I see why it needs to be an SRF. It only returns one row. >>>>>> Attached is version 4, which removes management of SRF stages. >>>>> >>>>> I have been looking at v4 a bit more, and found a couple of small things: >>>>> - a warning in pgstatfuncs.c >>>>> - some typos and format fixing in the sgml docs >>>>> - some corrections in a couple of comments >>>>> - Fixed an error message related to pg_stat_reset_shared referring >>>>> only to bgwriter and not the new option archiver. Here is how the new >>>>> message looks: >>>>> =# select pg_stat_reset_shared('popo'); >>>>> ERROR: 22023: unrecognized reset target: "popo" >>>>> HINT: Target must be "bgwriter" or "archiver" >>>>> A new version is attached containing those fixes. I played also with >>>>> the patch and pgbench, simulating some archive failures and successes >>>>> while running pgbench and the reports given by pg_stat_archiver were >>>>> correct, so I am marking this patch as "Ready for committer". >>>> >>>> I refactored the patch further. >>>> >>>> * Remove ArchiverStats global variable to simplify pgarch.c. >>>> * Remove stats_timestamp field from PgStat_ArchiverStats struct because >>>> it's not required. >>> Thanks, pgstat_send_archiver is cleaner now. >>> >>>> >>>> I have some review comments: >>>> >>>> + s.archived_wals, >>>> + s.last_archived_wal, >>>> + s.last_archived_wal_time, >>>> + s.failed_attempts, >>>> + s.last_failed_wal, >>>> + s.last_failed_wal_time, >>>> >>>> The column names of pg_stat_archiver look not consistent at least to me. >>>> What about the followings? >>>> >>>> archive_count >>>> last_archived_wal >>>> last_archived_time >>>> fail_count >>>> last_failed_wal >>>> last_failed_time >>> And what about archived_count and failed_count instead of respectively >>> archive_count and failed_count? The rest of the names are better now >>> indeed. >>> >>>> I think that it's time to rename all the variables related to pg_stat_bgwriter. >>>> For example, it's better to change PgStat_GlobalStats to PgStat_Bgwriter. >>>> I think that it's okay to make this change as separate patch, though. >>> Yep, this is definitely a different patch. >>> >>>> + char last_archived_wal[MAXFNAMELEN]; /* last WAL file archived */ >>>> + TimestampTz last_archived_wal_timestamp; /* last archival success */ >>>> + PgStat_Counter failed_attempts; >>>> + char last_failed_wal[MAXFNAMELEN]; /* last WAL file involved >>>> in failure */ >>>> Some hackers don't like the increase of the size of the statsfile. In order to >>>> reduce the size as much as possible, we should use the exact size of WAL file >>>> here instead of MAXFNAMELEN? >>> The first versions of the patch used a more limited size length more >>> inline with what you say: >>> +#define MAX_XFN_CHARS 40 >>> But this is inconsistent with xlog_internal.h. >> >> Using MAX_XFN_CHARS instead of MAXFNAMELEN has a clear advantage >> to reduce the size of the statistics file. Though I'm not sure how much this >> change improve the performance of the statistics collector, basically >> I'd like to >> use MAX_XFN_CHARS here. > > I changed the patch in this way, fixed some existing bugs (e.g., > correct the column > names of pg_stat_archiver in rules.out), and then just committed it. Thanks, after re-reading the code using MAX_XFN_CHARS makes sense. I didn't look at the comments on top of pg_arch.c. Thanks as well for fixing the regression output :) Regards, -- Michael
On Wed, Jan 29, 2014 at 3:07 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > > Anybody knows about this patch? > http://www.postgresql.org/message-id/508DFEC9.4020102@uptime.jp Though I'm not sure whether Nagayasu is still working on that patch, it's worth thinking to introduce that together with pg_stat_archiver. Regards, -- Fujii Masao
On Wed, Jan 29, 2014 at 10:55 AM, Fujii Masao <masao.fujii@gmail.com> wrote: > On Wed, Jan 29, 2014 at 3:07 AM, Alvaro Herrera > <alvherre@2ndquadrant.com> wrote: >> >> Anybody knows about this patch? >> http://www.postgresql.org/message-id/508DFEC9.4020102@uptime.jp > > Though I'm not sure whether Nagayasu is still working on that patch, > it's worth thinking to introduce that together with pg_stat_archiver. This patch uses globalStats to implement the new stat fields for walwriter, I think that we should use a different structure for clarity and to be in-line with what is done now with the archiver part. Btw, I agree that renaming the globalStats part to something more appropriate related to bgwriter is a sane target for this CF, but isn't it too late to late to consider this walwriter patch for this release? It introduces a new feature as it tracks xlog dirty writes and CF is already half way. Regards, -- Michael
On Sat, Jan 25, 2014 at 5:41 AM, Fujii Masao <masao.fujii@gmail.com> wrote: > I think that it's time to rename all the variables related to pg_stat_bgwriter. > For example, it's better to change PgStat_GlobalStats to PgStat_Bgwriter. > I think that it's okay to make this change as separate patch, though. Please find attached a simple patch renaming PgStat_GlobalStats to PgStat_BgWriterStats. Its related variables and functions are renamed as well and use the term "bgwriter" instead of "global". Comments are updated as well. This patch also removes stats_timestamp from PgStat_GlobalStats and uses instead a static variable in pgstat.c, making all the structures dedicated to each component clearer. Regards, -- Michael
On Wed, Jan 29, 2014 at 10:38 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > On Sat, Jan 25, 2014 at 5:41 AM, Fujii Masao <masao.fujii@gmail.com> wrote: >> I think that it's time to rename all the variables related to pg_stat_bgwriter. >> For example, it's better to change PgStat_GlobalStats to PgStat_Bgwriter. >> I think that it's okay to make this change as separate patch, though. > Please find attached a simple patch renaming PgStat_GlobalStats to > PgStat_BgWriterStats. And of course I forgot the patch... Now attached. -- Michael
Attachment
On Wed, Jan 29, 2014 at 12:35 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > On Wed, Jan 29, 2014 at 10:55 AM, Fujii Masao <masao.fujii@gmail.com> wrote: >> On Wed, Jan 29, 2014 at 3:07 AM, Alvaro Herrera >> <alvherre@2ndquadrant.com> wrote: >>> >>> Anybody knows about this patch? >>> http://www.postgresql.org/message-id/508DFEC9.4020102@uptime.jp >> >> Though I'm not sure whether Nagayasu is still working on that patch, >> it's worth thinking to introduce that together with pg_stat_archiver. > This patch uses globalStats to implement the new stat fields for > walwriter, I think that we should use a different structure for > clarity and to be in-line with what is done now with the archiver > part. Is there clear reason why we should define separate structure for each global stats view? If we introduce something like pg_stat_walwriter later, it seems better to use only one structure, i.e., globalStats, for all global stats views. Which would simplify the code and can reduce the number of calls of fwrite()/fread(). Regards, -- Fujii Masao
On Wed, Jan 29, 2014 at 3:03 AM, Fujii Masao <masao.fujii@gmail.com> wrote: > On Tue, Jan 28, 2014 at 3:42 AM, Fujii Masao <masao.fujii@gmail.com> wrote: >> On Sat, Jan 25, 2014 at 3:19 PM, Michael Paquier >> <michael.paquier@gmail.com> wrote: >>> On Sat, Jan 25, 2014 at 5:41 AM, Fujii Masao <masao.fujii@gmail.com> wrote: >>>> On Thu, Jan 23, 2014 at 4:10 PM, Michael Paquier >>>> <michael.paquier@gmail.com> wrote: >>>>> On Thu, Jan 9, 2014 at 6:36 AM, Gabriele Bartolini >>>>> <gabriele.bartolini@2ndquadrant.it> wrote: >>>>>> Il 08/01/14 18:42, Simon Riggs ha scritto: >>>>>>> Not sure I see why it needs to be an SRF. It only returns one row. >>>>>> Attached is version 4, which removes management of SRF stages. >>>>> >>>>> I have been looking at v4 a bit more, and found a couple of small things: >>>>> - a warning in pgstatfuncs.c >>>>> - some typos and format fixing in the sgml docs >>>>> - some corrections in a couple of comments >>>>> - Fixed an error message related to pg_stat_reset_shared referring >>>>> only to bgwriter and not the new option archiver. Here is how the new >>>>> message looks: >>>>> =# select pg_stat_reset_shared('popo'); >>>>> ERROR: 22023: unrecognized reset target: "popo" >>>>> HINT: Target must be "bgwriter" or "archiver" >>>>> A new version is attached containing those fixes. I played also with >>>>> the patch and pgbench, simulating some archive failures and successes >>>>> while running pgbench and the reports given by pg_stat_archiver were >>>>> correct, so I am marking this patch as "Ready for committer". >>>> >>>> I refactored the patch further. >>>> >>>> * Remove ArchiverStats global variable to simplify pgarch.c. >>>> * Remove stats_timestamp field from PgStat_ArchiverStats struct because >>>> it's not required. >>> Thanks, pgstat_send_archiver is cleaner now. >>> >>>> >>>> I have some review comments: >>>> >>>> + s.archived_wals, >>>> + s.last_archived_wal, >>>> + s.last_archived_wal_time, >>>> + s.failed_attempts, >>>> + s.last_failed_wal, >>>> + s.last_failed_wal_time, >>>> >>>> The column names of pg_stat_archiver look not consistent at least to me. >>>> What about the followings? >>>> >>>> archive_count >>>> last_archived_wal >>>> last_archived_time >>>> fail_count >>>> last_failed_wal >>>> last_failed_time >>> And what about archived_count and failed_count instead of respectively >>> archive_count and failed_count? The rest of the names are better now >>> indeed. >>> >>>> I think that it's time to rename all the variables related to pg_stat_bgwriter. >>>> For example, it's better to change PgStat_GlobalStats to PgStat_Bgwriter. >>>> I think that it's okay to make this change as separate patch, though. >>> Yep, this is definitely a different patch. >>> >>>> + char last_archived_wal[MAXFNAMELEN]; /* last WAL file archived */ >>>> + TimestampTz last_archived_wal_timestamp; /* last archival success */ >>>> + PgStat_Counter failed_attempts; >>>> + char last_failed_wal[MAXFNAMELEN]; /* last WAL file involved >>>> in failure */ >>>> Some hackers don't like the increase of the size of the statsfile. In order to >>>> reduce the size as much as possible, we should use the exact size of WAL file >>>> here instead of MAXFNAMELEN? >>> The first versions of the patch used a more limited size length more >>> inline with what you say: >>> +#define MAX_XFN_CHARS 40 >>> But this is inconsistent with xlog_internal.h. >> >> Using MAX_XFN_CHARS instead of MAXFNAMELEN has a clear advantage >> to reduce the size of the statistics file. Though I'm not sure how much this >> change improve the performance of the statistics collector, basically >> I'd like to >> use MAX_XFN_CHARS here. > > I changed the patch in this way, fixed some existing bugs (e.g., > correct the column > names of pg_stat_archiver in rules.out), and then just committed it. "Depesz" mentioned here that it would be interesting to have as well in this view the size of archive queue: http://www.depesz.com/2014/01/29/waiting-for-9-4-add-pg_stat_archiver-statistics-view/ And it looks indeed interesting to have. What do you think about adding it as a TODO item? -- Michael
On Sat, Feb 1, 2014 at 9:32 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > On Wed, Jan 29, 2014 at 3:03 AM, Fujii Masao <masao.fujii@gmail.com> wrote: >> On Tue, Jan 28, 2014 at 3:42 AM, Fujii Masao <masao.fujii@gmail.com> wrote: >>> On Sat, Jan 25, 2014 at 3:19 PM, Michael Paquier >>> <michael.paquier@gmail.com> wrote: >>>> On Sat, Jan 25, 2014 at 5:41 AM, Fujii Masao <masao.fujii@gmail.com> wrote: >>>>> On Thu, Jan 23, 2014 at 4:10 PM, Michael Paquier >>>>> <michael.paquier@gmail.com> wrote: >>>>>> On Thu, Jan 9, 2014 at 6:36 AM, Gabriele Bartolini >>>>>> <gabriele.bartolini@2ndquadrant.it> wrote: >>>>>>> Il 08/01/14 18:42, Simon Riggs ha scritto: >>>>>>>> Not sure I see why it needs to be an SRF. It only returns one row. >>>>>>> Attached is version 4, which removes management of SRF stages. >>>>>> >>>>>> I have been looking at v4 a bit more, and found a couple of small things: >>>>>> - a warning in pgstatfuncs.c >>>>>> - some typos and format fixing in the sgml docs >>>>>> - some corrections in a couple of comments >>>>>> - Fixed an error message related to pg_stat_reset_shared referring >>>>>> only to bgwriter and not the new option archiver. Here is how the new >>>>>> message looks: >>>>>> =# select pg_stat_reset_shared('popo'); >>>>>> ERROR: 22023: unrecognized reset target: "popo" >>>>>> HINT: Target must be "bgwriter" or "archiver" >>>>>> A new version is attached containing those fixes. I played also with >>>>>> the patch and pgbench, simulating some archive failures and successes >>>>>> while running pgbench and the reports given by pg_stat_archiver were >>>>>> correct, so I am marking this patch as "Ready for committer". >>>>> >>>>> I refactored the patch further. >>>>> >>>>> * Remove ArchiverStats global variable to simplify pgarch.c. >>>>> * Remove stats_timestamp field from PgStat_ArchiverStats struct because >>>>> it's not required. >>>> Thanks, pgstat_send_archiver is cleaner now. >>>> >>>>> >>>>> I have some review comments: >>>>> >>>>> + s.archived_wals, >>>>> + s.last_archived_wal, >>>>> + s.last_archived_wal_time, >>>>> + s.failed_attempts, >>>>> + s.last_failed_wal, >>>>> + s.last_failed_wal_time, >>>>> >>>>> The column names of pg_stat_archiver look not consistent at least to me. >>>>> What about the followings? >>>>> >>>>> archive_count >>>>> last_archived_wal >>>>> last_archived_time >>>>> fail_count >>>>> last_failed_wal >>>>> last_failed_time >>>> And what about archived_count and failed_count instead of respectively >>>> archive_count and failed_count? The rest of the names are better now >>>> indeed. >>>> >>>>> I think that it's time to rename all the variables related to pg_stat_bgwriter. >>>>> For example, it's better to change PgStat_GlobalStats to PgStat_Bgwriter. >>>>> I think that it's okay to make this change as separate patch, though. >>>> Yep, this is definitely a different patch. >>>> >>>>> + char last_archived_wal[MAXFNAMELEN]; /* last WAL file archived */ >>>>> + TimestampTz last_archived_wal_timestamp; /* last archival success */ >>>>> + PgStat_Counter failed_attempts; >>>>> + char last_failed_wal[MAXFNAMELEN]; /* last WAL file involved >>>>> in failure */ >>>>> Some hackers don't like the increase of the size of the statsfile. In order to >>>>> reduce the size as much as possible, we should use the exact size of WAL file >>>>> here instead of MAXFNAMELEN? >>>> The first versions of the patch used a more limited size length more >>>> inline with what you say: >>>> +#define MAX_XFN_CHARS 40 >>>> But this is inconsistent with xlog_internal.h. >>> >>> Using MAX_XFN_CHARS instead of MAXFNAMELEN has a clear advantage >>> to reduce the size of the statistics file. Though I'm not sure how much this >>> change improve the performance of the statistics collector, basically >>> I'd like to >>> use MAX_XFN_CHARS here. >> >> I changed the patch in this way, fixed some existing bugs (e.g., >> correct the column >> names of pg_stat_archiver in rules.out), and then just committed it. > "Depesz" mentioned here that it would be interesting to have as well > in this view the size of archive queue: > http://www.depesz.com/2014/01/29/waiting-for-9-4-add-pg_stat_archiver-statistics-view/ > And it looks indeed interesting to have. What do you think about > adding it as a TODO item? I think that it's OK to add that as TODO item. There might be the system that the speed of WAL archiving is slower than that of WAL generation, at peak time. The DBA of that system might want to monitor the size of archive queue. We can implement this by just counting the files with .ready extension in pg_xlog/archive_status directory. Or we can also implement that by adding new counter field in pg_stat_archiver structure, incrementing it whenever creating .ready file, and decrementing it whenever changing .ready to .done. Regards, -- Fujii Masao
Hi Michael and Fujii, Il 01/02/14 17:46, Fujii Masao ha scritto: > I think that it's OK to add that as TODO item. There might be > the system that the speed of WAL archiving is slower than > that of WAL generation, at peak time. The DBA of that system > might want to monitor the size of archive queue. I agree that it is an interesting thing to do. The reason I didn't introduce it in the first place was that I did not want to make too many changes in this first attempt. > We can implement this by just counting the files with .ready > extension in pg_xlog/archive_status directory. Or we can also > implement that by adding new counter field in pg_stat_archiver > structure, incrementing it whenever creating .ready file, and > decrementing it whenever changing .ready to .done. I would love to give it a shot at the next opportunity. Thanks, Gabriele -- Gabriele Bartolini - 2ndQuadrant ItaliaPostgreSQL Training, Services and Supportgabriele.bartolini@2ndQuadrant.it | www.2ndQuadrant.it
On Sun, Feb 2, 2014 at 2:43 AM, Gabriele Bartolini <gabriele.bartolini@2ndquadrant.it> wrote: > Hi Michael and Fujii, > > Il 01/02/14 17:46, Fujii Masao ha scritto: >> I think that it's OK to add that as TODO item. There might be >> the system that the speed of WAL archiving is slower than >> that of WAL generation, at peak time. The DBA of that system >> might want to monitor the size of archive queue. > I agree that it is an interesting thing to do. The reason I didn't > introduce it in the first place was that I did not want to make too many > changes in this first attempt. For the time being I have added an item in "Statistics Collector" about that. >> We can implement this by just counting the files with .ready >> extension in pg_xlog/archive_status directory. Or we can also >> implement that by adding new counter field in pg_stat_archiver >> structure, incrementing it whenever creating .ready file, and >> decrementing it whenever changing .ready to .done. > I would love to give it a shot at the next opportunity. Will be happy to look at it once you get something. Regards, -- Michael