Thread: pg_controldata gobbledygook
I'm not sure who is supposed to be able to read this sort of stuff: Latest checkpoint's NextXID: 0/7575 Latest checkpoint's NextOID: 49152 Latest checkpoint's NextMultiXactId: 7 Latest checkpoint's NextMultiOffset: 13 Latest checkpoint's oldestXID: 1265 Latest checkpoint's oldestXID's DB: 1 Latest checkpoint's oldestActiveXID: 0 Latest checkpoint's oldestMultiXid: 1 Latest checkpoint's oldestMulti's DB: 1 Note that these symbols don't even correspond to the actual symbols used in the source code in some cases. The comments in the pg_control.h header file use much more pleasant terms, which when put to use would lead to output similar to this: Latest checkpoint's next free transaction ID: 0/7575 Latest checkpoint's next free OID: 49152 Latest checkpoint's next free MultiXactId: 7 Latest checkpoint's next free MultiXact offset: 13 Latest checkpoint's cluster-wide minimum datfrozenxid: 1265 Latest checkpoint's database with cluster-wide minimum datfrozenxid: 1 Latest checkpoint's oldest transaction ID still running: 0 Latest checkpoint's cluster-wide minimum datminmxid: 1 Latest checkpoint's database with cluster-wide minimum datminmxid: 1 One could even rearrange the layout a little bit like this: Control data as of latest checkpoint: next free transaction ID: 0/7575 next free OID: 49152 etc. Comments?
Peter Eisentraut <peter_e@gmx.net> writes: > The comments in the pg_control.h header file use much more pleasant > terms, which when put to use would lead to output similar to this: > Latest checkpoint's next free transaction ID: 0/7575 > Latest checkpoint's next free OID: 49152 > Latest checkpoint's next free MultiXactId: 7 > Latest checkpoint's next free MultiXact offset: 13 > Latest checkpoint's cluster-wide minimum datfrozenxid: 1265 > Latest checkpoint's database with cluster-wide minimum datfrozenxid: 1 > Latest checkpoint's oldest transaction ID still running: 0 > Latest checkpoint's cluster-wide minimum datminmxid: 1 > Latest checkpoint's database with cluster-wide minimum datminmxid: 1 > One could even rearrange the layout a little bit like this: > Control data as of latest checkpoint: > next free transaction ID: 0/7575 > next free OID: 49152 > etc. > Comments? I think I've heard of scripts grepping the output of pg_controldata for this that or the other. Any rewording of the labels would break that. While I'm not opposed to improving the labels, I would vote against your second, abbreviated scheme because it would make things ambiguous for simple grep-based scripts. regards, tom lane
On Thu, Apr 25, 2013 at 8:07 PM, Peter Eisentraut <peter_e@gmx.net> wrote: > Comments? +1 from me. I don't think that these particular changes would break WAL-E, Heroku's continuous archiving tool, which has a class called PgControlDataParser. However, it's possible to imagine someone being affected in a similar way. So I'd be sure to document it clearly, and to perhaps preserve the old label names to avoid breaking scripts. -- Peter Geoghegan
Tom Lane wrote: > I think I've heard of scripts grepping the output of pg_controldata for > this that or the other. Any rewording of the labels would break that. > While I'm not opposed to improving the labels, I would vote against your > second, abbreviated scheme because it would make things ambiguous for > simple grep-based scripts. We could provide two alternative outputs, one for human consumption with the proposed format and something else that uses, say, shell assignment syntax. (I did propose this years ago and I might have an unfinished patch still lingering about somewhere.) -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Fri, Apr 26, 2013 at 12:22 AM, Peter Geoghegan <pg@heroku.com> wrote:
On Thu, Apr 25, 2013 at 8:07 PM, Peter Eisentraut <peter_e@gmx.net> wrote:
> Comments?
+1 from me.
I don't think that these particular changes would break WAL-E,
Heroku's continuous archiving tool, which has a class called
PgControlDataParser. However, it's possible to imagine someone being
affected in a similar way. So I'd be sure to document it clearly, and
to perhaps preserve the old label names to avoid breaking scripts.
Why don't we add options to pg_controldata outputs the info in other several formats like json, yaml, xml or another one?
Best regards,
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Tom Lane wrote: >> I think I've heard of scripts grepping the output of pg_controldata for >> this that or the other. Any rewording of the labels would break that. >> While I'm not opposed to improving the labels, I would vote against your >> second, abbreviated scheme because it would make things ambiguous for >> simple grep-based scripts. > We could provide two alternative outputs, one for human consumption with > the proposed format and something else that uses, say, shell assignment > syntax. (I did propose this years ago and I might have an unfinished > patch still lingering about somewhere.) And a script would use that how? "pg_controldata --machine-friendly" would fail outright on older versions. I think it's okay to ask script writers to writepg_controldata | grep -e 'old label|new label' but not okay to ask them to deal with anything as complicated as trying a switch to see if it works or not. regards, tom lane
On Thu, Apr 25, 2013 at 9:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: >> Tom Lane wrote: >>> I think I've heard of scripts grepping the output of pg_controldata for >>> this that or the other. Any rewording of the labels would break that. >>> While I'm not opposed to improving the labels, I would vote against your >>> second, abbreviated scheme because it would make things ambiguous for >>> simple grep-based scripts. > >> We could provide two alternative outputs, one for human consumption with >> the proposed format and something else that uses, say, shell assignment >> syntax. (I did propose this years ago and I might have an unfinished >> patch still lingering about somewhere.) > > And a script would use that how? "pg_controldata --machine-friendly" > would fail outright on older versions. I think it's okay to ask script > writers to write > pg_controldata | grep -e 'old label|new label' > but not okay to ask them to deal with anything as complicated as trying > a switch to see if it works or not. From what I'm reading, it seems like the main benefit of the changes is to make things easier for humans to skim over. Automated programs that care about precise meanings of each field are awkwardly but otherwise well-served by the precise output as rendered right now. What about doing something similar but different from the --machine-readable proposal, such as adding an option for the *human*-readable variant that is guaranteed to mercilessly change as human-readers/-hackers sees fit on whim? It's a bit of a kludge that this is not the default, but would prevent having to serve two quite different masters with the same output. Although I'm not seriously proposing explicitly "-h" (as seen in some GNU programs in rendering byte sizes and the like...yet could be confused for 'help'), something like that may serve as prior art.
<div class="moz-cite-prefix">On 26/04/13 18:53, Daniel Farina wrote:<br /></div><blockquote cite="mid:CAAZKuFa5ougGqfy+z6SpMB+ppCy3Oxq1xP1X4ekEgVvj4Zt2cQ@mail.gmail.com"type="cite"><pre wrap="">On Thu, Apr 25, 2013at 9:34 PM, Tom Lane <a class="moz-txt-link-rfc2396E" href="mailto:tgl@sss.pgh.pa.us"><tgl@sss.pgh.pa.us></a> wrote: </pre><blockquote type="cite"><pre wrap="">Alvaro Herrera <a class="moz-txt-link-rfc2396E" href="mailto:alvherre@2ndquadrant.com"><alvherre@2ndquadrant.com></a>writes: </pre><blockquote type="cite"><pre wrap="">Tom Lane wrote: </pre><blockquote type="cite"><pre wrap="">I think I've heard of scripts grepping the output of pg_controldata for this that or the other. Any rewording of the labels would break that. While I'm not opposed to improving the labels, I would vote against your second, abbreviated scheme because it would make things ambiguous for simple grep-based scripts. </pre></blockquote></blockquote><pre wrap=""> </pre><blockquote type="cite"><pre wrap="">We could provide two alternative outputs, one for human consumption with the proposed format and something else that uses, say, shell assignment syntax. (I did propose this years ago and I might have an unfinished patch still lingering about somewhere.) </pre></blockquote><pre wrap=""> And a script would use that how? "pg_controldata --machine-friendly" would fail outright on older versions. I think it's okay to ask script writers to write pg_controldata | grep -e 'old label|new label' but not okay to ask them to deal with anything as complicated as trying a switch to see if it works or not. </pre></blockquote><pre wrap=""> From what I'm reading, it seems like the main benefit of the changes is to make things easier for humans to skim over. Automated programs that care about precise meanings of each field are awkwardly but otherwise well-served by the precise output as rendered right now. What about doing something similar but different from the --machine-readable proposal, such as adding an option for the *human*-readable variant that is guaranteed to mercilessly change as human-readers/-hackers sees fit on whim? It's a bit of a kludge that this is not the default, but would prevent having to serve two quite different masters with the same output. Although I'm not seriously proposing explicitly "-h" (as seen in some GNU programs in rendering byte sizes and the like...yet could be confused for 'help'), something like that may serve as prior art. </pre></blockquote> I think the current way should remain the default, as Daniel suggests - but a '--human-readable' (orsuitable abbreviation) flag could be added.<br /><br /> Such as in the command to list directory details, using the 'ls'command in Linux...<br /><br /><br /> (Below, <tt><small><b>Y</b></small></tt> = 1024 * 1024 * 1024 * 1024 * 1024 * 1024* 1024 * 1024 bytes = 2^80 bytes.)<br /><br /><small><b><tt>man ls</tt></b><b><tt><br /></tt></b><b><tt>[...]</tt></b><b><tt><br/></tt></b><b><tt> -h, --human-readable</tt></b><b><tt><br /></tt></b><b><tt> with -l, print sizes in human readable format (e.g., 1K 234M 2G)</tt></b><b><tt><br /></tt></b><b><tt>[...]</tt></b><b><tt><br /></tt></b><b><tt> SIZE may be (or may be an integer optionally followedby) one of fol‐</tt></b><b><tt><br /></tt></b><b><tt> lowing: KB 1000, K 1024, MB 1000*1000, M 1024*1024, andso on for G, T,</tt></b><b><tt><br /></tt></b><b><tt> P, E, Z, Y.</tt></b><b><tt><br /></tt></b><b><tt> [...]</tt></b></small><br/><br /><br /> Cheers,<br /> Gavin<br />
On 2013-04-25 23:07:02 -0400, Peter Eisentraut wrote: > I'm not sure who is supposed to be able to read this sort of stuff: > > Latest checkpoint's NextXID: 0/7575 > Latest checkpoint's NextOID: 49152 > Latest checkpoint's NextMultiXactId: 7 > Latest checkpoint's NextMultiOffset: 13 > Latest checkpoint's oldestXID: 1265 > Latest checkpoint's oldestXID's DB: 1 > Latest checkpoint's oldestActiveXID: 0 > Latest checkpoint's oldestMultiXid: 1 > Latest checkpoint's oldestMulti's DB: 1 > > Note that these symbols don't even correspond to the actual symbols used > in the source code in some cases. > > The comments in the pg_control.h header file use much more pleasant > terms, which when put to use would lead to output similar to this: > > Latest checkpoint's next free transaction ID: 0/7575 > Latest checkpoint's next free OID: 49152 > Latest checkpoint's next free MultiXactId: 7 > Latest checkpoint's next free MultiXact offset: 13 > Latest checkpoint's cluster-wide minimum datfrozenxid: 1265 > Latest checkpoint's database with cluster-wide minimum datfrozenxid: 1 > Latest checkpoint's oldest transaction ID still running: 0 > Latest checkpoint's cluster-wide minimum datminmxid: 1 > Latest checkpoint's database with cluster-wide minimum datminmxid: 1 > > One could even rearrange the layout a little bit like this: > > Control data as of latest checkpoint: > next free transaction ID: 0/7575 > next free OID: 49152 I have to admit I don't see the point. None of those values is particularly interesting to anybody without implementation level knowledge and those will likely deal with them just fine. And I find the version with the shorter names far quicker to read. The clarity win here doesn't seem to be worth the price of potentially breaking some tools. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
--On 25. April 2013 23:19:14 -0400 Tom Lane <tgl@sss.pgh.pa.us> wrote: > I think I've heard of scripts grepping the output of pg_controldata for > this that or the other. Any rewording of the labels would break that. > While I'm not opposed to improving the labels, I would vote against your > second, abbreviated scheme because it would make things ambiguous for > simple grep-based scripts. I had exactly this kind of discussion just a few days ago with a customer, who wants to use the output in their scripts and was a little worried about the compatibility between major versions. I don't think we do guarantuee any output format compatibility between corresponding symbols in major versions explicitly, but given that pg_controldata seems to have a broad use case here, we should maybe document it somewhere wether to discourage or encourage people to rely on it? -- Thanks Bernd
On Fri, Apr 26, 2013 at 5:08 AM, Andres Freund <andres@2ndquadrant.com> wrote: > I have to admit I don't see the point. None of those values is particularly > interesting to anybody without implementation level knowledge and those > will likely deal with them just fine. And I find the version with the > shorter names far quicker to read. > The clarity win here doesn't seem to be worth the price of potentially > breaking some tools. +1. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, Apr 26, 2013 at 2:08 AM, Andres Freund <andres@2ndquadrant.com> wrote:
I have to admit I don't see the point. None of those values is particularlyOn 2013-04-25 23:07:02 -0400, Peter Eisentraut wrote:
> I'm not sure who is supposed to be able to read this sort of stuff:
>
> Latest checkpoint's NextXID: 0/7575
> Latest checkpoint's NextOID: 49152
> Latest checkpoint's NextMultiXactId: 7
> Latest checkpoint's NextMultiOffset: 13
> Latest checkpoint's oldestXID: 1265
> Latest checkpoint's oldestXID's DB: 1
> Latest checkpoint's oldestActiveXID: 0
> Latest checkpoint's oldestMultiXid: 1
> Latest checkpoint's oldestMulti's DB: 1
>
> Note that these symbols don't even correspond to the actual symbols used
> in the source code in some cases.
>
> The comments in the pg_control.h header file use much more pleasant
> terms, which when put to use would lead to output similar to this:
>
> Latest checkpoint's next free transaction ID: 0/7575
> Latest checkpoint's next free OID: 49152
> Latest checkpoint's next free MultiXactId: 7
> Latest checkpoint's next free MultiXact offset: 13
> Latest checkpoint's cluster-wide minimum datfrozenxid: 1265
> Latest checkpoint's database with cluster-wide minimum datfrozenxid: 1
> Latest checkpoint's oldest transaction ID still running: 0
> Latest checkpoint's cluster-wide minimum datminmxid: 1
> Latest checkpoint's database with cluster-wide minimum datminmxid: 1
>
> One could even rearrange the layout a little bit like this:
>
> Control data as of latest checkpoint:
> next free transaction ID: 0/7575
> next free OID: 49152
interesting to anybody without implementation level knowledge and those
will likely deal with them just fine. And I find the version with the
shorter names far quicker to read.
I agree. For the ones I didn't know the meaning of, I still don't know the meaning of them based on the long form, either. While a tutorial on what these things mean might be useful, embedding the tutorial into the output of pg_controldata probably isn't the right place.
Cheers,
Jeff
On Fri, Apr 26, 2013 at 08:51:23AM -0400, Robert Haas wrote: > On Fri, Apr 26, 2013 at 5:08 AM, Andres Freund <andres@2ndquadrant.com> wrote: > > I have to admit I don't see the point. None of those values is particularly > > interesting to anybody without implementation level knowledge and those > > will likely deal with them just fine. And I find the version with the > > shorter names far quicker to read. > > The clarity win here doesn't seem to be worth the price of potentially > > breaking some tools. > > +1. FYI, pg_upgrade would certainly have to be updated to handle this change. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +