Thread: Log timestamps at higher resolution

Log timestamps at higher resolution

From
David Fetter
Date:
Folks,

Per gripes I've been hearing with increasing frequency, please find
attached a patch that implements $Subject. It's microsecond resolution
because at least at the moment, nanosecond resolution doesn't appear
to be helpful in this context.

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Attachment

Re: Log timestamps at higher resolution

From
Thomas Munro
Date:
On Wed, Oct 24, 2018 at 7:51 AM David Fetter <david@fetter.org> wrote:
> Per gripes I've been hearing with increasing frequency, please find
> attached a patch that implements $Subject. It's microsecond resolution
> because at least at the moment, nanosecond resolution doesn't appear
> to be helpful in this context.

Wouldn't you want to choose a new letter or some other way to make
existing format control strings do what they always did?

-- 
Thomas Munro
http://www.enterprisedb.com


Re: Log timestamps at higher resolution

From
David Fetter
Date:
On Wed, Oct 24, 2018 at 08:00:24AM +1300, Thomas Munro wrote:
> On Wed, Oct 24, 2018 at 7:51 AM David Fetter <david@fetter.org> wrote:
> > Per gripes I've been hearing with increasing frequency, please find
> > attached a patch that implements $Subject. It's microsecond resolution
> > because at least at the moment, nanosecond resolution doesn't appear
> > to be helpful in this context.
> 
> Wouldn't you want to choose a new letter or some other way to make
> existing format control strings do what they always did?

I hadn't because I'd looked at the existing format as merely buggy in
lacking precision, although I guess with really fussy log processors,
this change could break things.

Have you seen processors like that in the wild?

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: Log timestamps at higher resolution

From
Alvaro Herrera
Date:
On 2018-Oct-23, David Fetter wrote:

> On Wed, Oct 24, 2018 at 08:00:24AM +1300, Thomas Munro wrote:
> > On Wed, Oct 24, 2018 at 7:51 AM David Fetter <david@fetter.org> wrote:
> > > Per gripes I've been hearing with increasing frequency, please find
> > > attached a patch that implements $Subject. It's microsecond resolution
> > > because at least at the moment, nanosecond resolution doesn't appear
> > > to be helpful in this context.
> > 
> > Wouldn't you want to choose a new letter or some other way to make
> > existing format control strings do what they always did?
> 
> I hadn't because I'd looked at the existing format as merely buggy in
> lacking precision, although I guess with really fussy log processors,
> this change could break things.
> 
> Have you seen processors like that in the wild?

pgbadger does this:
        '%m' => [('t_mtimestamp',   '(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2})\.\d+(?: [A-Z\+\-\d]{3,6})?')], # timestamp
withmilliseconds
 

which should cope with however many digits there are (\d+).
But I would expect others to be less forgiving ...

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Log timestamps at higher resolution

From
David Fetter
Date:
On Tue, Oct 23, 2018 at 04:14:50PM -0300, Alvaro Herrera wrote:
> On 2018-Oct-23, David Fetter wrote:
> 
> > On Wed, Oct 24, 2018 at 08:00:24AM +1300, Thomas Munro wrote:
> > > On Wed, Oct 24, 2018 at 7:51 AM David Fetter <david@fetter.org> wrote:
> > > > Per gripes I've been hearing with increasing frequency, please find
> > > > attached a patch that implements $Subject. It's microsecond resolution
> > > > because at least at the moment, nanosecond resolution doesn't appear
> > > > to be helpful in this context.
> > > 
> > > Wouldn't you want to choose a new letter or some other way to make
> > > existing format control strings do what they always did?
> > 
> > I hadn't because I'd looked at the existing format as merely buggy in
> > lacking precision, although I guess with really fussy log processors,
> > this change could break things.
> > 
> > Have you seen processors like that in the wild?
> 
> pgbadger does this:
>         '%m' => [('t_mtimestamp',   '(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2})\.\d+(?: [A-Z\+\-\d]{3,6})?')], # timestamp
withmilliseconds
 
> 
> which should cope with however many digits there are (\d+).
> But I would expect others to be less forgiving ...

That's an interesting point.  pgbadger is the only one I recall using
that's still maintained. Which others would it be useful to test?

Also, do we have tests--or at least ideas of how to
test--functionality relating to logging? I was a little bit taken
aback by the fact that `make check-world` passed after the change.

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: Log timestamps at higher resolution

From
Michael Paquier
Date:
On Wed, Oct 24, 2018 at 12:11:18AM +0200, David Fetter wrote:
> That's an interesting point.  pgbadger is the only one I recall using
> that's still maintained. Which others would it be useful to test?

There could be private solutions as well.  My take is that we should use
separate letters and not change the existing ones or we'll get
complains.

> Also, do we have tests--or at least ideas of how to
> test--functionality relating to logging? I was a little bit taken
> aback by the fact that `make check-world` passed after the change.

This requires server-level changes where a TAP test is usually adapted,
and there is no test for logging yet.
--
Michael

Attachment

Re: Log timestamps at higher resolution

From
David Fetter
Date:
On Wed, Oct 24, 2018 at 08:46:47AM +0900, Michael Paquier wrote:
> On Wed, Oct 24, 2018 at 12:11:18AM +0200, David Fetter wrote:
> > That's an interesting point.  pgbadger is the only one I recall using
> > that's still maintained. Which others would it be useful to test?
> 
> There could be private solutions as well.  My take is that we should use
> separate letters and not change the existing ones or we'll get
> complains.

I believe that the utility of having finer time resolution outweighs
the inconvenience of changing processing systems, to the extent that
that's a consideration.

Adding yet more random letter options to log_line_prefix isn't free
either.

For one thing, there are a limited number of single-letter options we
can use, and we need to make sure they can continue to make sense, or
at least have reasonably consistent meanings.

For another, having separate letter rather than number modifiers as
printf("%03d") does, is just lousy API design.  Baroque is OK for
music if you're in the mood, but not so great for log line prefix
format codes.  If we do go the number option route, we can only not
break people's fussy parsers by having a default format of length 3, a
decision which will get harder and harder to justify as time goes on.

I'm happy to code up a number option if that's what people want, but
that'll break things the same way the current patch does, modulo The
Inexplicable Decision To Default To Three Decimal Places, which code
archaeologists will love and system implementers will hate.

> > Also, do we have tests--or at least ideas of how to
> > test--functionality relating to logging? I was a little bit taken
> > aback by the fact that `make check-world` passed after the change.
> 
> This requires server-level changes where a TAP test is usually adapted,
> and there is no test for logging yet.

Is that worth a separate patch?

I suspect that as things Cloud™ (a.k.a. Renting Other People's Data
Centers) get even more popular, we'll want to improve our logging
game, and that will sooner rather than later have features complicated
enough to be worth testing.

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: Log timestamps at higher resolution

From
Tom Lane
Date:
David Fetter <david@fetter.org> writes:
> On Wed, Oct 24, 2018 at 08:00:24AM +1300, Thomas Munro wrote:
>> On Wed, Oct 24, 2018 at 7:51 AM David Fetter <david@fetter.org> wrote:
>>> Per gripes I've been hearing with increasing frequency, please find
>>> attached a patch that implements $Subject. It's microsecond resolution
>>> because at least at the moment, nanosecond resolution doesn't appear
>>> to be helpful in this context.

>> Wouldn't you want to choose a new letter or some other way to make
>> existing format control strings do what they always did?

> I hadn't because I'd looked at the existing format as merely buggy in
> lacking precision, although I guess with really fussy log processors,
> this change could break things.

Aside from breaking things, this would increase log volume, which is
already a pain point for many people.  So I think your odds of getting
it committed as-is are nil.  It might be OK as a different letter, or
with some other way of controlling the precision.

            regards, tom lane


Re: Log timestamps at higher resolution

From
Alvaro Herrera
Date:
On 2018-Oct-24, David Fetter wrote:

> For another, having separate letter rather than number modifiers as
> printf("%03d") does, is just lousy API design.

I don't think the API is lousy as all that, but a further improvement to
allow a precision specifier might be a worthy feature addition -- say
%.6t or such (where %m would have the same meaning as %.3t).

> Baroque is OK for music if you're in the mood,

Haha!

> I suspect that as things Cloud™ (a.k.a. Renting Other People's Data
> Centers) get even more popular, we'll want to improve our logging
> game, and that will sooner rather than later have features complicated
> enough to be worth testing.

Improved logging is already a sore need we have, regardless of weather
conditions.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Log timestamps at higher resolution

From
Tom Lane
Date:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> On 2018-Oct-24, David Fetter wrote:
>> For another, having separate letter rather than number modifiers as
>> printf("%03d") does, is just lousy API design.

> I don't think the API is lousy as all that, but a further improvement to
> allow a precision specifier might be a worthy feature addition -- say
> %.6t or such (where %m would have the same meaning as %.3t).

+1.  I think the patch as it stands is way too influenced by the purely
chance factor that "milli" and "micro" have the same initial letter in
English.  David would never have submitted it in this form if that weren't
true; but that doesn't make either resolution the right answer for
everyone.  And I do not want to have this discussion again in ten years
when somebody starts moaning that microsecond resolution is so last
century.  Let's put in a number so that the format-string API can stay the
same as the required resolution moves.

            regards, tom lane


Re: Log timestamps at higher resolution

From
David Fetter
Date:
On Wed, Oct 24, 2018 at 11:10:02AM +0100, Tom Lane wrote:
> David Fetter <david@fetter.org> writes:
> > On Wed, Oct 24, 2018 at 08:00:24AM +1300, Thomas Munro wrote:
> >> On Wed, Oct 24, 2018 at 7:51 AM David Fetter <david@fetter.org> wrote:
> >>> Per gripes I've been hearing with increasing frequency, please find
> >>> attached a patch that implements $Subject. It's microsecond resolution
> >>> because at least at the moment, nanosecond resolution doesn't appear
> >>> to be helpful in this context.
> 
> >> Wouldn't you want to choose a new letter or some other way to make
> >> existing format control strings do what they always did?
> 
> > I hadn't because I'd looked at the existing format as merely buggy in
> > lacking precision, although I guess with really fussy log processors,
> > this change could break things.
> 
> Aside from breaking things, this would increase log volume, which is
> already a pain point for many people.

As to log volume, this may or may not be in our ambit to address.
There are Linux distros that "handle" this by automatically using
compression so slow that it completely tanks performance, as in
shrinks TPS by an order of magnitude or more. For me, the
lesson there is to make those trade-offs explicit and controllable by
our users rather than magically imposed by whatever means.

> So I think your odds of getting it committed as-is are nil.  It
> might be OK as a different letter, or with some other way of
> controlling the precision.

Would you be OK with a way to control the precision that defaults to
more precise and comes with a big loud release note about how to get
the previous behavior?

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: Log timestamps at higher resolution

From
David Fetter
Date:
On Wed, Oct 24, 2018 at 03:17:09PM +0100, Tom Lane wrote:
> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> > On 2018-Oct-24, David Fetter wrote:
> >> For another, having separate letter rather than number modifiers as
> >> printf("%03d") does, is just lousy API design.
> 
> > I don't think the API is lousy as all that, but a further improvement to
> > allow a precision specifier might be a worthy feature addition -- say
> > %.6t or such (where %m would have the same meaning as %.3t).
> 
> +1.  I think the patch as it stands is way too influenced by the purely
> chance factor that "milli" and "micro" have the same initial letter in
> English.  David would never have submitted it in this form if that weren't
> true; but that doesn't make either resolution the right answer for
> everyone.  And I do not want to have this discussion again in ten years
> when somebody starts moaning that microsecond resolution is so last
> century.  Let's put in a number so that the format-string API can stay the
> same as the required resolution moves.

Digging a teensy bit deeper, I noticed that there's already a
"padding" (space padding, if I understand correctly) system for parts
of the log_line_prefix specification including %m. Did we get painted
into a corner here back in 9.4, when the padding feature was added?
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=4334639f4bb9fb88c13b8dd5faca22b207248504

Strangely, there were no tests that came with that either.  David, did
you mean to expand it past space padding, or...?

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: Log timestamps at higher resolution

From
David Rowley
Date:
On 25 October 2018 at 11:25, David Fetter <david@fetter.org> wrote:
> Digging a teensy bit deeper, I noticed that there's already a
> "padding" (space padding, if I understand correctly) system for parts
> of the log_line_prefix specification including %m. Did we get painted
> into a corner here back in 9.4, when the padding feature was added?
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=4334639f4bb9fb88c13b8dd5faca22b207248504

I'm not sure "painted into a corner" is the right way to think about
this. Numerics immediately followed by the '%' are simply already used
by the space padding feature.  If you want some other modifier then
you'll need to invent some other syntax to identify that. It looks
like using a '.' after the '%' was mentioned by Alvaro.

I think the best way to implement this is to rename
process_log_prefix_padding into process_log_prefix_modifiers and add
an "int *precision" variable and add some code to parse and populate
that based on if there's a '.' found in the prefix. Then in
log_line_prefix() just take into account the precision for any prefix
variable that it should apply to. So, %s likely should be altered too,
although that might require some additional code to NULLify the string
if the server config is reloaded, or perhaps you can just remember how
much precision that string was made with and only rebuild it if the
required precision changes.  We don't seem to have any other floating
point type prefix variables, so I don't think any of the others need
to be altered.

> Strangely, there were no tests that came with that either.  David, did
> you mean to expand it past space padding, or...?

Unsure what infrastructure existed then for testing this.

I didn't intend to add anything else later. I don't think we have any
variables that zero padding would be that useful for, but maybe
someone might want to zero pad their pids or xids... ?

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: Log timestamps at higher resolution

From
Michael Paquier
Date:
On Thu, Oct 25, 2018 at 01:00:08PM +1300, David Rowley wrote:
> On 25 October 2018 at 11:25, David Fetter <david@fetter.org> wrote:
>> Strangely, there were no tests that came with that either.  David, did
>> you mean to expand it past space padding, or...?
>
> Unsure what infrastructure existed then for testing this.

Close to none in-core for such things, the first set of TAP tests
included was 9.6, and at the time of 9.4 it was not even clear what we
were lookng for.  You could have done things similar to what
test_decoding or pg_stat_statements are doing to enforce some
configuration parameters, but that would not have scaled with different
configurations.  Another possibility would have been a buildfarm module.

Anyway, what has been done in this case looks fine to me with what was
present.
--
Michael

Attachment

Re: Log timestamps at higher resolution

From
David Fetter
Date:
On Thu, Oct 25, 2018 at 01:00:08PM +1300, David Rowley wrote:
> On 25 October 2018 at 11:25, David Fetter <david@fetter.org> wrote:
> > Digging a teensy bit deeper, I noticed that there's already a
> > "padding" (space padding, if I understand correctly) system for parts
> > of the log_line_prefix specification including %m. Did we get painted
> > into a corner here back in 9.4, when the padding feature was added?
> > https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=4334639f4bb9fb88c13b8dd5faca22b207248504
> 
> I'm not sure "painted into a corner" is the right way to think about
> this.

Excellent!

> Numerics immediately followed by the '%' are simply already used by
> the space padding feature.  If you want some other modifier then
> you'll need to invent some other syntax to identify that. It looks
> like using a '.' after the '%' was mentioned by Alvaro.
> 
> I think the best way to implement this is to rename
> process_log_prefix_padding into process_log_prefix_modifiers and add
> an "int *precision" variable and add some code to parse and populate
> that based on if there's a '.' found in the prefix.  Then in
> log_line_prefix() just take into account the precision for any
> prefix variable that it should apply to. So, %s likely should be
> altered too, although that might require some additional code to
> NULLify the string if the server config is reloaded, or perhaps you
> can just remember how much precision that string was made with and
> only rebuild it if the required precision changes.  We don't seem to
> have any other floating point type prefix variables, so I don't
> think any of the others need to be altered.

Thanks for the detailed roadmap on this.

> > Strangely, there were no tests that came with that either.  David,
> > did you mean to expand it past space padding, or...?
> 
> Unsure what infrastructure existed then for testing this.
> 
> I didn't intend to add anything else later. I don't think we have
> any variables that zero padding would be that useful for, but maybe
> someone might want to zero pad their pids or xids... ?

They might, so I'll either put in a comment where that would go or
just implement it.

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: Log timestamps at higher resolution

From
Alvaro Herrera
Date:
On 2018-Oct-25, David Fetter wrote:

> > I didn't intend to add anything else later. I don't think we have
> > any variables that zero padding would be that useful for, but maybe
> > someone might want to zero pad their pids or xids... ?
> 
> They might, so I'll either put in a comment where that would go or
> just implement it.

I've never seen a need for zero-padding XIDs or PIDs, frankly.  Can you
imagine the mess if we ever grew XIDs to 64 bits?  And what about
OS-level configurability of PIDs?  I don't know that there's a good way
to learn their widths.

I don't think it's worth of a comment even.  

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services