Thread: logging stuff

logging stuff

From
Andrew Dunstan
Date:
(Responding to the deafening silence regarding my posts a couple of days 
ago about logging dbnames and disconnections) ;-)

The dbname patch is now done. If nobody objects to the format 
("[db:yourdbname]") I'll submit it - I did it that way to make it fairly 
easy to split a log file based on it, although you would have to be 
careful with multiline log entries such as query strings. It is 
intentionally minimalist.

I had some thoughts about logging disconnections - I can see a way to do 
it via an on_proc_exit handler, I think. Then I started wondering if it 
might be useful to log session times instead of just noting a disconnect 
and letting the user have to calculate the time.

But I won't bother with this if there's no interest. *I* have no current 
use for it, but I could well imagine others might. (I might too in the 
future if I wanted to debug my connection pooling app).

cheers

andrew





Re: logging stuff

From
"scott.marlowe"
Date:
On Tue, 5 Aug 2003, Andrew Dunstan wrote:

> 
> (Responding to the deafening silence regarding my posts a couple of days 
> ago about logging dbnames and disconnections) ;-)
> 
> The dbname patch is now done. If nobody objects to the format 
> ("[db:yourdbname]") I'll submit it - I did it that way to make it fairly 
> easy to split a log file based on it, although you would have to be 
> careful with multiline log entries such as query strings. It is 
> intentionally minimalist.
> 
> I had some thoughts about logging disconnections - I can see a way to do 
> it via an on_proc_exit handler, I think. Then I started wondering if it 
> might be useful to log session times instead of just noting a disconnect 
> and letting the user have to calculate the time.
> 
> But I won't bother with this if there's no interest. *I* have no current 
> use for it, but I could well imagine others might. (I might too in the 
> future if I wanted to debug my connection pooling app).

Actually, I'd certainly like to see it done (both dbname and disconnect).

I'd guess the deafening silence was more because of no objctions than lack 
of interest.  I know for me it was.



Re: logging stuff

From
Bruce Momjian
Date:
I think we need a more general variable that can take several values,
separated by commas, like:
log_line:  dbname,user

or something like that.

In fact, looking at the postgresql.conf file, I see only two setting
that print on every line:  log_pid and log_timestamp.  Perhaps those two
should be merged into log_line.

Of course, this is all for 7.5.

---------------------------------------------------------------------------

scott.marlowe wrote:
> On Tue, 5 Aug 2003, Andrew Dunstan wrote:
> 
> > 
> > (Responding to the deafening silence regarding my posts a couple of days 
> > ago about logging dbnames and disconnections) ;-)
> > 
> > The dbname patch is now done. If nobody objects to the format 
> > ("[db:yourdbname]") I'll submit it - I did it that way to make it fairly 
> > easy to split a log file based on it, although you would have to be 
> > careful with multiline log entries such as query strings. It is 
> > intentionally minimalist.
> > 
> > I had some thoughts about logging disconnections - I can see a way to do 
> > it via an on_proc_exit handler, I think. Then I started wondering if it 
> > might be useful to log session times instead of just noting a disconnect 
> > and letting the user have to calculate the time.
> > 
> > But I won't bother with this if there's no interest. *I* have no current 
> > use for it, but I could well imagine others might. (I might too in the 
> > future if I wanted to debug my connection pooling app).
> 
> Actually, I'd certainly like to see it done (both dbname and disconnect).
> 
> I'd guess the deafening silence was more because of no objctions than lack 
> of interest.  I know for me it was.
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
>       subscribe-nomail command to majordomo@postgresql.org so that your
>       message can get through to the mailing list cleanly
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: logging stuff

From
Andrew Dunstan
Date:
It's more work for a very small cosmetic benefit, ISTM. Setting up 
atomic GUC variables is trivially easy, I'm happy to say.  Not that I 
can't write a simple parser, but why bother for something so tiny?

Also, would we be saying (or implying) that the order given on the line 
would affect the output? If so, that could make LOTS more work, again 
for very small benefit, I think.

(Yes, of course this is for 7.5).

cheers

andrew

Bruce Momjian wrote:

>I think we need a more general variable that can take several values,
>separated by commas, like:
>
>    log_line:  dbname,user
>
>or something like that.
>
>In fact, looking at the postgresql.conf file, I see only two setting
>that print on every line:  log_pid and log_timestamp.  Perhaps those two
>should be merged into log_line.
>
>Of course, this is all for 7.5.
>
>  
>




Re: logging stuff

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> The dbname patch is now done. If nobody objects to the format 
> ("[db:yourdbname]") I'll submit it - I did it that way to make it fairly 
> easy to split a log file based on it, although you would have to be 
> careful with multiline log entries such as query strings.

I'd tend to just put the dbname in a known column, and not bother with
the decoration --- decoration adds up fast when it's on every log line,
and I don't think it helps make the log more parseable.  Compare what we
do with timestamps and pids.
        regards, tom lane


Re: logging stuff

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> I think we need a more general variable that can take several values,
> separated by commas, like:
>     log_line:  dbname,user
> or something like that.

Strikes me as overkill --- adding two more booleans comparable to
log_pid would do the job just as well with lots less coding effort.
        regards, tom lane


Re: logging stuff

From
Andrew Dunstan
Date:
Not quite the same - timestamps and pids have known formats, while db 
names are almost arbitrary. I know including spaces in names is horrible 
to my *nix way of thinking, but others might not have my prejudices. 
(interesting question - what characters are NOT allowed in a database 
name?).

BTW, we're talking about 5 extra chars per line here. I know it adds up, 
but these days disk space is cheap and plentiful - for a million log 
lines we're still only talking 5Mb (says he whose first machine had a 
whopping 20Mb of disk space, and who paid hundreds of (aussie) dollars 
for the extra 10 Mb.)

Syslog puts [] around pids, and typically has a lot of redundancy.

I'm prepared to be guided by concensus, though.

cheers

andrew


Tom Lane wrote:

>Andrew Dunstan <andrew@dunslane.net> writes:
>  
>
>>The dbname patch is now done. If nobody objects to the format 
>>("[db:yourdbname]") I'll submit it - I did it that way to make it fairly 
>>easy to split a log file based on it, although you would have to be 
>>careful with multiline log entries such as query strings.
>>    
>>
>
>I'd tend to just put the dbname in a known column, and not bother with
>the decoration --- decoration adds up fast when it's on every log line,
>and I don't think it helps make the log more parseable.  Compare what we
>do with timestamps and pids.
>
>            regards, tom lane
>
>  
>




Re: logging stuff

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> I'm prepared to be guided by concensus, though.

I'm not dead set on it either, just wanted to raise a flag.  Who else
has an opinion?
        regards, tom lane


Re: logging stuff

From
Josh Berkus
Date:
Tom,

> > I'm prepared to be guided by concensus, though.
>
> I'm not dead set on it either, just wanted to raise a flag.  Who else
> has an opinion?

From my perspective, we could really use a "delimiter" between the "fields" of 
log output which is unlikely to appear within those fields instead of parsing 
by character count, rather than making dbname a special case.

Or do we already do this and I'm not parsing my log right?

-- 
Josh Berkus
Aglio Database Solutions
San Francisco


Re: logging stuff

From
Tom Lane
Date:
Josh Berkus <josh@agliodbs.com> writes:
> From my perspective, we could really use a "delimiter" between the
> "fields" of log output which is unlikely to appear within those fields
> instead of parsing by character count, rather than making dbname a
> special case.

Well, Andrew was arguing that space is a likely character in dbnames,
but then again it's possible to put "]" into dbnames.  I think the only
way to make this completely unambiguous would be to introduce a quoting
convention for dbnames (and usernames when those get added).  But if
the goal is to allow simple parsing of log entries, that won't improve
matters at all.

My feeling about it is that DBAs who want to automatically parse log
entries can just forbid spaces in the dbnames and usernames they allow.
The KISS principle applies here.

A variant (which'd be okay with me) is to separate these fields with
tabs instead of spaces; then the rule for DBAs would be "don't allow
tabs in db/user names".
        regards, tom lane


Re: logging stuff

From
Larry Rosenman
Date:

--On Tuesday, August 05, 2003 16:27:55 -0400 Tom Lane <tgl@sss.pgh.pa.us> 
wrote:

> Josh Berkus <josh@agliodbs.com> writes:
>> From my perspective, we could really use a "delimiter" between the
>> "fields" of log output which is unlikely to appear within those fields
>> instead of parsing by character count, rather than making dbname a
>> special case.
>
> Well, Andrew was arguing that space is a likely character in dbnames,
> but then again it's possible to put "]" into dbnames.  I think the only
> way to make this completely unambiguous would be to introduce a quoting
> convention for dbnames (and usernames when those get added).  But if
> the goal is to allow simple parsing of log entries, that won't improve
> matters at all.
>
> My feeling about it is that DBAs who want to automatically parse log
> entries can just forbid spaces in the dbnames and usernames they allow.
> The KISS principle applies here.
>
> A variant (which'd be okay with me) is to separate these fields with
> tabs instead of spaces; then the rule for DBAs would be "don't allow
> tabs in db/user names".
Tabs cause issues in syslog output if I remember correctly.


>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 8: explain analyze is your friend
>



-- 
Larry Rosenman                     http://www.lerctr.org/~ler
Phone: +1 972-414-9812                 E-Mail: ler@lerctr.org
US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749



Re: logging stuff

From
Tom Lane
Date:
Larry Rosenman <ler@lerctr.org> writes:
> --On Tuesday, August 05, 2003 16:27:55 -0400 Tom Lane <tgl@sss.pgh.pa.us> 
>> A variant (which'd be okay with me) is to separate these fields with
>> tabs instead of spaces; then the rule for DBAs would be "don't allow
>> tabs in db/user names".

> Tabs cause issues in syslog output if I remember correctly.

Really?  What kind of issues?  We're already quite likely to be sticking
tabs into the log output, e.g. if log_statement is turned on.
        regards, tom lane


Re: logging stuff

From
Larry Rosenman
Date:

--On Tuesday, August 05, 2003 16:41:34 -0400 Tom Lane <tgl@sss.pgh.pa.us> 
wrote:

> Larry Rosenman <ler@lerctr.org> writes:
>> --On Tuesday, August 05, 2003 16:27:55 -0400 Tom Lane
>> <tgl@sss.pgh.pa.us>
>>> A variant (which'd be okay with me) is to separate these fields with
>>> tabs instead of spaces; then the rule for DBAs would be "don't allow
>>> tabs in db/user names".
>
>> Tabs cause issues in syslog output if I remember correctly.
>
> Really?  What kind of issues?  We're already quite likely to be sticking
> tabs into the log output, e.g. if log_statement is turned on.
I remember dealing with \n's, and don't recall if \t is a problem as well
in SCO's syslogd.  I'll play tonite, and see what it does/doesn't deal 
with.

LER

>
>             regards, tom lane



-- 
Larry Rosenman                     http://www.lerctr.org/~ler
Phone: +1 972-414-9812                 E-Mail: ler@lerctr.org
US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749



Re: logging stuff

From
Bruce Momjian
Date:
My issue is that I think there are reasonable people might want
username, dbname, hostname, host ip, and host port on the log lines. 
This is the information that we currently report during a connection, if
enabled.

I have no problem adding those five booleans if people prefer booleans
--- my big point is that dbname isn't the only possible thing folks
would want.  And of course, we already have pid and timestamp, so once
we are done, we will have seven possible data items on each line, and
with booleans there will be no control over their order on the line.

---------------------------------------------------------------------------

Andrew Dunstan wrote:
> Not quite the same - timestamps and pids have known formats, while db 
> names are almost arbitrary. I know including spaces in names is horrible 
> to my *nix way of thinking, but others might not have my prejudices. 
> (interesting question - what characters are NOT allowed in a database 
> name?).
> 
> BTW, we're talking about 5 extra chars per line here. I know it adds up, 
> but these days disk space is cheap and plentiful - for a million log 
> lines we're still only talking 5Mb (says he whose first machine had a 
> whopping 20Mb of disk space, and who paid hundreds of (aussie) dollars 
> for the extra 10 Mb.)
> 
> Syslog puts [] around pids, and typically has a lot of redundancy.
> 
> I'm prepared to be guided by concensus, though.
> 
> cheers
> 
> andrew
> 
> 
> Tom Lane wrote:
> 
> >Andrew Dunstan <andrew@dunslane.net> writes:
> >  
> >
> >>The dbname patch is now done. If nobody objects to the format 
> >>("[db:yourdbname]") I'll submit it - I did it that way to make it fairly 
> >>easy to split a log file based on it, although you would have to be 
> >>careful with multiline log entries such as query strings.
> >>    
> >>
> >
> >I'd tend to just put the dbname in a known column, and not bother with
> >the decoration --- decoration adds up fast when it's on every log line,
> >and I don't think it helps make the log more parseable.  Compare what we
> >do with timestamps and pids.
> >
> >            regards, tom lane
> >
> >  
> >
> 
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
>       subscribe-nomail command to majordomo@postgresql.org so that your
>       message can get through to the mailing list cleanly
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: logging stuff

From
Bruce Momjian
Date:
Another interesting idea, if we do a single log variable with parameters
separated by commas, is to allow some syntax where you could specify the
delimiter between fields, so it could be:
log_line:     "|", dbname, username

or
log_line:     " ", dbname, username

---------------------------------------------------------------------------

Josh Berkus wrote:
> Tom,
> 
> > > I'm prepared to be guided by concensus, though.
> >
> > I'm not dead set on it either, just wanted to raise a flag.  Who else
> > has an opinion?
> 
> >From my perspective, we could really use a "delimiter" between the "fields" of 
> log output which is unlikely to appear within those fields instead of parsing 
> by character count, rather than making dbname a special case.
> 
> Or do we already do this and I'm not parsing my log right?
> 
> -- 
> Josh Berkus
> Aglio Database Solutions
> San Francisco
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 8: explain analyze is your friend
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: logging stuff

From
"scott.marlowe"
Date:
If we're looking at this, we might want to look at how apache does it with 
it's customlog feature.  This allows you to first define custom log types, 
then set them according to which virtual server you're setting up.

I could see that being nice so you could create a couple of different 
custom log types, then use one for one database, and another for a 
different database.  

For people running large numbers of databases on a single cluster, it 
might be quite useful to be able to log each database differently.

On Tue, 5 Aug 2003, Bruce Momjian wrote:

> 
> Another interesting idea, if we do a single log variable with parameters
> separated by commas, is to allow some syntax where you could specify the
> delimiter between fields, so it could be:
> 
>     log_line:     "|", dbname, username
> 
> or
> 
>     log_line:     " ", dbname, username
> 
> ---------------------------------------------------------------------------
> 
> Josh Berkus wrote:
> > Tom,
> > 
> > > > I'm prepared to be guided by concensus, though.
> > >
> > > I'm not dead set on it either, just wanted to raise a flag.  Who else
> > > has an opinion?
> > 
> > >From my perspective, we could really use a "delimiter" between the "fields" of 
> > log output which is unlikely to appear within those fields instead of parsing 
> > by character count, rather than making dbname a special case.
> > 
> > Or do we already do this and I'm not parsing my log right?
> > 
> > -- 
> > Josh Berkus
> > Aglio Database Solutions
> > San Francisco
> > 
> > ---------------------------(end of broadcast)---------------------------
> > TIP 8: explain analyze is your friend
> > 
> 
> 



Re: logging stuff

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> ... And of course, we already have pid and timestamp, so once
> we are done, we will have seven possible data items on each line, and
> with booleans there will be no control over their order on the line.

Which is exactly the way I want it ;-).  I can't see any use that would
justify the amount of extra logic needed to allow user-specified
ordering of the entries.  This feature discussion seems to be
degenerating into a gild-the-lily contest ...
        regards, tom lane


Re: logging stuff

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > ... And of course, we already have pid and timestamp, so once
> > we are done, we will have seven possible data items on each line, and
> > with booleans there will be no control over their order on the line.
> 
> Which is exactly the way I want it ;-).  I can't see any use that would
> justify the amount of extra logic needed to allow user-specified
> ordering of the entries.  This feature discussion seems to be
> degenerating into a gild-the-lily contest ...

Depends if someone needs a lilly, though I have not heard anyone say
they do.  ;-)

Adding several new variables is fine, but what do we call the hostname
option if we already have log_hostname?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: logging stuff

From
Rod Taylor
Date:
> I could see that being nice so you could create a couple of different
> custom log types, then use one for one database, and another for a
> different database.

Point them to different log files and you've got yourself a great
feature.

> For people running large numbers of databases on a single cluster, it
> might be quite useful to be able to log each database differently.

You bet. Hosting companies give clients the option to pick up their
weblogs. This would be a good feature for those.  Make the log come out
similarly as apache, and you've got yourself some webalizer screens with
funny URLs (queries ;).


Re: logging stuff

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Adding several new variables is fine, but what do we call the hostname
> option if we already have log_hostname?

<shrug>  We've renamed GUC variables before for consistency.  I'd opt
for picking names that show the common purpose, maybe log_line_FOO?
        regards, tom lane


Re: logging stuff

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Adding several new variables is fine, but what do we call the hostname
> > option if we already have log_hostname?
> 
> <shrug>  We've renamed GUC variables before for consistency.  I'd opt
> for picking names that show the common purpose, maybe log_line_FOO?

That's what I was thinking, that we might rename log_pid and
log_timestamp to log_line_pid and log_line_timestamp to indicate it is
printed on every line, and have the new booleans follow that pattern.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: logging stuff

From
"Mendola Gaetano"
Date:
"Tom Lane" <tgl@sss.pgh.pa.us> writes:
> Josh Berkus <josh@agliodbs.com> writes:
> > From my perspective, we could really use a "delimiter" between the
> > "fields" of log output which is unlikely to appear within those fields
> > instead of parsing by character count, rather than making dbname a
> > special case.
>
> Well, Andrew was arguing that space is a likely character in dbnames,
> but then again it's possible to put "]" into dbnames.  I think the only
> way to make this completely unambiguous would be to introduce a quoting
> convention for dbnames (and usernames when those get added).  But if
> the goal is to allow simple parsing of log entries, that won't improve
> matters at all.
>
> My feeling about it is that DBAs who want to automatically parse log
> entries can just forbid spaces in the dbnames and usernames they allow.
> The KISS principle applies here.
>
> A variant (which'd be okay with me) is to separate these fields with
> tabs instead of spaces; then the rule for DBAs would be "don't allow
> tabs in db/user names".

The tabs are not properly treated by syslogd, look this piece of log:

Aug  6 01:02:16 dell-01 postgres[19490]: [157296] LOG:  connection received:
host=[local]
Aug  6 01:02:16 dell-01 postgres[19490]: [157297] LOG:  connection
authorized: user=kalman database=kalman
Aug  6 01:02:16 dell-01 postgres[19490]: [157298] DEBUG:
/usr/bin/postmaster child[19490]: starting with (
Aug  6 01:02:16 dell-01 postgres[19490]: [157299] DEBUG:  ^Ipostgres
Aug  6 01:02:16 dell-01 postgres[19490]: [157300] DEBUG:  ^I-v131072
Aug  6 01:02:16 dell-01 postgres[19490]: [157301] DEBUG:  ^I-p
Aug  6 01:02:16 dell-01 postgres[19490]: [157302] DEBUG:  ^Ikalman
Aug  6 01:02:16 dell-01 postgres[19490]: [157303] DEBUG:  )


do you see:   ^I   that are "chars" that syslogd hate.


Regards
Gaeatano Mendola




Re: logging stuff

From
Tom Lane
Date:
"Mendola Gaetano" <mendola@bigfoot.com> writes:
> "Tom Lane" <tgl@sss.pgh.pa.us> writes:
>> A variant (which'd be okay with me) is to separate these fields with
>> tabs instead of spaces; then the rule for DBAs would be "don't allow
>> tabs in db/user names".

> do you see:   ^I   that are "chars" that syslogd hate.

Ugh.  Okay, scratch that idea ...
        regards, tom lane


Re: logging stuff

From
"Andrew Dunstan"
Date:
Gathering some of the threads, here's what I see:

1. There is some demand for per database logging. Doing that is rather
beyond the scope of what I intended - in fact what I intended was to enable
sensible log splitting out of band. I'll even write a perl script to do it
for you if you like :-)

2. There is a suggestion of custom log formats, beyond the current on/off
flags used, similar to Apache's logging mechanism. Of course, there are some
problems - web logs cover one thing - http transactions. Even there, custom
logging has led people to grief, and led to the common logging format that
web log analysers usually use. And Pg logs cover various kinds of events.
Some don't have a sensible db name, for example (like server startup logs
messages). Writing code that would allow administrators to specify the log
format almost arbitrarily, or even just to order the fields arbitrarily,
would be quite a bit of work, and I suspect a constant source of irritating
bugs, and I have no interest in doing the work. There are far more valuable
things that need to be done.

3. There is debate about field delimiters in the logs. Currently, timestamp
when used is a fixed length field at the start of the record, and pid when
used is inside [ ]. This is complicated in the case of dbname by the facts
that a) dbnames can be almost anything and b) not all records will have a
(sensible) dbname to report. I chose [ ] originally to conform to what was
done with pid, but now I'm leaning towards something else - too many M$
people have used [ ] in db work for years, so it is likely to crop up in
names somehow. Maybe < > or | | - whatever seems least liketo to occur
inside a dbname. I want to logs to be readable as well as parseable, or I'd
use something like ^ or ~ or `. The other reason to move away from [ ] is
that Tom (quite reasonably) wants to keep the char count down, and that
would let me drop the "db:" prefix, so the fixed overhead would be 3 chars
per line instead of 6 (if you count the trailing space).

The bottom line for me is to get the info needed in the logs, and then make
them look pretty, split them, or whatever, out of band - that isn't (and
shouldn't, IMNSHO) be the server's job.

andrew