Thread: Add support for logging the current role

Add support for logging the current role

From

Stephen Frost

Date:

12 January 2011, 10:23:53

Greetings,

Minor enhancement, but a valuable one imv.  Hopefully there aren't any
issues with it. :)
Thanks!
    Stephen

commit 3cb707aa9f228e629e7127625a76a223751a778b
Author: Stephen Frost <sfrost@snowman.net>
Date:   Wed Jan 12 09:17:31 2011 -0500
   Add support for logging the current role      This adds a '%o' option to the log_line_prefix GUC which will log the
current role.  The '%u' option only logs the Session user, which can   be misleading, but it's valuable to have both
options.

*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
***************
*** 3508,3513 **** local0.*    /var/log/postgresql
--- 3508,3518 ----              <entry>yes</entry>             </row>             <row>
+              <entry><literal>%o</literal></entry>
+              <entry>Current role name</entry>
+              <entry>yes</entry>
+             </row>
+             <row>              <entry><literal>%d</literal></entry>              <entry>Database name</entry>
    <entry>yes</entry> 
*** a/src/backend/utils/error/elog.c
--- b/src/backend/utils/error/elog.c
***************
*** 1826,1831 **** log_line_prefix(StringInfo buf, ErrorData *edata)
--- 1826,1841 ----                     appendStringInfoString(buf, username);                 }                 break;
+             case 'o':
+                 if (MyProcPort)
+                 {
+                     const char *rolename = GetUserNameFromId(GetUserId());
+
+                     if (rolename == NULL || *rolename == '\0')
+                         rolename = _("[unknown]");
+                     appendStringInfoString(buf, rolename);
+                 }
+                 break;             case 'd':                 if (MyProcPort)                 {

Re: Add support for logging the current role

From

Robert Haas

Date:

12 January 2011, 11:06:20

On Wed, Jan 12, 2011 at 9:23 AM, Stephen Frost <sfrost@snowman.net> wrote:
> Minor enhancement, but a valuable one imv.  Hopefully there aren't any
> issues with it. :)

1. Why %o?  That's not obviously mnemonic.  Perhaps %U?

2. It won't be clear to people reading this what the difference is
between %u and this.  You probably need to reword the documentation
for the existing option as well as documenting the new one.

3. Please attach the patch rather than including it inline, if possible.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Add support for logging the current role

From

Stephen Frost

Date:

12 January 2011, 11:12:31

* Robert Haas (robertmhaas@gmail.com) wrote:
> On Wed, Jan 12, 2011 at 9:23 AM, Stephen Frost <sfrost@snowman.net> wrote:
> > Minor enhancement, but a valuable one imv.  Hopefully there aren't any
> > issues with it. :)
>
> 1. Why %o?  That's not obviously mnemonic.  Perhaps %U?

r was taken? :)  I'm not sure I like %U, but in the end I don't *really*
care.  I'll update it to %U and wait for someone else to complain.

> 2. It won't be clear to people reading this what the difference is
> between %u and this.  You probably need to reword the documentation
> for the existing option as well as documenting the new one.

Fair enough.

> 3. Please attach the patch rather than including it inline, if possible.

Hrm, I could have sworn that Tom had asked for the exact opposite in the
past, but either way is fine by me.
Stephen

Re: Add support for logging the current role

From

Robert Haas

Date:

12 January 2011, 11:14:18

On Wed, Jan 12, 2011 at 10:12 AM, Stephen Frost <sfrost@snowman.net> wrote:
> r was taken? :)  I'm not sure I like %U, but in the end I don't *really*
> care.  I'll update it to %U and wait for someone else to complain.

The joys of community...

>> 3. Please attach the patch rather than including it inline, if possible.
>
> Hrm, I could have sworn that Tom had asked for the exact opposite in the
> past, but either way is fine by me.

Really?  I don't remember that, but it's certainly possible.  My
problem is that cutting and pasting from a browser window into a patch
file tends to be a little iffy.  If you paste too much or too little
or the whitespace doesn't come out quite right, the patch doesn't
apply.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Add support for logging the current role

From

Stephen Frost

Date:

12 January 2011, 11:28:10

Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:
> 1. Why %o?  That's not obviously mnemonic.  Perhaps %U?
>
> 2. It won't be clear to people reading this what the difference is
> between %u and this.  You probably need to reword the documentation
> for the existing option as well as documenting the new one.
>
> 3. Please attach the patch rather than including it inline, if possible.

Updated patch attached-

commit 7319e8ddc91d62addea25b85f7dbe2f95132cdc1
Author: Stephen Frost <sfrost@snowman.net>
Date:   Wed Jan 12 10:23:13 2011 -0500

    Use %U for role in log_line_prefix; improve docs

    Change the variable for logging the current role in log_line_prefix
    from %o to %U, to better reflect the 'user'-type mnemonic.
    Improve the documentation for the %U and %u log_line_prefix options
    to better differentiate them from each other.

commit 3cb707aa9f228e629e7127625a76a223751a778b
Author: Stephen Frost <sfrost@snowman.net>
Date:   Wed Jan 12 09:17:31 2011 -0500

    Add support for logging the current role

    This adds a '%o' option to the log_line_prefix GUC which will log the
    current role.  The '%u' option only logs the Session user, which can
    be misleading, but it's valuable to have both options.

    Thanks!

        Stephen

Attachment

Re: Add support for logging the current role

From

Tom Lane

Date:

12 January 2011, 11:33:17

Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Jan 12, 2011 at 10:12 AM, Stephen Frost <sfrost@snowman.net> wrote:
>> Hrm, I could have sworn that Tom had asked for the exact opposite in the
>> past, but either way is fine by me.

> Really?  I don't remember that, but it's certainly possible.

I don't remember saying exactly that either.  The main point is to
ensure the patch doesn't get mangled in transmission.  I've seen people
screw it up both ways: inline is much more vulnerable to mailers
deciding to whack whitespace around, while attachments are vulnerable to
being encoded in all sorts of weird ways, some of which come out nicely
in the archives and some of which don't.  I'm not in favor of gzipping
small patches that could perfectly well be left in readable form.

This particular patch looks fine here:
http://archives.postgresql.org/pgsql-hackers/2011-01/msg00845.php
so I'm thinking Stephen doesn't need to revisit his technique.

+1 for choosing something more mnemonic than "%o", btw.
        regards, tom lane

Re: Add support for logging the current role

From

Stephen Frost

Date:

12 January 2011, 11:43:15

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> +1 for choosing something more mnemonic than "%o", btw.

Alright, not to be *too* ridiculous about this, but I'm feeling like
'%R' might be better than '%U', if we don't mind overloading a single
letter based on case.  I've always been annoyed at the lack of
distinction between 'user' and 'role' in our docs and feel it does lead
to some confusion.

Updated patch attached, if people agree.  Compiles, passes regressions,
works as advertised, etc.

commit bba27fe63702405514ed2c3bb72b70cc178f9ce1
Author: Stephen Frost <sfrost@snowman.net>
Date:   Wed Jan 12 10:38:24 2011 -0500

    Change log_line_prefix for current role to %R

    As we're going for a mnemonic, and this is really about roles
    instead of users, change log_line_prefix argument to %R from
    %U for current_role.

    Thanks,

        Stephen

Attachment

Re: Add support for logging the current role

From

Robert Haas

Date:

12 January 2011, 11:46:26

On Wed, Jan 12, 2011 at 10:43 AM, Stephen Frost <sfrost@snowman.net> wrote:
> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>> +1 for choosing something more mnemonic than "%o", btw.
>
> Alright, not to be *too* ridiculous about this, but I'm feeling like
> '%R' might be better than '%U', if we don't mind overloading a single
> letter based on case.  I've always been annoyed at the lack of
> distinction between 'user' and 'role' in our docs and feel it does lead
> to some confusion.
>
> Updated patch attached, if people agree.  Compiles, passes regressions,
> works as advertised, etc.

I was thinking that %u/%U would have the advantage of implying some
connection between the two things which is in fact present.  %r/%R
seems not quite as good to me.  Also, let's paint it tangerine.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Add support for logging the current role

From

Stephen Frost

Date:

12 January 2011, 12:00:38

* Robert Haas (robertmhaas@gmail.com) wrote:
> I was thinking that %u/%U would have the advantage of implying some
> connection between the two things which is in fact present.  %r/%R
> seems not quite as good to me.  Also, let's paint it tangerine.

I figured that's where you were going.

+1 for whatever the committer wants to commit. ;)
Stephen

Re: Add support for logging the current role

From

Robert Haas

Date:

12 January 2011, 12:41:48

On Wed, Jan 12, 2011 at 11:00 AM, Stephen Frost <sfrost@snowman.net> wrote:
> * Robert Haas (robertmhaas@gmail.com) wrote:
>> I was thinking that %u/%U would have the advantage of implying some
>> connection between the two things which is in fact present.  %r/%R
>> seems not quite as good to me.  Also, let's paint it tangerine.
>
> I figured that's where you were going.
>
> +1 for whatever the committer wants to commit. ;)

OK, done.  :-)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Add support for logging the current role

From

Tom Lane

Date:

12 January 2011, 12:53:40

Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Jan 12, 2011 at 11:00 AM, Stephen Frost <sfrost@snowman.net> wrote:
>> +1 for whatever the committer wants to commit. ;)

> OK, done.  :-)

Uh, did you actually stop to *think* about this patch?

What you have just committed puts a syscache lookup into the elog output
path.  Quite aside from the likely performance hit, this will
malfunction badly in any case where we're trying to log from an aborted
transaction.

Please revert and rethink.
        regards, tom lane

Re: Add support for logging the current role

From

Robert Haas

Date:

12 January 2011, 12:59:24

On Wed, Jan 12, 2011 at 11:53 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Wed, Jan 12, 2011 at 11:00 AM, Stephen Frost <sfrost@snowman.net> wrote:
>>> +1 for whatever the committer wants to commit. ;)
>
>> OK, done.  :-)
>
> Uh, did you actually stop to *think* about this patch?

You have a valid point here, but this isn't the most tactful way of putting it.

> What you have just committed puts a syscache lookup into the elog output
> path.  Quite aside from the likely performance hit, this will
> malfunction badly in any case where we're trying to log from an aborted
> transaction.
>
> Please revert and rethink.

I think it's going to take more than a rethink - I don't see any way
to salvage it.  :-(

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Add support for logging the current role

From

Stephen Frost

Date:

12 January 2011, 13:13:56

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Uh, did you actually stop to *think* about this patch?

Actually, I was worried about exactly that, but I didn't see anything at
the top of elog.c that indicated if it'd be a problem or not (and the
Syscache lookup issue was *exactly* what I was looking for). :(  There
was much discussion about recursion and memory commit and whatnot, but
nothing about SysCache lookups.

> What you have just committed puts a syscache lookup into the elog output
> path.  Quite aside from the likely performance hit, this will
> malfunction badly in any case where we're trying to log from an aborted
> transaction.

I had been looking into storing the current role inside the Proc struct
or in some new variable and then pulling it from there (updating it when
needed during a SET ROLE, of course), but it seemed a bit of overkill if
it wasn't necessary (which wasn't obvious to me). We could also just log
the role's OID (%o anyone..?), since that doesn't need a syscache lookup
to get at.  I'd much rather log the role name if we can tho.

I had looked through some of the other calls happening in log_line_prefix
and didn't see any explicit syscache lookups but it seemed like we were
doing quite a few other things that might have issues, so I had assumed
it'd be alright.  Sorry about that.
Thanks,
    Stephen

Re: Add support for logging the current role

From

Stephen Frost

Date:

12 January 2011, 13:25:19

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> What you have just committed puts a syscache lookup into the elog output
> path.  Quite aside from the likely performance hit, this will
> malfunction badly in any case where we're trying to log from an aborted
> transaction.

Attached is my (admittedly horrible) attempt to add some comments to
elog.c regarding this issue.  Reviewing this, I'm not sure the
performance concern is really an issue (given that the user could choose
to enable it or not), but clearly the other issue is a concern.

Thanks,
Stephen

commit 4dcf23e007967892557b7b113a9229cb9fc4575d
Author: Stephen Frost <sfrost@snowman.net>
Date:   Wed Jan 12 12:22:16 2011 -0500
   Improve comments at the top of elog.c      Add in some comments about how certain usually available backend
systemsmay be unavailable or which won't function properly in   elog.c due to the current transaction being in a failed
state.

Re: Add support for logging the current role

From

Robert Haas

Date:

12 January 2011, 13:25:38

On Wed, Jan 12, 2011 at 12:13 PM, Stephen Frost <sfrost@snowman.net> wrote:
>> What you have just committed puts a syscache lookup into the elog output
>> path.  Quite aside from the likely performance hit, this will
>> malfunction badly in any case where we're trying to log from an aborted
>> transaction.
>
> I had been looking into storing the current role inside the Proc struct
> or in some new variable and then pulling it from there (updating it when
> needed during a SET ROLE, of course), but it seemed a bit of overkill if
> it wasn't necessary (which wasn't obvious to me). We could also just log
> the role's OID (%o anyone..?), since that doesn't need a syscache lookup
> to get at.  I'd much rather log the role name if we can tho.

Logging the OID seems to be of questionable value.  I thought of the
update-the-variable-when-it-changes approach too, but besides being a
bit expensive if it's changing frequently, it's not necessarily safe
to do the syscache lookup there either - see the comments for
GetUserIdAndSecContext (which are really for SetUserIdAndSecContext,
but they're in an odd place).

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Add support for logging the current role

From

Robert Haas

Date:

12 January 2011, 13:26:17

On Wed, Jan 12, 2011 at 12:25 PM, Stephen Frost <sfrost@snowman.net> wrote:
> Attached is ...

I don't see an attachment, other than signature.asc.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Add support for logging the current role

From

Stephen Frost

Date:

12 January 2011, 13:37:23

* Robert Haas (robertmhaas@gmail.com) wrote:
> On Wed, Jan 12, 2011 at 12:25 PM, Stephen Frost <sfrost@snowman.net> wrote:
> > Attached is ...
>
> I don't see an attachment, other than signature.asc.

I suck, sorry about that, here it is..

See, inlining is better!  I wouldn't have forgotten it! ;)

    Stephen

Attachment

Re: Add support for logging the current role

From

Stephen Frost

Date:

12 January 2011, 14:00:06

* Robert Haas (robertmhaas@gmail.com) wrote:
> Logging the OID seems to be of questionable value.

I certainly disagree about this, not being able to figure out what's
causing a 'permissions denied' error because you don't know which role
the log is coming from is *very* annoying.  Having to go look up the
role from the OID in the log is also annoying, but less so, imv. :)

> I thought of the
> update-the-variable-when-it-changes approach too, but besides being a
> bit expensive if it's changing frequently, it's not necessarily safe
> to do the syscache lookup there either - see the comments for
> GetUserIdAndSecContext (which are really for SetUserIdAndSecContext,
> but they're in an odd place).

Alright, here's a patch which adds the ability to log the current role's
OID and which calls GetUserIdAndSecContext() directly and handles the
possibility that CurrentUserId isn't valid.  Perhaps we should just grab
CurrentUserId directly rather than going through
GetUserIdAndSecContext()?  I could certainly do that instead.

Also includes those additional comments in elog.c.

    Thanks,

        Stephen

commit d9a7acd5ea1f5214b44875b6d257c5c59590167c
Author: Stephen Frost <sfrost@snowman.net>
Date:   Wed Jan 12 12:53:50 2011 -0500

    Use GetUserIdAndSecContext to get role OID in elog

    We can't be sure that GetUserId() will be called when current_user
    is a valid Oid, per the comments in GetUserIdAndSecContext, when
    called from elog.c, so instead call GetUserIdAndSecContext directly
    and handle the possible invalid Oid ourselves.

commit 605497b062298ea195d8999f8cefca10968ae22f
Author: Stephen Frost <sfrost@snowman.net>
Date:   Wed Jan 12 12:29:44 2011 -0500

    Change to logging role's OID instead of name

    Remove the SysCache lookup from elog.c/log_line_prefix by logging
    the role's OID instead, this addresses a concern where a SysCache
    lookup could malfunction badly due to logging from a failed
    transaction.  Note that using SysCache from the elog routines could
    also be a performance hit, though this would only be the case if a
    user chose to enable that logging.

Attachment

Re: Add support for logging the current role

From

Robert Haas

Date:

12 January 2011, 15:03:23

On Wed, Jan 12, 2011 at 12:59 PM, Stephen Frost <sfrost@snowman.net> wrote:
> * Robert Haas (robertmhaas@gmail.com) wrote:
>> Logging the OID seems to be of questionable value.
>
> I certainly disagree about this, not being able to figure out what's
> causing a 'permissions denied' error because you don't know which role
> the log is coming from is *very* annoying.

Interesting.  I wonder if we shouldn't try to fix this by including
the relevant role name in the error message.  Or is that just going to
be too messy to live?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Add support for logging the current role

From

Stephen Frost

Date:

12 January 2011, 20:59:02

* Robert Haas (robertmhaas@gmail.com) wrote:
> On Wed, Jan 12, 2011 at 12:59 PM, Stephen Frost <sfrost@snowman.net> wrote:
> > I certainly disagree about this, not being able to figure out what's
> > causing a 'permissions denied' error because you don't know which role
> > the log is coming from is *very* annoying.
>
> Interesting.  I wonder if we shouldn't try to fix this by including
> the relevant role name in the error message.  Or is that just going to
> be too messy to live?

It might be possible to do and answer that specific question- but what
about the obvious next question: which role was this command run with?
iow, if I log dml, how do I know what the role was when the dml
statement was run?  ie- why was this command allowed?

Let's ask another question- why do we provide a %u option in
log_line_prefix instead of just logging it as part of each statement?
When you have roles that aren't 'inherit' and have a lot of 'set role's
happening, you end up asking the same questions about role that you
would about user.

As a side-note, CurrentUserId isn't actually exported (I'm not suprised,
tbh, but I've actually checked now), so you have to go through
GetUserIdAndSecContext().
Thanks,        Stephen

Re: Add support for logging the current role

From

Tom Lane

Date:

12 January 2011, 21:16:29

Stephen Frost <sfrost@snowman.net> writes:
> * Robert Haas (robertmhaas@gmail.com) wrote:
>> Interesting.  I wonder if we shouldn't try to fix this by including
>> the relevant role name in the error message.  Or is that just going to
>> be too messy to live?

> It might be possible to do and answer that specific question- but what
> about the obvious next question: which role was this command run with?
> iow, if I log dml, how do I know what the role was when the dml
> statement was run?  ie- why was this command allowed?

I'm less than excited about that argument because it's after the fact
--- if you needed to know the information, you probably didn't have
log_line_prefix set correctly, even assuming you had adequate logging
otherwise.  And logging an OID just seems too ugly to live.

Another little problem with the quick and dirty solution is that stuff
that's important enough to warrant a log_line_prefix escape is generally
thought to be important enough to warrant inclusion in CSV logs.  That
would imply adding a column and taking the resultant compatibility hit.
        regards, tom lane

Re: Add support for logging the current role

From

Stephen Frost

Date:

12 January 2011, 21:22:01

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > It might be possible to do and answer that specific question- but what
> > about the obvious next question: which role was this command run with?
> > iow, if I log dml, how do I know what the role was when the dml
> > statement was run?  ie- why was this command allowed?
>
> I'm less than excited about that argument because it's after the fact
> --- if you needed to know the information, you probably didn't have
> log_line_prefix set correctly, even assuming you had adequate logging
> otherwise.  And logging an OID just seems too ugly to live.

Erm, really?  Ok, fine, maybe you didn't have log_line_prefix set
correctly the first time you needed the information, but after you
discover that you *don't know*, you're going to be looking for an option
to let you get that information for the future.  I would also suggest
that more experienced admins are going to have a default log_line_prefix
that they install on new systems they set up (I know I do...), and I'd
be suprised if knowing the role that a command is actually run as wasn't
popular among that set.

I don't like logging an OID either.

> Another little problem with the quick and dirty solution is that stuff
> that's important enough to warrant a log_line_prefix escape is generally
> thought to be important enough to warrant inclusion in CSV logs.  That
> would imply adding a column and taking the resultant compatibility hit.

I'd be more than happy to add support for this to the CSV logs.  I agree
that it'd make sense to do.  I think we need to solve the bigger problem
of OID vs. rolename vs. lookups from elog first though.
Thanks,
    Stephen

Re: Add support for logging the current role

From

Stephen Frost

Date:

12 January 2011, 21:26:13

* Stephen Frost (sfrost@snowman.net) wrote:
> Erm, really?  Ok, fine, maybe you didn't have log_line_prefix set
> correctly the first time you needed the information, but after you
> discover that you *don't know*, you're going to be looking for an option
> to let you get that information for the future.

Oh, yeah, and honestly, the above is the reason I'm after this myself- I
was having a difficult time figuring out which of 300-odd users a given
error was happening for and was annoyed that I couldn't figure out what
role it was.  The web server logs in with a 'general' role that doesn't
inherit anything and then has to 'set role' to whatever role the user
authenticated with.  After that 'set role' happens, we've got no way to
know from the logs who is impacted by a given error.

Guess I'm just trying to say that I didn't write this patch as an
academic exercise but rather because it solves a real world problem for
me.
Thanks,
    Stephen

Re: Add support for logging the current role

From

Tom Lane

Date:

12 January 2011, 21:29:56

Stephen Frost <sfrost@snowman.net> writes:
> Guess I'm just trying to say that I didn't write this patch as an
> academic exercise but rather because it solves a real world problem for
> me.

I understand.  But doing this right is going to take more than ten lines
of code, and more than a negligible performance penalty.  We have to
consider whether it's worth it.
        regards, tom lane

Re: Add support for logging the current role

From

Stephen Frost

Date:

12 January 2011, 21:45:30

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> I understand.  But doing this right is going to take more than ten lines
> of code, and more than a negligible performance penalty.  We have to
> consider whether it's worth it.

It'd be ideal if the performance hit could only be felt by people who
want to enable the option.  On the flip side, I don't know that adding a
bit of extra work to SET ROLE would be that bad.  If it helps (and I
don't know if it does, I'm still trying to wrap my head around
GetUserIdAndSecContext/SetUserIdAndSecContext), I'd be fine with *not*
trying to log the right role when inside Security Definter functions
(after all, if those are getting called, the user could go look at the
function definition to see which role it's being run as).

I gather one issue is how we can pick up what the correct role name is
when resetting the role due to a failed transaction..?  Building a stack
with all the role names pre-cached to deal with that wouldn't be likely
to work and we'd need more than one level to deal with savepoints, I
assume?  We could reset it to an Invalid name on abort and then detect
that it needs to be corrected at the start of the next transaction,
perhaps?
Thanks,
    Stephen

Re: Add support for logging the current role

From

Tom Lane

Date:

12 January 2011, 22:00:03

Stephen Frost <sfrost@snowman.net> writes:
> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>> I understand.  But doing this right is going to take more than ten lines
>> of code, and more than a negligible performance penalty.  We have to
>> consider whether it's worth it.

> It'd be ideal if the performance hit could only be felt by people who
> want to enable the option.  On the flip side, I don't know that adding a
> bit of extra work to SET ROLE would be that bad.  If it helps (and I
> don't know if it does, I'm still trying to wrap my head around
> GetUserIdAndSecContext/SetUserIdAndSecContext), I'd be fine with *not*
> trying to log the right role when inside Security Definter functions
> (after all, if those are getting called, the user could go look at the
> function definition to see which role it's being run as).

> I gather one issue is how we can pick up what the correct role name is
> when resetting the role due to a failed transaction..?  Building a stack
> with all the role names pre-cached to deal with that wouldn't be likely
> to work and we'd need more than one level to deal with savepoints, I
> assume?  We could reset it to an Invalid name on abort and then detect
> that it needs to be corrected at the start of the next transaction,
> perhaps?

I seem to recall that the assign hook for role stores the string form of
the role name anyway.  So in principle you could arrange for that to get
dumped someplace where elog.c could look at it (think about just adding
a string parameter to SetUserIdAndSecContext).  It wouldn't track the
effects of RENAME ROLE against an actively-used role, but then again
neither does %u.

I'm not actually concerned about adding a few extra cycles to SET ROLE
for this.  What bothered me more was the cost of adding another output
column to CSV log mode.  That's not something you're going to be able to
finesse such that only people who care pay the cost.
        regards, tom lane

Re: Add support for logging the current role

From

Dimitri Fontaine

Date:

14 January 2011, 11:58:12

Tom Lane <tgl@sss.pgh.pa.us> writes:
> Another little problem with the quick and dirty solution is that stuff
> that's important enough to warrant a log_line_prefix escape is generally
> thought to be important enough to warrant inclusion in CSV logs.  That
> would imply adding a column and taking the resultant compatibility hit.

Well if we're down to adding columns to the CSV format, what about
adding an explicit column where to output the query duration as an
interval literal, rather than putting it in the query string (IIRC)?

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

Re: Add support for logging the current role

From

Andrew Dunstan

Date:

14 January 2011, 12:36:49

On 01/12/2011 08:59 PM, Tom Lane wrote:
>
> I'm not actually concerned about adding a few extra cycles to SET ROLE
> for this.  What bothered me more was the cost of adding another output
> column to CSV log mode.  That's not something you're going to be able to
> finesse such that only people who care pay the cost.
>
>             

I think it's time to revisit the design of CSV logs again, now we have 
two or three releases worth of experience with it. It needs some 
flexibility and refinement.

cheers

andrew

Re: Add support for logging the current role

From

Stephen Frost

Date:

14 January 2011, 12:40:34

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> I seem to recall that the assign hook for role stores the string form of
> the role name anyway.

Indeed it does, and it's already exposed through show_role() since it's
needed in guc.c.  Based on my review and understanding of the comments
and calls, it also doesn't do anything particularly complicated or any
syscache searches or anything.

> It wouldn't track the
> effects of RENAME ROLE against an actively-used role, but then again
> neither does %u.

Right, I didn't specifically point that out in the documentation
changes, but I can if people feel it's neceessary.

> What bothered me more was the cost of adding another output
> column to CSV log mode.  That's not something you're going to be able to
> finesse such that only people who care pay the cost.

I definitely feel this is something that we should be logging in the CSV
also, and you're right, there doesn't appear to be a way to do that
without just outright changing the format and causing people to have to
update anything/everything that uses it.  I have a hard time with the
idea that we'll commit to never changing that format though, so do we
want to provide a way for users to specify the format (ala
log_line_prefix), or just ask users to expect and deal with format
changes when they happen..?

I noticed Dimitri would like another change to the CSV log format (which
looked reasonable to me, asking to have something split out from the
query string itself), it'd certainly be better to change both in the
same release than split them across two (of course, we might come up
with something else in the future...).

I have to admit to being a bit suprised the CSV logging wasn't
implemented with a 'format' type option.  I'm not sure if I have the
cycles or even if we would want to try and add that now, but it
strikes me as something we should probably do.

Updated patch attached, included new comments for elog.c too.

    Thanks,

        Stephen

commit 4e27ab79ef9b0d0c3c9824d672e06160dd227cc2
Author: Stephen Frost <sfrost@snowman.net>
Date:   Wed Jan 12 12:22:16 2011 -0500

    Improve comments at the top of elog.c

    Add in some comments about how certain usually available backend
    systems may be unavailable or which won't function properly in
    elog.c due to the current transaction being in a failed state.

commit d3ca4063ba8e16930278947c32c336b5b80cdaba
Author: Stephen Frost <sfrost@snowman.net>
Date:   Fri Jan 14 11:19:45 2011 -0500

    Add %U option to log_line_prefix for current role

    This adds a new option to log_line_prefix (%U) to allow the current
    role to be logged, which is valuable information when an application
    or user is using SET ROLE and roles which are set 'noinherit'.

    This also changes the current definition of %u to be 'Session user',
    to avoid confusion when a superuser uses 'SET SESSION AUTHORIZATION'.
    Otherwise, a log might read 'login_user none' but actually be running
    as a different user due to SET SESSION AUTHORIZATION.  The 'username'
    field for CSV logging was also updated to be 'Session user'.  Note:
    SET SESSION AUTHORIZATION is only allowed for superusers, and the
    logged username will only change if SET SESSION AUTHORIZATION is
    called, so this is unlikely to have signifigant user impact.

    Last, but certainly not least, role_name was added as a new column to
    the CSV log output and corresponding example CSV table definition.
    This is a user-visible change which should be called out in the release
    notes.

Attachment

Re: Add support for logging the current role

From

Tom Lane

Date:

14 January 2011, 12:44:10

Andrew Dunstan <andrew@dunslane.net> writes:
> On 01/12/2011 08:59 PM, Tom Lane wrote:
>> I'm not actually concerned about adding a few extra cycles to SET ROLE
>> for this.  What bothered me more was the cost of adding another output
>> column to CSV log mode.  That's not something you're going to be able to
>> finesse such that only people who care pay the cost.

> I think it's time to revisit the design of CSV logs again, now we have 
> two or three releases worth of experience with it. It needs some 
> flexibility and refinement.

It would definitely be nice to support optional columns a little better.
I'm not even sure whether the runtime overhead is worth worrying about
(maybe it is, maybe it isn't, I have no data).  But I do know that
adding a column to the CSV output format spec causes a flag day for
users.  How can we avoid that?
        regards, tom lane

Re: Add support for logging the current role

From

Stephen Frost

Date:

14 January 2011, 12:48:52

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
> > I think it's time to revisit the design of CSV logs again, now we have
> > two or three releases worth of experience with it. It needs some
> > flexibility and refinement.
>
> It would definitely be nice to support optional columns a little better.
> I'm not even sure whether the runtime overhead is worth worrying about
> (maybe it is, maybe it isn't, I have no data).  But I do know that
> adding a column to the CSV output format spec causes a flag day for
> users.  How can we avoid that?

My first thought would be to have a 'log_csv_format' GUC that's very
similar to 'log_line_prefix' (and uses the same variables if
possible..).  We could then ship a default in postgresql.conf that
matches what the current format is while adding the other options if
people want to use them.

If we could have all the processing to generate that line go through the
same function for log_line_prefix and log_csv_format, that'd be even
better.  Makes me tempted to throw out the current notion of
'log_line_*prefix*' and replace it with 'log_line_*format*' to match
exactly the 'log_csv_format' that I'm proposing.  That'd undoubtably
cause more user headaches tho... :(
Thanks,
    Stephen

Re: Add support for logging the current role

From

Andrew Dunstan

Date:

14 January 2011, 17:56:31


On 01/14/2011 11:48 AM, Stephen Frost wrote:
> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>> Andrew Dunstan<andrew@dunslane.net>  writes:
>>> I think it's time to revisit the design of CSV logs again, now we have
>>> two or three releases worth of experience with it. It needs some
>>> flexibility and refinement.
>> It would definitely be nice to support optional columns a little better.
>> I'm not even sure whether the runtime overhead is worth worrying about
>> (maybe it is, maybe it isn't, I have no data).  But I do know that
>> adding a column to the CSV output format spec causes a flag day for
>> users.  How can we avoid that?
> My first thought would be to have a 'log_csv_format' GUC that's very
> similar to 'log_line_prefix' (and uses the same variables if
> possible..).  We could then ship a default in postgresql.conf that
> matches what the current format is while adding the other options if
> people want to use them.
>
> If we could have all the processing to generate that line go through the
> same function for log_line_prefix and log_csv_format, that'd be even
> better.  Makes me tempted to throw out the current notion of
> 'log_line_*prefix*' and replace it with 'log_line_*format*' to match
> exactly the 'log_csv_format' that I'm proposing.  That'd undoubtably
> cause more user headaches tho... :(
>
>     


I'm not sure I really want to make it that flexible :-)

To deal with the issue Tom's referring to, I think it would be 
sufficient if we just allowed users to suppress production of certain 
columns (as long as we never do anything so evil as to add a new column 
in the middle).

There are some other issues with the format. I know Josh has bitched 
about the presence of command tags in certain fields, for example.

cheers

andrew

Re: Add support for logging the current role

From

Aidan Van Dyk

Date:

14 January 2011, 18:04:52

On Fri, Jan 14, 2011 at 4:56 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>
> I'm not sure I really want to make it that flexible :-)
>
> To deal with the issue Tom's referring to, I think it would be sufficient if
> we just allowed users to suppress production of certain columns (as long as
> we never do anything so evil as to add a new column in the middle).
>
> There are some other issues with the format. I know Josh has bitched about
> the presence of command tags in certain fields, for example.

If there is going to be any change, how about using fixed columns (an
possibly allowing them to be empty for stuff that's expensive to
create/write), but adding a 1st column that contains a "version"
identifyer.  And to make it easy, maybe the PG major version as the
version value.

If the 1st column is always the version,  tools can easily know if
they understand all the columns (and what order they are in) and it'
easy to write a "conversion" that strips/re-aranges columns from a
newer CVS dump to match an older one if you have tools that don't know
about newer column layouts..

Personally, I'm not worried about the CSV logs being backwards
compatible as long as there's a very easy way to know what I might be
looking at, so conversion is easy...

But then again, I don't have multiple gigabytes of logs to process either.

a.

--
Aidan Van Dyk                                             Create like a god,
aidan@highrise.ca                                       command like a king,
http://www.highrise.ca/                                   work like a slave.

Re: Add support for logging the current role

From

Tom Lane

Date:

14 January 2011, 18:08:17

Andrew Dunstan <andrew@dunslane.net> writes:
> On 01/14/2011 11:48 AM, Stephen Frost wrote:
>> My first thought would be to have a 'log_csv_format' GUC that's very
>> similar to 'log_line_prefix' (and uses the same variables if
>> possible..).  We could then ship a default in postgresql.conf that
>> matches what the current format is while adding the other options if
>> people want to use them.

> I'm not sure I really want to make it that flexible :-)

It actually sounded like a pretty good idea to me.  The current CSV
format is already overly bulky/verbose, because it includes absolutely
everything anybody ever wanted before now.  Allowing people to select
what they actually need, and thereby get rid of some of the overhead
they're currently paying, would be a good thing.
        regards, tom lane

Re: Add support for logging the current role

From

Andrew Dunstan

Date:

14 January 2011, 18:10:47


On 01/14/2011 05:04 PM, Aidan Van Dyk wrote:
> If there is going to be any change, how about using fixed columns (an
> possibly allowing them to be empty for stuff that's expensive to
> create/write), but adding a 1st column that contains a "version"
> identifyer.  And to make it easy, maybe the PG major version as the
> version value.
>
> If the 1st column is always the version,  tools can easily know if
> they understand all the columns (and what order they are in) and it'
> easy to write a "conversion" that strips/re-aranges columns from a
> newer CVS dump to match an older one if you have tools that don't know
> about newer column layouts..
>
>


The whole point of having CSV logs is so you can load them into a 
database table without needing preprocessing tools. So I'm not going to 
be very receptive to changes that are predicated on using such tools.

cheers

andrew

Re: Add support for logging the current role

From

Tom Lane

Date:

14 January 2011, 18:44:35

Aidan Van Dyk <aidan@highrise.ca> writes:
> If there is going to be any change, how about using fixed columns (an
> possibly allowing them to be empty for stuff that's expensive to
> create/write), but adding a 1st column that contains a "version"
> identifyer.  And to make it easy, maybe the PG major version as the
> version value.

Seems like that just adds even more overhead, without really solving any
of the problems we're concerned about.  Code consuming the CSV log would
still need a-priori knowledge of what columns to expect.
        regards, tom lane

Re: Add support for logging the current role

From

Andrew Dunstan

Date:

14 January 2011, 19:48:27

On 01/14/2011 05:08 PM, Tom Lane wrote:
> Andrew Dunstan<andrew@dunslane.net>  writes:
>> On 01/14/2011 11:48 AM, Stephen Frost wrote:
>>> My first thought would be to have a 'log_csv_format' GUC that's very
>>> similar to 'log_line_prefix' (and uses the same variables if
>>> possible..).  We could then ship a default in postgresql.conf that
>>> matches what the current format is while adding the other options if
>>> people want to use them.
>> I'm not sure I really want to make it that flexible :-)
> It actually sounded like a pretty good idea to me.  The current CSV
> format is already overly bulky/verbose, because it includes absolutely
> everything anybody ever wanted before now.  Allowing people to select
> what they actually need, and thereby get rid of some of the overhead
> they're currently paying, would be a good thing.
>
>             

If you have a format string, what do you want to do with the bits of the 
format that aren't field references? What about delimiters? A format 
string makes it too easy to muck up and too hard to get right, IMNSHO. 
History has shown how easy it is to muck up CSVs.  The suggestion I made 
of allowing people to suppress production of certain columns would take 
care of the bulk problem much more safely, I think. We've actually had 
remarkably few issues with CSV logs not being loadable, that I know of 
anyway. When we implemented it, I expected many more issues with it than 
we've had. I'd like to keep it that way.

cheers

andrew

Re: Add support for logging the current role

From

Tom Lane

Date:

14 January 2011, 21:00:14

Andrew Dunstan <andrew@dunslane.net> writes:
> On 01/14/2011 05:08 PM, Tom Lane wrote:
>> It actually sounded like a pretty good idea to me.

> If you have a format string, what do you want to do with the bits of the 
> format that aren't field references?

I was thinking of it as being strictly a field list.  I don't know
whether it's really practical to borrow log_line_prefix's one-character
names for the fields (for one thing, there would need to be names for
all the existing CSV columns, not all of which equate to log_line_prefix
escapes); but in any case anything other than field references would be
disallowed.  If you prefer to use a name list as the syntax that's fine
with me.
        regards, tom lane

Re: Add support for logging the current role

From

Robert Haas

Date:

14 January 2011, 21:41:22

On Fri, Jan 14, 2011 at 8:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> On 01/14/2011 05:08 PM, Tom Lane wrote:
>>> It actually sounded like a pretty good idea to me.
>
>> If you have a format string, what do you want to do with the bits of the
>> format that aren't field references?
>
> I was thinking of it as being strictly a field list.  I don't know
> whether it's really practical to borrow log_line_prefix's one-character
> names for the fields (for one thing, there would need to be names for
> all the existing CSV columns, not all of which equate to log_line_prefix
> escapes); but in any case anything other than field references would be
> disallowed.  If you prefer to use a name list as the syntax that's fine
> with me.

I think we're in the process of designing a manned mission to Mars to
solve the problem that our shoelaces are untied.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Add support for logging the current role

From

Stephen Frost

Date:

14 January 2011, 21:44:47

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
> > If you have a format string, what do you want to do with the bits of the
> > format that aren't field references?
>
> I was thinking of it as being strictly a field list.  I don't know
> whether it's really practical to borrow log_line_prefix's one-character
> names for the fields (for one thing, there would need to be names for
> all the existing CSV columns, not all of which equate to log_line_prefix
> escapes);

I'm not really happy about the idea that you can only get certain
information in a log file if you use CSV format.  I also don't know
that there's really any particular reason log_line_prefix's names
have to be one character.

> but in any case anything other than field references would be
> disallowed.  If you prefer to use a name list as the syntax that's fine
> with me.

I do like the idea of having just a field list though, to keep things
simple for the CSV users, and we could also pre-process the list into
flag variables or a bitmask or similar to be able to quickly check if a
certain field should be included or not.  I'm not really keen about how
log_line_prefix currently parses the direct user-provided syntax every
time; strikes me as inefficient.
Thanks,
    Stephen

Re: Add support for logging the current role

From

Tom Lane

Date:

14 January 2011, 22:01:06

Stephen Frost <sfrost@snowman.net> writes:
> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>> I was thinking of it as being strictly a field list.  I don't know
>> whether it's really practical to borrow log_line_prefix's one-character
>> names for the fields (for one thing, there would need to be names for
>> all the existing CSV columns, not all of which equate to log_line_prefix
>> escapes);

> I'm not really happy about the idea that you can only get certain
> information in a log file if you use CSV format.

I said no such thing!  The point here is that there is a great deal of
stuff in the textual log format that is not governed by log_line_prefix,
so log_line_prefix provides no precedent for naming it: the error level,
the SQLSTATE, the primary message, the detail, the hint, etc, all come
out without any connection to log_line_prefix.  In CSV all of those
already exist as columns.  If we want users to be able to control which
CSV columns get emitted, they'll need to be able to name those columns,
and log_line_prefix doesn't provide any precedent for that.

> I also don't know
> that there's really any particular reason log_line_prefix's names
> have to be one character.

Well, it's pretty much conventional for %-escapes to be that way,
though I agree we're kind of straining the system already.

> I do like the idea of having just a field list though, to keep things
> simple for the CSV users, and we could also pre-process the list into
> flag variables or a bitmask or similar to be able to quickly check if a
> certain field should be included or not.  I'm not really keen about how
> log_line_prefix currently parses the direct user-provided syntax every
> time; strikes me as inefficient.

For log_line_prefix I'm not sure you could do a lot better.  I agree
that a field name list for CSV would have to be preprocessed somehow for
efficiency.
        regards, tom lane

Re: Add support for logging the current role

From

Andrew Dunstan

Date:

14 January 2011, 22:24:47


On 01/14/2011 08:41 PM, Robert Haas wrote:
>
> I think we're in the process of designing a manned mission to Mars to
> solve the problem that our shoelaces are untied.
>

What's your suggestion, then?

cheers

andre

Re: Add support for logging the current role

From

Tom Lane

Date:

14 January 2011, 22:28:57

Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, Jan 14, 2011 at 8:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I was thinking of it as being strictly a field list.

> I think we're in the process of designing a manned mission to Mars to
> solve the problem that our shoelaces are untied.

How so?  ISTM the problems at hand are (a) we can't add a new CSV column
without causing a flag day for users who may not even care about the
information, and (b) we're worried that emitting all these columns may
result in a performance hit, again for information that a particular
user may not need.  A user-settable column list seems pretty on-target
for solving those problems to me.
        regards, tom lane

Re: Add support for logging the current role

From

Stephen Frost

Date:

14 January 2011, 22:30:55

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> A user-settable column list seems pretty on-target
> for solving those problems to me.

I'm looking into implementing this.

An interesting initial question is- should the users be able to control
the *order* of the columns?  My gut feeling, if we're giving them a GUC
that's a list of fields, is 'yes', but I'm happy to listen to other
thoughts.
Thanks,
    Stephen

Re: Add support for logging the current role

From

Tom Lane

Date:

14 January 2011, 22:51:54

Stephen Frost <sfrost@snowman.net> writes:
> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>> A user-settable column list seems pretty on-target
>> for solving those problems to me.

> I'm looking into implementing this.

> An interesting initial question is- should the users be able to control
> the *order* of the columns?  My gut feeling, if we're giving them a GUC
> that's a list of fields, is 'yes', but I'm happy to listen to other
> thoughts.

Yeah, I was just thinking about that in connection with the suggestion
of using a bitmap as the pre-parsed representation (which would more or
less force adoption of the fixed-column-order approach).  I really think
we can't get away with that.  Remember what Andrew pointed out upthread:
it's important to be able to load the csvlog output directly into a
table without any extra processing.  Suppose a DBA is logging columns
A,B,D and he later realizes that logging C would be a good thing too.
He's going to have to ALTER TABLE ADD COLUMN to add C to his logging
table ... and now it's at the end.  This is no problem if he can set the
GUC to be "A,B,D,C" and have the field order be honored.  Otherwise he's
got a problem.

In any case, if the GUC representation is a list of field names, I think
the POLA demands that the system honor the list order.  You could escape
that expectation by controlling the feature with a pile of booleans
(csv_log_pid = on, csv_log_timestamp = off, etc) but I can't say that
that sort of API appeals to me.

BTW, in case you didn't know, there are some GUCs defined as lists of
identifiers already (look for GUC_LIST bits).  Be sure to steal code.
        regards, tom lane

Re: Add support for logging the current role

From

Andrew Dunstan

Date:

14 January 2011, 22:59:49


On 01/14/2011 09:51 PM, Tom Lane wrote:
> Stephen Frost<sfrost@snowman.net>  writes:
>> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>>> A user-settable column list seems pretty on-target
>>> for solving those problems to me.
>> I'm looking into implementing this.
>> An interesting initial question is- should the users be able to control
>> the *order* of the columns?  My gut feeling, if we're giving them a GUC
>> that's a list of fields, is 'yes', but I'm happy to listen to other
>> thoughts.
> Yeah, I was just thinking about that in connection with the suggestion
> of using a bitmap as the pre-parsed representation (which would more or
> less force adoption of the fixed-column-order approach).  I really think
> we can't get away with that.  Remember what Andrew pointed out upthread:
> it's important to be able to load the csvlog output directly into a
> table without any extra processing.  Suppose a DBA is logging columns
> A,B,D and he later realizes that logging C would be a good thing too.
> He's going to have to ALTER TABLE ADD COLUMN to add C to his logging
> table ... and now it's at the end.  This is no problem if he can set the
> GUC to be "A,B,D,C" and have the field order be honored.  Otherwise he's
> got a problem.

Ok, you sold me. Until I read this I was inclined to say not, on KISS 
principles.


The only thing I'd suggest extra is that we might allow "version_n_m" as 
shorthand for the default table layout from the relevant version.

cheers

andrew

Re: Add support for logging the current role

From

Stephen Frost

Date:

14 January 2011, 23:05:39

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> In any case, if the GUC representation is a list of field names, I think
> the POLA demands that the system honor the list order.

Agreed.  That puts us back into the question of how to make it
efficient.  My best thought at the moment, which doesn't strike me as
particularly efficient, is to build an array of the columns as enum's
and then have loop through the array and use a switch() on the enum.  At
least it's all integer-based there then and we're not calling strcmp()
for every field or strchr to find the next field, but couldn't we do
better?

> BTW, in case you didn't know, there are some GUCs defined as lists of
> identifiers already (look for GUC_LIST bits).  Be sure to steal code.

No, I didn't..  Excellent.
Thanks!
    Stephen

Re: Add support for logging the current role

From

Stephen Frost

Date:

14 January 2011, 23:07:53

* Andrew Dunstan (andrew@dunslane.net) wrote:
> The only thing I'd suggest extra is that we might allow
> "version_n_m" as shorthand for the default table layout from the
> relevant version.

I like that idea, makes the default a lot simpler to deal with too. :)
Thanks!
    Stephen

Re: Add support for logging the current role

From

Tom Lane

Date:

14 January 2011, 23:09:16

Andrew Dunstan <andrew@dunslane.net> writes:
> The only thing I'd suggest extra is that we might allow "version_n_m" as 
> shorthand for the default table layout from the relevant version.

Mmm ... seems like that just complicates matters.  To make that useful,
you have to assume that there *is* a default table layout, it's
different across versions, and everything that looks at this GUC value
will know instantly what it is for each version.  The last bit is kind
of a killer for tools like pgfouine, no?  In any case I thought the
expectation here was that the default column list would be frozen at
what it is now, and probably will never change.
        regards, tom lane

Re: Add support for logging the current role

From

Tom Lane

Date:

14 January 2011, 23:16:08

Stephen Frost <sfrost@snowman.net> writes:
> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>> In any case, if the GUC representation is a list of field names, I think
>> the POLA demands that the system honor the list order.  

> Agreed.  That puts us back into the question of how to make it
> efficient.  My best thought at the moment, which doesn't strike me as
> particularly efficient, is to build an array of the columns as enum's
> and then have loop through the array and use a switch() on the enum.

Yeah, an array or list of integer codes was what I was thinking too.

> At least it's all integer-based there then and we're not calling
> strcmp() for every field or strchr to find the next field, but
> couldn't we do better?

I really doubt that the cycles spent in the loop + switch are going to
amount to anything at all, compared to the cycles involved in formatting
each field and then pushing it through the CSV logic.  Not to mention
the I/O costs of sending the string somewhere afterwards.
        regards, tom lane

Re: Add support for logging the current role

From

Stephen Frost

Date:

14 January 2011, 23:20:48

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> everything that looks at this GUC value
> will know instantly what it is for each version.
> The last bit is kind of a killer for tools like pgfouine, no?

Ugh..  Could we just accept it as input but return the full list if
asked for it>

> In any case I thought the
> expectation here was that the default column list would be frozen at
> what it is now, and probably will never change.

This I don't like..  When I install a new version fresh, I like to get
all of the "bells & whistles" that go along with it, which, in my view,
would include new fields that the smart PG folks have decided might be
useful for me.  I'd like to provide a way for users who are upgrading to
be able to get the old behavior back, to minimize the trouble for them,
and being able to say "just change the version_9_1 to version_9_0 in
your log_csv_fields GUC" is a heck of a lot better than "well, rip out
the default and replace it with this huge list of fields".

I'd been puzzling over how to deal with this big list of fields in
postgresql.conf and I do like the idea of some kind of short-cut being
provided to ease the pain for users.  What about something other than
version_x_y?  I could maybe see having a 'default' and an 'all'
instead..  Then have the default be what we have currently and 'all' be
the full list I'm thinking about.
Thanks,
    Stephen

Re: Add support for logging the current role

From

Stephen Frost

Date:

14 January 2011, 23:25:43

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Yeah, an array or list of integer codes was what I was thinking too.

Hm, yeah, a list of integer codes might be even better/simpler.

Okay, next user-interface question- thoughts on handling SIGHUP?  My
first reaction is that we should create a new log file on SIGHUP (if we
don't already, havn't checked), or maybe just on SIGHUP if this variable
changes.

Point being, until we get Andrew's jagged-csv-import magic committed to
core, we won't be able to import a log file that a user has changed the
field list for mid-stream (following the add-a-new-column use-case we've
been discussing).
Thanks,
    Stephen

Re: Add support for logging the current role

From

Tom Lane

Date:

14 January 2011, 23:34:49

Stephen Frost <sfrost@snowman.net> writes:
> I'd been puzzling over how to deal with this big list of fields in
> postgresql.conf and I do like the idea of some kind of short-cut being
> provided to ease the pain for users.

Yeah, I agree with the worry that a default value that's a mile long
is going to be a bit of a PITA.  But I don't think we're there yet on
having a better solution.

> What about something other than
> version_x_y?  I could maybe see having a 'default' and an 'all'
> instead..  Then have the default be what we have currently and 'all' be
> the full list I'm thinking about.

If "default" always means what it means today, I can live with that.
But if the meaning of "all" changes from version to version, that seems
like a royal mess.  Again, I'm concerned that an external tool like
pgfouine be able to make sense of the value without too much context.
If it doesn't know what some of the columns are, it can just ignore them
--- but a magic summary identifier that it doesn't understand at all is
a problem.
        regards, tom lane

Re: Add support for logging the current role

From

Tom Lane

Date:

14 January 2011, 23:43:42

Stephen Frost <sfrost@snowman.net> writes:
> Okay, next user-interface question- thoughts on handling SIGHUP?  My
> first reaction is that we should create a new log file on SIGHUP (if we
> don't already, havn't checked), or maybe just on SIGHUP if this variable
> changes.

> Point being, until we get Andrew's jagged-csv-import magic committed to
> core, we won't be able to import a log file that a user has changed the
> field list for mid-stream (following the add-a-new-column use-case we've
> been discussing).

Now I think you're reaching the mission-to-mars stage that Robert was
complaining about.  Solving that sort of problem is well outside the
scope of this patch.  I don't care if people have to shut down and
restart their servers in order to make a change to the log format.
Even if I did, the other patch sounds like a better approach.
        regards, tom lane

Re: Add support for logging the current role

From

Robert Haas

Date:

15 January 2011, 08:14:45

On Fri, Jan 14, 2011 at 9:24 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
> On 01/14/2011 08:41 PM, Robert Haas wrote:
>> I think we're in the process of designing a manned mission to Mars to
>> solve the problem that our shoelaces are untied.
> What's your suggestion, then?

If there's a practical way to add the requested escape, add it to the
text format and leave reengineering the CSV format for another day.
Yeah, I know that's not the most beautiful solution in the world, but
we're doing engineering here, not theology.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Add support for logging the current role

From

Tom Lane

Date:

15 January 2011, 12:08:59

Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, Jan 14, 2011 at 9:24 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>> What's your suggestion, then?

> If there's a practical way to add the requested escape, add it to the
> text format and leave reengineering the CSV format for another day.
> Yeah, I know that's not the most beautiful solution in the world, but
> we're doing engineering here, not theology.

Well, the original patch was exactly that.  But I don't agree with that
approach; I think allowing the capabilities of text and CSV logs to
diverge significantly would be a mistake.  If a piece of information is
valuable enough to need a way to include it in textual log entries,
then you need a way to include it in CSV log entries too.  If it's not
valuable enough to do the work to support it in CSV, then we can live
without it.
        regards, tom lane

Re: Add support for logging the current role

From

Andrew Dunstan

Date:

15 January 2011, 12:19:13


On 01/15/2011 11:08 AM, Tom Lane wrote:
> Robert Haas<robertmhaas@gmail.com>  writes:
>> On Fri, Jan 14, 2011 at 9:24 PM, Andrew Dunstan<andrew@dunslane.net>  wrote:
>>> What's your suggestion, then?
>> If there's a practical way to add the requested escape, add it to the
>> text format and leave reengineering the CSV format for another day.
>> Yeah, I know that's not the most beautiful solution in the world, but
>> we're doing engineering here, not theology.
> Well, the original patch was exactly that.  But I don't agree with that
> approach; I think allowing the capabilities of text and CSV logs to
> diverge significantly would be a mistake.  If a piece of information is
> valuable enough to need a way to include it in textual log entries,
> then you need a way to include it in CSV log entries too.  If it's not
> valuable enough to do the work to support it in CSV, then we can live
> without it.
>
>             

Yeah, I agree, that's exactly the kind of divergence we usually try to 
avoid. And it's hardly theology to say let's not do a half-assed job on 
this.

cheers

andrew

Re: Add support for logging the current role

From

Alvaro Herrera

Date:

17 January 2011, 12:44:49

Excerpts from Tom Lane's message of sáb ene 15 00:34:40 -0300 2011:
> Stephen Frost <sfrost@snowman.net> writes:

> > What about something other than
> > version_x_y?  I could maybe see having a 'default' and an 'all'
> > instead..  Then have the default be what we have currently and 'all' be
> > the full list I'm thinking about.
> 
> If "default" always means what it means today, I can live with that.
> But if the meaning of "all" changes from version to version, that seems
> like a royal mess.  Again, I'm concerned that an external tool like
> pgfouine be able to make sense of the value without too much context.
> If it doesn't know what some of the columns are, it can just ignore them
> --- but a magic summary identifier that it doesn't understand at all is
> a problem.

Maybe if we offered a way for the utility to find out the field list
from the magic identifier, it would be enough.

(It would be neat to have magic identifiers for "terse", "verbose",
etc, that mimicked the behavior of client processing)

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: Add support for logging the current role

From

Andrew Dunstan

Date:

17 January 2011, 12:52:00


On 01/17/2011 11:44 AM, Alvaro Herrera wrote:
> Excerpts from Tom Lane's message of sáb ene 15 00:34:40 -0300 2011:
>> Stephen Frost<sfrost@snowman.net>  writes:
>>> What about something other than
>>> version_x_y?  I could maybe see having a 'default' and an 'all'
>>> instead..  Then have the default be what we have currently and 'all' be
>>> the full list I'm thinking about.
>> If "default" always means what it means today, I can live with that.
>> But if the meaning of "all" changes from version to version, that seems
>> like a royal mess.  Again, I'm concerned that an external tool like
>> pgfouine be able to make sense of the value without too much context.
>> If it doesn't know what some of the columns are, it can just ignore them
>> --- but a magic summary identifier that it doesn't understand at all is
>> a problem.
> Maybe if we offered a way for the utility to find out the field list
> from the magic identifier, it would be enough.
>
> (It would be neat to have magic identifiers for "terse", "verbose",
> etc, that mimicked the behavior of client processing)
>

Just output a header line with the column names. We've long been able to 
import such files. If the list of columns changes we should rotate log 
files before outputting the new format. That might get a little tricky 
to coordinate between backends.

cheers

andrew

Re: Add support for logging the current role

From

Stephen Frost

Date:

19 January 2011, 01:36:14

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > Point being, until we get Andrew's jagged-csv-import magic committed to
> > core, we won't be able to import a log file that a user has changed the
> > field list for mid-stream (following the add-a-new-column use-case we've
> > been discussing).
>
> Now I think you're reaching the mission-to-mars stage that Robert was
> complaining about.  Solving that sort of problem is well outside the
> scope of this patch.  I don't care if people have to shut down and
> restart their servers in order to make a change to the log format.
> Even if I did, the other patch sounds like a better approach.

Alright, here's the latest on this patch.  I've added a log_csv_fields
GUC along with the associated magic to make it work (at least for me).
Also added 'role_name' and '%U' options.  Requires postmaster restart to
change, didn't include any 'short-cut' field options, though I don't
think it'd be hard to do if we can decide on it.  Default remains the
same as what was in 9.0.

commit ff249aeac7216da623bf77840380d5e767f681fc
Author: Stephen Frost <sfrost@snowman.net>
Date:   Wed Jan 19 00:26:52 2011 -0500

    Add log_csv_fields GUC for CSV output & curr_role

    This patch adds a new GUC called 'log_csv_fields'.  This GUC allows
    the user to control the set of fields written to the CSV output as
    well as the order in which they are written.  The default set of
    fields remains those that were included in 9.0, to avoid breaking
    existing user configurations.

    In passing, update 'user_name' for log_line_prefix and log_csv_fields
    to mean 'session user' (which could be reset by a superuser with
    set session authorization), and add a 'role_name' option (%U) to
    log_line_prefix and log_csv_fields, to allow users to log the
    current role (as set by SET ROLE- not impacted by SECURITY DEFINER
    functions).

    Thanks,

        Stephen

Attachment

Re: Add support for logging the current role

From

Itagaki Takahiro

Date:

24 January 2011, 06:15:36

On Wed, Jan 19, 2011 at 14:36, Stephen Frost <sfrost@snowman.net> wrote:
> Alright, here's the latest on this patch.  I've added a log_csv_fields
> GUC along with the associated magic to make it work (at least for me).
> Also added 'role_name' and '%U' options.  Requires postmaster restart to
> change, didn't include any 'short-cut' field options, though I don't
> think it'd be hard to do if we can decide on it.  Default remains the
> same as what was in 9.0.

Hi, I reviewed log_csv_options.patch. It roughly looks good,
but I'd like to discuss about the design in some points.

==== Features ====
The patch adds "log_csv_fields" GUC variable. It allows to customize
columns in csvlog. The default setting is what we were writing 9.0 or
earlier versions.

It also add "role_name" for log_csv_fields and "%U" for log_line_prefix
to record role names. They are the name set by SET ROLE. OTOH, user_name
and %u shows authorization user names.

==== Discussions ====
* How about "csvlog_fields" rather than "log_csv_fields"? Since we have variables with syslog_ prefix, csvlog_ prefix
seemsto be better. 

* We use %<what> syntax to specify fields in logs for log_line_prefix, but will use long field names for
log_csv_fields.Do you have any better idea to share the names in both options?  At least I want to share the short
descriptionfor them in postgresql.conf. 

* log_csv_fields's GUC context is PGC_POSTMASTER. Is it by design? PGC_SIGHUP would be more consistent compared with
log_line_prefix.However, the csv format will not be valid because the column definitions will be changed in the middle
offile. 

* "none" is not so useful for the initial "role_name" field. Should we use user_name instead of the empty role_name?

==== Comments for codes ====
Some of the items are trivial, though.

* What objects do you want to allocate in TopMemoryContext in assign_log_csv_fields() ? AFAICS, we don't have to keep
rawstringand column_list in long-term context. Or, if you need TopMemoryContext, those variables should be pfree'ed at
theend of function. 

* appendStringInfoChar() calls in write_csvlog() seem to be wrong when the last field is not application_name.

* Docs need more cross-reference hyper-links for "see also" items.

* Docs need some tags for itemized elements or pre-formatted codes. They looks itemized in the sgml files, but will be
flattenedin complied HTML files. 

* A declaration of assign_log_csv_fields() at the top of elog.c needs "extern".
* There is a duplicated declaration for build_default_csvlog_list().
* list_free() is NIL-safe. You don't have to check whether the list is NIL before call the function.

--
Itagaki Takahiro

Re: Add support for logging the current role

From

Stephen Frost

Date:

29 January 2011, 00:06:21

Itagaki,

* Itagaki Takahiro (itagaki.takahiro@gmail.com) wrote:
> ==== Discussions ====
> * How about "csvlog_fields" rather than "log_csv_fields"?
>   Since we have variables with syslog_ prefix, csvlog_ prefix
>   seems to be better.

Sure, not a big deal, to be honest, as long as we can actually agree on
something...  Not changed in the patch, but I can if people want.

> * We use %<what> syntax to specify fields in logs for log_line_prefix,
>   but will use long field names for log_csv_fields. Do you have any
>   better idea to share the names in both options?  At least I want to
>   share the short description for them in postgresql.conf.

No, I don't, and that's going well beyond what I feel makes sense for
this patch at this time.  We could review that for 9.2 or later, but
we've had quite enough expanding-of-requirements for this patch
already, imnsho.

> * log_csv_fields's GUC context is PGC_POSTMASTER. Is it by design?
>   PGC_SIGHUP would be more consistent compared with log_line_prefix.
>   However, the csv format will not be valid because the column
>   definitions will be changed in the middle of file.

Doing SIGHUP would require addressing how to get all of the backends to
close the old log file and open the new one, because we don't want to
have a given log file which has two different CSV formats in it (we
wouldn't be able to load it into the database...).  This was
specifically addressed in the thread leading up to this patch...

> * "none" is not so useful for the initial "role_name" field.
>   Should we use user_name instead of the empty role_name?

none is what it is, however, if you query 'show role'.  I would rather
be consistant with that than log something else.

> ==== Comments for codes ====
> Some of the items are trivial, though.
>
> * What objects do you want to allocate in TopMemoryContext in
>   assign_log_csv_fields() ? AFAICS, we don't have to keep rawstring
>   and column_list in long-term context. Or, if you need TopMemoryContext,
>   those variables should be pfree'ed at the end of function.

You're right, rawstring and column_list don't need to be in
TopMemoryContext.  I just moved the switch to Top to be after those are
allocated.

> * appendStringInfoChar() calls in write_csvlog() seem to be wrong
>   when the last field is not application_name.

Urgh, right, fixed to *not* include a trailing comma on the last column.

> * Docs need more cross-reference hyper-links for "see also" items.
>
> * Docs need some tags for itemized elements or pre-formatted codes.
>   They looks itemized in the sgml files, but will be flattened in
>   complied HTML files.

Not sure what you're referring to here...?  Can you elaborate?  I'm not
great with the docs. :/

> * A declaration of assign_log_csv_fields() at the top of elog.c
>   needs "extern".

Err, no, I don't think it does.  None of the other exported functions
from elog.c have extern declarations in elog.c...  I did realize that I
probably shouldnt have the declaration at the top of elog.c for
assign_loc_csv_fields() or build_default_csvlog_list().

> * There is a duplicated declaration for build_default_csvlog_list().

Removed the duplicate in elog.c.

> * list_free() is NIL-safe. You don't have to check whether the list
>   is NIL before call the function.

Fixed.

Updated patch attached.

    Thanks!

        Stephen

Attachment

Re: Add support for logging the current role

From

Itagaki Takahiro

Date:

01 February 2011, 07:48:47

> Updated patch attached.

I think we need to improve postgresql.conf.sample a bit more, especially
the long line for #log_csv_fields = '...'. 330 characters in it! #1. Leave the long line because it is needed. #2. Hide
thevariable from the default conf. #3. Use short %x mnemonic both in log_line_prefix and log_csv_fields.     (It might
requiremany additional mnemonics.) 
Which is better, or another idea?

On Sat, Jan 29, 2011 at 13:06, Stephen Frost <sfrost@snowman.net> wrote:
>> * log_csv_fields's GUC context is PGC_POSTMASTER. Is it by design?
>
> Doing SIGHUP would require addressing how to get all of the backends to
> close the old log file and open the new one, because we don't want to
> have a given log file which has two different CSV formats in it (we
> wouldn't be able to load it into the database...).  This was
> specifically addressed in the thread leading up to this patch...

I think it depends default log filename, that contains %S (seconds)
suffix. We can remove %S from log_filename; if we use a log per-day,
those log might contain different columns even after restart. If we
cannot avoid zigged csv fields completely, SIGHUP seems reasonable for it.

>> * What objects do you want to allocate in TopMemoryContext in
>>   assign_log_csv_fields() ?
> I just moved the switch to Top to be after those are allocated.

How about changing the type of csv_log_fields from List* to fixed
array of LogCSVFields? If so, we can use an array-initializer
instead of build_default_csvlog_list() ?  The code will be simplified.
Fixed length won't be a problem because it would be rare that the
same field are specified many times.

>> * Docs need some tags for itemized elements or pre-formatted codes.
>>   They looks itemized in the sgml files, but will be flattened in
>>   complied HTML files.
>
> Not sure what you're referring to here...?  Can you elaborate?  I'm not
> great with the docs. :/

Could you try to "make html" in the doc directory?
Your new decumentation after
| These columns may be included in the CSV output:
will be unaligned plain text without some tags.

--
Itagaki Takahiro

Re: Add support for logging the current role

From

Stephen Frost

Date:

06 February 2011, 10:40:50

* Itagaki Takahiro (itagaki.takahiro@gmail.com) wrote:
> I think we need to improve postgresql.conf.sample a bit more, especially
> the long line for #log_csv_fields = '...'. 330 characters in it!
>   #1. Leave the long line because it is needed.

It's needed to match what the current default is..

>   #2. Hide the variable from the default conf.

I don't like that idea.

>   #3. Use short %x mnemonic both in log_line_prefix and log_csv_fields.
>       (It might require many additional mnemonics.)

It would require a lot more, and wouldn't scale well either (this was
discussed previously..).

> I think it depends default log filename, that contains %S (seconds)
> suffix. We can remove %S from log_filename; if we use a log per-day,
> those log might contain different columns even after restart. If we
> cannot avoid zigged csv fields completely, SIGHUP seems reasonable for it.

This problem is bigger than SIGHUP, but at least with a restart
required, the default will work properly.  The default configuration
wouldn't work w/ a change to the log line and a SIGHUP done.  If you
change the log filename then it could generate jagged CSV files even on
restart.  We'd have to move the old log file out of the way when/if we
detect that it had a different CSV format and that's really just not
practical.  We don't want to cause problems for people, but I don't
think there's a way to prevent jagged CSVs if they're going to go out of
thier way to configure PG to create them.

> How about changing the type of csv_log_fields from List* to fixed
> array of LogCSVFields? If so, we can use an array-initializer
> instead of build_default_csvlog_list() ?  The code will be simplified.
> Fixed length won't be a problem because it would be rare that the
> same field are specified many times.

I really don't think changing it to an array is necessary or even
particularly helpful, and I don't agree that the code would actually
be simpler- you still have to make sure the CSV fields are kept in
the order requested by the user.  Also, you'd have to remember to update
the length of the static array every time a new log option was added or
risk writing past the end of the array..

> Could you try to "make html" in the doc directory?
> Your new decumentation after
> | These columns may be included in the CSV output:
> will be unaligned plain text without some tags.

Alright, I can take a look at improving the documentation situation,
though I certainly wouldn't complain if someone who has it all working
already were to help..
Thanks,
    Stephen

Re: Add support for logging the current role

From

Itagaki Takahiro

Date:

06 February 2011, 11:14:02

On Sun, Feb 6, 2011 at 23:31, Stephen Frost <sfrost@snowman.net> wrote:
> * Itagaki Takahiro (itagaki.takahiro@gmail.com) wrote:
>> I think we need to improve postgresql.conf.sample a bit more, especially
>> the long line for #log_csv_fields = '...'. 330 characters in it!
>>   #1. Leave the long line because it is needed.
> It's needed to match what the current default is..

I agree that it's logically good design, but we could not accept it
as long as it breaks tools in the real world...
Will it break "SHOW ALL" and "SELECT * FROM pg_settings" output?
I'm worried that they are not designed to display such a long value.

>> I think it depends default log filename, that contains %S (seconds)
>> suffix. We can remove %S from log_filename; if we use a log per-day,
>> those log might contain different columns even after restart. If we
>> cannot avoid zigged csv fields completely, SIGHUP seems reasonable for it.
>
> This problem is bigger than SIGHUP, but at least with a restart
> required, the default will work properly.  The default configuration
> wouldn't work w/ a change to the log line and a SIGHUP done.

"Only works with the default settings" is just wrong design.
If we cannot provide a perfect solution, we should allow users to
control everything as they like. I still think PGC_SIGHUP is the
best mode for the parameter, with a note of caution in the docs.

--
Itagaki Takahiro

Re: Add support for logging the current role

From

Stephen Frost

Date:

06 February 2011, 14:03:18

* Itagaki Takahiro (itagaki.takahiro@gmail.com) wrote:
> I agree that it's logically good design, but we could not accept it
> as long as it breaks tools in the real world...

If it does, I think it's pretty clear that those tools are themselves
broken..

> Will it break "SHOW ALL" and "SELECT * FROM pg_settings" output?
> I'm worried that they are not designed to display such a long value.

It certainly won't break those commands in PostgreSQL.  If tools made
assumptions about how long a string could be that don't match PG's
understand, those tools are broken.  This also isn't the only string and
such tools could be broken by, eg, setting log_prefix to a very long
value (entirely possible to do..).

> >> I think it depends default log filename, that contains %S (seconds)
> >> suffix. We can remove %S from log_filename; if we use a log per-day,
> >> those log might contain different columns even after restart. If we
> >> cannot avoid zigged csv fields completely, SIGHUP seems reasonable for it.
> >
> > This problem is bigger than SIGHUP, but at least with a restart
> > required, the default will work properly.  The default configuration
> > wouldn't work w/ a change to the log line and a SIGHUP done.
>
> "Only works with the default settings" is just wrong design.

It's also not anywhere close to what I said.  If you have a suggestion
about how to fix it that's reasonable, please suggest it.  Suggesting
that "because we could, given a complicated enough configuration,
create jagged files anyway, we should encourage it to happen by allowing
the change on SIGHUP" isn't a solution to the problem and will just
create more problems.  It will certainly work just fine with more than
just the default settings, but, yes, there are some settings which would
cause PG to create jagged CSV files.

If we keep it as requiring restart and then automatically move or
truncate log files which are in the way on that restart, or decide to
not start at all, we could make it so PG doesn't create a jagged CSV
file.  I don't particularly care for any of those options.  At least
one of those would be a solution, however none of those include
allowing it on SIGHUP, which is what you're advocating.

> If we cannot provide a perfect solution, we should allow users to
> control everything as they like. I still think PGC_SIGHUP is the
> best mode for the parameter, with a note of caution in the docs.

Doing it on a SIGHUP would be *guaranteed* to create jagged CSV files
which then couldn't be loaded into PG.  I don't agree with this and the
impression I got from Tom and Andrew was that they agreed to not do it
on SIGHUP.  It's unfortuante that we seem to be at an impasse here, but
of everyone voicing opinions on it so far, it would appear that you're
in the minority.
Thanks,
    Stephen

Re: Add support for logging the current role

From

Stephen Frost

Date:

06 February 2011, 15:10:36

* Itagaki Takahiro (itagaki.takahiro@gmail.com) wrote:
> Could you try to "make html" in the doc directory?

Yeah, doesn't seem to work for me (missing '/bin/collateindex.pl',
apparently..).

> Your new decumentation after
> | These columns may be included in the CSV output:
> will be unaligned plain text without some tags.

Ok, I've cleaned up that part of the documentation to be a table instead
of the listings that were there, seems like a better approach anyway.
Updated patch attached, which has also been rebased against HEAD.

    Thanks!

        Stephen

commit d8dddd1c425a4c320540769084ceeb7d23bc3662
Author: Stephen Frost <sfrost@snowman.net>
Date:   Sun Feb 6 14:02:05 2011 -0500

    Change log_csv_options listing to a table

    This patch changes the listing of field options available to
    log_csv_options into a table, which will hopefully both look
    better and be clearer.

commit f9851cdfaeb931f01c015f5651b72d16957c7114
Merge: 3e71e33 5ed45ac
Author: Stephen Frost <sfrost@snowman.net>
Date:   Sun Feb 6 13:26:17 2011 -0500

    Merge branch 'master' of git://git.postgresql.org/git/postgresql into log_csv_options

commit 3e71e338a2b9352d730f59a989027e33d99bea50
Author: Stephen Frost <sfrost@snowman.net>
Date:   Fri Jan 28 22:44:33 2011 -0500

    Cleanup log_csv_options patch

    Clean up of various function declarations to hopefully be correct
    and clean and matching PG conventions.  Also move TopMemoryContext
    usage to later, since the local variables don't need to be in
    TopMemoryContext.  Lastly, ensure that a comma is not produced
    after the last CSV field, and that one is produced if
    application_name is not the last field.

    Review by Itagaki Takahiro, thanks!

commit 1825def11badd661d219fa4c516f06e0ad423443
Merge: ff249ae 847e8c7
Author: Stephen Frost <sfrost@snowman.net>
Date:   Wed Jan 19 06:50:03 2011 -0500

    Merge branch 'master' of git://git.postgresql.org/git/postgresql into log_csv_options

commit ff249aeac7216da623bf77840380d5e767f681fc
Author: Stephen Frost <sfrost@snowman.net>
Date:   Wed Jan 19 00:26:52 2011 -0500

    Add log_csv_fields GUC for CSV output & curr_role

    This patch adds a new GUC called 'log_csv_fields'.  This GUC allows
    the user to control the set of fields written to the CSV output as
    well as the order in which they are written.  The default set of
    fields remains those that were included in 9.0, to avoid breaking
    existing user configurations.

    In passing, update 'user_name' for log_line_prefix and log_csv_fields
    to mean 'session user' (which could be reset by a superuser with
    set session authorization), and add a 'role_name' option (%U) to
    log_line_prefix and log_csv_fields, to allow users to log the
    current role (as set by SET ROLE- not impacted by SECURITY DEFINER
    functions).

Attachment

Re: Add support for logging the current role

From

Itagaki Takahiro

Date:

10 February 2011, 05:56:24

On Mon, Feb 7, 2011 at 04:10, Stephen Frost <sfrost@snowman.net> wrote:
> Yeah, doesn't seem to work for me (missing '/bin/collateindex.pl',
> apparently..).

You might need "yum install openjade stylesheets" or similar packages
and re-"configure".

> Ok, I've cleaned up that part of the documentation to be a table instead
> of the listings that were there, seems like a better approach anyway.

Yeah, that's a good job!

>> I agree that it's logically good design, but we could not accept it
>> as long as it breaks tools in the real world...
> If it does, I think it's pretty clear that those tools are themselves
> broken..

The word "break" was my wrong choice, but your new parameter still
requires very wide monitors to display SHOW ALL and pg_settings.
I'd like to solve the issue even though the feature itself is useful.
One fast and snappy solution might be to set the default value to
"default", that means the compatible set of columns.
Other better ideas?

For implementation, write_csvlog() has many following lines:if (curr_field != num_fields) appendStringInfoChar(&buf,
',');
It will be cleaner if we add first_col flag and move it out of
the switch statement.

Other questions I raised before might be matters of preference.
I'd like to here about them form third person.* name: log_csv_fields vs. csvlog_fields* when to assign: PGC_POSTMASTER
vs.PGC_SIGHUP

-- 
Itagaki Takahiro

Re: Add support for logging the current role

From

Noah Misch

Date:

10 February 2011, 07:27:18

On Thu, Feb 10, 2011 at 06:56:15PM +0900, Itagaki Takahiro wrote:
> On Mon, Feb 7, 2011 at 04:10, Stephen Frost <sfrost@snowman.net> wrote:
> >> I agree that it's logically good design, but we could not accept it
> >> as long as it breaks tools in the real world...
> > If it does, I think it's pretty clear that those tools are themselves
> > broken..
> 
> The word "break" was my wrong choice, but your new parameter still
> requires very wide monitors to display SHOW ALL and pg_settings.
> I'd like to solve the issue even though the feature itself is useful.
> One fast and snappy solution might be to set the default value to
> "default", that means the compatible set of columns.
> Other better ideas?

If some tool barfs on a 330-byte GUC value, we might as well have that tool barf
early and often, not just on non-default values.

FWIW, a 330 byte boot_val doesn't seem like a big deal to me.  If it were over
_POSIX2_LINE_MAX (2048), that might be another matter.

> Other questions I raised before might be matters of preference.
> I'd like to here about them form third person.
>  * name: log_csv_fields vs. csvlog_fields

+1 for csvlog_fields.  We have the precedent of syslog_* and that log_* are all
applicable to more than one log destination.

>  * when to assign: PGC_POSTMASTER vs. PGC_SIGHUP

+1 for PGC_SIGHUP.  PGC_POSTMASTER is mostly for things where we have not
implemented code to instigate the change after startup (usually because the
difficulty/value ratio of doing so is too high).  There's no such problem here,
merely the risk that the DBA might not be prepared to deal with a column list
change mid-logfile.  If anything, let's have the documentation mention
pg_rotate_logfile() as potentially useful in conjunction.

nm

Re: Add support for logging the current role

From

Robert Haas

Date:

10 February 2011, 09:09:51

On Thu, Feb 10, 2011 at 6:27 AM, Noah Misch <noah@leadboat.com> wrote:
> FWIW, a 330 byte boot_val doesn't seem like a big deal to me.  If it were over
> _POSIX2_LINE_MAX (2048), that might be another matter.

I don't think it's entirely stupid to worry about this completely
screwing up the output of "SHOW ALL" on people using 80-character
terminal windows.  I haven't checked, but if it renders the output
totally unreadable then I think we should try to find an alternative
that doesn't.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Add support for logging the current role

From

Stephen Frost

Date:

11 February 2011, 11:02:37

* Robert Haas (robertmhaas@gmail.com) wrote:
> I don't think it's entirely stupid to worry about this completely
> screwing up the output of "SHOW ALL" on people using 80-character
> terminal windows.  I haven't checked, but if it renders the output
> totally unreadable then I think we should try to find an alternative
> that doesn't.

Alright, so I looked into this a bit and have a couple of comments:

show all; output, at least on my test rig, is *already* well over
80-characters wide.  The longest GUC is
max_predicate_locks_per_transaction, which forces the first column to be
38 columns.  We then have some rather long descriptions (which are only
shown on show all, no clue why that is..), the longest being "Sets
whether XML data in implicit parsing and serialization operations is to
be considered as documents or content fragments." (for xmloption).  Now,
it's true that the longest default *setting* w/ this patch is the list
of CSV fields, but it's not like 'show all;' really works that well on
an 80-character terminal today.  The second longest setting, on my
system, is the path to postgresql.conf:
/data/sfrost/pgsql/test/data/postgresql.conf

That, plus the length of max_predicate_locks_per_transaction, would make
'show all;' go beyond 80 characters even if we took out the description
(but we don't currently support that..).  This new option *would* make a
query against an individual 'show <name>;' return, in the default
configuration, a value longer than 80-characters, but we're just talking
1 row being returned there.

My feeling is that this could be improved by supporting multi-line
configuration settings, and then changing the longer descriptions to be
multi-line, but that still wouldn't get us down to 80-characters due to
the combination of max_predicate_locks_per_transaction and config_file.
Renaming a configuration option isn't exactly a trivial thing to do
either. :/

One thing that would be nice, but probably non-trivial, would be to
allow "show all;" to be a subselect, so you could do things like, I
dunno, pull the max length of certain columns. :)  We could also have a
'show all;' which just returns the name and setting and then a 'show all
verbose;' for including the description, or a 'show verbose <name>;' for
getting all three fields when looking at a specific variable.

All-in-all, I think we're past the point of being able to make show all;
fit completely on an 80-character terminal, even in \x mode.  I'd be
willing to work on the multi-line stuff for 9.2, if people are really
interested in it, but I don't think not having that should be a
show-stopper for this patch, and I think that would be more likely to
break client applications than this change..
Thanks,
    Stephen

Re: Add support for logging the current role

From

"Kevin Grittner"

Date:

11 February 2011, 11:34:58

>Stephen Frost <sfrost@snowman.net> wrote:
> That, plus the length of max_predicate_locks_per_transaction,
> would make 'show all;' go beyond 80 characters even if we took out
> the description
Should we abbreviate something there?  max_pred_locks_per_tran,
maybe?
-Kevin

Re: Add support for logging the current role

From

Robert Haas

Date:

11 February 2011, 11:37:40

On Fri, Feb 11, 2011 at 10:34 AM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:
>>Stephen Frost <sfrost@snowman.net> wrote:
>
>> That, plus the length of max_predicate_locks_per_transaction,
>> would make 'show all;' go beyond 80 characters even if we took out
>> the description
>
> Should we abbreviate something there?  max_pred_locks_per_tran,
> maybe?

If we're going to abbreviate transaction, I'd vote for txn over tran,
but I think Stephen's point that this is already a lost cause may have
some validity.  Not sure what other people think.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Add support for logging the current role

From

Stephen Frost

Date:

11 February 2011, 11:52:51

* Robert Haas (robertmhaas@gmail.com) wrote:
> On Fri, Feb 11, 2011 at 10:34 AM, Kevin Grittner
> > Should we abbreviate something there?  max_pred_locks_per_tran,
> > maybe?
>
> If we're going to abbreviate transaction, I'd vote for txn over tran,
> but I think Stephen's point that this is already a lost cause may have
> some validity.  Not sure what other people think.

There's lots of other GUCs with "transaction" spelled out in them.. :/

Another option, which I don't like, would be to use 'default' by
'default', and build the list on the fly every time if that's what it
is..  That would give no insight into what the list of fields is for
someone though, they'd have to go back to the documentation to figure
it out, and that sucks..
Thanks,
    Stephen

Re: Add support for logging the current role

From

Robert Haas

Date:

11 February 2011, 11:54:57

On Fri, Feb 11, 2011 at 10:52 AM, Stephen Frost <sfrost@snowman.net> wrote:
> * Robert Haas (robertmhaas@gmail.com) wrote:
>> On Fri, Feb 11, 2011 at 10:34 AM, Kevin Grittner
>> > Should we abbreviate something there?  max_pred_locks_per_tran,
>> > maybe?
>>
>> If we're going to abbreviate transaction, I'd vote for txn over tran,
>> but I think Stephen's point that this is already a lost cause may have
>> some validity.  Not sure what other people think.
>
> There's lots of other GUCs with "transaction" spelled out in them.. :/
>
> Another option, which I don't like, would be to use 'default' by
> 'default', and build the list on the fly every time if that's what it
> is..  That would give no insight into what the list of fields is for
> someone though, they'd have to go back to the documentation to figure
> it out, and that sucks..

Yeah.  The root cause of this problem is that the way psql handles
tabular output with a few very wide rows stinks.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Add support for logging the current role

From

Stephen Frost

Date:

11 February 2011, 11:59:54

* Robert Haas (robertmhaas@gmail.com) wrote:
> Yeah.  The root cause of this problem is that the way psql handles
> tabular output with a few very wide rows stinks.

True, but it would be kinda nice to support multi-line configuration
variables.  I still vote for that being "not required to get this patch
in", but it's certainly something we could do later.  Of course, you do
have to ask yourself what having \n's in log_line_prefix would mean in
the config file..
Thanks,
    Stephen

Re: Add support for logging the current role

From

Stephen Frost

Date:

11 February 2011, 12:23:47

* Itagaki Takahiro (itagaki.takahiro@gmail.com) wrote:
> On Mon, Feb 7, 2011 at 04:10, Stephen Frost <sfrost@snowman.net> wrote:
> > Yeah, doesn't seem to work for me (missing '/bin/collateindex.pl',
> > apparently..).
>
> You might need "yum install openjade stylesheets" or similar packages
> and re-"configure".

I've got openjade, etc, installed, but I'm on Debian and it doesn't
appear to include that collateindex.pl anywhere..

> For implementation, write_csvlog() has many following lines:
>  if (curr_field != num_fields) appendStringInfoChar(&buf, ',');
> It will be cleaner if we add first_col flag and move it out of
> the switch statement.

Done.

> Other questions I raised before might be matters of preference.
> I'd like to here about them form third person.
>  * name: log_csv_fields vs. csvlog_fields

Done.

>  * when to assign: PGC_POSTMASTER vs. PGC_SIGHUP

I'm still in the PGC_POSTMASTER camp on this and I really think it's a
more complicated change than just changing that value in the code, even
if we all agreed it should be allowed on SIGHUP (which certainly isn't
the case anyway..).  In the end, if we really want that, we can always
add it in the future.

Updated patch attached, full git log below.

    Thanks,

        Stephen

commit 6bd2b9f1d2bc3b166a3e5598ee590e25159c61a5
Author: Stephen Frost <sfrost@snowman.net>
Date:   Fri Feb 11 11:16:17 2011 -0500

    Rename log_csv_fields GUC to csvlog_fields

    This patch renames the log_csv_fileds GUC to csvlog_fields, to better
    match the other csvlog_* options.

    Also cleaned up the CSV generation code a bit by moving the comma-adding
    code out of the switch() statement.

commit a281ca611e6181339e92b488c815e0cb8c1298d2
Merge: d8dddd1 183d3cf
Author: Stephen Frost <sfrost@snowman.net>
Date:   Fri Feb 11 08:37:27 2011 -0500

    Merge branch 'master' of git://git.postgresql.org/git/postgresql into log_csv_options

commit d8dddd1c425a4c320540769084ceeb7d23bc3662
Author: Stephen Frost <sfrost@snowman.net>
Date:   Sun Feb 6 14:02:05 2011 -0500

    Change log_csv_options listing to a table

    This patch changes the listing of field options available to
    log_csv_options into a table, which will hopefully both look
    better and be clearer.

commit f9851cdfaeb931f01c015f5651b72d16957c7114
Merge: 3e71e33 5ed45ac
Author: Stephen Frost <sfrost@snowman.net>
Date:   Sun Feb 6 13:26:17 2011 -0500

    Merge branch 'master' of git://git.postgresql.org/git/postgresql into log_csv_options

commit 3e71e338a2b9352d730f59a989027e33d99bea50
Author: Stephen Frost <sfrost@snowman.net>
Date:   Fri Jan 28 22:44:33 2011 -0500

    Cleanup log_csv_options patch

    Clean up of various function declarations to hopefully be correct
    and clean and matching PG conventions.  Also move TopMemoryContext
    usage to later, since the local variables don't need to be in
    TopMemoryContext.  Lastly, ensure that a comma is not produced
    after the last CSV field, and that one is produced if
    application_name is not the last field.

    Review by Itagaki Takahiro, thanks!

commit 1825def11badd661d219fa4c516f06e0ad423443
Merge: ff249ae 847e8c7
Author: Stephen Frost <sfrost@snowman.net>
Date:   Wed Jan 19 06:50:03 2011 -0500

    Merge branch 'master' of git://git.postgresql.org/git/postgresql into log_csv_options

commit ff249aeac7216da623bf77840380d5e767f681fc
Author: Stephen Frost <sfrost@snowman.net>
Date:   Wed Jan 19 00:26:52 2011 -0500

    Add log_csv_fields GUC for CSV output & curr_role

    This patch adds a new GUC called 'log_csv_fields'.  This GUC allows
    the user to control the set of fields written to the CSV output as
    well as the order in which they are written.  The default set of
    fields remains those that were included in 9.0, to avoid breaking
    existing user configurations.

    In passing, update 'user_name' for log_line_prefix and log_csv_fields
    to mean 'session user' (which could be reset by a superuser with
    set session authorization), and add a 'role_name' option (%U) to
    log_line_prefix and log_csv_fields, to allow users to log the
    current role (as set by SET ROLE- not impacted by SECURITY DEFINER
    functions).

Attachment

Re: Add support for logging the current role

From

Tom Lane

Date:

11 February 2011, 12:27:02

Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, Feb 11, 2011 at 10:34 AM, Kevin Grittner
> <Kevin.Grittner@wicourts.gov> wrote:
>> Should we abbreviate something there? �max_pred_locks_per_tran,
>> maybe?

> If we're going to abbreviate transaction, I'd vote for txn over tran,
> but I think Stephen's point that this is already a lost cause may have
> some validity.  Not sure what other people think.

Aren't we already using "xact" for that purpose in some user-visible
places?  But personally I'd be happy with "max_pred_locks_per_transaction"
which gets the worst case down without being too obviously at variance
with "max_locks_per_transaction".
        regards, tom lane

Re: Add support for logging the current role

From

Stephen Frost

Date:

11 February 2011, 12:33:31

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > If we're going to abbreviate transaction, I'd vote for txn over tran,
> > but I think Stephen's point that this is already a lost cause may have
> > some validity.  Not sure what other people think.

I agree w/ reducing that particular GUC a bit in size, but just to make
it clear- that doesn't even come close to solving or fixing the
80-character terminal issue wrt 'show all;'...

> Aren't we already using "xact" for that purpose in some user-visible
> places?  But personally I'd be happy with "max_pred_locks_per_transaction"
> which gets the worst case down without being too obviously at variance
> with "max_locks_per_transaction".

Sounds good to me.  The header length for show all would drop to only 206
characters (or so) with that change.  If we offered a 'show all;' which
didn't include 'description' and didn't have any settings longer than
about 46 characters, *then* it'd fit on an 80-char terminal.  Of course,
if we had multi-line GUC support, we could put each field on a new line
and each of those is well under 46 characters..
Thanks,
    Stephen

Re: Add support for logging the current role

From

Tom Lane

Date:

11 February 2011, 12:49:43

Stephen Frost <sfrost@snowman.net> writes:
> * Itagaki Takahiro (itagaki.takahiro@gmail.com) wrote:
>> You might need "yum install openjade stylesheets" or similar packages
>> and re-"configure".

> I've got openjade, etc, installed, but I'm on Debian and it doesn't
> appear to include that collateindex.pl anywhere..

FWIW, on Fedora 13 I see

$ which collateindex.pl
/usr/bin/collateindex.pl
$ rpm -qf /usr/bin/collateindex.pl
docbook-style-dsssl-1.79-11.fc13.noarch
$ rpm -qi docbook-style-dsssl
Name        : docbook-style-dsssl          Relocations: (not relocatable)
Version     : 1.79                              Vendor: Fedora Project
Release     : 11.fc13                       Build Date: Mon Jun  7 10:06:48 2010
Install Date: Fri Oct  1 10:07:37 2010         Build Host: x86-07.phx2.fedoraproject.org
Group       : Applications/Text             Source RPM: docbook-style-dsssl-1.79-11.fc13.src.rpm
Size        : 2308505                          License: Copyright only
Signature   : RSA/SHA256, Mon Jun  7 10:34:27 2010, Key ID 7edc6ad6e8e40fde
Packager    : Fedora Project
URL         : http://docbook.sourceforge.net/
Summary     : Norman Walsh's modular stylesheets for DocBook
Description :
These DSSSL stylesheets allow to convert any DocBook document to another
printed (for example, RTF or PostScript) or online (for example, HTML) format.
They are highly customizable.
        regards, tom lane

Re: Add support for logging the current role

From

Alvaro Herrera

Date:

11 February 2011, 12:54:21

Excerpts from Tom Lane's message of vie feb 11 13:49:33 -0300 2011:
> Stephen Frost <sfrost@snowman.net> writes:
> > * Itagaki Takahiro (itagaki.takahiro@gmail.com) wrote:
> >> You might need "yum install openjade stylesheets" or similar packages
> >> and re-"configure".
> 
> > I've got openjade, etc, installed, but I'm on Debian and it doesn't
> > appear to include that collateindex.pl anywhere..

$ apt-file search collateindex.pl
docbook-dsssl: /usr/bin/collateindex.pl
docbook-dsssl: /usr/share/man/man1/collateindex.pl.1.gz

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: Add support for logging the current role

From

Stephen Frost

Date:

11 February 2011, 12:56:01

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> $ which collateindex.pl
> /usr/bin/collateindex.pl
> $ rpm -qf /usr/bin/collateindex.pl
> docbook-style-dsssl-1.79-11.fc13.noarch

Ah-hah, thanks for that!  Apparently on Debian it's docbook-dsssl that
contains it, and yes, it gets installed into /usr/bin (not /bin, should
have figured that..).  I'll give building the docs another shot.
Thanks again,
    Stephen

Re: Add support for logging the current role

From

"Kevin Grittner"

Date:

11 February 2011, 19:07:34

Stephen Frost <sfrost@snowman.net> wrote:
> Tom Lane (tgl@sss.pgh.pa.us) wrote:

>> I'd be happy with "max_pred_locks_per_transaction" which gets the
>> worst case down without being too obviously at variance with
>> "max_locks_per_transaction".
>
> Sounds good to me.  The header length for show all would drop to
> only 206 characters (or so) with that change.

Patch attached.

-Kevin

Attachment

max_pred_locks_per_transaction.patch

Re: Add support for logging the current role

From

"Kevin Grittner"

Date:

11 February 2011, 19:20:28

I wrote:

> Patch attached.

This time with src/backend/utils/misc/postgresql.conf.sample fixed.

-Kevin

Attachment

max_pred_locks_per_transaction-2.patch

Re: Add support for logging the current role

From

Robert Haas

Date:

13 February 2011, 09:01:55

On Fri, Feb 11, 2011 at 11:23 AM, Stephen Frost <sfrost@snowman.net> wrote:
> Updated patch attached, full git log below.

The documentation for csvlog_fields should probably use <literal>
around the default value.

The sentence that begins "For details on what these fields are" should
hyperlink the referenced sections of the documentation.

The function prototype you added to elog.c is misformatted - the type
should be on the line preceding the function name only for the
definition, not for prototypes.

The code for log_line_prefix = 'u' is indented wrong.  Also, an else
clause that has only an if hanging off of it can be turned into an
"else if" for better readability.

This part kind of concerns me:

+ * This is more of a 'safety valve' than anything else,
+ * since GUC processing really should happen before we do any error logging.
+ * We might even want to change this eventually to just not log CSV format logs
+ * if this ever happens, to avoid a discrepency in the CSV log file which would
+ * make it difficult to load into PG.

I'm not really convinced that making the CSV log format variable is a
good thing.  One of the reasons we added that format in the first
place is to make sure that we could generate log output in an easily
parseable format, and this seems like a big step backwards for not
much, especially if we can't even guarantee that we're not going to
inject random differently-formatted lines during startup.

Stylistically, build_default_csvlog_list and assign_csvlog_fields
ought to be loops driven off an array, rather than hand-coded.

I think this was discussed before, and I hate to remention it, but
would it make sense to reuse the single-character codes from
log_line_prefix rather than inventing new codes for the same fields?

It would be awfully nice if we could inject something into the csvlog
output format that would let client programs know which fields are
present.  This would be especially useful for people writing tools
that are intended to work on ANY PG installation, rather than just,
say, their own.  I'm not sure if there's a clean way to do that,
though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Add support for logging the current role

From

Stephen Frost

Date:

13 February 2011, 09:26:41

* Robert Haas (robertmhaas@gmail.com) wrote:
> The documentation for csvlog_fields should probably use <literal>
> around the default value.
>
> The sentence that begins "For details on what these fields are" should
> hyperlink the referenced sections of the documentation.
>
> The function prototype you added to elog.c is misformatted - the type
> should be on the line preceding the function name only for the
> definition, not for prototypes.
>
> The code for log_line_prefix = 'u' is indented wrong.  Also, an else
> clause that has only an if hanging off of it can be turned into an
> "else if" for better readability.

Will fix.

> This part kind of concerns me:
>
> + * This is more of a 'safety valve' than anything else,
> + * since GUC processing really should happen before we do any error logging.
> + * We might even want to change this eventually to just not log CSV format logs
> + * if this ever happens, to avoid a discrepency in the CSV log file which would
> + * make it difficult to load into PG.
>
> I'm not really convinced that making the CSV log format variable is a
> good thing.  One of the reasons we added that format in the first
> place is to make sure that we could generate log output in an easily
> parseable format, and this seems like a big step backwards for not
> much, especially if we can't even guarantee that we're not going to
> inject random differently-formatted lines during startup.

I couldn't make it actually produce incorrect lines.  I was worried
about the possibility, but I don't think it's actually possible because
postgresql.conf needs to be parsed and GUC handling has to work before
we will even start trying to do CSV logging.  I'll rework the comment.

> Stylistically, build_default_csvlog_list and assign_csvlog_fields
> ought to be loops driven off an array, rather than hand-coded.

Sure, will fix.

> I think this was discussed before, and I hate to remention it, but
> would it make sense to reuse the single-character codes from
> log_line_prefix rather than inventing new codes for the same fields?

As I recall, Tom didn't like that idea and neither did I- there's only
so many letters..  It also sucks wrt being self-documenting.  I'd much
rather support multi-line GUCs..

> It would be awfully nice if we could inject something into the csvlog
> output format that would let client programs know which fields are
> present.  This would be especially useful for people writing tools
> that are intended to work on ANY PG installation, rather than just,
> say, their own.  I'm not sure if there's a clean way to do that,
> though.

This would be called a 'header' in most typical CSV scenarios.
Unfortunately, last I checked (maybe it's changed?), COPY w/ HEADER just
throws the header away instead of doing anything useful with it.  If it
actually used the header to build the column list, then adding a header
would be sufficient, provided all the necessary fields are in the table.

If I wanted something to throw away the first record of a file before
loading it, I'd use tail.

I'll see about adding an option to have it output a header.
Thanks,
    Stephen

Re: Add support for logging the current role

From

Andrew Dunstan

Date:

13 February 2011, 20:28:12

On 02/13/2011 08:26 AM, Stephen Frost wrote:
> This would be called a 'header' in most typical CSV scenarios.
> Unfortunately, last I checked (maybe it's changed?), COPY w/ HEADER just
> throws the header away instead of doing anything useful with it.  If it
> actually used the header to build the column list, then adding a header
> would be sufficient, provided all the necessary fields are in the table.

See the discussion back around the the 8.1 release (IIRC) when we added 
the HEADER option.

> If I wanted something to throw away the first record of a file before
> loading it, I'd use tail.
>

The whole point of us having direct CSV import is to minimise the 
requirement for preprocessing.

That said, I think there's probably a good case for an option to use the 
header line as a column list. I know of at least one application I have 
written that could benefit from it. But that's work for 9.2 or later.

cheers

andrew

Re: Add support for logging the current role

From

Stephen Frost

Date:

13 February 2011, 22:51:18

Robert,

* Robert Haas (robertmhaas@gmail.com) wrote:
> On Fri, Feb 11, 2011 at 11:23 AM, Stephen Frost <sfrost@snowman.net> wrote:
> > Updated patch attached, full git log below.

Thanks again for the review.  Updated patch attached.

> The documentation for csvlog_fields should probably use <literal>
> around the default value.

Fixed.

> The sentence that begins "For details on what these fields are" should
> hyperlink the referenced sections of the documentation.

Fixed.

> The function prototype you added to elog.c is misformatted - the type
> should be on the line preceding the function name only for the
> definition, not for prototypes.

Fixed.

> The code for log_line_prefix = 'u' is indented wrong.  Also, an else
> clause that has only an if hanging off of it can be turned into an
> "else if" for better readability.

Fixed.

> This part kind of concerns me:
>
> + * This is more of a 'safety valve' than anything else,
> + * since GUC processing really should happen before we do any error logging.
> + * We might even want to change this eventually to just not log CSV format logs
> + * if this ever happens, to avoid a discrepency in the CSV log file which would
> + * make it difficult to load into PG.
>
> I'm not really convinced that making the CSV log format variable is a
> good thing.  One of the reasons we added that format in the first
> place is to make sure that we could generate log output in an easily
> parseable format, and this seems like a big step backwards for not
> much, especially if we can't even guarantee that we're not going to
> inject random differently-formatted lines during startup.

Comment, function, and whole issue removed.

> Stylistically, build_default_csvlog_list and assign_csvlog_fields
> ought to be loops driven off an array, rather than hand-coded.

Done, added CSVFieldNames and modiffied assign_csvlog_fileds to use it
(build_default_csvlog_list was removed).

> I think this was discussed before, and I hate to remention it, but
> would it make sense to reuse the single-character codes from
> log_line_prefix rather than inventing new codes for the same fields?

I'd rather ditch the log_line_prefix idea of single-letter codes since
it's never going to scale.  Perhaps making log_line_prefix with work
%csvlog_name instead of just the %<single-letter> codes might work.  I
don't see a solution which doesn't involve changing log_line_prefix
though, in any case.

> It would be awfully nice if we could inject something into the csvlog
> output format that would let client programs know which fields are
> present.  This would be especially useful for people writing tools
> that are intended to work on ANY PG installation, rather than just,
> say, their own.  I'm not sure if there's a clean way to do that,
> though.

Added csvlog_header GUC to allow the user to ask for the header to be
printed at the top of each log file.  If and when an option is added to
PG's COPY to respect the header, this should resolve that issue.

Also updated to HEAD.

Full git log below, patch attached.

    Thanks,

        Stephen

commit 592c256ffff4ffde77fc29ff28fdedd2c9f2dafd
Merge: 33639eb cebbaa1
Author: Stephen Frost <sfrost@snowman.net>
Date:   Sun Feb 13 21:11:44 2011 -0500

    Merge branch 'master' of git://git.postgresql.org/git/postgresql into log_csv_options

commit 33639ebfe67b0dd58a0a89161e9f0d5237830ed4
Author: Stephen Frost <sfrost@snowman.net>
Date:   Sun Feb 13 21:08:08 2011 -0500

    Add csvlog_header GUC, other cleanup

    This patch adds a csvlog_header option which will start each CSV
    log file with a header which matches the GUC (and hence the format
    of the CSV log file generated).

    Numerous other whitespace clean-ups, removed build_default_csvlog_list(),
    since it wasn't actually necessary or useful.  Added an array which
    lists the text strings of the various CSVLOG options to simplify
    assign_csvlog_fields().

commit 6bd2b9f1d2bc3b166a3e5598ee590e25159c61a5
Author: Stephen Frost <sfrost@snowman.net>
Date:   Fri Feb 11 11:16:17 2011 -0500

    Rename log_csv_fields GUC to csvlog_fields

    This patch renames the log_csv_fileds GUC to csvlog_fields, to better
    match the other csvlog_* options.

    Also cleaned up the CSV generation code a bit by moving the comma-adding
    code out of the switch() statement.

commit a281ca611e6181339e92b488c815e0cb8c1298d2
Merge: d8dddd1 183d3cf
Author: Stephen Frost <sfrost@snowman.net>
Date:   Fri Feb 11 08:37:27 2011 -0500

    Merge branch 'master' of git://git.postgresql.org/git/postgresql into log_csv_options

commit d8dddd1c425a4c320540769084ceeb7d23bc3662
Author: Stephen Frost <sfrost@snowman.net>
Date:   Sun Feb 6 14:02:05 2011 -0500

    Change log_csv_options listing to a table

    This patch changes the listing of field options available to
    log_csv_options into a table, which will hopefully both look
    better and be clearer.

commit f9851cdfaeb931f01c015f5651b72d16957c7114
Merge: 3e71e33 5ed45ac
Author: Stephen Frost <sfrost@snowman.net>
Date:   Sun Feb 6 13:26:17 2011 -0500

    Merge branch 'master' of git://git.postgresql.org/git/postgresql into log_csv_options

commit 3e71e338a2b9352d730f59a989027e33d99bea50
Author: Stephen Frost <sfrost@snowman.net>
Date:   Fri Jan 28 22:44:33 2011 -0500

    Cleanup log_csv_options patch

    Clean up of various function declarations to hopefully be correct
    and clean and matching PG conventions.  Also move TopMemoryContext
    usage to later, since the local variables don't need to be in
    TopMemoryContext.  Lastly, ensure that a comma is not produced
    after the last CSV field, and that one is produced if
    application_name is not the last field.

    Review by Itagaki Takahiro, thanks!

commit 1825def11badd661d219fa4c516f06e0ad423443
Merge: ff249ae 847e8c7
Author: Stephen Frost <sfrost@snowman.net>
Date:   Wed Jan 19 06:50:03 2011 -0500

    Merge branch 'master' of git://git.postgresql.org/git/postgresql into log_csv_options

commit ff249aeac7216da623bf77840380d5e767f681fc
Author: Stephen Frost <sfrost@snowman.net>
Date:   Wed Jan 19 00:26:52 2011 -0500

    Add log_csv_fields GUC for CSV output & curr_role

    This patch adds a new GUC called 'log_csv_fields'.  This GUC allows
    the user to control the set of fields written to the CSV output as
    well as the order in which they are written.  The default set of
    fields remains those that were included in 9.0, to avoid breaking
    existing user configurations.

    In passing, update 'user_name' for log_line_prefix and log_csv_fields
    to mean 'session user' (which could be reset by a superuser with
    set session authorization), and add a 'role_name' option (%U) to
    log_line_prefix and log_csv_fields, to allow users to log the
    current role (as set by SET ROLE- not impacted by SECURITY DEFINER
    functions).

Attachment

Re: Add support for logging the current role

From

Stephen Frost

Date:

13 February 2011, 22:52:58

Andrew,

* Andrew Dunstan (andrew@dunslane.net) wrote:
> See the discussion back around the the 8.1 release (IIRC) when we
> added the HEADER option.

I recall seeing it and not agreeing with it then. :)

> >If I wanted something to throw away the first record of a file before
> >loading it, I'd use tail.
>
> The whole point of us having direct CSV import is to minimise the
> requirement for preprocessing.

Having a 'throw away 1 line' option is just silly to me, but documenting
it as "header" is worse.  Water under the bridge at this point though.

> That said, I think there's probably a good case for an option to use
> the header line as a column list. I know of at least one application
> I have written that could benefit from it. But that's work for 9.2
> or later.

I agree it's work for 9.2.  I could probably help with this if people
agree that it should be done.
Thanks,
    Stephen

Re: Add support for logging the current role

From

Itagaki Takahiro

Date:

14 February 2011, 07:23:43

On Mon, Feb 14, 2011 at 11:51, Stephen Frost <sfrost@snowman.net> wrote:
>> It would be awfully nice if we could inject something into the csvlog
>> output format that would let client programs know which fields are
>> present.  This would be especially useful for people writing tools
>> that are intended to work on ANY PG installation, rather than just,
>> say, their own.  I'm not sure if there's a clean way to do that,
>> though.
>
> Added csvlog_header GUC to allow the user to ask for the header to be
> printed at the top of each log file.  If and when an option is added to
> PG's COPY to respect the header, this should resolve that issue.

We need to design csvlog_header more carefully. csvlog_header won't work
if log_filename is low-resolution, ex. log-per-day.  It's still useful when
a DBA reads the file manually, but documentation would better.
Or, should we skip writing headers when the open log file is not
empty? It works unless when csvlog_fields is modified before restart,
and also on logger's crash and restart, though it's a rare case.

A few comments and trivial issues:

* It might be my misunderstanding, but there was a short description for %U
for in log_line_prefix in postgresql.conf, right? Did you remove it in the
latest version?

* The long csvlog_fields default is a problem also in postgresql.conf,
but we have no solution for the issue... The current code is the best for now.

* In assign_csvlog_fields(), we need to cleanup memory and memory context
before return on error.
+         /* check if no option matched, and if so, return error */
+         if (curr_option == MAX_CSVLOG_OPTS)
+             return NULL;

* An added needless "return" should be removed.
*** 2139,2144 **** write_csvlog(ErrorData *edata)
--- 2379,2386 ----         write_pipe_chunks(buf.data, buf.len, LOG_DESTINATION_CSVLOG);
     pfree(buf.data);
+
+     return; }
 /*

--
Itagaki Takahiro

Re: Add support for logging the current role

From

Stephen Frost

Date:

14 February 2011, 10:30:24

Itagaki,

* Itagaki Takahiro (itagaki.takahiro@gmail.com) wrote:
> We need to design csvlog_header more carefully. csvlog_header won't work
> if log_filename is low-resolution, ex. log-per-day.

This isn't any different a problem to the issue of someone changing the
csvlog_fields GUC but not checking if the log file already exists on
restart.  I've suggested a number of different options, but none of them
are terribly good, and I havn't heard anyone supporting any solution to
this issue.

> It's still useful when
> a DBA reads the file manually, but documentation would better.

Eh?  If you mean that we should add documentation to make users aware of
the possible issue of changing these values without making sure the log
file gets rotated- sure, I'd be happy to do that.

> Or, should we skip writing headers when the open log file is not
> empty?

This doesn't help the csvlog_fields issue, unfortunately.  I don't think
it'd be hard to implement this to help with the header issue, I'm just
not sure if it makes sense to do so when the actual list of fields could
change...

> * It might be my misunderstanding, but there was a short description for %U
> for in log_line_prefix in postgresql.conf, right? Did you remove it in the
> latest version?

No, and I don't see where I ever added it..  I've fixed it.

> * In assign_csvlog_fields(), we need to cleanup memory and memory context
> before return on error.
> +         /* check if no option matched, and if so, return error */
> +         if (curr_option == MAX_CSVLOG_OPTS)
> +             return NULL;

Fixed this and a couple of similar issues.

> * An added needless "return" should be removed.

Meh, I like explicit returns, but since it generated a hunk all by
itself, I'll clear it out.

Updated patch attached, git log below.

    Thanks,

        Stephen

commit 304e35ebb74f68da69163ed9dd1dd453b67181e7
Author: Stephen Frost <sfrost@snowman.net>
Date:   Mon Feb 14 09:26:03 2011 -0500

    csvlog_fields: fix leak, other cleanup

    Fix a couple of potential memory leaks in assign_csvlog_fields().
    Also added a few comments, removed an extra 'return;', and added
    %U to the sample postgresql.conf.

commit 592c256ffff4ffde77fc29ff28fdedd2c9f2dafd
Merge: 33639eb cebbaa1
Author: Stephen Frost <sfrost@snowman.net>
Date:   Sun Feb 13 21:11:44 2011 -0500

    Merge branch 'master' of git://git.postgresql.org/git/postgresql into log_csv_options

commit 33639ebfe67b0dd58a0a89161e9f0d5237830ed4
Author: Stephen Frost <sfrost@snowman.net>
Date:   Sun Feb 13 21:08:08 2011 -0500

    Add csvlog_header GUC, other cleanup

    This patch adds a csvlog_header option which will start each CSV
    log file with a header which matches the GUC (and hence the format
    of the CSV log file generated).

    Numerous other whitespace clean-ups, removed build_default_csvlog_list(),
    since it wasn't actually necessary or useful.  Added an array which
    lists the text strings of the various CSVLOG options to simplify
    assign_csvlog_fields().

commit 6bd2b9f1d2bc3b166a3e5598ee590e25159c61a5
Author: Stephen Frost <sfrost@snowman.net>
Date:   Fri Feb 11 11:16:17 2011 -0500

    Rename log_csv_fields GUC to csvlog_fields

    This patch renames the log_csv_fileds GUC to csvlog_fields, to better
    match the other csvlog_* options.

    Also cleaned up the CSV generation code a bit by moving the comma-adding
    code out of the switch() statement.

commit a281ca611e6181339e92b488c815e0cb8c1298d2
Merge: d8dddd1 183d3cf
Author: Stephen Frost <sfrost@snowman.net>
Date:   Fri Feb 11 08:37:27 2011 -0500

    Merge branch 'master' of git://git.postgresql.org/git/postgresql into log_csv_options

commit d8dddd1c425a4c320540769084ceeb7d23bc3662
Author: Stephen Frost <sfrost@snowman.net>
Date:   Sun Feb 6 14:02:05 2011 -0500

    Change log_csv_options listing to a table

    This patch changes the listing of field options available to
    log_csv_options into a table, which will hopefully both look
    better and be clearer.

commit f9851cdfaeb931f01c015f5651b72d16957c7114
Merge: 3e71e33 5ed45ac
Author: Stephen Frost <sfrost@snowman.net>
Date:   Sun Feb 6 13:26:17 2011 -0500

    Merge branch 'master' of git://git.postgresql.org/git/postgresql into log_csv_options

commit 3e71e338a2b9352d730f59a989027e33d99bea50
Author: Stephen Frost <sfrost@snowman.net>
Date:   Fri Jan 28 22:44:33 2011 -0500

    Cleanup log_csv_options patch

    Clean up of various function declarations to hopefully be correct
    and clean and matching PG conventions.  Also move TopMemoryContext
    usage to later, since the local variables don't need to be in
    TopMemoryContext.  Lastly, ensure that a comma is not produced
    after the last CSV field, and that one is produced if
    application_name is not the last field.

    Review by Itagaki Takahiro, thanks!

commit 1825def11badd661d219fa4c516f06e0ad423443
Merge: ff249ae 847e8c7
Author: Stephen Frost <sfrost@snowman.net>
Date:   Wed Jan 19 06:50:03 2011 -0500

    Merge branch 'master' of git://git.postgresql.org/git/postgresql into log_csv_options

commit ff249aeac7216da623bf77840380d5e767f681fc
Author: Stephen Frost <sfrost@snowman.net>
Date:   Wed Jan 19 00:26:52 2011 -0500

    Add log_csv_fields GUC for CSV output & curr_role

    This patch adds a new GUC called 'log_csv_fields'.  This GUC allows
    the user to control the set of fields written to the CSV output as
    well as the order in which they are written.  The default set of
    fields remains those that were included in 9.0, to avoid breaking
    existing user configurations.

    In passing, update 'user_name' for log_line_prefix and log_csv_fields
    to mean 'session user' (which could be reset by a superuser with
    set session authorization), and add a 'role_name' option (%U) to
    log_line_prefix and log_csv_fields, to allow users to log the
    current role (as set by SET ROLE- not impacted by SECURITY DEFINER
    functions).

Attachment

Re: Add support for logging the current role

From

Itagaki Takahiro

Date:

15 February 2011, 01:46:13

On Mon, Feb 14, 2011 at 23:30, Stephen Frost <sfrost@snowman.net> wrote:
> > * In assign_csvlog_fields(), we need to cleanup memory and memory context
> > before return on error.
> Fixed this and a couple of similar issues.

Not yet fixed. Switched memory context is not restored on error.

> Updated patch attached, git log below.

Now I mark the patch to "Ready for Committer",
because I don't have suggestions any more.

For reference, I note my previous questions. Some of them might be TODO
items, or might not. We can add the basic feature in 9.1, and improve it
9.2 or later versions.

* csvlog_fields and csvlog_header won't work with non-default log_filename when it doesn't contain seconds in the
format.They expect they can always open empty log files.

* The long default value for csvlog_fields leads long text line in postgresql.conf, SHOW ALL, pg_settings view, but
therewere no better alternative solutions in the past discussion.

* csvlog_fields is marked as PGC_POSTMASTER. It can protect mixed formats in a csv file on default log_filename, but
othersimilar GUC variables are usually marked AS PGC_SIGHUP.

-- 
Itagaki Takahiro

Re: Add support for logging the current role

From

Robert Haas

Date:

15 February 2011, 08:59:03

On Tue, Feb 15, 2011 at 12:46 AM, Itagaki Takahiro
<itagaki.takahiro@gmail.com> wrote:
> On Mon, Feb 14, 2011 at 23:30, Stephen Frost <sfrost@snowman.net> wrote:
>> > * In assign_csvlog_fields(), we need to cleanup memory and memory context
>> > before return on error.
>> Fixed this and a couple of similar issues.
>
> Not yet fixed. Switched memory context is not restored on error.
>
>> Updated patch attached, git log below.
>
> Now I mark the patch to "Ready for Committer",
> because I don't have suggestions any more.
>
> For reference, I note my previous questions. Some of them might be TODO
> items, or might not. We can add the basic feature in 9.1, and improve it
> 9.2 or later versions.
>
> * csvlog_fields and csvlog_header won't work with non-default log_filename
>  when it doesn't contain seconds in the format. They expect they can always
>  open empty log files.
>
> * The long default value for csvlog_fields leads long text line in
>  postgresql.conf, SHOW ALL, pg_settings view, but there were no better
>  alternative solutions in the past discussion.
>
> * csvlog_fields is marked as PGC_POSTMASTER. It can protect mixed formats
>  in a csv file on default log_filename, but other similar GUC variables
>  are usually marked AS PGC_SIGHUP.

I think we should push this whole patch out to 9.2.  It seems to me
that there are significant unresolved design issues here which need
more time and thought than we can realistically give them now.  In
addition to the above, there is the problem of making the data
self-identifying, which I think is really, really important for
third-party tools.  I am not keen to push a half-baked solution into
the tree now that we will have to live with for years.  The payoff
(getting %U) seems quite out of proportion to the potential downsides
of making a change of this type at this late date.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Add support for logging the current role

From

Robert Haas

Date:

15 February 2011, 09:08:26

On Fri, Feb 11, 2011 at 6:20 PM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:
> I wrote:
>>> Patch attached.
>
> This time with src/backend/utils/misc/postgresql.conf.sample fixed.

Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Add support for logging the current role

From

Stephen Frost

Date:

15 February 2011, 09:09:11

* Robert Haas (robertmhaas@gmail.com) wrote:
> The payoff
> (getting %U) seems quite out of proportion to the potential downsides
> of making a change of this type at this late date.

I'd be happy to go back to the original patch/idea of just the simple
addition of %U as an option for log_line_prefix.  I'd be quite
frustrated to not have *any* way to log the current role in 9.1.  I
don't think anyone is going to be too bent out of shape that we can't do
it with CSV initially, so long as we agree that we'll try and add that
for 9.2.

Pushing the CSV log changes to 9.2 would be fine with me, and I'd be
happy to continue working on it and the GUC changes I was suggesting to
address the long config line, and perhaps figure out a way to handle
changes on SIGHUP.
Thanks,
    Stephen

Re: Add support for logging the current role

From

Stephen Frost

Date:

15 February 2011, 09:43:31

* Itagaki Takahiro (itagaki.takahiro@gmail.com) wrote:
> On Mon, Feb 14, 2011 at 23:30, Stephen Frost <sfrost@snowman.net> wrote:
> > > * In assign_csvlog_fields(), we need to cleanup memory and memory context
> > > before return on error.
> > Fixed this and a couple of similar issues.
>
> Not yet fixed. Switched memory context is not restored on error.

Ugh, sorry about that, I should have realized that needed to be done.
Updated patch attached.

> > Updated patch attached, git log below.
>
> Now I mark the patch to "Ready for Committer",
> because I don't have suggestions any more.

Thanks.

> For reference, I note my previous questions. Some of them might be TODO
> items, or might not. We can add the basic feature in 9.1, and improve it
> 9.2 or later versions.

That's what I would have thought, ah well. :)

> * csvlog_fields and csvlog_header won't work with non-default log_filename
>   when it doesn't contain seconds in the format. They expect they can always
>   open empty log files.

Or that the user will deal with changes and header lines mid-file if
they decide to use a log_filename which causes it to happen.

> * The long default value for csvlog_fields leads long text line in
>   postgresql.conf, SHOW ALL, pg_settings view, but there were no better
>   alternative solutions in the past discussion.

Not without making GUCs able to be multi-line.  If people are interested
in this, I'll try to make it happen for 9.2.

> * csvlog_fields is marked as PGC_POSTMASTER. It can protect mixed formats
>   in a csv file on default log_filename, but other similar GUC variables
>   are usually marked AS PGC_SIGHUP.

The problem here is primairly that each backend does write_csvlog() and
there's no easy way to make sure that none of them write the new format
to the old file (or the old format to the new file) before a switch is
done.  I can try looking into this but I'm concerned the only solution
would be to introduce some amount of locking which could slow down the
overall logging process which might be unacceptable performance-wise.

Preventing CSV logs from appending to existing files could be pretty
easily done, provided we can agree on what to call the new files, but
that wouldn't change PGC_POSTMASTER.
Thanks,
    Stephen

Re: Add support for logging the current role

From

Stephen Frost

Date:

15 February 2011, 09:51:44

* Stephen Frost (sfrost@snowman.net) wrote:
> Ugh, sorry about that, I should have realized that needed to be done.
> Updated patch attached.

Errr, for real this time.

    Thanks,

        Stephen

commit 25e94dcb390f56502bc46e683b438c20d2dc74e0
Author: Stephen Frost <sfrost@snowman.net>
Date:   Tue Feb 15 08:50:17 2011 -0500

    assign_csvlog_fields() - reset context on error

    On error, we need to make sure to reset the memory context back
    to what it was when we entered.

Attachment

Re: Add support for logging the current role

From

Stephen Frost

Date:

15 February 2011, 11:26:29

* Stephen Frost (sfrost@snowman.net) wrote:
> I'd be happy to go back to the original patch/idea of just the simple
> addition of %U as an option for log_line_prefix.

Updated patch attached which just adds %U support to log_line_prefix.
Will work on adding CSV support for this in 9.2, along with associated
other issues regarding supporting variable CSV format output.

    Thanks,

        Stephen

commit c1b06c04af0c886c6ec27917368f3c674227ed2d
Author: Stephen Frost <sfrost@snowman.net>
Date:   Tue Feb 15 10:21:38 2011 -0500

    Add %U option to log_line_prefix

    This patch adds a %U option to log_line_prefix, to allow logging
    of the current role (previously not possible).  Also reworks %u
    a bit and adds documentation to clarify what each means.

Attachment

Re: Add support for logging the current role

From

Robert Haas

Date:

15 February 2011, 11:32:36

On Tue, Feb 15, 2011 at 10:26 AM, Stephen Frost <sfrost@snowman.net> wrote:
> * Stephen Frost (sfrost@snowman.net) wrote:
>> I'd be happy to go back to the original patch/idea of just the simple
>> addition of %U as an option for log_line_prefix.
>
> Updated patch attached which just adds %U support to log_line_prefix.
> Will work on adding CSV support for this in 9.2, along with associated
> other issues regarding supporting variable CSV format output.

Something along these lines would be OK with me (I haven't yet
validated every detail), but there were previous objections to adding
any new fields to log_line_prefix until we had a flexible CSV format.
I think that's raising the bar a bit too high, personally, but I don't
have the only vote around here...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Add support for logging the current role

From

Tom Lane

Date:

15 February 2011, 11:37:59

Stephen Frost <sfrost@snowman.net> writes:
> * Robert Haas (robertmhaas@gmail.com) wrote:
>> The payoff
>> (getting %U) seems quite out of proportion to the potential downsides
>> of making a change of this type at this late date.

> I'd be happy to go back to the original patch/idea of just the simple
> addition of %U as an option for log_line_prefix.  I'd be quite
> frustrated to not have *any* way to log the current role in 9.1.  I
> don't think anyone is going to be too bent out of shape that we can't do
> it with CSV initially, so long as we agree that we'll try and add that
> for 9.2.

Given that this has been like this right along, I don't see why it's
all that urgent to force a half-baked solution into 9.1.  I'm also
concerned that if we do do that, you'll lose motivation to work on
cleaning it up for 9.2 ;-)
        regards, tom lane

Re: Add support for logging the current role

From

Robert Haas

Date:

15 February 2011, 11:45:13

On Tue, Feb 15, 2011 at 10:37 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Stephen Frost <sfrost@snowman.net> writes:
>> * Robert Haas (robertmhaas@gmail.com) wrote:
>>> The payoff
>>> (getting %U) seems quite out of proportion to the potential downsides
>>> of making a change of this type at this late date.
>
>> I'd be happy to go back to the original patch/idea of just the simple
>> addition of %U as an option for log_line_prefix.  I'd be quite
>> frustrated to not have *any* way to log the current role in 9.1.  I
>> don't think anyone is going to be too bent out of shape that we can't do
>> it with CSV initially, so long as we agree that we'll try and add that
>> for 9.2.
>
> Given that this has been like this right along, I don't see why it's
> all that urgent to force a half-baked solution into 9.1.  I'm also
> concerned that if we do do that, you'll lose motivation to work on
> cleaning it up for 9.2 ;-)

Trying to arm-twist people into working on A before we're willing to
give them B doesn't necessary serve us very well.  I'd rather leave
the problem of making the CSV format more flexible to someone who is
really motivated to work on *that problem*, whether that person ends
up being Stephen or not.

Just my $0.02.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Add support for logging the current role

From

Tom Lane

Date:

15 February 2011, 11:57:34

Robert Haas <robertmhaas@gmail.com> writes:
> Something along these lines would be OK with me (I haven't yet
> validated every detail), but there were previous objections to adding
> any new fields to log_line_prefix until we had a flexible CSV format.
> I think that's raising the bar a bit too high, personally, but I don't
> have the only vote around here...

I think I was the one objecting.  I don't necessarily say that we have
to have a "flexible" CSV format, but I do say that facilities that are
available in log_line_prefix and not in CSV logs are a bad thing.
        regards, tom lane

Re: Add support for logging the current role

From

Robert Haas

Date:

15 February 2011, 12:03:58

On Tue, Feb 15, 2011 at 10:57 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> Something along these lines would be OK with me (I haven't yet
>> validated every detail), but there were previous objections to adding
>> any new fields to log_line_prefix until we had a flexible CSV format.
>> I think that's raising the bar a bit too high, personally, but I don't
>> have the only vote around here...
>
> I think I was the one objecting.  I don't necessarily say that we have
> to have a "flexible" CSV format, but I do say that facilities that are
> available in log_line_prefix and not in CSV logs are a bad thing.

Well, I guess the other option is to just add it to the format, full
stop.  But as someone pointed out previously, that's not a terribly
scalable solution, but perhaps it could be judged adequate for this
particular case.

While I generally agree with the principal, I also wonder if it might
be better to just add this field in log_line_prefix and wait for
someone to complain about that as other than a theoretical matter.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Add support for logging the current role

From

Stephen Frost

Date:

15 February 2011, 12:04:01

Tom,

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Given that this has been like this right along, I don't see why it's
> all that urgent to force a half-baked solution into 9.1.  I'm also
> concerned that if we do do that, you'll lose motivation to work on
> cleaning it up for 9.2 ;-)

The addition to log_line_prefix is hardly 'half-baked' as a solution, to
that problem (I just pulled the hunks from the rest of the patch as they
were completely independent, and tested them).  I've also gone and added
the csvlog_fields/csvlog_header patch to the 2011-Next commitfest. :P

I've also already started looking at changing syslogger to have it
figure out if it should be writing a header out or not.  If we can
decide what semantics we should have when the log file exists and we're
not planning to rotate it on startup, it won't be hard for me to
implement them (well, I hope).
Thanks,
    Stephen

Re: Add support for logging the current role

From

Stephen Frost

Date:

15 February 2011, 12:13:16

* Robert Haas (robertmhaas@gmail.com) wrote:
> Well, I guess the other option is to just add it to the format, full
> stop.  But as someone pointed out previously, that's not a terribly
> scalable solution, but perhaps it could be judged adequate for this
> particular case.

Think I suggested that at one point.  I'm all for doing that on a major
version change like this one, but I think we already had some concerns
about that on this thread (Andrew maybe?).

> While I generally agree with the principal, I also wonder if it might
> be better to just add this field in log_line_prefix and wait for
> someone to complain about that as other than a theoretical matter.

I might be working against myself, but I'll complain right now about the
lack of any way to have a header on the CSV logs and that you don't get
to control what fields are logged.  That said, I'm not currently using
them either, so my vote doesn't count for much.  Of course, I'll also
complain about the lack of any way to get PG to respect the header,
forcing me to do fun things like:

for file in *results*; doHEADER=`head -1 $file`sed -e 's:""::g' < $file | \    psql -d beac -h sauron -c \    "\copy
my_table($HEADER) from STDIN with csv header"

done

on a regular basis.  How forcing me to do that rather than asking
someone else to use 'tail -n +2' makes sense is beyond me..
Thanks,
    Stephen

Re: Add support for logging the current role

From

Robert Haas

Date:

15 February 2011, 12:29:12

On Tue, Feb 15, 2011 at 11:13 AM, Stephen Frost <sfrost@snowman.net> wrote:
> * Robert Haas (robertmhaas@gmail.com) wrote:
>> Well, I guess the other option is to just add it to the format, full
>> stop.  But as someone pointed out previously, that's not a terribly
>> scalable solution, but perhaps it could be judged adequate for this
>> particular case.
>
> Think I suggested that at one point.  I'm all for doing that on a major
> version change like this one, but I think we already had some concerns
> about that on this thread (Andrew maybe?).
>
>> While I generally agree with the principal, I also wonder if it might
>> be better to just add this field in log_line_prefix and wait for
>> someone to complain about that as other than a theoretical matter.
>
> I might be working against myself, but I'll complain right now about the
> lack of any way to have a header on the CSV logs and that you don't get
> to control what fields are logged.  That said, I'm not currently using
> them either, so my vote doesn't count for much.  Of course, I'll also
> complain about the lack of any way to get PG to respect the header,
> forcing me to do fun things like:
>
> for file in *results*; do
>        HEADER=`head -1 $file`
>        sed -e 's:""::g' < $file | \
>            psql -d beac -h sauron -c \
>            "\copy my_table ($HEADER) from STDIN with csv header"
> done
>
> on a regular basis.  How forcing me to do that rather than asking
> someone else to use 'tail -n +2' makes sense is beyond me..

It's not an either/or proposition.  We could certainly support header
on/off/ignore, with the new extensible COPY syntax.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Add support for logging the current role

From

Andrew Dunstan

Date:

15 February 2011, 12:59:33

On 02/15/2011 11:13 AM, Stephen Frost wrote:
> * Robert Haas (robertmhaas@gmail.com) wrote:
>> Well, I guess the other option is to just add it to the format, full
>> stop.  But as someone pointed out previously, that's not a terribly
>> scalable solution, but perhaps it could be judged adequate for this
>> particular case.
> Think I suggested that at one point.  I'm all for doing that on a major
> version change like this one, but I think we already had some concerns
> about that on this thread (Andrew maybe?).

I could live with it for a release if I thought we had a clear path 
ahead, but I think there are some design issues that we need to think 
about before we start providing for header lines and variable formats in 
CSV logs, particularly w.r.t. log rotation etc. So I'm slightly nervous 
about going ahead with this right now.

>> While I generally agree with the principal, I also wonder if it might
>> be better to just add this field in log_line_prefix and wait for
>> someone to complain about that as other than a theoretical matter.
> I might be working against myself, but I'll complain right now about the
> lack of any way to have a header on the CSV logs and that you don't get
> to control what fields are logged.  That said, I'm not currently using
> them either, so my vote doesn't count for much.  Of course, I'll also
> complain about the lack of any way to get PG to respect the header,
> forcing me to do fun things like:
>
> for file in *results*; do
>     HEADER=`head -1 $file`
>     sed -e 's:""::g'<  $file | \
>         psql -d beac -h sauron -c \
>         "\copy my_table ($HEADER) from STDIN with csv header"
> done
>
> on a regular basis.  How forcing me to do that rather than asking
> someone else to use 'tail -n +2' makes sense is beyond me..
>

You don't really make your case any better by continuing this argument 
from years ago. I can tell you from experience that the CSV HEADER 
feature is distinctly useful as it is. If you want to add a mode that 
uses the header line as a column list on import, then make that case, 
and I'll support it in fact, but it's not an alternative to having the 
header ignored, which is a feature I would vigorously resist removing. 
(Incidentally, I think it won't be trivial - the COPY code expects to 
know the columns by the time it opens the file).

cheers

andrew

Re: Add support for logging the current role

From

Stephen Frost

Date:

15 February 2011, 14:02:53

* Andrew Dunstan (andrew@dunslane.net) wrote:
> On 02/15/2011 11:13 AM, Stephen Frost wrote:
> >Think I suggested that at one point.  I'm all for doing that on a major
> >version change like this one, but I think we already had some concerns
> >about that on this thread (Andrew maybe?).
>
> I could live with it for a release if I thought we had a clear path
> ahead, but I think there are some design issues that we need to
> think about before we start providing for header lines and variable
> formats in CSV logs, particularly w.r.t. log rotation etc. So I'm
> slightly nervous about going ahead with this right now.

I believe the suggestion that Robert and I were talking about above was
to just unilatterally change the CSV log file output format to include
current_role.  No header lines, no variable output format, etc.

I do think we can make header lines and variable output work, if we can
get agreement on what the semantics should be.

> You don't really make your case any better by continuing this
> argument from years ago. I can tell you from experience that the CSV
> HEADER feature is distinctly useful as it is. If you want to add a
> mode that uses the header line as a column list on import, then make
> that case, and I'll support it in fact, but it's not an alternative
> to having the header ignored, which is a feature I would vigorously
> resist removing.

I'm not really interested in removing it.  I guess I have a vain hope
that with arguing I'll convince someone to take up the mantle of
implementing the 'use header' option. :)  Not getting much traction
though, so I expect I'll work on it this summer.

> (Incidentally, I think it won't be trivial - the
> COPY code expects to know the columns by the time it opens the
> file).

Thanks for that insight, I'll take a look at how things work and see if
I can come up with a sensible proposal.
Thanks,
    Stephen

Re: Add support for logging the current role

From

Robert Haas

Date:

16 February 2011, 17:24:42

On Tue, Feb 15, 2011 at 1:02 PM, Stephen Frost <sfrost@snowman.net> wrote:
> * Andrew Dunstan (andrew@dunslane.net) wrote:
>> On 02/15/2011 11:13 AM, Stephen Frost wrote:
>> >Think I suggested that at one point.  I'm all for doing that on a major
>> >version change like this one, but I think we already had some concerns
>> >about that on this thread (Andrew maybe?).
>>
>> I could live with it for a release if I thought we had a clear path
>> ahead, but I think there are some design issues that we need to
>> think about before we start providing for header lines and variable
>> formats in CSV logs, particularly w.r.t. log rotation etc. So I'm
>> slightly nervous about going ahead with this right now.
>
> I believe the suggestion that Robert and I were talking about above was
> to just unilatterally change the CSV log file output format to include
> current_role.  No header lines, no variable output format, etc.
>
> I do think we can make header lines and variable output work, if we can
> get agreement on what the semantics should be.

I think we're back to not having a consensus on a reasonable way to
proceed here.  Let's take this up again for 9.2.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Add support for logging the current role

From

Andrew Dunstan

Date:

16 February 2011, 18:55:24


On 02/16/2011 04:24 PM, Robert Haas wrote:
> On Tue, Feb 15, 2011 at 1:02 PM, Stephen Frost<sfrost@snowman.net>  wrote:
>> * Andrew Dunstan (andrew@dunslane.net) wrote:
>>> On 02/15/2011 11:13 AM, Stephen Frost wrote:
>>>> Think I suggested that at one point.  I'm all for doing that on a major
>>>> version change like this one, but I think we already had some concerns
>>>> about that on this thread (Andrew maybe?).
>>> I could live with it for a release if I thought we had a clear path
>>> ahead, but I think there are some design issues that we need to
>>> think about before we start providing for header lines and variable
>>> formats in CSV logs, particularly w.r.t. log rotation etc. So I'm
>>> slightly nervous about going ahead with this right now.
>> I believe the suggestion that Robert and I were talking about above was
>> to just unilatterally change the CSV log file output format to include
>> current_role.  No header lines, no variable output format, etc.
>>
>> I do think we can make header lines and variable output work, if we can
>> get agreement on what the semantics should be.
> I think we're back to not having a consensus on a reasonable way to
> proceed here.  Let's take this up again for 9.2.
>

That's up to you. I can certainly live with what is suggested in 
Stephen's penultimate para above.

cheers

andrew

Re: Add support for logging the current role

From

Robert Haas

Date:

16 February 2011, 20:07:39

On Wed, Feb 16, 2011 at 5:55 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
> On 02/16/2011 04:24 PM, Robert Haas wrote:
>>
>> On Tue, Feb 15, 2011 at 1:02 PM, Stephen Frost<sfrost@snowman.net>  wrote:
>>>
>>> * Andrew Dunstan (andrew@dunslane.net) wrote:
>>>>
>>>> On 02/15/2011 11:13 AM, Stephen Frost wrote:
>>>>>
>>>>> Think I suggested that at one point.  I'm all for doing that on a major
>>>>> version change like this one, but I think we already had some concerns
>>>>> about that on this thread (Andrew maybe?).
>>>>
>>>> I could live with it for a release if I thought we had a clear path
>>>> ahead, but I think there are some design issues that we need to
>>>> think about before we start providing for header lines and variable
>>>> formats in CSV logs, particularly w.r.t. log rotation etc. So I'm
>>>> slightly nervous about going ahead with this right now.
>>>
>>> I believe the suggestion that Robert and I were talking about above was
>>> to just unilatterally change the CSV log file output format to include
>>> current_role.  No header lines, no variable output format, etc.
>>>
>>> I do think we can make header lines and variable output work, if we can
>>> get agreement on what the semantics should be.
>>
>> I think we're back to not having a consensus on a reasonable way to
>> proceed here.  Let's take this up again for 9.2.
>>
>
> That's up to you. I can certainly live with what is suggested in Stephen's
> penultimate para above.

OK.  If no one objects further, Stephen and I will make that happen.
Otherwise: he's dead, Jim.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Add support for logging the current role

From

Tom Lane

Date:

16 February 2011, 20:43:01

Robert Haas <robertmhaas@gmail.com> writes:
>>> On Tue, Feb 15, 2011 at 1:02 PM, Stephen Frost<sfrost@snowman.net> �wrote:
>>>> I believe the suggestion that Robert and I were talking about above was
>>>> to just unilatterally change the CSV log file output format to include
>>>> current_role. �No header lines, no variable output format, etc.

> OK.  If no one objects further, Stephen and I will make that happen.
> Otherwise: he's dead, Jim.

I can't remember at the moment: have we changed the CSV format in any
releases since it was first created?  And if so, did anyone complain?

If there's precedent showing this isn't going to be a problem for CSV
users, I won't object.  Otherwise I think that we should try to have
just one flag day for them, not two.
        regards, tom lane

Re: Add support for logging the current role

From

Robert Haas

Date:

16 February 2011, 21:34:33

On Wed, Feb 16, 2011 at 7:42 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>>>> On Tue, Feb 15, 2011 at 1:02 PM, Stephen Frost<sfrost@snowman.net>  wrote:
>>>>> I believe the suggestion that Robert and I were talking about above was
>>>>> to just unilatterally change the CSV log file output format to include
>>>>> current_role.  No header lines, no variable output format, etc.
>
>> OK.  If no one objects further, Stephen and I will make that happen.
>> Otherwise: he's dead, Jim.
>
> I can't remember at the moment: have we changed the CSV format in any
> releases since it was first created?  And if so, did anyone complain?
>
> If there's precedent showing this isn't going to be a problem for CSV
> users, I won't object.  Otherwise I think that we should try to have
> just one flag day for them, not two.

CSV log files were introduced in 8.3.0 by commit
fd801f4faa8e0f00bc314b16549e3d8e8aa1b653.  There are several follow-on
commits making adjustments, but they all appear to be 8.3-vintage:

230e8962f3a47cae4729ad7c017410d28caf1370
3bf66d6f1c3a8d266c3e6ed939e763a001179faf
77c166ba6cf6fe8f7e9737b7fe1793d886dd5cf8

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Add support for logging the current role

From

Stephen Frost

Date:

16 February 2011, 21:38:58

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> I can't remember at the moment: have we changed the CSV format in any
> releases since it was first created?  And if so, did anyone complain?

It was changed between 8.4 and 9.0 (application_name was added).  I've
looked around a bit in the archives w/ google and havn't found a single
complaint.  Perhaps google is failing me, but it seems this isn't too
bad.  We've had CSV log output since 8.3, for reference, so it was
unchanged 8.3 -> 8.4, then changed between 8.4 -> 9.0.
Thanks,
    Stephen

Re: Add support for logging the current role

From

Andrew Dunstan

Date:

16 February 2011, 21:49:02


On 02/16/2011 08:38 PM, Stephen Frost wrote:
> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>> I can't remember at the moment: have we changed the CSV format in any
>> releases since it was first created?  And if so, did anyone complain?
> It was changed between 8.4 and 9.0 (application_name was added).  I've
> looked around a bit in the archives w/ google and havn't found a single
> complaint.  Perhaps google is failing me, but it seems this isn't too
> bad.  We've had CSV log output since 8.3, for reference, so it was
> unchanged 8.3 ->  8.4, then changed between 8.4 ->  9.0.
>
>     

Frankly, compared with other issues that we've sometimes inflicted on 
people upgrading, this one  strikes me as fairly low grade.

cheers

andrew

Re: Add support for logging the current role

From

Stephen Frost

Date:

16 February 2011, 21:59:05

* Robert Haas (robertmhaas@gmail.com) wrote:
> CSV log files were introduced in 8.3.0 by commit
> fd801f4faa8e0f00bc314b16549e3d8e8aa1b653.  There are several follow-on
> commits making adjustments, but they all appear to be 8.3-vintage:
>
> 230e8962f3a47cae4729ad7c017410d28caf1370
> 3bf66d6f1c3a8d266c3e6ed939e763a001179faf
> 77c166ba6cf6fe8f7e9737b7fe1793d886dd5cf8

This list appears to miss out on
8217cfbd991856d25d73b0f7afcf43d99f90b653 ..?
Thanks,
    Stephen

Re: Add support for logging the current role

From

Robert Haas

Date:

16 February 2011, 22:02:22

On Wed, Feb 16, 2011 at 8:58 PM, Stephen Frost <sfrost@snowman.net> wrote:
> * Robert Haas (robertmhaas@gmail.com) wrote:
>> CSV log files were introduced in 8.3.0 by commit
>> fd801f4faa8e0f00bc314b16549e3d8e8aa1b653.  There are several follow-on
>> commits making adjustments, but they all appear to be 8.3-vintage:
>>
>> 230e8962f3a47cae4729ad7c017410d28caf1370
>> 3bf66d6f1c3a8d266c3e6ed939e763a001179faf
>> 77c166ba6cf6fe8f7e9737b7fe1793d886dd5cf8
>
> This list appears to miss out on
> 8217cfbd991856d25d73b0f7afcf43d99f90b653 ..?

Ah, so it does.  Sounds like you win.  Have we a patch implementing
the sounds-like-its-agreed change, then?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Add support for logging the current role

From

Stephen Frost

Date:

16 February 2011, 22:03:13

* Robert Haas (robertmhaas@gmail.com) wrote:
> On Wed, Feb 16, 2011 at 8:58 PM, Stephen Frost <sfrost@snowman.net> wrote:
> > This list appears to miss out on
> > 8217cfbd991856d25d73b0f7afcf43d99f90b653 ..?
>
> Ah, so it does.  Sounds like you win.  Have we a patch implementing
> the sounds-like-its-agreed change, then?

Working on it and expect to post it tonight.

(this is about 10x simpler than the previous patch, hah)
Thanks,
    Stephen

Re: Add support for logging the current role

From

Stephen Frost

Date:

16 February 2011, 22:52:42

* Robert Haas (robertmhaas@gmail.com) wrote:
> Ah, so it does.  Sounds like you win.  Have we a patch implementing
> the sounds-like-its-agreed change, then?

Patch attached, rebased to current master.  Full git log:

    Thanks,

        Stephen

commit 47eebe20deb5da56ea6eb413ee80110887790440
Author: Stephen Frost <sfrost@snowman.net>
Date:   Wed Feb 16 21:42:14 2011 -0500

    Add current role to csvlog output

    This patch adds the current role to the csvlog output.  It also slightly
    changes the user_name column to return the session user, if it's been
    changed from the login user, instead of the original login user.
    This is only possible through SET SESSION AUTHORIZATION, which is only
    allowed for superusers.  These changes allow a clear view of what
    privileges commands are being run as.

commit 7456d4fc98e6207b562dd0325dc09bbb1c915ae9
Merge: c1b06c0 9301698
Author: Stephen Frost <sfrost@snowman.net>
Date:   Wed Feb 16 21:03:59 2011 -0500

    Merge branch 'master' of git://git.postgresql.org/git/postgresql into log_role_basic

commit c1b06c04af0c886c6ec27917368f3c674227ed2d
Author: Stephen Frost <sfrost@snowman.net>
Date:   Tue Feb 15 10:21:38 2011 -0500

    Add %U option to log_line_prefix

    This patch adds a %U option to log_line_prefix, to allow logging
    of the current role (previously not possible).  Also reworks %u
    a bit and adds documentation to clarify what each means.

Attachment

Re: Add support for logging the current role

From

Robert Haas

Date:

17 February 2011, 11:43:26

On Wed, Feb 16, 2011 at 9:52 PM, Stephen Frost <sfrost@snowman.net> wrote:
> * Robert Haas (robertmhaas@gmail.com) wrote:
>> Ah, so it does.  Sounds like you win.  Have we a patch implementing
>> the sounds-like-its-agreed change, then?
>
> Patch attached, rebased to current master.

Ugg, wait a minute.  This not only adds %U; it also changes the
behavior of %u, which I don't think we've agreed on.  Also, emitting
'none' when not SET ROLE has been done is pretty ugly.  I'm back to
thinking we need to push this out to 9.2 and take more time to think
about this.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Add support for logging the current role

From

Tom Lane

Date:

17 February 2011, 11:50:49

Robert Haas <robertmhaas@gmail.com> writes:
> Ugg, wait a minute.  This not only adds %U; it also changes the
> behavior of %u, which I don't think we've agreed on.  Also, emitting
> 'none' when not SET ROLE has been done is pretty ugly.  I'm back to
> thinking we need to push this out to 9.2 and take more time to think
> about this.

Yeah, I thought what was supposed to be emitted was the value of
current_user, not SQL's weird definition of what SET ROLE means.
        regards, tom lane

Re: Add support for logging the current role

From

Stephen Frost

Date:

17 February 2011, 11:51:40

* Robert Haas (robertmhaas@gmail.com) wrote:
> Ugg, wait a minute.  This not only adds %U; it also changes the
> behavior of %u, which I don't think we've agreed on.  Also, emitting
> 'none' when not SET ROLE has been done is pretty ugly.  I'm back to
> thinking we need to push this out to 9.2 and take more time to think
> about this.

As I explained in various commit logs and, as I recall, when I first
posted about it, the behavior change for %u could only come about when
someone used 'SET SESSION AUTHORIZATION', which requires superuser
privileges.  It makes more sense to me for 'user_name' to be equivilant
to 'SESSION USER', but it's really not that big a deal either way.

Guess I had foolishly thought that people were alright with it by lack
of any comments on it. :(  Does anyone else want to chime in on this?

I actually came across that problem because the documentation was poor
regarding exactly what that column meant.  If we actually want it to be
"the name that the user first used to authenticate to the system with",
then let's update the documentation accordingly and we can remove those
changes.
Thanks,
    Stephen

Re: Add support for logging the current role

From

Stephen Frost

Date:

17 February 2011, 11:58:19

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > Ugg, wait a minute.  This not only adds %U; it also changes the
> > behavior of %u, which I don't think we've agreed on.  Also, emitting
> > 'none' when not SET ROLE has been done is pretty ugly.  I'm back to
> > thinking we need to push this out to 9.2 and take more time to think
> > about this.
>
> Yeah, I thought what was supposed to be emitted was the value of
> current_user, not SQL's weird definition of what SET ROLE means.

current_user uses GetUserNameFromId() and goes through the cache lookups
to get there.  I was using what show_role() returns (which is also what
'show role;' returns).  I'd be happy to make it emit an empty string
when 'none' is returned though.
Thanks,
    Stephen

Re: Add support for logging the current role

From

Tom Lane

Date:

17 February 2011, 12:10:47

Stephen Frost <sfrost@snowman.net> writes:
> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>> Yeah, I thought what was supposed to be emitted was the value of
>> current_user, not SQL's weird definition of what SET ROLE means.

> current_user uses GetUserNameFromId() and goes through the cache lookups
> to get there.  I was using what show_role() returns (which is also what
> 'show role;' returns).  I'd be happy to make it emit an empty string
> when 'none' is returned though.

Well, that just doesn't seem useful to me in the real world.  If I were
using this, I would expect it to emit a real user name that matches the
currently applied permissions checking.  All the time.  "show role" does
what it does because the SQL standard says so, not because anybody
outside the standards committee thinks that's a sane definition.
        regards, tom lane

Re: Add support for logging the current role

From

Stephen Frost

Date:

17 February 2011, 12:18:41

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Well, that just doesn't seem useful to me in the real world.  If I were
> using this, I would expect it to emit a real user name that matches the
> currently applied permissions checking.  All the time.

I wouldn't have ever thought to use %U w/o %u, to be honest.  Unless I'm
missing something though, this change would just be emitting what
show_session_authorization() returns when show_role() returns 'none'.
That's certainly fine by me.

> "show role" does
> what it does because the SQL standard says so, not because anybody
> outside the standards committee thinks that's a sane definition.

Guess it actually makes some sense to me.
Thanks,
    Stephen

Re: Add support for logging the current role

From

Stephen Frost

Date:

17 February 2011, 12:45:58

* Robert Haas (robertmhaas@gmail.com) wrote:
> Ugg, wait a minute.  This not only adds %U; it also changes the
> behavior of %u, which I don't think we've agreed on.  Also, emitting
> 'none' when not SET ROLE has been done is pretty ugly.  I'm back to
> thinking we need to push this out to 9.2 and take more time to think
> about this.

%u, user_name, etc changes reverted.

%U now always returns the role currently being used for permissions
checks, by using show_session_authorization() when show_role() returns
'none'.  Ditto for CSV updates.

git log below, re-based patch attached.  All regression tests passed,
tested with log_line_prefix and csvlog also, all looks good to me.

Robert, if you say this has to be punted to 9.2 again, I'm giving up. ;)

    Thanks,

        Stephen

Attachment

Re: Add support for logging the current role

From

Robert Haas

Date:

17 February 2011, 15:46:24

On Thu, Feb 17, 2011 at 11:45 AM, Stephen Frost <sfrost@snowman.net> wrote:
> Robert, if you say this has to be punted to 9.2 again, I'm giving up. ;)

Frankly, this patch has already consumed more than its fair share of
my attention.  Having said that, I've just spent some more time on it.
 I tightened up both the code and the docs a bit.  I fixed
log_line_prefix so that it doesn't needlessly compute the value to be
used for %U when %U isn't used.  I fixed the CSV logging code to do
proper escaping.  Updated patch attached.

It seems there's at least one more thing to worry about here, which is
the overhead of this computation when CSV logging is in use.  If no
SET ROLE or SET SESSION AUTHORIZATION commands are in use, the code
will call show_role(), which will return "none".  We'll then strcmp()
that against "none" and decide to call show_session_authorization(),
which will call strtoul() to find the comma separator and then return
a pointer to the string that follows it.  Now, none of that is
enormously expensive, so maybe it's not worth worrying about, but
since logging can be a hotspot, I thought I'd mention it and solicit
an opinion on whether that's likely to be a problem in practice.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

logrole-rmh.patch

Re: Add support for logging the current role

From

Josh Berkus

Date:

17 February 2011, 16:07:03

Robert,

> It seems there's at least one more thing to worry about here, which is
> the overhead of this computation when CSV logging is in use.  If no
> SET ROLE or SET SESSION AUTHORIZATION commands are in use, the code
> will call show_role(), which will return "none".  We'll then strcmp()
> that against "none" and decide to call show_session_authorization(),
> which will call strtoul() to find the comma separator and then return
> a pointer to the string that follows it.  Now, none of that is
> enormously expensive, so maybe it's not worth worrying about, but
> since logging can be a hotspot, I thought I'd mention it and solicit
> an opinion on whether that's likely to be a problem in practice.

That seems like enough to need a performance test.  No clear ideas here
on how we'd measure the overhead of that accurately, though.  Suggestions?

--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com

Re: Add support for logging the current role

From

Tom Lane

Date:

17 February 2011, 17:53:41

Robert Haas <robertmhaas@gmail.com> writes:
> It seems there's at least one more thing to worry about here, which is
> the overhead of this computation when CSV logging is in use.  If no
> SET ROLE or SET SESSION AUTHORIZATION commands are in use, the code
> will call show_role(), which will return "none".  We'll then strcmp()
> that against "none" and decide to call show_session_authorization(),
> which will call strtoul() to find the comma separator and then return
> a pointer to the string that follows it.  Now, none of that is
> enormously expensive, so maybe it's not worth worrying about, but
> since logging can be a hotspot, I thought I'd mention it and solicit
> an opinion on whether that's likely to be a problem in practice.

Well, in the first place, going through two not-very-related APIs in
order to reverse-engineer what miscinit.c already knows is pretty silly
(not to mention full of possible bugs).  We ought to be looking at the
GetUserId state directly.

Now you will complain that elog.c mustn't try to map that OID back to
string form, which is true.  But IIRC, we used to keep the current
userid stored in both OID and string form.  The string form was removed
as unnecessary overhead, but maybe it'd be a good idea to put that back.

In short, add a bit of overhead at SetUserId time in order to make this
cheap (and accurate) in elog.c.
        regards, tom lane

Re: Add support for logging the current role

From

Stephen Frost

Date:

17 February 2011, 19:18:41

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > It seems there's at least one more thing to worry about here, which is
> > the overhead of this computation when CSV logging is in use.  If no
> > SET ROLE or SET SESSION AUTHORIZATION commands are in use, the code
> > will call show_role(), which will return "none".  We'll then strcmp()
> > that against "none" and decide to call show_session_authorization(),
> > which will call strtoul() to find the comma separator and then return
> > a pointer to the string that follows it.  Now, none of that is
> > enormously expensive, so maybe it's not worth worrying about, but
> > since logging can be a hotspot, I thought I'd mention it and solicit
> > an opinion on whether that's likely to be a problem in practice.
>
> Well, in the first place, going through two not-very-related APIs in
> order to reverse-engineer what miscinit.c already knows is pretty silly
> (not to mention full of possible bugs).  We ought to be looking at the
> GetUserId state directly.

GetUserId can end up being set in a number of places though, often in
places where we can't fail (SetUserIdAndSecContext has some nice
comments on this).

> Now you will complain that elog.c mustn't try to map that OID back to
> string form, which is true.  But IIRC, we used to keep the current
> userid stored in both OID and string form.  The string form was removed
> as unnecessary overhead, but maybe it'd be a good idea to put that back.

The OID and the string are kept in the role_string and
session_authorization_string GUCs respectively.  They're just not in a
terribly useful format, and because SetUserId() can change things w/o
the GUCs getting updated, there's a risk that they're wrong, which is
why show_role() does the stroul() dance to check if GetCurrentRoleId()
matches to what it stuffed into role_string.

> In short, add a bit of overhead at SetUserId time in order to make this
> cheap (and accurate) in elog.c.

We can't do the lookup in SetUserIDAndSecContext(), and I'm not
convinced we actually want to anyway, since that would end up returning
what the role is inside of security definer functions and the like.
We're already setting a variable in assign_session_authorization and
assign_role that has the information we need.  We could inspect
role_string ourselves (including the strcmp() and strtoul()) instead
of asking show_role() to do it for us but that doesn't strike me as all
*that* much of an improvement and goes around the API that at least
exists.

We could certainly have a second set of variables which are set by
assign_role/assign_session_authorization that are in a format we can
more easily use but what would that mean for the GUC variables..?  I
don't know that we'd want to keep them duplicating the data..  Would it
be possible to actually use a struct instead of a straight-up string
there?  Is there any particular reason we keep monkeying around with
storing the OID, superuser bit, role name, etc, as a string anyway..?
Thanks,
    Stephen

Re: Add support for logging the current role

From

Robert Haas

Date:

17 February 2011, 23:05:03

On Thu, Feb 17, 2011 at 4:53 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> It seems there's at least one more thing to worry about here, which is
>> the overhead of this computation when CSV logging is in use.  If no
>> SET ROLE or SET SESSION AUTHORIZATION commands are in use, the code
>> will call show_role(), which will return "none".  We'll then strcmp()
>> that against "none" and decide to call show_session_authorization(),
>> which will call strtoul() to find the comma separator and then return
>> a pointer to the string that follows it.  Now, none of that is
>> enormously expensive, so maybe it's not worth worrying about, but
>> since logging can be a hotspot, I thought I'd mention it and solicit
>> an opinion on whether that's likely to be a problem in practice.
>
> Well, in the first place, going through two not-very-related APIs in
> order to reverse-engineer what miscinit.c already knows is pretty silly
> (not to mention full of possible bugs).  We ought to be looking at the
> GetUserId state directly.
>
> Now you will complain that elog.c mustn't try to map that OID back to
> string form, which is true.  But IIRC, we used to keep the current
> userid stored in both OID and string form.  The string form was removed
> as unnecessary overhead, but maybe it'd be a good idea to put that back.
>
> In short, add a bit of overhead at SetUserId time in order to make this
> cheap (and accurate) in elog.c.

As Stephen says, I think this is utterly impractical; those routines
can't ever throw any kind of error.  I was mostly wondering whether we
ought to create a function show_explicitly_set_role() or somesuch that
would do basically the same thing as the proposed code, but try to
avoid the strcmp in the common case where no set role has been done.
I'm not very certain it's worth worrying about, though.
write_csvlog() is not a trivial function as it is.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Add support for logging the current role

From

Tom Lane

Date:

17 February 2011, 23:40:35

Robert Haas <robertmhaas@gmail.com> writes:
> On Thu, Feb 17, 2011 at 4:53 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> In short, add a bit of overhead at SetUserId time in order to make this
>> cheap (and accurate) in elog.c.

> As Stephen says, I think this is utterly impractical; those routines
> can't ever throw any kind of error.

Why would they need to throw an error?  It'd be on the caller's head to
supply the role name along with OID.  We can keep the name in a static
buffer of size NAMEDATALEN, so don't tell me about palloc failures
either.

The logging design as it stands seems to me to be a Rube Goldberg device
that is probably going to have corner-case bugs quite aside from its
possible performance issues.
        regards, tom lane

Re: Add support for logging the current role

From

Robert Haas

Date:

18 February 2011, 09:32:43

On Thu, Feb 17, 2011 at 10:40 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Thu, Feb 17, 2011 at 4:53 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> In short, add a bit of overhead at SetUserId time in order to make this
>>> cheap (and accurate) in elog.c.
>
>> As Stephen says, I think this is utterly impractical; those routines
>> can't ever throw any kind of error.
>
> Why would they need to throw an error?  It'd be on the caller's head to
> supply the role name along with OID.  We can keep the name in a static
> buffer of size NAMEDATALEN, so don't tell me about palloc failures
> either.

OK, but there are not a small number of callers of that function, and
they don't necessarily have the correct info at hand.  For example,
you'd need to add prevUserAsText to TransactionStateData, which
doesn't seem appealing.

> The logging design as it stands seems to me to be a Rube Goldberg device
> that is probably going to have corner-case bugs quite aside from its
> possible performance issues.

I think you're overreacting.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Add support for logging the current role

From

Bruce Momjian

Date:

11 March 2011, 10:00:06

Is this something for the next commit-fest?

---------------------------------------------------------------------------

Stephen Frost wrote:
-- Start of PGP signed section.
> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
> > Robert Haas <robertmhaas@gmail.com> writes:
> > > It seems there's at least one more thing to worry about here, which is
> > > the overhead of this computation when CSV logging is in use.  If no
> > > SET ROLE or SET SESSION AUTHORIZATION commands are in use, the code
> > > will call show_role(), which will return "none".  We'll then strcmp()
> > > that against "none" and decide to call show_session_authorization(),
> > > which will call strtoul() to find the comma separator and then return
> > > a pointer to the string that follows it.  Now, none of that is
> > > enormously expensive, so maybe it's not worth worrying about, but
> > > since logging can be a hotspot, I thought I'd mention it and solicit
> > > an opinion on whether that's likely to be a problem in practice.
> > 
> > Well, in the first place, going through two not-very-related APIs in
> > order to reverse-engineer what miscinit.c already knows is pretty silly
> > (not to mention full of possible bugs).  We ought to be looking at the
> > GetUserId state directly.
> 
> GetUserId can end up being set in a number of places though, often in
> places where we can't fail (SetUserIdAndSecContext has some nice
> comments on this).
> 
> > Now you will complain that elog.c mustn't try to map that OID back to
> > string form, which is true.  But IIRC, we used to keep the current
> > userid stored in both OID and string form.  The string form was removed
> > as unnecessary overhead, but maybe it'd be a good idea to put that back.
> 
> The OID and the string are kept in the role_string and
> session_authorization_string GUCs respectively.  They're just not in a
> terribly useful format, and because SetUserId() can change things w/o
> the GUCs getting updated, there's a risk that they're wrong, which is
> why show_role() does the stroul() dance to check if GetCurrentRoleId()
> matches to what it stuffed into role_string.
> 
> > In short, add a bit of overhead at SetUserId time in order to make this
> > cheap (and accurate) in elog.c.
> 
> We can't do the lookup in SetUserIDAndSecContext(), and I'm not
> convinced we actually want to anyway, since that would end up returning
> what the role is inside of security definer functions and the like.
> We're already setting a variable in assign_session_authorization and
> assign_role that has the information we need.  We could inspect
> role_string ourselves (including the strcmp() and strtoul()) instead
> of asking show_role() to do it for us but that doesn't strike me as all
> *that* much of an improvement and goes around the API that at least
> exists.
> 
> We could certainly have a second set of variables which are set by
> assign_role/assign_session_authorization that are in a format we can
> more easily use but what would that mean for the GUC variables..?  I
> don't know that we'd want to keep them duplicating the data..  Would it
> be possible to actually use a struct instead of a straight-up string
> there?  Is there any particular reason we keep monkeying around with
> storing the OID, superuser bit, role name, etc, as a string anyway..?
> 
>     Thanks,
> 
>         Stephen
-- End of PGP section, PGP failed!

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +

Re: Add support for logging the current role

From

Stephen Frost

Date:

11 March 2011, 10:21:57

* Bruce Momjian (bruce@momjian.us) wrote:
> Is this something for the next commit-fest?

I already moved it there..
Thanks,
    Stephen