Re: Auditing extension for PostgreSQL (Take 2) - Mailing list pgsql-hackers

From David Steele
Subject Re: Auditing extension for PostgreSQL (Take 2)
Date
Msg-id 55104204.80008@pgmasters.net
Whole thread Raw
In response to Re: Auditing extension for PostgreSQL (Take 2)  (Abhijit Menon-Sen <ams@2ndQuadrant.com>)
Responses Re: Auditing extension for PostgreSQL (Take 2)
Re: Auditing extension for PostgreSQL (Take 2)
List pgsql-hackers
Thanks for the review, Abhijit.

On 3/23/15 1:31 AM, Abhijit Menon-Sen wrote:
> At 2015-02-24 11:22:41 -0500, david@pgmasters.net wrote:
>>
>> Patch v3 is attached.
>> +
>> +    /* Function execution */
>> +    LOG_MISC = (1 << 5),
>
> The comment above LOG_MISC should be changed.

Fixed.

> More fundamentally, this classification makes it easy to reuse LOGSTMT_*
> (and a nice job you've done of that, with just a few additional special
> cases), I don't think this level is quite enough for our needs. I think
> it should at least be possible to specifically log commands that affect
> privileges and roles.

I agree, but this turns out to be easier said than done.  In the prior
code for instance, CREATE ROLE was classified as USER, while ALTER ROLE
.. RENAME was classified as DDL.  This is because any rename gets the
command tag T_RenameStmt.  CreateCommandTag does return "ALTER ROLE",
but now we're in the realm of string-matching again which is not my
favorite thing.  Let me see if there is a clean way to get this
accomplished.  I've also felt this is the one thing I'd like to see
broken out.

> I'm fond of finer categorisation for DDL as well, but I could live with
> all DDL being lumped together.
>
> I'm experimenting with a few approaches to do this without reintroducing
> switch statements to test every command. That will require core changes,
> but I think we can find an acceptable arrangement. I'll post a proof of
> concept in a few days.

I also think finer-grained categorization would be best accomplished
with some core changes.  It seemed too late to get those in for 9.5 so I
decided to proceed with what I knew could be done reliably with the idea
to improve it with core changes going forward.

I look forward to your proof-of-concept.

>> + * Takes an AuditEvent and, if it log_check(), writes it to the audit
>> log.
>
> I don't think log_check is the most useful name, because this sentence
> doesn't tell me what the function may do. Similarly, I would prefer to
> have log_acl_check be renamed acl_grants_audit or similar. (These are
> all static functions anyway, I don't think a log_ prefix is needed.)

log_check() has become somewhat vestigial at this point since it is only
called from one place - I've been considering removing it and merging
into log_audit_event().  For the moment I've improved the comments.

I like acl_grants_audit() and agree that it's a clearer name.  I'll
incorporate that into the next version and apply the same scheme to the
other ACL functionsas well as do a general review of naming.

>> +    /* Free the column set */
>> +    bms_free(tmpSet);
>
> (An aside, really: there are lots of comments like this, which I don't
> think add anything to understanding the code, and should be removed.)

I generally feel like you can't have too many comments.  I think even
the less interesting/helpful comments help break the code into
functional sections for readability.

>> +        /*
>> +         * We don't have access to the parsetree here, so we have to generate
>> +         * the node type, object type, and command tag by decoding
>> +         * rte->requiredPerms and rte->relkind.
>> +         */
>> +        auditEvent.logStmtLevel = LOGSTMT_MOD;
>
> (I am also trying to find a way to avoid having to do this.)

That would be excellent.

>> +        /* Set object type based on relkind */
>> +        switch (class->relkind)
>> +        {
>> +            case RELKIND_RELATION:
>> +                utilityAuditEvent.objectType = OBJECT_TYPE_TABLE;
>> +                break;
>
> This occurs elsewhere too. But I suppose new relkinds are added less
> frequently than new commands.

Well, that's the hope at least.  I should mention that ALL statements
will be logged no matter what additional classification happens.  The
amount of information returned may not be ideal, but nothing is ever
excluded from logging (depending on the classes selected, of course).

> Again on a larger level, I'm not sure how I feel about _avoiding_ the
> use of event triggers for audit logging. Regardless of whether we use
> the deparse code (which I personally think is a good idea; Álvaro has
> been working on it, and it looks very nice) to log extra information,
> using the object access hook inevitably means we have to reimplement
> the identification/classification code here.
>
> In "old" pgaudit, I think that extra effort is justified by the need to
> be backwards compatible with pre-event trigger releases. In a 9.5-only
> version, I am not at all convinced that this makes sense.
>
> Thoughts?

I was nervous about basing pg_audit on code that I wasn't sure would be
committed (at the time).  Since pg_event_trigger_get_creation_commands()
is tied up with deparse, I honestly didn't feel like the triggers were
bringing much to the table.

That being said, I agree that the deparse code is very useful and now
looks certain to be committed for 9.5.

I have prepared a patch that brings event triggers and deparse back to
pg_audit based on the Alvaro's dev/deparse branch at
git://git.postgresql.org/git/2ndquadrant_bdr.git (commit 0447fc5).  I've
updated the unit tests accordingly.

I left in the OAT code for now.  It does add detail to one event that
the event triggers do not handle (creating PK indexes) and I feel that
it lends an element of safety in case something happens to the event
triggers.  OAT is also required for function calls so the code path
cannot be eliminated entirely.  I'm not committed to keeping the
redundant OAT code, but I'd rather not remove it until deparse is
committed to master.

I've been hesitant to post this patch as it will not work in master
(though it will compile), but I don't want to hold on to it any longer
since the end of the CF is nominally just weeks away.  If you want to
run the patch in master, you'll need to disable the
pg_audit_ddl_command_end trigger.

I've also addressed Fujii's concerns about logging parameters - this is
now an option.  The event stack has been formalized and
MemoryContextRegisterResetCallback() is used to cleanup the stack on errors.

Let me know what you think.

--
- David Steele
david@pgmasters.net

Attachment

pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: barnacle (running CLOBBER_CACHE_RECURSIVELY) seems stuck since November
Next
From: Peter Geoghegan
Date:
Subject: Re: Abbreviated keys for Numeric