Re: pgaudit - an auditing extension for PostgreSQL - Mailing list pgsql-hackers
From | Stephen Frost |
---|---|
Subject | Re: pgaudit - an auditing extension for PostgreSQL |
Date | |
Msg-id | 20150119032837.GC3062@tamriel.snowman.net Whole thread Raw |
In response to | Re: pgaudit - an auditing extension for PostgreSQL (Abhijit Menon-Sen <ams@2ndQuadrant.com>) |
Responses |
Re: pgaudit - an auditing extension for PostgreSQL
|
List | pgsql-hackers |
Abhijit, Apologies, I've been meaning to go through this for quite some time. * Abhijit Menon-Sen (ams@2ndQuadrant.com) wrote: > I've changed pgaudit to work as you suggested. Great, glad to see that. > A quick note on the implementation: pgaudit was already installing an > ExecutorCheckPerms_hook anyway; I adapted code from ExecRTECheckPerms > to check if the audit role has been granted any of the permissions > required for the operation. Sure, makes sense to me. > This means there are three ways to configure auditing: > > 1. GRANT … ON … TO audit, which logs any operations that correspond to > the granted permissions. I was thinking we would be able to configure which role this applies to, rather than hard-coding 'audit' as *the* role. Perhaps with a new GUC, or the existing 'pgaudit.roles' GUC could be used. Further, this can be extrapolated out to cover auditing of roles by checking for role membership in this role, see below. > 2. Set pgaudit.roles = 'r1, r2, …', which logs everything done by r1, > r2, and any of their descendants. If these roles were the 'audit' roles as considered under #1 then couldn't administrators control what pgaudit.roles provide today using role memberships granted to the roles listed? > 3. Set pgaudit.log = 'read, write, …', which logs any events in any of > the listed classes. Approach #1 addresses this too, doesn't it? SELECT vs. UPDATE vs. DELETE, etc? I'm not sure that this really adds much as it isn't nearly as granular as the GRANT-based approach since it applies to everything. One of the really big and interesting distinctions to consider between checking permissions granted to an 'audit' role vs. role membership is how DDL is handled. As currently implemented, role-level auditing is required to have DDL be logged, but you may very well want to log all DDL and no DML, for example. What would be really helpful is to develop a wiki to cover common use-cases and how to set up pgaudit for them. > (This is a small change from the earlier behaviour where, if a role was > listed in .roles, it was still subject to the .log setting. I find that > more useful in practice, but since we're discussing Stephen's proposal, > I implemented what he said.) Right- the idea is to provide a very granular way of specifying what is to be logged; much more than what the previous pair of GUCs provided. > The new pgaudit.c is attached here for review. Nothing else has changed > from the earlier submission; and everything is in the github repository > (github.com/2ndQuadrant/pgaudit). There's a few big questions we need to address with pgaudit to have it go into our repo. The first is what major version(s) we're targetting. My feeling is that we should strip down what goes into contrib to be 9.5 based. I certainly understand that you want to support earlier versions but I don't think it makes sense to try and carry that baggage in contrib when it won't ever be released with earlier versions. The second is the USE_DEPARSE_FUNCTIONS-related bits. I realize that there's a goal to get additional things into 9.5 from that branch but it doesn't make sense for the contrib pgaudit to include those components prior to them actually being in core. I'm also wondering if pieces of that are now out-of-date with where core is. If those parts are going to go into 9.5 soon (which looks like it may happen..) then hopefully we can just remove the #define's and clean up the code to use the new capabilities. Lastly, I'll echo a concern which Robert has raised which is- how do we make sure that new commands, etc, are covered? I don't particularly like how pgaudit will need to be kept in sync with what's in ProcessUtility (normal and slow). I'm actually a bit hopeful that we can work out a way for pgaudit to help with this through a regression test which loops over all objects and tests logging each of them. One of the important aspects of auditing is that what is requested to be audited is and exactly that- no more, no less. Reviewing the code makes me pretty nervous about that actually happening properly, but that's mostly due to the back-and-forth between different major versions and the #ifdef/#ifndef's. Additional comments in-line below. > enum LogClass { > LOG_NONE = 0, > > /* SELECT */ > LOG_READ = (1 << 0), > > /* INSERT, UPDATE, DELETE, TRUNCATE */ > LOG_WRITE = (1 << 1), > > /* GRANT, REVOKE, ALTER … */ > LOG_PRIVILEGE = (1 << 2), > > /* CREATE/DROP/ALTER ROLE */ > LOG_USER = (1 << 3), > > /* DDL: CREATE/DROP/ALTER */ > LOG_DEFINITION = (1 << 4), > > /* DDL: CREATE OPERATOR etc. */ > LOG_CONFIG = (1 << 5), > > /* VACUUM, REINDEX, ANALYZE */ > LOG_ADMIN = (1 << 6), > > /* Function execution */ > LOG_FUNCTION = (1 << 7), > > /* Absolutely everything; not available via pgaudit.log */ > LOG_ALL = ~(uint64)0 > }; I'd really like to see this additional information regarding what kind a command is codified in core. Ideally, we'd have a single place which tracks the kind of command and possibly other information which can then be addressed all at once when new commands are added. Also, we could allow more granularity by using the actual classes which we already have for objects. > /* > * This module collects AuditEvents from various sources (event > * triggers, and executor/utility hooks) and passes them to the > * log_audit_event() function. > * > * An AuditEvent represents an operation that potentially affects a > * single object. If an underlying command affects multiple objects, > * multiple AuditEvents must be created to represent it. > */ The above doesn't appear to be accurate (perhaps it once was?) as log_audit_event() only takes a single AuditEvent structure at a time and it's not a list. I'm not sure that needs to change, just pointing out that the comment appears to be inaccurate. > typedef struct { > NodeTag type; > const char *object_id; > const char *object_type; > const char *command_tag; > const char *command_text; > bool granted; > } AuditEvent; I'm inclined to suggest that the decision be made earlier on about if a given action should be logged, as the earlier we do that the less work we have to do and we could avoid having to pass in things like 'granted' to the log_audit_event function at all. Having the decision split with the 'granted' flag being set from the callback and then the 'should_be_logged' happening inside log_audit_event doesn't strike me as terribly sensible. > /* > * Returns the oid of the hardcoded "audit" role. > */ > > static Oid > audit_role_oid() > { > HeapTuple roleTup; > Oid oid = InvalidOid; > > roleTup = SearchSysCache1(AUTHNAME, PointerGetDatum("audit")); > if (HeapTupleIsValid(roleTup)) { > oid = HeapTupleGetOid(roleTup); > ReleaseSysCache(roleTup); > } > > return oid; > } The above could go away if we follow my suggestions above, of course. > /* Returns true if either pgaudit.roles or pgaudit.log is set. */ > > static inline bool > pgaudit_configured() > { > return (pgaudit_roles_str && *pgaudit_roles_str) || pgaudit_log != 0; > } This feels, to me at least, like it shouldn't be necessary. If the check to see if we should be logging the current action is moved up sufficiently, then we may be able to do away with this function and the checks associated with it- those would turn into "should we be logging this?" and, if the lists are empty, would return "no" pretty quickly. > /* > * Takes a role OID and returns true if the role is mentioned in > * pgaudit.roles or if it inherits from a role mentioned therein; > * returns false otherwise. > */ > > static bool > role_is_audited(Oid roleid) > { > List *roles; > ListCell *lt; > > if (!pgaudit_roles_str || !*pgaudit_roles_str) > return false; > > if (!SplitIdentifierString(pgaudit_roles_str, ',', &roles)) > return false; > > foreach(lt, roles) > { > char *name = (char *)lfirst(lt); > HeapTuple roleTup; > > roleTup = SearchSysCache1(AUTHNAME, PointerGetDatum(name)); > if (HeapTupleIsValid(roleTup)) > { > Oid parentrole = HeapTupleGetOid(roleTup); > > ReleaseSysCache(roleTup); > if (is_member_of_role(roleid, parentrole)) > return true; > } > } > > return false; > } The results here should be cached, as was discussed earlier, iirc. > /* > * Takes a role OID and an AuditEvent and returns true or false > * depending on whether the event should be logged according to the > * pgaudit.roles/log settings. If it returns true, it also fills in the > * name of the LogClass which it is to be logged under. > */ > > static bool > should_be_logged(Oid userid, AuditEvent *e, const char **classname) > { > enum LogClass class = LOG_NONE; > char *name; > > /* > * Look at the type of the command and decide what LogClass needs to > * be enabled for the command to be logged. > */ > > switch (e->type) > { > case T_SelectStmt: > name = "READ"; > class = LOG_READ; > break; [... long list of every type we have ...] I'm really not a fan of these big switch statements which attempt to cover everything which currently exists. If we don't come up with a way to make very sure that these areas are updated as things change, we're going to end up missing things. As mentioned up-thread, I'd really like to see a way to identify these types from core-based information instead. > /* > * Anything that's left out of the list above is just noise, > * and not very interesting from an auditing perspective. So > * there's intentionally no way to enable LOG_ALL. > */ This approach really worries me- what about some new type which is added? How can we know that it's just "noise"? Whatever is considered 'noise' should at least be explicitly listed, so we know we're covering everything. > /* > * Takes an AuditEvent and, if it should_be_logged(), writes it to the > * audit log. The AuditEvent is assumed to be completely filled in by > * the caller (unknown values must be set to "" so that they can be > * logged without error checking). > */ > > static void > log_audit_event(AuditEvent *e) > { So, clearly, this is an area which could use some improvement. Specifically, being able to write to an independent file instead of just using ereport(), supporting other output formats (JSON, maybe even a table..). Also, have you considered using a dynamic shared memory block to queue the logging messages to and then a background worker to write them out in-order to a file? It'd clearly be similar to our existing ereport() mechanism, but importantly you'd be able to send it to another file. Being able to log directly to syslog or other message queues would be great also. These are things which could be added later, of course. > /* > * Create AuditEvents for DML operations via executor permissions > * checks. We create an AuditEvent for each table in the list of > * RangeTableEntries from the query. > */ > > static void > log_executor_check_perms(Oid auditOid, List *rangeTabls, bool abort_on_violation) > { > ListCell *lr; > > foreach(lr, rangeTabls) > { > Oid relOid; > Relation rel; > AuditEvent e; > RangeTblEntry *rte = lfirst(lr); > char *relname; > const char *tag; > const char *reltype; > NodeTag type; > > /* We only care about tables, and can ignore subqueries etc. */ > if (rte->rtekind != RTE_RELATION) > continue; > > /* > * Get the fully-qualified name of the relation. > * > * User queries against catalog tables (e.g. "\dt") are logged > * here. Should we filter them out, as we do for functions in > * pg_catalog? > */ I'd probably say 'yes', regarding this, though it might be nice to have it be optional.. > relOid = rte->relid; > rel = relation_open(relOid, NoLock); > relname = quote_qualified_identifier(get_namespace_name(RelationGetNamespace(rel)), > RelationGetRelationName(rel)); > relation_close(rel, NoLock); > > /* > * We don't have access to the parsetree here, so we have to > * generate the node type, object type, and command tag by > * decoding rte->requiredPerms and rte->relkind. > */ This seems like it could be improved- have you considered suggesting a hook which is, perhaps, in a better place and could pass in this information instead of having to attempt to regenerate it? > /* > * If a role named "audit" exists, we check if it has been > * granted permission to perform the operation identified above. > * If so, we must log the event regardless of the static pgaudit > * settings. > */ This isn't quite what was intended. This isn't about what the audit role could do but rather auditing specifically what the role has access to. The idea is that the permission system is used as a proxy for configuration at the action+column level. As an example, if an UPDATE with a WHERE clause is executed and the audit user has SELECT rights on the table, then the UPDATE should be logged due to the WHERE clause. At least, that's been my thinking as it mirrors our permissions system but, as Robert said, it's an 'anti-permission' approach. On the flip side, if UPDATE is granted to the audit role, but an UPDATE with WHERE clause is run then we should log that- even if the audit role doesn't have SELECT. This all matches up (but is opposite of) our regular permission system. Deviating from that strikes me as a bad idea. One caveat on this is that I was thinking the permission would have to be explicitly granted- that is, grants to PUBLIC would not count. That may be more difficult to deal with though, but it'd certainly make auditing of functions easier as execute is granted to public by default there and would create a lot of noise otherwise. > log_utility_command(Node *parsetree, > const char *queryString, > ProcessUtilityContext context, > ParamListInfo params, > DestReceiver *dest, > char *completionTag) Similar thoughts here wrt the huge switch statement, etc. > /* > * Create AuditEvents for certain kinds of CREATE and ALTER statements, > * as detected by log_object_access() in lieu of event trigger support > * for them. > */ Just to comment on this, the object access framework feels like it should, perhaps, be reworked to be based on event triggers.. I'm nervous that we're ending up with so many different ways to deal with what's happening under ProcessUtility, and that we've now split that in half between what can be done with event triggers and what can't, with no way for modules to easily know which is which. > log_function_execution(Oid objectId) > { > HeapTuple proctup; > Form_pg_proc proc; > const char *name; > AuditEvent e; > > proctup = SearchSysCache1(PROCOID, ObjectIdGetDatum(objectId)); > if (!proctup) > elog(ERROR, "cache lookup failed for function %u", objectId); > proc = (Form_pg_proc) GETSTRUCT(proctup); > > /* > * Logging execution of all pg_catalog functions would > * make the log unusably noisy. > */ > > if (IsSystemNamespace(proc->pronamespace)) > { > ReleaseSysCache(proctup); > return; > } > > name = quote_qualified_identifier(get_namespace_name(proc->pronamespace), > NameStr(proc->proname)); > ReleaseSysCache(proctup); > > e.type = T_ExecuteStmt; > e.object_id = name; > e.object_type = "FUNCTION"; > e.command_tag = "EXECUTE"; > if (debug_query_string) > e.command_text = debug_query_string; > else > e.command_text = ""; > e.granted = false; > > log_audit_event(&e); > } This doesn't address overloaded functions, does it? It probably should include the arguments and types in 'name'. Thanks! Stephen
pgsql-hackers by date: