Thread: proposal: add columns created and altered to pg_proc and pg_class
Hello, this my proposal is very simple. It help to people who have to manage large or complex database system. Important data are date of creating and date of altering tables and stored procedures. These data cannot be modified by user, so implementation doesn't need any new statements. Notes, objections? Regards Pavel Stehule
Pavel Stehule <pavel.stehule@gmail.com> wrote: > Important data are date of creating and date of altering tables > and stored procedures. These data cannot be modified by user, so > implementation doesn't need any new statements. > > Notes, objections? This feature has been present in other database products I've used, and I occasionally miss it. PostgreSQL doesn't have stored procedures, but has functions. This doesn't sound like something to propose past feature freeze, so I assume you're asking about a potential 8.5 feature. -Kevin
Pavel Stehule <pavel.stehule@gmail.com> writes: > this my proposal is very simple. It help to people who have to manage > large or complex database system. Important data are date of creating > and date of altering tables and stored procedures. These data cannot > be modified by user, so implementation doesn't need any new > statements. ISTM anyone who thinks they need this actually need a full DDL log; or at least, if we give them this, they will be back next week asking for a full log. So it'd save a lot of work to tell them to just log their DDL to start with. Some obvious objections to the simple approach: - what if I want to know *who* made the change - what if I need to know about the change before last - what if I need to know about a DROP - what if I need to know about operators, operator classes, schemas, etc etc regards, tom lane
2009/4/13 Kevin Grittner <Kevin.Grittner@wicourts.gov>: > Pavel Stehule <pavel.stehule@gmail.com> wrote: >> Important data are date of creating and date of altering tables >> and stored procedures. These data cannot be modified by user, so >> implementation doesn't need any new statements. >> >> Notes, objections? > > This feature has been present in other database products I've used, > and I occasionally miss it. > > PostgreSQL doesn't have stored procedures, but has functions. > sure, I know :) > This doesn't sound like something to propose past feature freeze, so > I assume you're asking about a potential 8.5 feature. > yes, it's for 8.5 Pavel > -Kevin >
2009/4/13 Tom Lane <tgl@sss.pgh.pa.us>: > Pavel Stehule <pavel.stehule@gmail.com> writes: >> this my proposal is very simple. It help to people who have to manage >> large or complex database system. Important data are date of creating >> and date of altering tables and stored procedures. These data cannot >> be modified by user, so implementation doesn't need any new >> statements. > > ISTM anyone who thinks they need this actually need a full DDL log; > or at least, if we give them this, they will be back next week asking > for a full log. So it'd save a lot of work to tell them to just log > their DDL to start with. Yes, it is solution, but this method isn't too much practical. You have to do some grep and regular-expressions gaming for to get some info. It's similar info about autovacuum. > > Some obvious objections to the simple approach: > - what if I want to know *who* made the change it should be interesting, but I see two problems - a) user's should be dropped, b) lot of firms use some special no login user for creating objects. But it has sense. > - what if I need to know about the change before last it is out of relation databases. Should be solved via triggers on DDL statements. You hypothetical request going to much far - you should to store state before, DDL statement, . In this moment I don't would to create complete system like CVS. And I would not do it in future. When PostgreSQL will have some DDL statement triggers or some similar, then people will do own complete systems. For me, time of last change is basic property like owner, size, name ... > - what if I need to know about a DROP it's similar to creation time and modify time in file systems. When you need some special, you need some special techniques - watchers, ... > - what if I need to know about operators, operator classes, schemas, etc > etc + databases, all is important, and why not? All your objections are regular, but I propose some what I thing is most important, and other should be added later (step by step). > I am sure, so all these information should be taken from outer sources, and reason, why I should it, should be removed via some procedural or organisation rules. One sample from my current practise. I am working as database architect in large firm. I have full access to postgres on develop, deploy and preprod environment. But I haven't any special access on production. I am able to see logs on production. But when I would to check if some patches was applied, then I have to search in logs - and logs are parted and rotated - for me is important information about last change on tables or functions. Of course, there are other kind of objects, but I never needed this info. regards Pavel Stehule > regards, tom lane >
Tom Lane <tgl@sss.pgh.pa.us> wrote: > Pavel Stehule <pavel.stehule@gmail.com> writes: >> this my proposal is very simple. It help to people who have to >> manage large or complex database system. Important data are date of >> creating and date of altering tables and stored procedures. These >> data cannot be modified by user, so implementation doesn't need any >> new statements. > > ISTM anyone who thinks they need this actually need a full DDL log; > or at least, if we give them this, they will be back next week > asking for a full log. So it'd save a lot of work to tell them to > just log their DDL to start with. > > Some obvious objections to the simple approach: > - what if I want to know *who* made the change > - what if I need to know about the change before last > - what if I need to know about a DROP > - what if I need to know about operators, operator classes, schemas, > etc etc Well, in a situation where you've got 80-some production databases and dozens of development databases (the number changes from day to day as now projects create code forks and other merge back in) it is occasionally useful to get simple information such as Pavel proposes from the system tables. I don't think that anyone would expect the system tables to track the complete history -- just save someone time tracking down the complete record when such simple information would suffice. In terms of value -- I was wishing I had it just last week -- it would have saved me a few minutes. It was probably two or three months prior to that I last wished for it. Definitely not huge from my perspective -- just an occasional convenience which some other DBMS products provide. -Kevin
On Mon, Apr 13, 2009 at 2:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Pavel Stehule <pavel.stehule@gmail.com> writes: >> this my proposal is very simple. It help to people who have to manage >> large or complex database system. Important data are date of creating >> and date of altering tables and stored procedures. These data cannot >> be modified by user, so implementation doesn't need any new >> statements. > > ISTM anyone who thinks they need this actually need a full DDL log; > or at least, if we give them this, they will be back next week asking > for a full log. So it'd save a lot of work to tell them to just log > their DDL to start with. DDL logs are good, but you generally can't keep them around forever, so it's helpful to have some basic information that occupies O(1) space. So based on that I'd respond to these objections as follows: > Some obvious objections to the simple approach: > - what if I want to know *who* made the change Fine, let's log the OID of the creator and of the person who made the last change, too (or else decide that the dependency problems are too thorny - we'd need to set this to NULL if a role is dropped - and don't). > - what if I need to know about the change before last Tough, you should have a DDL log. > - what if I need to know about a DROP Tough, you should have a DDL log. > - what if I need to know about operators, operator classes, schemas, etc > etc Fine, let's log this info for those too (or else decide they're too obscure and don't - pg_class and pg_proc are certainly the most interesting cases). In my applications, these requirements comes up frequently for user data and I've handled it by adding creation_time, creator_id, last_updated_time, last_updater_id columns to nearly every table that users can modify. It satisfies 90% of the auditing requirements for 10% of the work, and there's nothing to say that a more elaborate mechanism can't be built and used where necessary (which I have also done - but only for particularly critical data). ...Robert
On Mon, Apr 13, 2009 at 1:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Pavel Stehule <pavel.stehule@gmail.com> writes: >> this my proposal is very simple. It help to people who have to manage >> large or complex database system. Important data are date of creating >> and date of altering tables and stored procedures. These data cannot >> be modified by user, so implementation doesn't need any new >> statements. > > ISTM anyone who thinks they need this actually need a full DDL log; while i agree with that, what i actually think will be useful is to have the DDL log in a separate file... because it is a security log and probably i will want different info than in a log to measure average performance -- Atentamente, Jaime Casanova Soporte y capacitación de PostgreSQL Asesoría y desarrollo de sistemas Guayaquil - Ecuador Cel. +59387171157
>> - what if I need to know about operators, operator classes, schemas, etc >> etc > Fine, let's log this info for those too (or else decide they're too > obscure and don't - pg_class and pg_proc are certainly the most > interesting cases). I would suggest putting this info in a separate table, pg_change. It would have oid, catalog, user_changed, changed_on. That way we could simply keep the data for all objects which have an OID. This would also supposedly allow us to track drops if we wanted. We'd have to check on overhead of this, though, and maybe make it a GUC to track it. This would also be tremendously useful to suppliment replication systems. -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com
Josh Berkus <josh@agliodbs.com> writes: > I would suggest putting this info in a separate table, pg_change. It > would have oid, catalog, user_changed, changed_on. That way we could > simply keep the data for all objects which have an OID. That makes more sense to me --- it would easily extend to all cases and would not impose any overhead (in the form of useless columns) for catalogs that you didn't want to track in a particular case. The main problem that would have to be considered is how to flush no-longer-useful entries (which of course entails deciding which those are). regards, tom lane
On Mon, Apr 13, 2009 at 7:06 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Josh Berkus <josh@agliodbs.com> writes: >> I would suggest putting this info in a separate table, pg_change. It >> would have oid, catalog, user_changed, changed_on. That way we could >> simply keep the data for all objects which have an OID. > > That makes more sense to me --- it would easily extend to all cases > and would not impose any overhead (in the form of useless columns) > for catalogs that you didn't want to track in a particular case. > > The main problem that would have to be considered is how to flush > no-longer-useful entries (which of course entails deciding which > those are). I kinda think that the only thing that's going to make sense here is to drop the pg_change entries when the object is dropped. Now, admittedly, that means you can't track drops. But otherwise, you have the potential for pg_change to get really big and full of cruft, and I don't think there's going to be an easy way to garbage collect it. I really like the basic design, though. ...Robert
2009/4/14 Josh Berkus <josh@agliodbs.com>: > >>> - what if I need to know about operators, operator classes, schemas, etc >>> etc >> >> Fine, let's log this info for those too (or else decide they're too >> obscure and don't - pg_class and pg_proc are certainly the most >> interesting cases). > > I would suggest putting this info in a separate table, pg_change. It would > have oid, catalog, user_changed, changed_on. That way we could simply keep > the data for all objects which have an OID. > > This would also supposedly allow us to track drops if we wanted. > > We'd have to check on overhead of this, though, and maybe make it a GUC to > track it. > > This would also be tremendously useful to suppliment replication systems. > I though about it too. But I am not sure, if this isn't too complicated solution for simple task. If I thing little bit more - main important is timestamp of last change. regards Pavel Stehule > -- > Josh Berkus > PostgreSQL Experts Inc. > www.pgexperts.com >
Pavel Stehule <pavel.stehule@gmail.com> wrote: > I though about it too. But I am not sure, if this isn't too > complicated solution for simple task. If I thing little bit more - > main important is timestamp of last change. Yeah, if it would be too heavy to add a timestamp column or two to pg_class and maybe one or two others, why is it better to add a whole new table to maintain in parallel -- with it's own primary key, foreign keys (or similar integrity enforcement mechanism), etc. Others apparently see a bigger advantage to this than I, but if it's not something I can just eyeball while I'm looking at the object definition, it isn't likely to save me much over going to other sources. Let's not over-engineer this. -Kevin
On Tue, Apr 14, 2009 at 10:27 AM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > Pavel Stehule <pavel.stehule@gmail.com> wrote: >> I though about it too. But I am not sure, if this isn't too >> complicated solution for simple task. If I thing little bit more - >> main important is timestamp of last change. > > Yeah, if it would be too heavy to add a timestamp column or two to > pg_class and maybe one or two others, why is it better to add a whole > new table to maintain in parallel -- with it's own primary key, > foreign keys (or similar integrity enforcement mechanism), etc. Making pg_class and pg_proc tables larger hurts run-time performance, potentially. Making a separate table only slows down DDL operations, which are much less frequent. ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > On Tue, Apr 14, 2009 at 10:27 AM, Kevin Grittner >> Yeah, if it would be too heavy to add a timestamp column or two to >> pg_class and maybe one or two others, why is it better to add a whole >> new table to maintain in parallel -- with it's own primary key, >> foreign keys (or similar integrity enforcement mechanism), etc. > Making pg_class and pg_proc tables larger hurts run-time performance, > potentially. Making a separate table only slows down DDL operations, > which are much less frequent. And even more to the point, adding columns to the core system tables means you pay the performance cost *even when not using the feature*. We normally expect that inessential features should avoid making a performance impact on those who have no use for them. regards, tom lane
Robert Haas <robertmhaas@gmail.com> wrote: > Making pg_class and pg_proc tables larger hurts run-time performance, > potentially. Making a separate table only slows down DDL operations, > which are much less frequent. Copying the pg_class table, with oids and indexes, with and without the addition of one timestamp column, the timestamp column caused the copy to be about 11.3% larger; so I see your point. I guess I didn't realize just how tight the pg_class table was. Given all that, I'm going to say that from my perspective I don't think the convenience of saving the information is worth the cost, with either approach. I understand it might mean more to others. -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> wrote: > the timestamp column caused the copy to be about 11.3% larger Grabbed the wrong numbers. It's really 2.5%, but still.... -Kevin
On Tue, Apr 14, 2009 at 2:13 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > "Kevin Grittner" <Kevin.Grittner@wicourts.gov> wrote: >> the timestamp column caused the copy to be about 11.3% larger > > Grabbed the wrong numbers. It's really 2.5%, but still.... Well, that's why Tom doesn't want to add it to pg_class. But putting it in a separate table will have no impact on the speed of anything except DDL statements, and even then it won't require copying the whole table, so the performance impact will be pretty minimal, so I think it should be all right. ...Robert
Tom Lane wrote: > Pavel Stehule <pavel.stehule@gmail.com> writes: > > this my proposal is very simple. It help to people who have to manage > > large or complex database system. Important data are date of creating > > and date of altering tables and stored procedures. These data cannot > > be modified by user, so implementation doesn't need any new > > statements. > > ISTM anyone who thinks they need this actually need a full DDL log; > or at least, if we give them this, they will be back next week asking > for a full log. So it'd save a lot of work to tell them to just log > their DDL to start with. > > Some obvious objections to the simple approach: > - what if I want to know *who* made the change > - what if I need to know about the change before last > - what if I need to know about a DROP > - what if I need to know about operators, operator classes, schemas, etc > etc How do you handle dump/restore? Is it preserved? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Bruce Momjian <bruce@momjian.us> writes: > How do you handle dump/restore? Is it preserved? I would think not. regards, tom lane