Thread: YADP - Yet another Dependency Patch
This one has been corrected to fit in with Toms recent changes, as well as the changes with command/ restructuring. Please accept or reject quickly, else risk conflicts.
Attachment
Rod Taylor <rbt@zort.ca> writes: > This one has been corrected to fit in with Toms recent changes, as well > as the changes with command/ restructuring. > Please accept or reject quickly, else risk conflicts. This is interesting but certainly far from ready for prime time. Some random comments, in no particular order: 1. I don't like the code that installs and removes ad-hoc dependencies from relations to type Oid. On its own terms it's wrong (if it were right, then you'd also need to be installing dependencies for Tid, Xid, and the types of the other system columns), but on a larger scale I don't see the point of expending cycles, disk space, and code complexity to record these dependencies. The system *cannot* work if you drop type Oid. So it'd seem to make more sense to wire in a notion that certain types, tables, etc are "pinned" and may never be dropped; then there's no need to create explicit dependencies on them. If you want an explicit representation of pinning in the pg_depends table, perhaps it would work to create a row claiming that "table 0 / Oid 0 / subid 0" depends on a pinned object. 2. Is it really necessary to treat pg_depends as a bootstrapped relation? That adds a lot of complexity, as you've evidently already found, and it does not seem necessary if you're going to load the system dependencies in a later step of the initdb process. You can just make the dependency-adding routines be no-ops in bootstrap mode; then create pg_depends as an ordinary system catalog; and finally load the entries post-bootstrap. 3. Isn't there a better way to find the initial dependencies? That SELECT is truly ugly, and more to the point is highly likely to break anytime someone rearranges the catalogs. I'd like to see it generated automatically (maybe using a tool like findoidjoins); or perhaps we could do the discovery of the columns to look at on-the-fly. 4. Do not use the parser's RESTRICT/CASCADE tokens as enumerated type values. They change value every time someone tweaks the grammar. (Yes, I know you copied from extant code; that code is on my hitlist.) Define your own enum type instead of creating a lot of bogus dependencies on parser/parser.h. 5. Avoid using heapscans on pg_depend; it's gonna be way too big for that to give acceptable performance. Make sure you have indexes available to match your searches, and use the systable_scan routines. 6. The tests on relation names in dependDelete, getObjectName are (a) slow and (b) not schema-aware. Can you make these into OID comparisons instead? 7. The namespaceIdGetNspname routine you added is unsafe (it would need to pstrdup the name to be sure it's still valid after releasing the syscache entry); but more to the point, it duplicates a routine already present in lsyscache.c (which is where this sort of utility generally belongs, anyway). 8. Aggregate code seems unaware that aggfinalfn is optional. I have to leave, but reserve the right to make more comments later ;-) In general though, this seems like a cool approach and definitely worth pursuing. Keep at it! regards, tom lane
[ copied to hackers ] > 1. I don't like the code that installs and removes ad-hoc dependencies > from relations to type Oid. On its own terms it's wrong (if it were ... > explicit representation of pinning in the pg_depends table, perhaps it > would work to create a row claiming that "table 0 / Oid 0 / subid 0" > depends on a pinned object. Yes, a pinned dependency makes much more sense. int4, bool, varchar, and name are in the same boat. I'll make it so dependCreate() will ignore adding any additional dependencies on pinned types (class 0, Oid 0, SubID 0) and dependDelete() will never allow deletion when that dependency exists. > 2. Is it really necessary to treat pg_depends as a bootstrapped > relation? That adds a lot of complexity, as you've evidently already > found, and it does not seem necessary if you're going to load the system > dependencies in a later step of the initdb process. You can just make > the dependency-adding routines be no-ops in bootstrap mode; then create > pg_depends as an ordinary system catalog; and finally load the entries > post-bootstrap. Ack.. <sound of hand hitting head>. All that work to avoid a simple if statement. Ahh well.. learning at it's finest :) > 3. Isn't there a better way to find the initial dependencies? That > SELECT is truly ugly, and more to the point is highly likely to break > anytime someone rearranges the catalogs. I'd like to see it generated > automatically (maybe using a tool like findoidjoins); or perhaps we > could do the discovery of the columns to look at on-the-fly. I'm not entirely sure how to approach this, but it does appear that findoidjoins would find all the relations. So... I could create a pg_ function which will find all oid joins, and call dependCreate() for each entry it finds. That way dependCreate will ignore anything that was pinned (see above) automagically. It would also make initdb quite slow, and would add a pg_ function that one should normally avoid during normal production. Then again, I suppose it could be used to recreate missing dependencies if a user was manually fiddling with that table. initdb would call SELECT pg_findSystemDepends(); or something. > 4. Do not use the parser's RESTRICT/CASCADE tokens as enumerated type > values. They change value every time someone tweaks the grammar. > (Yes, I know you copied from extant code; that code is on my hitlist.) > Define your own enum type instead of creating a lot of bogus > dependencies on parser/parser.h. All but one of those will go away once the functions are modified to accept the actual RESTRICT or CASCADE bit. That was going to be step 2 of the process but I suppose I could do it now, along with a rather large regression test. The only place that RESTRICT will be used is dependDelete(); Nowhere else will care. It'll simply pass on what was given to it by the calling function from utility.c or a cascading dependDelete. Of course, gram.y will be littered with the 'opt_restrictcascade' tag. The RESTRICT usage is more of a current placeholder. I've marked the includes as /* FOR RESTRICT */ for that reason, make them easy to remove later. > 6. The tests on relation names in dependDelete, getObjectName are (a) > slow and (b) not schema-aware. Can you make these into OID comparisons > instead? Ahh yes. Good point.
Rod Taylor wrote: > [ copied to hackers ] > > > 1. I don't like the code that installs and removes ad-hoc > dependencies > > from relations to type Oid. On its own terms it's wrong (if it were Looks good to me. I realize this is a huge chunk of code. The only ultra-minor thing I saw was the use of dashes for comment blocks when not needed: /* ---- * word * ---- */ We use dashes like this only for comments that shouldn't be reformatted by pgindent. The one thing I would like to know is what things does it track, and what does it not track? Does it complete any TODO items, or do we save that for later? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
Yes, I've removed all of those. I've submitted a list of stuff it tracked earlier, but will do so again with the next patch. Basically, anything simple and straightforward ;) It doesn't manage dependencies of function code, view internals, default internals as I don't know how to find navigate an arbitrary parse tree for that information. Some function code isn't even available (C), so its next to impossible. No, it doesn't directly tick off any todo item other than the first in dependency. Make pg_depend table ;) -- Rod Taylor Your eyes are weary from staring at the CRT. You feel sleepy. Notice how restful it is to watch the cursor blink. Close your eyes. The opinions stated above are yours. You cannot imagine why you ever felt otherwise. ----- Original Message ----- From: "Bruce Momjian" <pgman@candle.pha.pa.us> To: "Rod Taylor" <rbt@zort.ca> Cc: "Tom Lane" <tgl@sss.pgh.pa.us>; <pgsql-patches@postgresql.org>; "Hackers List" <pgsql-hackers@postgresql.org> Sent: Tuesday, April 16, 2002 1:15 AM Subject: Re: [HACKERS] [PATCHES] YADP - Yet another Dependency Patch > Rod Taylor wrote: > > [ copied to hackers ] > > > > > 1. I don't like the code that installs and removes ad-hoc > > dependencies > > > from relations to type Oid. On its own terms it's wrong (if it were > > Looks good to me. I realize this is a huge chunk of code. The only > ultra-minor thing I saw was the use of dashes for comment blocks when > not needed: > > /* ---- > * word > * ---- > */ > > We use dashes like this only for comments that shouldn't be reformatted > by pgindent. > > The one thing I would like to know is what things does it track, and > what does it not track? Does it complete any TODO items, or do we save > that for later? > > -- > Bruce Momjian | http://candle.pha.pa.us > pgman@candle.pha.pa.us | (610) 853-3000 > + If your life is a hard drive, | 830 Blythe Avenue > + Christ can be your backup. | Drexel Hill, Pennsylvania 19026 >
Rod Taylor wrote: > Yes, I've removed all of those. > > I've submitted a list of stuff it tracked earlier, but will do so > again with the next patch. Basically, anything simple and > straightforward ;) Thanks for an updated list of tracked items. I know you posted it earlier, but some are complaining recently that patches are coming in too fast or with too long of a gap between discussion and patch arrival, and they are getting confused trying to track where we are going. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026