Thread: Re: extension facility (was: revised hstore patch)
On Tue, Jul 21, 2009 at 8:56 PM, Robert Haas<robertmhaas@gmail.com> wrote: > A decent module infrastructure is probably not going to fix this > problem unless it links with -ldwiw. There are really only two > options here: > > - Keep the old version around for compatibility and add a new version > that isn't compatible, plus provide a migration path from the old > version to the new version. > > - Make the new version read the format written by the old version. On Wed, Jul 22, 2009 at 1:40 PM, Dimitri Fontaine<dfontaine@hi-media.com> wrote: > I beg to defer. The way for a decent *extension* facility to handle the case > is by providing an upgrade function which accepts too arguments: old and new > version of the module. Then the module author is able to run custom code > from within the module upgrade transaction, where migrating on disk data > representation is entirely possible. pg_depend would have to allow for easy > finding of a given datatype column I guess. Technically I suppose you're right, but I don't think that's going to work very well in practice. The whole point of binary upgrade is that you have a really big database where dump and reload is not practical.The numbers we've seen for pg_migrator make copy modelook WAY slower than link mode, and that's without doing any transformation on the data. If you make the new code read the old on-disk format, you can upgrade your data a bit at a time if you want to. You can, for example, write a script to do a no-op update on a few thousand tuples every hour until they're all updated. That's really important for big datasets. If you keep an old and a new version of the datatype, you can't upgrade a tuple at a time, but you can at least upgrade one column at a time, which is still better than a kick in the head. If you make the extension-upgrade facility rewrite everything, you have to do your entire cluster in one shot. That will work for some people, but not for all. And unless you ship both versions of hstore with either PG 8.4 or PG 8.5, you're going to need the conversion to be done inside pg_migrator, which introduces a whole new level of complexity that I think we'd be better off without. ...Robert
On Jul 22, 2009, at 1:11 PM, Robert Haas wrote: > If you keep an old and a new version of the datatype, you can't > upgrade a tuple at a time, but you can at least upgrade one column at > a time, which is still better than a kick in the head. And as long as you're willing to deprecate how far back you'll go in doing such updates, thus keeping the maintenance of your code reasonable over time. > If you make the extension-upgrade facility rewrite everything, you > have to do your entire cluster in one shot. That will work for some > people, but not for all. And unless you ship both versions of hstore > with either PG 8.4 or PG 8.5, you're going to need the conversion to > be done inside pg_migrator, which introduces a whole new level of > complexity that I think we'd be better off without. Well, it depends. If there could be some sort of defined interface for pg_migrator could call to migrate any data type (this issue applies mainly to types, yes?), then an extension author just needs to implement that interface. No? Best, David
On Jul 23, 2009, at 2:44 AM, "David E. Wheeler" <david@kineticode.com> wrote: > On Jul 22, 2009, at 1:11 PM, Robert Haas wrote: > >> If you keep an old and a new version of the datatype, you can't >> upgrade a tuple at a time, but you can at least upgrade one column at >> a time, which is still better than a kick in the head. > > And as long as you're willing to deprecate how far back you'll go in > doing such updates, thus keeping the maintenance of your code > reasonable over time. Of course. > >> If you make the extension-upgrade facility rewrite everything, you >> have to do your entire cluster in one shot. That will work for some >> people, but not for all. And unless you ship both versions of hstore >> with either PG 8.4 or PG 8.5, you're going to need the conversion to >> be done inside pg_migrator, which introduces a whole new level of >> complexity that I think we'd be better off without. > > Well, it depends. If there could be some sort of defined interface > for pg_migrator could call to migrate any data type (this issue > applies mainly to types, yes?), then an extension author just needs > to implement that interface. No? Yes... but "if" and "just" can paper over a good deal of complexity, and it's not clear to me that there's any compensating advantage. ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > On Jul 23, 2009, at 2:44 AM, "David E. Wheeler" <david@kineticode.com> > wrote: >> >> Well, it depends. If there could be some sort of defined interface for >> pg_migrator could call to migrate any data type (this issue applies >> mainly to types, yes?), then an extension author just needs to implement >> that interface. No? > > Yes... but "if" and "just" can paper over a good deal of complexity, and > it's not clear to me that there's any compensating advantage. Well there's already an API for this in the extension design: create extension foo ... upgrade function upgrade_foo(old version, new version) So pg_migrator would have to look on previous cluster for which version of the module was there and on the new cluster which is installed, and run the function accordingly... All the burden is then on the extension's author. Regards, -- dim
On Jul 23, 2009, at 4:08, Robert Haas <robertmhaas@gmail.com> wrote: > Yes... but "if" and "just" can paper over a good deal of complexity, > and it's not clear to me that there's any compensating advantage. It seems reasonable not to worry about this issue in the first rev, or at least not to let it stop development of other features, so that it gas time to gel via discussion over time. Best, David
On Thu, Jul 23, 2009 at 11:05 AM, David E. Wheeler<david@kineticode.com> wrote: > On Jul 23, 2009, at 4:08, Robert Haas <robertmhaas@gmail.com> wrote: > >> Yes... but "if" and "just" can paper over a good deal of complexity, and >> it's not clear to me that there's any compensating advantage. > > It seems reasonable not to worry about this issue in the first rev, or at > least not to let it stop development of other features, so that it gas time > to gel via discussion over time. Yes, I still think the most fundamental issue here is getting to the point where pg_dump dumps the right thing. The central aspect of that is a system for keeping track of which objects are part of an extension using pg_depend. ...Robert