Thread: Oid registry
This rather overdue mail arises out the developer's meeting back in May, where we discussed an item I raised suggesting an Oid registry. The idea came from some difficulties I encountered when I did the backport of the JSON work we did in 9.2 to 9.1, but has wider application. Say someone writes an extension that defines type A. You want to write an extension that takes advantage of that type, but it's difficult if you don't know the type's Oid, and of course currently there is no way of knowing for sure what it is unless it's a builtin type. So the proposal is to have an Oid registry, in which authors could in effect reserve an Oid (or a couple of Oids) for a type. We would guarantee that these Oids would be reserved in just the same way Oids for builtins are reserved, and #define symbolic constants for the reserved Oids. To make that viable, we'd need to extend the CREATE commands for any objects we could reserve Oids for to allow for the specification of the Oids, somewhat like: CREATE TYPE newtype ( ....) WITH (TYPEOID = 123456, TYPEARRAYOID = 234567); I'm not sure what objects we would need this for other than types, but CREATE TYPE would be a good start anyway. Another suggestion that was made during the discussion was that we should also reserve a range of Oids for private use, and guarantee that those would never be allocated, somewhat analogously to RFC1918 IP addresses. thoughts? If there is general agreement I want to get working on this fairly soon. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > ... So the proposal is to have an Oid registry, in which authors could > in effect reserve an Oid (or a couple of Oids) for a type. We would > guarantee that these Oids would be reserved in just the same way Oids > for builtins are reserved, and #define symbolic constants for the > reserved Oids. To make that viable, we'd need to extend the CREATE > commands for any objects we could reserve Oids for to allow for the > specification of the Oids, somewhat like: > CREATE TYPE newtype ( ....) WITH (TYPEOID = 123456, TYPEARRAYOID = > 234567); Well, of course, it's not clear how "reserved" an OID can be if you provide SQL syntax that allows any Joe Blow to create a type with that OID. The possibilities for security holes are interesting to say the least, especially if applications blindly assume that oid 123456 *must* be the type they think it is. > Another suggestion that was made during the discussion was that we > should also reserve a range of Oids for private use, and guarantee that > those would never be allocated, somewhat analogously to RFC1918 IP > addresses. Likewise, "would never be allocated" and "we'll provide syntax that does it trivially" seem a bit at odds. Another point to think about is that it's a few years too late to guarantee that any particular OID above 16384 is unused; we can't do that now without possibly breaking pg_upgrade-ability of existing databases. While we could hand out some subrange of the OIDs below that, there's not exactly a huge amount of space available. > If there is general agreement I want to get working on this fairly soon. Turning this into something actually useful seems to me to be a bit harder than meets the eye. I'm not opposed to it in principle, but the management aspect seems much more difficult than the technical aspect. regards, tom lane
On Mon, 2012-09-24 at 18:59 -0400, Andrew Dunstan wrote: > This rather overdue mail arises out the developer's meeting back in > May, where we discussed an item I raised suggesting an Oid registry. > > The idea came from some difficulties I encountered when I did the > backport of the JSON work we did in 9.2 to 9.1, but has wider > application. Say someone writes an extension that defines type A. You > want to write an extension that takes advantage of that type, but it's > difficult if you don't know the type's Oid, Could you fill the rest of us in with some technical details about why this might be necessary and what it aims to achieve?
On 09/24/2012 09:37 PM, Peter Eisentraut wrote: > On Mon, 2012-09-24 at 18:59 -0400, Andrew Dunstan wrote: >> This rather overdue mail arises out the developer's meeting back in >> May, where we discussed an item I raised suggesting an Oid registry. >> >> The idea came from some difficulties I encountered when I did the >> backport of the JSON work we did in 9.2 to 9.1, but has wider >> application. Say someone writes an extension that defines type A. You >> want to write an extension that takes advantage of that type, but it's >> difficult if you don't know the type's Oid, > Could you fill the rest of us in with some technical details about why > this might be necessary and what it aims to achieve? Well, an obvious case is how record_to_json handles fields. If it knows nothing about the type all it can do is output the string value. That doesn't work well for types such as hstore. If it could reliably recognize a field as an hstore it might well be able to do lots better. cheers andrew
On 09/24/2012 09:24 PM, Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: >> ... So the proposal is to have an Oid registry, in which authors could >> in effect reserve an Oid (or a couple of Oids) for a type. We would >> guarantee that these Oids would be reserved in just the same way Oids >> for builtins are reserved, and #define symbolic constants for the >> reserved Oids. To make that viable, we'd need to extend the CREATE >> commands for any objects we could reserve Oids for to allow for the >> specification of the Oids, somewhat like: >> CREATE TYPE newtype ( ....) WITH (TYPEOID = 123456, TYPEARRAYOID = >> 234567); > Well, of course, it's not clear how "reserved" an OID can be if you > provide SQL syntax that allows any Joe Blow to create a type with that > OID. The possibilities for security holes are interesting to say the > least, especially if applications blindly assume that oid 123456 *must* > be the type they think it is. I think this would probably be a superuser-only facility. > >> Another suggestion that was made during the discussion was that we >> should also reserve a range of Oids for private use, and guarantee that >> those would never be allocated, somewhat analogously to RFC1918 IP >> addresses. > Likewise, "would never be allocated" and "we'll provide syntax that does > it trivially" seem a bit at odds. > > Another point to think about is that it's a few years too late to > guarantee that any particular OID above 16384 is unused; we can't > do that now without possibly breaking pg_upgrade-ability of existing > databases. While we could hand out some subrange of the OIDs below > that, there's not exactly a huge amount of space available. we seem to have a fair bit of leeway between to top numbered Oid as reported by unused_oids (4332) and 16384. I would expect a private range of a few hundred to be more than adequate, and a range for registered Oids of a couple of thousand to last for many years. I'm expecting the use of this over quite a few years to be numbered in tens, not thousands. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > On 09/24/2012 09:24 PM, Tom Lane wrote: >> Another point to think about is that it's a few years too late to >> guarantee that any particular OID above 16384 is unused; we can't >> do that now without possibly breaking pg_upgrade-ability of existing >> databases. While we could hand out some subrange of the OIDs below >> that, there's not exactly a huge amount of space available. > we seem to have a fair bit of leeway between to top numbered Oid as > reported by unused_oids (4332) and 16384. It's not nearly that simple. Per transam.h: * OIDs 1-9999 are reserved for manual assignment (see the files* in src/include/catalog/).** OIDS 10000-16383are reserved for assignment during initdb* using the OID generator. (We start the generator at 10000.)** OIDs beginning at 16384 are assigned from the OID generator* during normal multiuser operation. (Weforce the generator up to* 16384 as soon as we are in normal operation.) pg_database.datlastsysoid is 12907 as of HEAD, so we're using about 2900 OIDs that are assigned during initdb. We should expect continued growth in that number. For comparison, it was 10791 in 8.1, which was the first version following this assignment scheme --- so the number of auto-assigned OIDs has more than tripled in the last seven years. And there's not room for another tripling. So I think it's entirely likely that we'll have to reduce FirstBootstrapObjectId to a smaller number in the foreseeable future, or else increase FirstNormalObjectId which would be a pg_upgrade breaking move. Meanwhile, we do continue to eat manually-assigned OIDs at a nontrivial pace as well. So, yeah, we could reserve a couple hundred OIDs for a scheme like this and (probably) not regret it later. But a couple thousand would scare me ... and I'm not exactly convinced that a couple hundred is enough, if there's any demand out there at all. I suggest we could consider handing out reserved OIDs one or two at a time using the current manual assignment process, *without* any notion of a reserved block of OIDs. That is, when Joe Developer needs a couple of OIDs for joes_whizzy_datatype, he comes to us and we say "sure, you can have 4333 and 4334", which we then memorialize in a text file in include/catalog/. This doesn't preclude doing something differently later of course, but it avoids setting aside OID address space that is likely to be underutilized and fairly desperately needed at some future time. As you might guess from this, I'm not enamored of the "reserve some private OID space" part at all. There's not been any demand for that that I've noticed. In pretty much any development context, you could use OIDs up around 4 billion for such purposes and not have any problem at all --- it's only trying to *guarantee* they're not used anywhere out in the field that is problematic. regards, tom lane
Andrew Dunstan <andrew@dunslane.net> writes: > On 09/24/2012 09:37 PM, Peter Eisentraut wrote: >> Could you fill the rest of us in with some technical details about why >> this might be necessary and what it aims to achieve? > Well, an obvious case is how record_to_json handles fields. If it knows > nothing about the type all it can do is output the string value. That > doesn't work well for types such as hstore. If it could reliably > recognize a field as an hstore it might well be able to do lots better. Um ... that is an entirely unconvincing use case. We would not put any code into core that knows specifically about a non-core datatype, or at least I sure hope we'd not do such a klugy thing. It would be far better to invent an extensibility scheme (plug-in of some sort) for record_to_json. My recollection of the PGCon discussion is that people wanted to allow client-side code to hard-wire type OIDs for add-on types, in more or less the same way that things like JDBC know that "25" is "text". That's not unreasonable, since the alternatives are complicated and not all that reliable. I'm not sure the usage applies to anything except data types though, so at least for starters I'd only want to accept reservations of OIDs for data types. regards, tom lane
On Mon, Sep 24, 2012 at 8:39 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Andrew Dunstan <andrew@dunslane.net> writes: >> On 09/24/2012 09:37 PM, Peter Eisentraut wrote: >>> Could you fill the rest of us in with some technical details about why >>> this might be necessary and what it aims to achieve? > >> Well, an obvious case is how record_to_json handles fields. If it knows >> nothing about the type all it can do is output the string value. That >> doesn't work well for types such as hstore. If it could reliably >> recognize a field as an hstore it might well be able to do lots better. > > Um ... that is an entirely unconvincing use case. We would not put any > code into core that knows specifically about a non-core datatype, or > at least I sure hope we'd not do such a klugy thing. It would be far > better to invent an extensibility scheme (plug-in of some sort) for > record_to_json. > I don't think (and not hope) the core should know about external data type, but I have certainly seen a lot of use cases where an external project wants to know about another external data type. Say, if plv8 wants to convert hstore into a javascript object. It is arbitrary for users to define such a function that accepts hstore as arguments, but how does plv8 know the input is actually hstore? Of course you can look up type name conlusting SysCache and see if the type name is "hstore", but it's expensive to do it for every function invocation, so caching the hstore's oid in plv8 is the current workaround, which is not truly safe because the hstore's oid may change while caching. I can tell similar stories around array creation which needs element type oid. The core code can do some special stuff by using pre-defined type or function oid macros, and I guess it is reasonable for the external developers to hope to do such things. Thanks, -- Hitoshi Harada
On Mon, Sep 24, 2012 at 8:21 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > So, yeah, we could reserve a couple hundred OIDs for a scheme like this > and (probably) not regret it later. But a couple thousand would scare > me ... and I'm not exactly convinced that a couple hundred is enough, > if there's any demand out there at all. I think some kind of way to compose extension objects -- this includes and goes beyond just the things in pg_type, but operators and so on -- will have great impact on Postgres' power without making the community effort unscalable. I think PostGIS is the largest success by this metric -- although certain requirements spill back into pgsql-hackers, it's a pretty huge island of functionality out there that neatly progresses on its own without coordination. That's fantastic. I am fairly certain that if some form of in-line extensions were supported, we would see people building smaller extensions that use many little types (such as composites) relying on other extensions to do things or manage certain tables, increasing convenience overall. Things like the JSON data type (in spite of my own opinion that it should be added) are a highly useful concession. Yet, it is still a concession that extensions simply cannot be as close to good as a core data type when it comes to the ability to compose. The gap between pre-JSON-in-the-standard-library in Python, Ruby, et al and post-JSON-in-stdlib was much smaller. Even for statically typed languages -- like Java -- the only central-registries that exist are mostly for the convenience of distribution and deployment, not as the means of composition, and I think that is a good thing. However, an IANA-style OID registry I find pretty un-compelling. Is there no way we can use symbols, the type system, and invalidation messages to a client to do this? I feel like there is a way we can, and it is probably worth it to minimize coordination among extension authors and enabling smaller extensions. If reserved OID ranges for extensions are to become a thing, I think the right scope would be to presume that these extensions are not bundled with Postgres, but practically, whoever uses that space is probably going to have to be the kind of person who would correspond with -hackers. Ad-hoc composition among small or live-and-die-fast user-space libraries (that are written in trusted languages, for example) are out, which is kind of a bummer for what I think plv8 can enable. -- fdr
On 24 September 2012 21:26, Andrew Dunstan <andrew@dunslane.net> wrote: > > On 09/24/2012 09:37 PM, Peter Eisentraut wrote: >> >> On Mon, 2012-09-24 at 18:59 -0400, Andrew Dunstan wrote: >>> >>> This rather overdue mail arises out the developer's meeting back in >>> May, where we discussed an item I raised suggesting an Oid registry. >>> >>> The idea came from some difficulties I encountered when I did the >>> backport of the JSON work we did in 9.2 to 9.1, but has wider >>> application. Say someone writes an extension that defines type A. You >>> want to write an extension that takes advantage of that type, but it's >>> difficult if you don't know the type's Oid, >> >> Could you fill the rest of us in with some technical details about why >> this might be necessary and what it aims to achieve? > > > Well, an obvious case is how record_to_json handles fields. If it knows > nothing about the type all it can do is output the string value. That > doesn't work well for types such as hstore. If it could reliably recognize a > field as an hstore it might well be able to do lots better. I think we're missing something in the discussion here. Why can't you find out the oid of the type by looking it up by name? That mechanism is used throughout postgres already and seems to work just fine. Why would we need to hardcode it? -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 09/25/2012 04:26 AM, Andrew Dunstan wrote: > > On 09/24/2012 09:37 PM, Peter Eisentraut wrote: >> On Mon, 2012-09-24 at 18:59 -0400, Andrew Dunstan wrote: >>> This rather overdue mail arises out the developer's meeting back in >>> May, where we discussed an item I raised suggesting an Oid registry. >>> >>> The idea came from some difficulties I encountered when I did the >>> backport of the JSON work we did in 9.2 to 9.1, but has wider >>> application. Say someone writes an extension that defines type A. You >>> want to write an extension that takes advantage of that type, but it's >>> difficult if you don't know the type's Oid, >> Could you fill the rest of us in with some technical details about why >> this might be necessary and what it aims to achieve? > > Well, an obvious case is how record_to_json handles fields. If it > knows nothing > about the type all it can do is output the string value. That doesn't > work well > for types such as hstore. If it could reliably recognize a field as an > hstore it > might well be able to do lots better. During the earlier to_json() discussions I already proposed a solution to this - try a cast to JSON type before falling back to outputting the text version. As I understand it, this is how typecasts are supposed to be used. In addition to being "how PostgreSQL is intended to work" it also allows for users to override it and have their own version of json representations for types which have no standard json format. Just for the record - I'm not opposed to having a PostgreSQL OID Registry though I think that for anything in the backend the lookup by name should be fast enough. > cheers > > andrew > >
On Tue, Sep 25, 2012 at 1:06 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > On 24 September 2012 21:26, Andrew Dunstan <andrew@dunslane.net> wrote: >> Well, an obvious case is how record_to_json handles fields. If it knows >> nothing about the type all it can do is output the string value. That >> doesn't work well for types such as hstore. If it could reliably recognize a >> field as an hstore it might well be able to do lots better. > > I think we're missing something in the discussion here. > > Why can't you find out the oid of the type by looking it up by name? > That mechanism is used throughout postgres already and seems to work > just fine. > Sure, but how do you know the type named "hstore" is what you know as hstore? We don't have a global consensus a specific type name means exactly what everyone thinks it is. Thanks, -- Hitoshi Harada
On 25.09.2012 12:19, Hitoshi Harada wrote: > On Tue, Sep 25, 2012 at 1:06 AM, Simon Riggs<simon@2ndquadrant.com> wrote: >> On 24 September 2012 21:26, Andrew Dunstan<andrew@dunslane.net> wrote: >>> Well, an obvious case is how record_to_json handles fields. If it knows >>> nothing about the type all it can do is output the string value. That >>> doesn't work well for types such as hstore. If it could reliably recognize a >>> field as an hstore it might well be able to do lots better. I'm not at all familiar with record_to_json or the json datatype, but wouldn't it be appropriate to create a cast from hstore to json to handle that case? That brings us to another question: should the cast be part of the hstore extension, or json? (json is built-in, but imagine for a moment that it was an extension, too, so that there was a choice). IIRC someone started a discussion on that recently on pgsql-hackers, but I don't think there was any conclusion on that. >> I think we're missing something in the discussion here. >> >> Why can't you find out the oid of the type by looking it up by name? >> That mechanism is used throughout postgres already and seems to work >> just fine. > > Sure, but how do you know the type named "hstore" is what you know as > hstore? We don't have a global consensus a specific type name means > exactly what everyone thinks it is. If you create a type called "hstore" that isn't the one in contrib/hstore, you're just asking for trouble. Having a central registry of assigned oid numbers doesn't really make that problem go away either; you could still assign one of the registered oids for a different datatype if you wanted to. So whether you rely on the oid or type name, your code needs to behave sanely, even if the datatype doesn't behave the way you'd expect. At a minimum, there needs to be sanity checks so that you don't crash, and give a meaningful error message. Beyond that, I think it's the user responsibility to not confuse things by using well-known datatype names like "hstore" for something else. Here's another thought: imagine that someone creates a datatype that's more or less compatible with contrib/hstore, but the internal implementation is different. Would you call that hstore? Would you use the same assigned oid for it? If you have some other extension that knows about hstore, and treats it specially, would you want the new datatype to be treated specially too? And what about domains? If you have code that knows the oid of hstore and gives it special treatment, should domains over hstore also be given special treatment? - Heikki
On 09/25/2012 06:13 AM, Heikki Linnakangas wrote: > On 25.09.2012 12:19, Hitoshi Harada wrote: >> On Tue, Sep 25, 2012 at 1:06 AM, Simon Riggs<simon@2ndquadrant.com> >> wrote: >>> On 24 September 2012 21:26, Andrew Dunstan<andrew@dunslane.net> wrote: >>>> Well, an obvious case is how record_to_json handles fields. If it >>>> knows >>>> nothing about the type all it can do is output the string value. That >>>> doesn't work well for types such as hstore. If it could reliably >>>> recognize a >>>> field as an hstore it might well be able to do lots better. > > I'm not at all familiar with record_to_json or the json datatype, but > wouldn't it be appropriate to create a cast from hstore to json to > handle that case? No, the difficulty (or at least the first difficulty) is in having the code recognize that it has an hstore at all. The code picks apart the record field by field at run time and takes various actions depending on the field's type. For any type it doesn't recognize it just outputs the value as a string, and so that's what it does with hstore. Mostly this is the right thing but in the hstore case it's rather sad. > > That brings us to another question: should the cast be part of the > hstore extension, or json? (json is built-in, but imagine for a moment > that it was an extension, too, so that there was a choice). IIRC > someone started a discussion on that recently on pgsql-hackers, but I > don't think there was any conclusion on that. > >>> I think we're missing something in the discussion here. >>> >>> Why can't you find out the oid of the type by looking it up by name? >>> That mechanism is used throughout postgres already and seems to work >>> just fine. >> >> Sure, but how do you know the type named "hstore" is what you know as >> hstore? We don't have a global consensus a specific type name means >> exactly what everyone thinks it is. > > If you create a type called "hstore" that isn't the one in > contrib/hstore, you're just asking for trouble. Having a central > registry of assigned oid numbers doesn't really make that problem go > away either; you could still assign one of the registered oids for a > different datatype if you wanted to. That's true to some extent, but names are worse, since they are schema qualified. cheers andrew
On 09/24/2012 11:39 PM, Tom Lane wrote: > My recollection of the PGCon discussion is that people wanted to allow > client-side code to hard-wire type OIDs for add-on types, in more or > less the same way that things like JDBC know that "25" is "text". > That's not unreasonable, since the alternatives are complicated and > not all that reliable. I'm not sure the usage applies to anything > except data types though, so at least for starters I'd only want to > accept reservations of OIDs for data types. > > Yes, I certainly think that's a sufficient case. And just to be clear, I don't care about the "private range" suggestion. I was just reporting that as it came up in the discussion and at least wasn't immediately shouted down. But I'm happy to abandon it at least for now. If there is ever actual demand for it we could revisit the matter. Given your previous comments, perhaps we could just start handing out Oids (if there is any demand) numbered, say, 9000 and up. That should keep us well clear of any existing use. Regarding the use of pre-allocated Oids, unless we provide some facility to use them when creating types extension authors will be reduced to using the pretty ugly tricks I had to use when backporting JSON for binary upgrade-ability. This used some of pg_upgrade's tricks to force use of particular Oids, but I don't think this should be recommended practice. The suggestion I made was based on something you suggested (albeit with some caveats) in the recent discussion of pg_upgrade with extensions. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > Given your previous comments, perhaps we could just start handing out > Oids (if there is any demand) numbered, say, 9000 and up. That should > keep us well clear of any existing use. No, I think you missed my point entirely: handing out OIDs at the top of the manual assignment range is approximately the worst possible scenario. I foresee having to someday move FirstBootstrapObjectId down to 9000, or 8000, or even less, to cope with growth of the auto-assigned OID set. So we need to keep manually assigned OIDs reasonably compact near the bottom of the range, and it doesn't matter at all whether such OIDs are used internally or reserved for external developers. Nor do I see a need for such reserved OIDs to "look different" from internally-used OIDs. Reserved is reserved. regards, tom lane
On 09/25/2012 10:23 AM, Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: >> Given your previous comments, perhaps we could just start handing out >> Oids (if there is any demand) numbered, say, 9000 and up. That should >> keep us well clear of any existing use. > No, I think you missed my point entirely: handing out OIDs at the top > of the manual assignment range is approximately the worst possible > scenario. I foresee having to someday move FirstBootstrapObjectId > down to 9000, or 8000, or even less, to cope with growth of the > auto-assigned OID set. So we need to keep manually assigned OIDs > reasonably compact near the bottom of the range, and it doesn't matter > at all whether such OIDs are used internally or reserved for external > developers. Nor do I see a need for such reserved OIDs to "look > different" from internally-used OIDs. Reserved is reserved. > > OK, point taken. cheers andrew
Excerpts from Hitoshi Harada's message of mar sep 25 02:11:14 -0300 2012: > Of course you can > look up type name conlusting SysCache and see if the type name is > "hstore", but it's expensive to do it for every function invocation, > so caching the hstore's oid in plv8 is the current workaround, which > is not truly safe because the hstore's oid may change while caching. Hm, couldn't you just register an invalidation callback with CacheRegisterSyscacheCallback()? -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Excerpts from Hitoshi Harada's message of mar sep 25 02:11:14 -0300 2012: >> Of course you can >> look up type name conlusting SysCache and see if the type name is >> "hstore", but it's expensive to do it for every function invocation, >> so caching the hstore's oid in plv8 is the current workaround, which >> is not truly safe because the hstore's oid may change while caching. > Hm, couldn't you just register an invalidation callback with > CacheRegisterSyscacheCallback()? Yeah, if the code that needs the value is in the backend; but if it is, the cost of looking the value up again probably isn't that much either. The need for a cache is much greater in client-side code. The bigger issues in my mind have to do with how you look up the type in the first place (considering schema search path and so forth) and how you know that "hstore" is what you think it is. I recall some discussion of assigning a UUID to each datatype, which might alleviate the second part of that at least. Does nothing for malicious impersonation of a datatype of course, but should prevent accidental collisions. regards, tom lane
On 09/25/2012 12:14 PM, Tom Lane wrote: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: >> Excerpts from Hitoshi Harada's message of mar sep 25 02:11:14 -0300 2012: >>> Of course you can >>> look up type name conlusting SysCache and see if the type name is >>> "hstore", but it's expensive to do it for every function invocation, >>> so caching the hstore's oid in plv8 is the current workaround, which >>> is not truly safe because the hstore's oid may change while caching. >> Hm, couldn't you just register an invalidation callback with >> CacheRegisterSyscacheCallback()? > Yeah, if the code that needs the value is in the backend; but if it is, > the cost of looking the value up again probably isn't that much either. > The need for a cache is much greater in client-side code. > > The bigger issues in my mind have to do with how you look up the type in > the first place (considering schema search path and so forth) and how > you know that "hstore" is what you think it is. > > I recall some discussion of assigning a UUID to each datatype, which > might alleviate the second part of that at least. Does nothing for > malicious impersonation of a datatype of course, but should prevent > accidental collisions. > > One nice thing about that idea (if I understand it correctly) is that we could use it for existing types. The horse has long bolted on Oids for hstore, because it has lots of existing deployments with unknown Oids. cheers andrew
On Tue, Sep 25, 2012 at 09:18:30AM -0400, Andrew Dunstan wrote: > >I'm not at all familiar with record_to_json or the json datatype, > >but wouldn't it be appropriate to create a cast from hstore to > >json to handle that case? > > No, the difficulty (or at least the first difficulty) is in having > the code recognize that it has an hstore at all. The code picks > apart the record field by field at run time and takes various > actions depending on the field's type. For any type it doesn't > recognize it just outputs the value as a string, and so that's what > it does with hstore. Mostly this is the right thing but in the > hstore case it's rather sad. Is there a particular reason to special case hstore though? Wouldn't it be sufficient to simply look for a cast from the given type oid to json and use that if present. If you don't like cast, allow other extensions to define a function to_json_hook(yourtype) -> json which is used. Hardcoding IDs in extensions just doesn't seem right somehow... (Hmm, I see someone else in the thread pointed this out too). Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > He who writes carelessly confesses thereby at the very outset that he does > not attach much importance to his own thoughts. -- Arthur Schopenhauer
Tom Lane <tgl@sss.pgh.pa.us> writes: > Andrew Dunstan <andrew@dunslane.net> writes: >> Given your previous comments, perhaps we could just start handing out >> Oids (if there is any demand) numbered, say, 9000 and up. That should >> keep us well clear of any existing use. > > No, I think you missed my point entirely: handing out OIDs at the top > of the manual assignment range is approximately the worst possible > scenario. I foresee having to someday move FirstBootstrapObjectId If that's the only problem, there's an easy way out by registering from the top of the Oid range down to some constant up over there. Now it seems like the problem here is that we both want the extension to be managed as disconnected as possible from PostgreSQL, and at the same time we want to be able to trust them from within the backend with it being the only trusted authority. I'm not security oriented enough to devise a working scheme here, but it makes me think about some "shared secret" or GnuPG signature. Is it possible to automatically verify that an extension's ID card (containing some type's OID, name, etc) has been signed with PostgreSQL's own private key, without ever having to ship that key in the backend code nor binary? To me it's all about how to setup a distributed network of trust… Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On 9/24/12 10:26 PM, Andrew Dunstan wrote: > Well, an obvious case is how record_to_json handles fields. If it knows > nothing about the type all it can do is output the string value. That > doesn't work well for types such as hstore. If it could reliably > recognize a field as an hstore it might well be able to do lots better. I think the fix there is that the type supplies a "to JSON" function as part of its type definition (just like input and output functions). "to XML" would also be useful. Most types wouldn't need them and default to string output.
On 9/24/12 11:39 PM, Tom Lane wrote: > My recollection of the PGCon discussion is that people wanted to allow > client-side code to hard-wire type OIDs for add-on types, in more or > less the same way that things like JDBC know that "25" is "text". If I write a custom uint type and want to explain to JDBC that it is number-like, I don't think it's very attractive to have to register a hard-wired OID for it. There would not only be the burden of maintaining this OID database, JDBC and every driver would on the other hand have to maintain its own database of all custom types in the world. I would rather imagine a system where the type communicates its properties to the client using some sort of label system, like "number", "string", "pair of number" (point), "pair of string" (hstore). Or something with more or less detail. Maybe something like that already exists (maybe somewhere in thrift, protocol buffers, etc.?).
On 9/25/12 1:11 AM, Hitoshi Harada wrote: > Say, if plv8 > wants to convert hstore into a javascript object. It is arbitrary for > users to define such a function that accepts hstore as arguments, but > how does plv8 know the input is actually hstore? That's what the proposed transforms feature would be for: https://commitfest.postgresql.org/action/patch_view?id=879
Peter Eisentraut <peter_e@gmx.net> writes: > On 9/24/12 10:26 PM, Andrew Dunstan wrote: >> Well, an obvious case is how record_to_json handles fields. If it knows >> nothing about the type all it can do is output the string value. That >> doesn't work well for types such as hstore. If it could reliably >> recognize a field as an hstore it might well be able to do lots better. > I think the fix there is that the type supplies a "to JSON" function as > part of its type definition (just like input and output functions). "to > XML" would also be useful. Most types wouldn't need them and default to > string output. Yes ... but I really don't want to go down the path of treating those as new type properties; it doesn't scale. (And please don't tell me that JSON is the last word in container types and there will never be requests for any more of these.) Can we define these functions as being the cast-from-foo-to-json and cast-from-foo-to-xml functions? That would let us use the existing cast infrastructure to manage them. regards, tom lane
On 9/25/12 6:13 AM, Heikki Linnakangas wrote: > That brings us to another question: should the cast be part of the > hstore extension, or json? (json is built-in, but imagine for a moment > that it was an extension, too, so that there was a choice). IIRC someone > started a discussion on that recently on pgsql-hackers, but I don't > think there was any conclusion on that. That just depends how just want to direct the dependency arrows between these things. Either you make it a third extension, or you put it into one or the other, for example, if you feel that json support is "integral" to the hstore functionality.
On 9/25/12 9:18 AM, Andrew Dunstan wrote: > No, the difficulty (or at least the first difficulty) is in having the > code recognize that it has an hstore at all. The code picks apart the > record field by field at run time and takes various actions depending on > the field's type. For any type it doesn't recognize it just outputs the > value as a string, and so that's what it does with hstore. Mostly this > is the right thing but in the hstore case it's rather sad. But if you have a cast defined from hstore to json, then it is unambiguous, because the pg_cast entry binds the types together by OID.
On 9/25/12 5:58 PM, Tom Lane wrote: > Yes ... but I really don't want to go down the path of treating those as > new type properties; it doesn't scale. (And please don't tell me that > JSON is the last word in container types and there will never be > requests for any more of these.) Yeah, I didn't like that part either, but we only add one every five years or so. > Can we define these functions as being the cast-from-foo-to-json and > cast-from-foo-to-xml functions? That would let us use the existing cast > infrastructure to manage them. Sounds attractive, but there might be some problems in the details. For example, you can't cast scalar values to valid json values, because a valid json value can only be a dictionary or an array. If we had a flag of some kind saying "cast from foo to json, but only when part of a larger json serialization, not by itself", then it might work.
Peter Eisentraut <peter_e@gmx.net> writes: > On 9/25/12 5:58 PM, Tom Lane wrote: >> Can we define these functions as being the cast-from-foo-to-json and >> cast-from-foo-to-xml functions? That would let us use the existing cast >> infrastructure to manage them. > Sounds attractive, but there might be some problems in the details. For > example, you can't cast scalar values to valid json values, because a > valid json value can only be a dictionary or an array. If we had a flag > of some kind saying "cast from foo to json, but only when part of a > larger json serialization, not by itself", then it might work. Actually, after reading another message you sent, I thought you were going to respond that your proposed transforms feature would cover it. If there's some reason that's not what to use, I guess we could add another optional argument to cast support functions; but that interface is already rather overloaded. regards, tom lane
On 09/25/2012 12:13 PM, Heikki Linnakangas wrote: > On 25.09.2012 12:19, Hitoshi Harada wrote: >> On Tue, Sep 25, 2012 at 1:06 AM, Simon Riggs<simon@2ndquadrant.com> >> wrote: >>> On 24 September 2012 21:26, Andrew Dunstan<andrew@dunslane.net> wrote: >>>> Well, an obvious case is how record_to_json handles fields. If it >>>> knows >>>> nothing about the type all it can do is output the string value. That >>>> doesn't work well for types such as hstore. If it could reliably >>>> recognize a >>>> field as an hstore it might well be able to do lots better. > > I'm not at all familiar with record_to_json or the json datatype, but > wouldn't it be appropriate to create a cast from hstore to json to > handle that case? > > That brings us to another question: should the cast be part of the > hstore extension, or json? (json is built-in, but imagine for a moment > that it was an extension, too, so that there was a choice). IIRC > someone started a discussion on that recently on pgsql-hackers, but I > don't think there was any conclusion on that. > I would make the CAST part of both extensions and try to do it so, that if the "other" type does not exists then it will not be created. So the extension that comes later will be creating it. The most sensible extension would be to make it into a 3rd extension which depends on both json and hstore being installed CREATE EXTENSION hstore_to_json_cast; > > - Heikki > >
On 09/26/2012 12:06 AM, Peter Eisentraut wrote: > On 9/25/12 5:58 PM, Tom Lane wrote: >> Yes ... but I really don't want to go down the path of treating those as >> new type properties; it doesn't scale. (And please don't tell me that >> JSON is the last word in container types and there will never be >> requests for any more of these.) > Yeah, I didn't like that part either, but we only add one every five > years or so. > >> Can we define these functions as being the cast-from-foo-to-json and >> cast-from-foo-to-xml functions? That would let us use the existing cast >> infrastructure to manage them. > Sounds attractive, but there might be some problems in the details. For > example, you can't cast scalar values to valid json values, because a > valid json value can only be a dictionary or an array. Nope. Json _value_ can be anything you put in these dicts or arrays. It was just definition about something called "a json generator" which was defined to return an array or object/dict and the json type already can hold all the scalar values, the input conversion functions happily accept these and generate corresponding json values. It was all hashed through when I woke up on this too late in the 9.2 dev cycle and proposed all the CAST and to_json functions and also a single to_json for any json generation instead of array_to_json and row_to_json. so currently we do and we don't have this "json is an array or dict" functionality we cant convert anything else directly to json using a to_json function, but we can do it through input/output functions and thus we can output scalar-valued json values : jt=# select '1'::json,'"1"'::json; json | json ------+------ 1 | "1" (1 row) IIRC it was something about the to_json functions being generators and type io functions not being that :P > If we had a flag > of some kind saying "cast from foo to json, but only when part of a > larger json serialization, not by itself", then it might work. > >
On Tue, 2012-09-25 at 18:22 -0400, Tom Lane wrote: > Actually, after reading another message you sent, I thought you were > going to respond that your proposed transforms feature would cover it. I had thought about this some time ago, but it's clearer to think of casts as associating two types, versus transforms associating a type and a language. JSON and XML tend to be types.
On Mon, 2012-09-24 at 21:18 -0700, Daniel Farina wrote: > The gap between > pre-JSON-in-the-standard-library in Python, Ruby, et al and > post-JSON-in-stdlib was much smaller. Except in Python they renamed the thing.
On 09/25/2012 08:35 PM, Peter Eisentraut wrote: > On Tue, 2012-09-25 at 18:22 -0400, Tom Lane wrote: >> Actually, after reading another message you sent, I thought you were >> going to respond that your proposed transforms feature would cover it. > I had thought about this some time ago, but it's clearer to think of > casts as associating two types, versus transforms associating a type and > a language. JSON and XML tend to be types. OK, I think this solves the object_to_json problem after all - we'll look for a cast to json and if it's not there use the string form of the object. Nice. That still leaves the other uses for having well known Oids (or possibly UUIDs) for non-builtin types (e.g. so Drivers don't have to look them up in the catalogs, or having issues when types are added to the core.) cheers andrew
On Tue, Sep 25, 2012 at 10:23 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Andrew Dunstan <andrew@dunslane.net> writes: >> Given your previous comments, perhaps we could just start handing out >> Oids (if there is any demand) numbered, say, 9000 and up. That should >> keep us well clear of any existing use. > > No, I think you missed my point entirely: handing out OIDs at the top > of the manual assignment range is approximately the worst possible > scenario. I foresee having to someday move FirstBootstrapObjectId > down to 9000, or 8000, or even less, to cope with growth of the > auto-assigned OID set. So we need to keep manually assigned OIDs > reasonably compact near the bottom of the range, and it doesn't matter > at all whether such OIDs are used internally or reserved for external > developers. Nor do I see a need for such reserved OIDs to "look > different" from internally-used OIDs. Reserved is reserved. I'm not sure how much anyone cares, but just in case anyone does... it would be mighty useful to EnterpriseDB to have a range of OIDs that are guarantee not to be assigned to anyone else, because we're maintaining a fork of PostgreSQL that regularly merges with the mainline. We're actually likely to get crunched in our fork well before PostgreSQL itself does. There are enough other forks of PostgreSQL out there that there may other people who are in a similar situation, though I am not sure how much we want to cater to people doing such things. That having been said, I can't really see how it would be practical anyway unless we raise the 16384 lower limit for user-assigned OIDs considerably. And I'm not sure how to do that without breaking pg_upgrade. I am somewhat opposed to the idea of an OID registry. I can't see why the space of things people want to reserve OIDs for would be only tens rather than thousands. There are surely plenty of extensions that would like to depend on specific other extensions, or on core. If we legislate that hard-coded OIDs are the way to do that, I think we're going to end up with a lot of 'em. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 09/26/2012 02:48 AM, Andrew Dunstan wrote: > > On 09/25/2012 08:35 PM, Peter Eisentraut wrote: >> On Tue, 2012-09-25 at 18:22 -0400, Tom Lane wrote: >>> Actually, after reading another message you sent, I thought you were >>> going to respond that your proposed transforms feature would cover it. >> I had thought about this some time ago, but it's clearer to think of >> casts as associating two types, versus transforms associating a type and >> a language. JSON and XML tend to be types. > > > OK, I think this solves the object_to_json problem after all - we'll > look for a cast to json and if it's not there use the string form of > the object. Nice. This is solved then :) Would it be possible to also use the cast mechanism to do anyarray-to-json casts as parallel spelling for array_to_json() and record-to-json cast for row_to_json() btw, is anybody currently working on also going the opposite way, that is loading rows/records from json ? > That still leaves the other uses for having well known Oids (or > possibly UUIDs) for non-builtin types (e.g. so Drivers don't have to > look them up in the catalogs, or having issues when types are added to > the core.) One way to solve this would be to pass non-system oids to clients as their names. this would need a change in protocol. Or we could make it so that server sends a special record with typename->typeoid mappings at first use of non-system type. > cheers > > andrew > > >
I'm also implementing an extension where direct access to non-fixed OIDs (i.e. no catalog cache lookup by name) would be very helpful. I spent some time thinking about a workaround that makes OID registry unnecessary. How about the following? 1. Add a new varlena column to pg_proc catalog table, say 'ext_types',containing C array of OIDs. 2. Let each extension declare requirements like the following in its configuration files: "I expect <some type's name> type at 0-th position of 'ext_types' array." "I expect <other type's name> type at 1-st position of 'ext_types' array." etc. 3. Ensure that CREATE EXTENSION command reads all these type names, finds the appropriate OIDs in pg_type and puts them to the appropriate position in the 'ext_types' column for each function of the new extension (while in-core types would have it set to NULL of course). 4. Implement a macro to get the 0-th, 1-st, etc. value from pg_proc.ext_types (via FMGR). Is there any serious circumstance I forgot or does it seem to be e.g. too invasive? Kind regards, Tony H. On 09/25/2012 12:59 AM, Andrew Dunstan wrote: > This rather overdue mail arises out the developer's meeting back in > May, where we discussed an item I raised suggesting an Oid registry. > > The idea came from some difficulties I encountered when I did the > backport of the JSON work we did in 9.2 to 9.1, but has wider > application. Say someone writes an extension that defines type A. You > want to write an extension that takes advantage of that type, but it's > difficult if you don't know the type's Oid, and of course currently > there is no way of knowing for sure what it is unless it's a builtin > type. So the proposal is to have an Oid registry, in which authors > could in effect reserve an Oid (or a couple of Oids) for a type. We > would guarantee that these Oids would be reserved in just the same way > Oids for builtins are reserved, and #define symbolic constants for the > reserved Oids. To make that viable, we'd need to extend the CREATE > commands for any objects we could reserve Oids for to allow for the > specification of the Oids, somewhat like: > > CREATE TYPE newtype ( ....) WITH (TYPEOID = 123456, TYPEARRAYOID = > 234567); > > I'm not sure what objects we would need this for other than types, but > CREATE TYPE would be a good start anyway. > > Another suggestion that was made during the discussion was that we > should also reserve a range of Oids for private use, and guarantee > that those would never be allocated, somewhat analogously to RFC1918 > IP addresses. > > thoughts? > > If there is general agreement I want to get working on this fairly soon. > > cheers > > andrew > > > >
On 09/25/2012 08:56 PM, Robert Haas wrote: > On Tue, Sep 25, 2012 at 10:23 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Andrew Dunstan <andrew@dunslane.net> writes: >>> Given your previous comments, perhaps we could just start handing out >>> Oids (if there is any demand) numbered, say, 9000 and up. That should >>> keep us well clear of any existing use. >> No, I think you missed my point entirely: handing out OIDs at the top >> of the manual assignment range is approximately the worst possible >> scenario. I foresee having to someday move FirstBootstrapObjectId >> down to 9000, or 8000, or even less, to cope with growth of the >> auto-assigned OID set. So we need to keep manually assigned OIDs >> reasonably compact near the bottom of the range, and it doesn't matter >> at all whether such OIDs are used internally or reserved for external >> developers. Nor do I see a need for such reserved OIDs to "look >> different" from internally-used OIDs. Reserved is reserved. > I'm not sure how much anyone cares, but just in case anyone does... it > would be mighty useful to EnterpriseDB to have a range of OIDs that > are guarantee not to be assigned to anyone else, because we're > maintaining a fork of PostgreSQL that regularly merges with the > mainline. We're actually likely to get crunched in our fork well > before PostgreSQL itself does. There are enough other forks of > PostgreSQL out there that there may other people who are in a similar > situation, though I am not sure how much we want to cater to people > doing such things. That having been said, I can't really see how it > would be practical anyway unless we raise the 16384 lower limit for > user-assigned OIDs considerably. And I'm not sure how to do that > without breaking pg_upgrade. How many would EDB need for it to be useful? > > I am somewhat opposed to the idea of an OID registry. I can't see why > the space of things people want to reserve OIDs for would be only tens > rather than thousands. There are surely plenty of extensions that > would like to depend on specific other extensions, or on core. If we > legislate that hard-coded OIDs are the way to do that, I think we're > going to end up with a lot of 'em. Maybe you have a better feel than I do for how many live addon types are out there. Maybe there are hundreds or thousands, but if there are I am blissfully unaware of them. I did like the alternative idea upthread of UUIDs for types which would give them a virtually unlimited space. cheers andrew
Antonin Houska <antonin.houska@gmail.com> writes: > I'm also implementing an extension where direct access to non-fixed OIDs > (i.e. no catalog cache lookup by name) would be very helpful. I spent some > time thinking about a workaround that makes OID registry unnecessary. > How about the following? > 1. Add a new varlena column to pg_proc catalog table, say > 'ext_types',containing > C array of OIDs. > 2. Let each extension declare requirements like the following in its > configuration > files: > "I expect <some type's name> type at 0-th position of 'ext_types' array." > "I expect <other type's name> type at 1-st position of 'ext_types' array." > etc. I think this just begs the question: how do you specify <some type> and how do you know that whatever was found is what you want? Beyond that, nothing in what you said can't be done today by a function that does type name lookups and caches the results internally. And I'd just as soon not burden the function-call infrastructure with more overhead to support something only a small fraction of functions would need. Another point is that server-internal features don't help client-side code, which is actually where most of the pain is AFAICT. We aren't providing any infrastructure that helps clients interpret PQftype() values for non-core types. regards, tom lane
On 9/25/12 8:48 PM, Andrew Dunstan wrote: > That still leaves the other uses for having well known Oids (or possibly > UUIDs) for non-builtin types (e.g. so Drivers don't have to look them up > in the catalogs, or having issues when types are added to the core.) Here is a problem: If I write an "hstore-ng" extension, I have two options: 1. "Steal" the OID/UUID assigned to hstore so my extension integrates nicely with all the drivers. 2. Don't steal the OID/UUID assigned to hstore and have a very low chance that my extension will ever be supported nicely by a lot of drivers. Neither option is very nice.
Peter Eisentraut <peter_e@gmx.net> writes: > Here is a problem: If I write an "hstore-ng" extension, I have two In the patch for Finer Extension Dependencies, the offer is that you have the hstore-ng extension provide the 'hstore' feature. https://commitfest.postgresql.org/action/patch_view?id=727 Now, that says nothing about the type's OID nor how to trust it. In my proposal we would have a pg_features catalog, that are just keywords used in control files so that you can change your mind as far as dependencies are concerned from one release of your extension to the next (adding or removing some, splitting the extension in parts or joining them again, etc). Those features entries do not exist yet, and are a very specific set of OIDs, so we could maybe provision a large number of them here and refuse to assign them to untrusted sources. Again, the complex part of that problem, to me, is not about how to manage those numbers (running a registry, adding syntax support, etc) as much as how to manage a distributed network of trust. When applying for an OID or a feature identifier, the registry team could maybe GPG sign "something" with the PostgreSQL private key, the backend only having the public key embedded in its code and binary. IIRC that's enough to then validate the feature name/oid (with the signature, a new control file parameter) and allow it to get installed in the registry reserved range. We then need a way to ask for the list of objects provided by the extension providing that feature (it's been signed, it's now trusted), something that we already know how to do (joining pg_depend etc). The result of that query would be cached on the client side, because we're not trying to change the way pg_type OID assignment ranges work. We could maybe specialize all this down to pg_type and distribute a list of reserved OIDs there too, but it seems like that ship has already sailed (nice pg_upgrade chicken and egg problem). Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Tue, Sep 25, 2012 at 5:36 PM, Peter Eisentraut <peter_e@gmx.net> wrote: > On Mon, 2012-09-24 at 21:18 -0700, Daniel Farina wrote: >> The gap between >> pre-JSON-in-the-standard-library in Python, Ruby, et al and >> post-JSON-in-stdlib was much smaller. > > Except in Python they renamed the thing. By 'smaller' I mean the difference in capability between promoted to the Python stdlib is comparatively smaller than data types and operators from being promoted to core in Postgres, and the main reason for that is that people can compose on top of Postgres-core functionality, but those extensions themselves are not very useful in further composition. The extensibility at the first level is great. There exists an entirely livable alternative-future/present where Python never promoted the (simple)json library into the stdlib. The same could not be said for Postgres extensions as they exist today. -- fdr
On Wed, Sep 26, 2012 at 10:08 AM, Andrew Dunstan <andrew@dunslane.net> wrote: > How many would EDB need for it to be useful? Looks like we currently have about 160 hard-coded OIDs in our fork that are not in PostgreSQL, across all catalogs. Actually, there are probably some things we could do to reduce that number, which might be the smarter way to go. But taken at face value I suppose that means we would need a reservation of several hundred to allow for future growth. >> I am somewhat opposed to the idea of an OID registry. I can't see why >> the space of things people want to reserve OIDs for would be only tens >> rather than thousands. There are surely plenty of extensions that >> would like to depend on specific other extensions, or on core. If we >> legislate that hard-coded OIDs are the way to do that, I think we're >> going to end up with a lot of 'em. > > Maybe you have a better feel than I do for how many live addon types are out > there. Maybe there are hundreds or thousands, but if there are I am > blissfully unaware of them. Well, that's a fair point. There probably aren't. But then again, the proposed registry wouldn't only cover live projects. We'd probably have a decent number of people say: I can't do what I want unless I have an OID. And then the project becomes dormant or obsolescent but we still have the OID reservation and there's not really any politically feasible way of recovering the address space. I can't help thinking that it sounds a little bit like what IANA does, and the problems they face, except with 2^16 OIDs instead of 2^32 IP addresses. Admittedly there should be a lot less demand for type OIDs than for IP addresses, but the amount of address space we can allocate without pain is also much smaller. > I did like the alternative idea upthread of UUIDs for types which would give > them a virtually unlimited space. Yeah, me too. That doesn't require a centralized authority (hence, no debates here about whether a given extension is important enough to merit an allocation of a given size), doesn't move us further in the direction of exposing the database's internal identifiers as a concept that users have to care about, ad provides essentially infinite address space. There's more engineering work involved but sometimes more engineering work means a better result. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
> > > I did like the alternative idea upthread of UUIDs for types which > > would give them a virtually unlimited space. > > Yeah, me too. That doesn't require a centralized authority (hence, no > debates here about whether a given extension is important enough to merit > an allocation of a given size), doesn't move us further in the direction of > exposing the database's internal identifiers as a concept that users have to > care about, ad provides essentially infinite address space. There's more > engineering work involved but sometimes more engineering work means a > better result. > Random thought from the sideline... GIT is able to provide assurances as to content because it creates a hash. Now, for a function PostgreSQL could hash the catalog entry (including function body) and return than as proof that said function is the same as one installed in some other database or published publically. I do not know enough about the other objects to know if something similar is possible but maybe this will spark someone else's thoughts. David J.
<p dir="ltr"><br /> On Sep 27, 2012 9:52 AM, "Robert Haas" <<a href="mailto:robertmhaas@gmail.com">robertmhaas@gmail.com</a>>wrote:<br /> ><br /> > On Wed, Sep 26, 2012 at 10:08AM, Andrew Dunstan <<a href="mailto:andrew@dunslane.net">andrew@dunslane.net</a>> wrote:<br /> > > How manywould EDB need for it to be useful?<br /> ><br /> > Looks like we currently have about 160 hard-coded OIDs in ourfork<br /> > that are not in PostgreSQL, across all catalogs. Actually, there are<br /> > probably some thingswe could do to reduce that number, which might be<br /> > the smarter way to go. But taken at face value I supposethat means<br /> > we would need a reservation of several hundred to allow for future<br /> > growth.<p dir="ltr">I'mnot sure that's a way we really want to go down. How do we define which third party vendors would get to reserveoids? And how many? And under what other potential terms?<p dir="ltr">Seems like we'd set ourselves up for endlessdiscussions and bike shedding... <br /><br /><p dir="ltr">> >> I am somewhat opposed to the idea of an OIDregistry. I can't see why<br /> > >> the space of things people want to reserve OIDs for would be only tens<br/> > >> rather than thousands. There are surely plenty of extensions that<br /> > >> would liketo depend on specific other extensions, or on core. If we<br /> > >> legislate that hard-coded OIDs are theway to do that, I think we're<br /> > >> going to end up with a lot of 'em.<br /> > ><br /> > > Maybeyou have a better feel than I do for how many live addon types are out<br /> > > there. Maybe there are hundredsor thousands, but if there are I am<br /> > > blissfully unaware of them.<br /> ><br /> > Well, that'sa fair point. There probably aren't. But then again<p dir="ltr">There are probably more than we know. Many many internalthings at different organizations. <br /><p dir="ltr">> the proposed registry wouldn't only cover live projects. We'd<br /> > probably have a decent number of people say: I can't do what I want<br /> > unless I have anOID. And then the project becomes dormant or<br /> > obsolescent but we still have the OID reservation and there'snot<br /> > really any politically feasible way of recovering the address space.<br /> > I can't help thinkingthat it sounds a little bit like what IANA does,<br /> > and the problems they face, except with 2^16 OIDs insteadof 2^32 IP<br /> > addresses. Admittedly there should be a lot less demand for type OIDs<br /> > than for IPaddresses, but the amount of address space we can allocate<br /> > without pain is also much smaller.<p dir="ltr">Yeah,I think that would rapidly turn into a pain point. <br /><br /><p dir="ltr">> > I did like the alternativeidea upthread of UUIDs for types which would give<br /> > > them a virtually unlimited space.<br /> ><br/> > Yeah, me too. That doesn't require a centralized authority (hence, no<br /> > debates here about whethera given extension is important enough to<br /> > merit an allocation of a given size), doesn't move us furtherin the<br /> > direction of exposing the database's internal identifiers as a concept<br /> > that users haveto care about, ad provides essentially infinite<br /> > address space. There's more engineering work involved butsometimes<br /> > more engineering work means a better result.<br /><p dir="ltr">Yeah, seems like it would work muchbetter long term. <p dir="ltr">/Magnus
On Thu, Sep 27, 2012 at 2:34 PM, Magnus Hagander <magnus@hagander.net> wrote: > I'm not sure that's a way we really want to go down. How do we define which > third party vendors would get to reserve oids? And how many? And under what > other potential terms? > > Seems like we'd set ourselves up for endless discussions and bike > shedding... Not really. I'm only proposing that it would be nice to have a block of OIDs that core agrees not to assign for any other purpose, not that we dole out specific ones to specific companies. There's no reason why, for example, EnterpriseDB's fork can't use OIDs from the same reserved block as PostgreSQL-XC's fork or Greenplum's fork or Aster Data's fork - those are all distinct projects. All might need private OIDs but they can all come from the same range because the code bases don't mingle. That having been said, we've gotten this far without having any terrible trouble about this, so maybe it's not worth worrying about. It's a nice-to-have, not a big deal. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Sep 27, 2012 at 10:02 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Thu, Sep 27, 2012 at 2:34 PM, Magnus Hagander <magnus@hagander.net> wrote: >> I'm not sure that's a way we really want to go down. How do we define which >> third party vendors would get to reserve oids? And how many? And under what >> other potential terms? >> >> Seems like we'd set ourselves up for endless discussions and bike >> shedding... > > Not really. I'm only proposing that it would be nice to have a block > of OIDs that core agrees not to assign for any other purpose, not that > we dole out specific ones to specific companies. There's no reason Ah, ok. In that case I agree, that wouldn't be open for a lot of bikeshedding. > why, for example, EnterpriseDB's fork can't use OIDs from the same > reserved block as PostgreSQL-XC's fork or Greenplum's fork or Aster > Data's fork - those are all distinct projects. All might need private > OIDs but they can all come from the same range because the code bases > don't mingle. > > That having been said, we've gotten this far without having any > terrible trouble about this, so maybe it's not worth worrying about. > It's a nice-to-have, not a big deal. Yeah, there's got to be a whole lot of other very much more complicated things you have to do with each new major version :) -- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/
On Thu, Sep 27, 2012 at 04:02:45PM -0400, Robert Haas wrote: > On Thu, Sep 27, 2012 at 2:34 PM, Magnus Hagander <magnus@hagander.net> wrote: > > I'm not sure that's a way we really want to go down. How do we define which > > third party vendors would get to reserve oids? And how many? And under what > > other potential terms? > > > > Seems like we'd set ourselves up for endless discussions and bike > > shedding... > > Not really. I'm only proposing that it would be nice to have a block > of OIDs that core agrees not to assign for any other purpose, not that > we dole out specific ones to specific companies. There's no reason > why, for example, EnterpriseDB's fork can't use OIDs from the same > reserved block as PostgreSQL-XC's fork or Greenplum's fork or Aster > Data's fork - those are all distinct projects. All might need private > OIDs but they can all come from the same range because the code bases > don't mingle. > > That having been said, we've gotten this far without having any > terrible trouble about this, so maybe it's not worth worrying about. > It's a nice-to-have, not a big deal. Interesting idea, but if plugable data types started using that reserved range, it could conflict with XC or EDB-reserved oids, making those data types unusable in those forks. Maybe we need two reserved ranges. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +