Thread: Oid registry

Oid registry

From

Andrew Dunstan

Date:

24 September 2012, 22:59:31

This rather overdue mail arises out the developer's meeting back in May, 
where we discussed an item I raised suggesting an Oid registry.

The idea came from some difficulties I encountered when I did the 
backport of the JSON work we did in 9.2 to 9.1, but has wider 
application. Say someone writes an extension that defines type A. You 
want to write an extension that takes advantage of that type, but it's 
difficult if you don't know the type's Oid, and of course currently 
there is no way of knowing for sure what it is unless it's a builtin 
type. So the proposal is to have an Oid registry, in which authors could 
in effect reserve an Oid (or a couple of Oids) for a type. We would 
guarantee that these Oids would be reserved in just the same way Oids 
for builtins are reserved, and #define symbolic constants for the 
reserved Oids. To make that viable, we'd need to extend the CREATE 
commands for any objects we could reserve Oids for to allow for the 
specification of the Oids, somewhat like:
   CREATE TYPE newtype ( ....) WITH (TYPEOID = 123456, TYPEARRAYOID =   234567);

I'm not sure what objects we would need this for other than types, but 
CREATE TYPE would be a good start anyway.

Another suggestion that was made during the discussion was that we 
should also reserve a range of Oids for private use, and guarantee that 
those would never be allocated, somewhat analogously to RFC1918 IP 
addresses.

thoughts?

If there is general agreement I want to get working on this fairly soon.

cheers

andrew

Re: Oid registry

From

Tom Lane

Date:

25 September 2012, 01:24:13

Andrew Dunstan <andrew@dunslane.net> writes:
> ...  So the proposal is to have an Oid registry, in which authors could 
> in effect reserve an Oid (or a couple of Oids) for a type. We would 
> guarantee that these Oids would be reserved in just the same way Oids 
> for builtins are reserved, and #define symbolic constants for the 
> reserved Oids. To make that viable, we'd need to extend the CREATE 
> commands for any objects we could reserve Oids for to allow for the 
> specification of the Oids, somewhat like:

>     CREATE TYPE newtype ( ....) WITH (TYPEOID = 123456, TYPEARRAYOID =
>     234567);

Well, of course, it's not clear how "reserved" an OID can be if you
provide SQL syntax that allows any Joe Blow to create a type with that
OID.  The possibilities for security holes are interesting to say the
least, especially if applications blindly assume that oid 123456 *must*
be the type they think it is.

> Another suggestion that was made during the discussion was that we 
> should also reserve a range of Oids for private use, and guarantee that 
> those would never be allocated, somewhat analogously to RFC1918 IP 
> addresses.

Likewise, "would never be allocated" and "we'll provide syntax that does
it trivially" seem a bit at odds.

Another point to think about is that it's a few years too late to
guarantee that any particular OID above 16384 is unused; we can't
do that now without possibly breaking pg_upgrade-ability of existing
databases.  While we could hand out some subrange of the OIDs below
that, there's not exactly a huge amount of space available.

> If there is general agreement I want to get working on this fairly soon.

Turning this into something actually useful seems to me to be a bit
harder than meets the eye.  I'm not opposed to it in principle, but the
management aspect seems much more difficult than the technical aspect.
        regards, tom lane

Re: Oid registry

From

Peter Eisentraut

Date:

25 September 2012, 01:37:17

On Mon, 2012-09-24 at 18:59 -0400, Andrew Dunstan wrote:
> This rather overdue mail arises out the developer's meeting back in
> May, where we discussed an item I raised suggesting an Oid registry.
> 
> The idea came from some difficulties I encountered when I did the 
> backport of the JSON work we did in 9.2 to 9.1, but has wider 
> application. Say someone writes an extension that defines type A. You 
> want to write an extension that takes advantage of that type, but it's
> difficult if you don't know the type's Oid, 

Could you fill the rest of us in with some technical details about why
this might be necessary and what it aims to achieve?

Re: Oid registry

From

Andrew Dunstan

Date:

25 September 2012, 02:27:06

On 09/24/2012 09:37 PM, Peter Eisentraut wrote:
> On Mon, 2012-09-24 at 18:59 -0400, Andrew Dunstan wrote:
>> This rather overdue mail arises out the developer's meeting back in
>> May, where we discussed an item I raised suggesting an Oid registry.
>>
>> The idea came from some difficulties I encountered when I did the
>> backport of the JSON work we did in 9.2 to 9.1, but has wider
>> application. Say someone writes an extension that defines type A. You
>> want to write an extension that takes advantage of that type, but it's
>> difficult if you don't know the type's Oid,
> Could you fill the rest of us in with some technical details about why
> this might be necessary and what it aims to achieve?

Well, an obvious case is how record_to_json handles fields. If it knows 
nothing about the type all it can do is output the string value. That 
doesn't work well for types such as hstore. If it could reliably 
recognize a field as an hstore it might well be able to do lots better.

cheers

andrew

Re: Oid registry

From

Andrew Dunstan

Date:

25 September 2012, 02:39:28

On 09/24/2012 09:24 PM, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> ...  So the proposal is to have an Oid registry, in which authors could
>> in effect reserve an Oid (or a couple of Oids) for a type. We would
>> guarantee that these Oids would be reserved in just the same way Oids
>> for builtins are reserved, and #define symbolic constants for the
>> reserved Oids. To make that viable, we'd need to extend the CREATE
>> commands for any objects we could reserve Oids for to allow for the
>> specification of the Oids, somewhat like:
>>      CREATE TYPE newtype ( ....) WITH (TYPEOID = 123456, TYPEARRAYOID =
>>      234567);
> Well, of course, it's not clear how "reserved" an OID can be if you
> provide SQL syntax that allows any Joe Blow to create a type with that
> OID.  The possibilities for security holes are interesting to say the
> least, especially if applications blindly assume that oid 123456 *must*
> be the type they think it is.

I think this would probably be a superuser-only facility.


>
>> Another suggestion that was made during the discussion was that we
>> should also reserve a range of Oids for private use, and guarantee that
>> those would never be allocated, somewhat analogously to RFC1918 IP
>> addresses.
> Likewise, "would never be allocated" and "we'll provide syntax that does
> it trivially" seem a bit at odds.
>
> Another point to think about is that it's a few years too late to
> guarantee that any particular OID above 16384 is unused; we can't
> do that now without possibly breaking pg_upgrade-ability of existing
> databases.  While we could hand out some subrange of the OIDs below
> that, there's not exactly a huge amount of space available.


we seem to have a fair bit of leeway between to top numbered Oid as 
reported by unused_oids (4332) and 16384.

I would expect a private range of a few hundred to be more than 
adequate, and a range for registered Oids of a couple of thousand to 
last for many years. I'm expecting the use of this over quite a few 
years to be numbered in tens, not thousands.


cheers

andrew

Re: Oid registry

From

Tom Lane

Date:

25 September 2012, 03:21:28

Andrew Dunstan <andrew@dunslane.net> writes:
> On 09/24/2012 09:24 PM, Tom Lane wrote:
>> Another point to think about is that it's a few years too late to
>> guarantee that any particular OID above 16384 is unused; we can't
>> do that now without possibly breaking pg_upgrade-ability of existing
>> databases.  While we could hand out some subrange of the OIDs below
>> that, there's not exactly a huge amount of space available.

> we seem to have a fair bit of leeway between to top numbered Oid as 
> reported by unused_oids (4332) and 16384.

It's not nearly that simple.  Per transam.h:
*        OIDs 1-9999 are reserved for manual assignment (see the files*        in src/include/catalog/).**        OIDS
10000-16383are reserved for assignment during initdb*        using the OID generator.  (We start the generator at
10000.)**       OIDs beginning at 16384 are assigned from the OID generator*        during normal multiuser operation.
(Weforce the generator up to*        16384 as soon as we are in normal operation.)

pg_database.datlastsysoid is 12907 as of HEAD, so we're using about 2900
OIDs that are assigned during initdb.  We should expect continued growth
in that number.  For comparison, it was 10791 in 8.1, which was the first
version following this assignment scheme --- so the number of
auto-assigned OIDs has more than tripled in the last seven years.
And there's not room for another tripling.  So I think it's entirely
likely that we'll have to reduce FirstBootstrapObjectId to a smaller
number in the foreseeable future, or else increase FirstNormalObjectId
which would be a pg_upgrade breaking move.  Meanwhile, we do continue to
eat manually-assigned OIDs at a nontrivial pace as well.

So, yeah, we could reserve a couple hundred OIDs for a scheme like this
and (probably) not regret it later.  But a couple thousand would scare
me ... and I'm not exactly convinced that a couple hundred is enough,
if there's any demand out there at all.

I suggest we could consider handing out reserved OIDs one or two at a
time using the current manual assignment process, *without* any notion
of a reserved block of OIDs.  That is, when Joe Developer needs a couple
of OIDs for joes_whizzy_datatype, he comes to us and we say "sure, you
can have 4333 and 4334", which we then memorialize in a text file in
include/catalog/.  This doesn't preclude doing something differently
later of course, but it avoids setting aside OID address space that is
likely to be underutilized and fairly desperately needed at some future
time.

As you might guess from this, I'm not enamored of the "reserve some
private OID space" part at all.  There's not been any demand for that
that I've noticed.  In pretty much any development context, you could
use OIDs up around 4 billion for such purposes and not have any problem
at all --- it's only trying to *guarantee* they're not used anywhere out
in the field that is problematic.
        regards, tom lane

Re: Oid registry

From

Tom Lane

Date:

25 September 2012, 03:39:36

Andrew Dunstan <andrew@dunslane.net> writes:
> On 09/24/2012 09:37 PM, Peter Eisentraut wrote:
>> Could you fill the rest of us in with some technical details about why
>> this might be necessary and what it aims to achieve?

> Well, an obvious case is how record_to_json handles fields. If it knows 
> nothing about the type all it can do is output the string value. That 
> doesn't work well for types such as hstore. If it could reliably 
> recognize a field as an hstore it might well be able to do lots better.

Um ... that is an entirely unconvincing use case.  We would not put any
code into core that knows specifically about a non-core datatype, or
at least I sure hope we'd not do such a klugy thing.  It would be far
better to invent an extensibility scheme (plug-in of some sort) for
record_to_json.

My recollection of the PGCon discussion is that people wanted to allow
client-side code to hard-wire type OIDs for add-on types, in more or
less the same way that things like JDBC know that "25" is "text".
That's not unreasonable, since the alternatives are complicated and
not all that reliable.  I'm not sure the usage applies to anything
except data types though, so at least for starters I'd only want to
accept reservations of OIDs for data types.
        regards, tom lane

Re: Oid registry

From

Hitoshi Harada

Date:

25 September 2012, 05:11:19

On Mon, Sep 24, 2012 at 8:39 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> On 09/24/2012 09:37 PM, Peter Eisentraut wrote:
>>> Could you fill the rest of us in with some technical details about why
>>> this might be necessary and what it aims to achieve?
>
>> Well, an obvious case is how record_to_json handles fields. If it knows
>> nothing about the type all it can do is output the string value. That
>> doesn't work well for types such as hstore. If it could reliably
>> recognize a field as an hstore it might well be able to do lots better.
>
> Um ... that is an entirely unconvincing use case.  We would not put any
> code into core that knows specifically about a non-core datatype, or
> at least I sure hope we'd not do such a klugy thing.  It would be far
> better to invent an extensibility scheme (plug-in of some sort) for
> record_to_json.
>

I don't think (and not hope) the core should know about external data
type, but I have certainly seen a lot of use cases where an external
project wants to know about another external data type.  Say, if plv8
wants to convert hstore into a javascript object.  It is arbitrary for
users to define such a function that accepts hstore as arguments, but
how does plv8 know the input is actually hstore?  Of course you can
look up type name conlusting SysCache and see if the type name is
"hstore", but it's expensive to do it for every function invocation,
so caching the hstore's oid in plv8 is the current workaround, which
is not truly safe because the hstore's oid may change while caching.
I can tell similar stories around array creation which needs element
type oid.  The core code can do some special stuff by using
pre-defined type or function oid macros, and I guess it is reasonable
for the external developers to hope to do such things.

Thanks,
-- 
Hitoshi Harada

Re: Oid registry

From

Daniel Farina

Date:

25 September 2012, 05:51:11

On Mon, Sep 24, 2012 at 8:21 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> So, yeah, we could reserve a couple hundred OIDs for a scheme like this
> and (probably) not regret it later.  But a couple thousand would scare
> me ... and I'm not exactly convinced that a couple hundred is enough,
> if there's any demand out there at all.

I think some kind of way to compose extension objects -- this includes
and goes beyond just the things in pg_type, but operators and so on --
will have great impact on Postgres' power without making the community
effort unscalable.  I think PostGIS is the largest success by this
metric -- although certain requirements spill back into pgsql-hackers,
it's a pretty huge island of functionality out there that neatly
progresses on its own without coordination.  That's fantastic.

I am fairly certain that if some form of in-line extensions were
supported, we would see people building smaller extensions that use
many little types (such as composites) relying on other extensions to
do things or manage certain tables, increasing convenience overall.

Things like the JSON data type (in spite of my own opinion that it
should be added) are a highly useful concession.  Yet, it is still a
concession that extensions simply cannot be as close to good as a core
data type when it comes to the ability to compose.  The gap between
pre-JSON-in-the-standard-library in Python, Ruby, et al and
post-JSON-in-stdlib was much smaller.  Even for statically typed
languages -- like Java -- the only central-registries that exist are
mostly for the convenience of distribution and deployment, not as the
means of composition, and I think that is a good thing.

However, an IANA-style OID registry I find pretty un-compelling. Is
there no way we can use symbols, the type system, and invalidation
messages to a client to do this?  I feel like there is a way we can,
and it is probably worth it to minimize coordination among extension
authors and enabling smaller extensions.

If reserved OID ranges for extensions are to become a thing, I think
the right scope would be to presume that these extensions are not
bundled with Postgres, but practically, whoever uses that space is
probably going to have to be the kind of person who would correspond
with -hackers.  Ad-hoc composition among small or live-and-die-fast
user-space libraries (that are written in trusted languages, for
example) are out, which is kind of a bummer for what I think plv8 can
enable.

-- 
fdr

Re: Oid registry

From

Simon Riggs

Date:

25 September 2012, 08:06:53

On 24 September 2012 21:26, Andrew Dunstan <andrew@dunslane.net> wrote:
>
> On 09/24/2012 09:37 PM, Peter Eisentraut wrote:
>>
>> On Mon, 2012-09-24 at 18:59 -0400, Andrew Dunstan wrote:
>>>
>>> This rather overdue mail arises out the developer's meeting back in
>>> May, where we discussed an item I raised suggesting an Oid registry.
>>>
>>> The idea came from some difficulties I encountered when I did the
>>> backport of the JSON work we did in 9.2 to 9.1, but has wider
>>> application. Say someone writes an extension that defines type A. You
>>> want to write an extension that takes advantage of that type, but it's
>>> difficult if you don't know the type's Oid,
>>
>> Could you fill the rest of us in with some technical details about why
>> this might be necessary and what it aims to achieve?
>
>
> Well, an obvious case is how record_to_json handles fields. If it knows
> nothing about the type all it can do is output the string value. That
> doesn't work well for types such as hstore. If it could reliably recognize a
> field as an hstore it might well be able to do lots better.

I think we're missing something in the discussion here.

Why can't you find out the oid of the type by looking it up by name?
That mechanism is used throughout postgres already and seems to work
just fine.

Why would we need to hardcode it?

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: Oid registry

From

Hannu Krosing

Date:

25 September 2012, 08:37:56

On 09/25/2012 04:26 AM, Andrew Dunstan wrote:
>
> On 09/24/2012 09:37 PM, Peter Eisentraut wrote:
>> On Mon, 2012-09-24 at 18:59 -0400, Andrew Dunstan wrote:
>>> This rather overdue mail arises out the developer's meeting back in
>>> May, where we discussed an item I raised suggesting an Oid registry.
>>>
>>> The idea came from some difficulties I encountered when I did the
>>> backport of the JSON work we did in 9.2 to 9.1, but has wider
>>> application. Say someone writes an extension that defines type A. You
>>> want to write an extension that takes advantage of that type, but it's
>>> difficult if you don't know the type's Oid,
>> Could you fill the rest of us in with some technical details about why
>> this might be necessary and what it aims to achieve?
>
> Well, an obvious case is how record_to_json handles fields. If it 
> knows nothing 
> about the type all it can do is output the string value. That doesn't 
> work well 
> for types such as hstore. If it could reliably recognize a field as an 
> hstore it 
> might well be able to do lots better.

During the earlier to_json() discussions I already proposed a solution 
to this - try
a cast to JSON type before falling back to outputting the text version.

As I understand it, this is how typecasts are supposed to be used.

In addition to being "how PostgreSQL is intended to work" it also allows 
for users
to override it and have their own version of json representations for 
types which have no
standard json format.

Just for the record - I'm not opposed to having a PostgreSQL OID 
Registry though I think
that for anything in the backend the lookup by name should be fast enough.

> cheers
>
> andrew
>
>

Re: Oid registry

From

Hitoshi Harada

Date:

25 September 2012, 09:19:36

On Tue, Sep 25, 2012 at 1:06 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On 24 September 2012 21:26, Andrew Dunstan <andrew@dunslane.net> wrote:
>> Well, an obvious case is how record_to_json handles fields. If it knows
>> nothing about the type all it can do is output the string value. That
>> doesn't work well for types such as hstore. If it could reliably recognize a
>> field as an hstore it might well be able to do lots better.
>
> I think we're missing something in the discussion here.
>
> Why can't you find out the oid of the type by looking it up by name?
> That mechanism is used throughout postgres already and seems to work
> just fine.
>

Sure, but how do you know the type named "hstore" is what you know as
hstore?  We don't have a global consensus a specific type name means
exactly what everyone thinks it is.

Thanks,
-- 
Hitoshi Harada

Re: Oid registry

From

Heikki Linnakangas

Date:

25 September 2012, 10:13:18

On 25.09.2012 12:19, Hitoshi Harada wrote:
> On Tue, Sep 25, 2012 at 1:06 AM, Simon Riggs<simon@2ndquadrant.com>  wrote:
>> On 24 September 2012 21:26, Andrew Dunstan<andrew@dunslane.net>  wrote:
>>> Well, an obvious case is how record_to_json handles fields. If it knows
>>> nothing about the type all it can do is output the string value. That
>>> doesn't work well for types such as hstore. If it could reliably recognize a
>>> field as an hstore it might well be able to do lots better.

I'm not at all familiar with record_to_json or the json datatype, but 
wouldn't it be appropriate to create a cast from hstore to json to 
handle that case?

That brings us to another question: should the cast be part of the 
hstore extension, or json? (json is built-in, but imagine for a moment 
that it was an extension, too, so that there was a choice). IIRC someone 
started a discussion on that recently on pgsql-hackers, but I don't 
think there was any conclusion on that.

>> I think we're missing something in the discussion here.
>>
>> Why can't you find out the oid of the type by looking it up by name?
>> That mechanism is used throughout postgres already and seems to work
>> just fine.
>
> Sure, but how do you know the type named "hstore" is what you know as
> hstore?  We don't have a global consensus a specific type name means
> exactly what everyone thinks it is.

If you create a type called "hstore" that isn't the one in 
contrib/hstore, you're just asking for trouble. Having a central 
registry of assigned oid numbers doesn't really make that problem go 
away either; you could still assign one of the registered oids for a 
different datatype if you wanted to.

So whether you rely on the oid or type name, your code needs to behave 
sanely, even if the datatype doesn't behave the way you'd expect. At a 
minimum, there needs to be sanity checks so that you don't crash, and 
give a meaningful error message. Beyond that, I think it's the user 
responsibility to not confuse things by using well-known datatype names 
like "hstore" for something else.

Here's another thought: imagine that someone creates a datatype that's 
more or less compatible with contrib/hstore, but the internal 
implementation is different. Would you call that hstore? Would you use 
the same assigned oid for it? If you have some other extension that 
knows about hstore, and treats it specially, would you want the new 
datatype to be treated specially too?

And what about domains? If you have code that knows the oid of hstore 
and gives it special treatment, should domains over hstore also be given 
special treatment?

- Heikki

Re: Oid registry

From

Andrew Dunstan

Date:

25 September 2012, 13:18:51

On 09/25/2012 06:13 AM, Heikki Linnakangas wrote:
> On 25.09.2012 12:19, Hitoshi Harada wrote:
>> On Tue, Sep 25, 2012 at 1:06 AM, Simon Riggs<simon@2ndquadrant.com>  
>> wrote:
>>> On 24 September 2012 21:26, Andrew Dunstan<andrew@dunslane.net>  wrote:
>>>> Well, an obvious case is how record_to_json handles fields. If it 
>>>> knows
>>>> nothing about the type all it can do is output the string value. That
>>>> doesn't work well for types such as hstore. If it could reliably 
>>>> recognize a
>>>> field as an hstore it might well be able to do lots better.
>
> I'm not at all familiar with record_to_json or the json datatype, but 
> wouldn't it be appropriate to create a cast from hstore to json to 
> handle that case?

No, the difficulty (or at least the first difficulty) is in having the 
code recognize that it has an hstore at all. The code picks apart the 
record field by field at run time and takes various actions depending on 
the field's type. For any type it doesn't recognize it just outputs the 
value as a string, and so that's what it does with hstore. Mostly this 
is the right thing but in the hstore case it's rather sad.


>
> That brings us to another question: should the cast be part of the 
> hstore extension, or json? (json is built-in, but imagine for a moment 
> that it was an extension, too, so that there was a choice). IIRC 
> someone started a discussion on that recently on pgsql-hackers, but I 
> don't think there was any conclusion on that.
>
>>> I think we're missing something in the discussion here.
>>>
>>> Why can't you find out the oid of the type by looking it up by name?
>>> That mechanism is used throughout postgres already and seems to work
>>> just fine.
>>
>> Sure, but how do you know the type named "hstore" is what you know as
>> hstore?  We don't have a global consensus a specific type name means
>> exactly what everyone thinks it is.
>
> If you create a type called "hstore" that isn't the one in 
> contrib/hstore, you're just asking for trouble. Having a central 
> registry of assigned oid numbers doesn't really make that problem go 
> away either; you could still assign one of the registered oids for a 
> different datatype if you wanted to.


That's true to some extent, but names are worse, since they are schema 
qualified.


cheers

andrew

Re: Oid registry

From

Andrew Dunstan

Date:

25 September 2012, 13:32:39

On 09/24/2012 11:39 PM, Tom Lane wrote:

> My recollection of the PGCon discussion is that people wanted to allow
> client-side code to hard-wire type OIDs for add-on types, in more or
> less the same way that things like JDBC know that "25" is "text".
> That's not unreasonable, since the alternatives are complicated and
> not all that reliable.  I'm not sure the usage applies to anything
> except data types though, so at least for starters I'd only want to
> accept reservations of OIDs for data types.
>
>             

Yes, I certainly think that's a sufficient case. And just to be clear, I 
don't care about the "private range" suggestion. I was just reporting 
that as it came up in the discussion and at least wasn't immediately 
shouted down. But I'm happy to abandon it at least for now. If there is 
ever actual demand for it we could revisit the matter.

Given your previous comments, perhaps we could just start handing out 
Oids (if there is any demand) numbered, say, 9000 and up. That should 
keep us well clear of any existing use.

Regarding the use of pre-allocated Oids, unless we provide some facility 
to use them when creating types extension authors will be reduced to 
using the pretty ugly tricks I had to use when backporting JSON for 
binary upgrade-ability. This used some of pg_upgrade's tricks to force 
use of particular Oids, but I don't think this should be recommended 
practice. The suggestion I made was based on something you suggested 
(albeit with some caveats) in the recent discussion of pg_upgrade with 
extensions.

cheers

andrew

Re: Oid registry

From

Tom Lane

Date:

25 September 2012, 14:24:16

Andrew Dunstan <andrew@dunslane.net> writes:
> Given your previous comments, perhaps we could just start handing out 
> Oids (if there is any demand) numbered, say, 9000 and up. That should 
> keep us well clear of any existing use.

No, I think you missed my point entirely: handing out OIDs at the top
of the manual assignment range is approximately the worst possible
scenario.  I foresee having to someday move FirstBootstrapObjectId
down to 9000, or 8000, or even less, to cope with growth of the
auto-assigned OID set.  So we need to keep manually assigned OIDs
reasonably compact near the bottom of the range, and it doesn't matter
at all whether such OIDs are used internally or reserved for external
developers.  Nor do I see a need for such reserved OIDs to "look
different" from internally-used OIDs.  Reserved is reserved.
        regards, tom lane

Re: Oid registry

From

Andrew Dunstan

Date:

25 September 2012, 14:38:23

On 09/25/2012 10:23 AM, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> Given your previous comments, perhaps we could just start handing out
>> Oids (if there is any demand) numbered, say, 9000 and up. That should
>> keep us well clear of any existing use.
> No, I think you missed my point entirely: handing out OIDs at the top
> of the manual assignment range is approximately the worst possible
> scenario.  I foresee having to someday move FirstBootstrapObjectId
> down to 9000, or 8000, or even less, to cope with growth of the
> auto-assigned OID set.  So we need to keep manually assigned OIDs
> reasonably compact near the bottom of the range, and it doesn't matter
> at all whether such OIDs are used internally or reserved for external
> developers.  Nor do I see a need for such reserved OIDs to "look
> different" from internally-used OIDs.  Reserved is reserved.
>
>             

OK, point taken.

cheers

andrew

Re: Oid registry

From

Alvaro Herrera

Date:

25 September 2012, 15:22:11

Excerpts from Hitoshi Harada's message of mar sep 25 02:11:14 -0300 2012:

> Of course you can
> look up type name conlusting SysCache and see if the type name is
> "hstore", but it's expensive to do it for every function invocation,
> so caching the hstore's oid in plv8 is the current workaround, which
> is not truly safe because the hstore's oid may change while caching.

Hm, couldn't you just register an invalidation callback with
CacheRegisterSyscacheCallback()?

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: Oid registry

From

Tom Lane

Date:

25 September 2012, 16:14:38

Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Excerpts from Hitoshi Harada's message of mar sep 25 02:11:14 -0300 2012:
>> Of course you can
>> look up type name conlusting SysCache and see if the type name is
>> "hstore", but it's expensive to do it for every function invocation,
>> so caching the hstore's oid in plv8 is the current workaround, which
>> is not truly safe because the hstore's oid may change while caching.

> Hm, couldn't you just register an invalidation callback with
> CacheRegisterSyscacheCallback()?

Yeah, if the code that needs the value is in the backend; but if it is,
the cost of looking the value up again probably isn't that much either.
The need for a cache is much greater in client-side code.

The bigger issues in my mind have to do with how you look up the type in
the first place (considering schema search path and so forth) and how
you know that "hstore" is what you think it is.

I recall some discussion of assigning a UUID to each datatype, which
might alleviate the second part of that at least.  Does nothing for
malicious impersonation of a datatype of course, but should prevent
accidental collisions.
        regards, tom lane

Re: Oid registry

From

Andrew Dunstan

Date:

25 September 2012, 16:30:08

On 09/25/2012 12:14 PM, Tom Lane wrote:
> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
>> Excerpts from Hitoshi Harada's message of mar sep 25 02:11:14 -0300 2012:
>>> Of course you can
>>> look up type name conlusting SysCache and see if the type name is
>>> "hstore", but it's expensive to do it for every function invocation,
>>> so caching the hstore's oid in plv8 is the current workaround, which
>>> is not truly safe because the hstore's oid may change while caching.
>> Hm, couldn't you just register an invalidation callback with
>> CacheRegisterSyscacheCallback()?
> Yeah, if the code that needs the value is in the backend; but if it is,
> the cost of looking the value up again probably isn't that much either.
> The need for a cache is much greater in client-side code.
>
> The bigger issues in my mind have to do with how you look up the type in
> the first place (considering schema search path and so forth) and how
> you know that "hstore" is what you think it is.
>
> I recall some discussion of assigning a UUID to each datatype, which
> might alleviate the second part of that at least.  Does nothing for
> malicious impersonation of a datatype of course, but should prevent
> accidental collisions.
>
>             


One nice thing about that idea (if I understand it correctly) is that we 
could use it for existing types. The horse has long bolted on Oids for 
hstore, because it has lots of existing deployments with unknown Oids.

cheers

andrew

Re: Oid registry

From

Martijn van Oosterhout

Date:

25 September 2012, 17:28:52

On Tue, Sep 25, 2012 at 09:18:30AM -0400, Andrew Dunstan wrote:
> >I'm not at all familiar with record_to_json or the json datatype,
> >but wouldn't it be appropriate to create a cast from hstore to
> >json to handle that case?
>
> No, the difficulty (or at least the first difficulty) is in having
> the code recognize that it has an hstore at all. The code picks
> apart the record field by field at run time and takes various
> actions depending on the field's type. For any type it doesn't
> recognize it just outputs the value as a string, and so that's what
> it does with hstore. Mostly this is the right thing but in the
> hstore case it's rather sad.

Is there a particular reason to special case hstore though? Wouldn't it
be sufficient to simply look for a cast from the given type oid to json
and use that if present.

If you don't like cast, allow other extensions to define a function
to_json_hook(yourtype) -> json which is used.

Hardcoding IDs in extensions just doesn't seem right somehow...

(Hmm, I see someone else in the thread pointed this out too).

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> He who writes carelessly confesses thereby at the very outset that he does
> not attach much importance to his own thoughts.  -- Arthur Schopenhauer

Re: Oid registry

From

Dimitri Fontaine

Date:

25 September 2012, 19:56:26

Tom Lane <tgl@sss.pgh.pa.us> writes:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> Given your previous comments, perhaps we could just start handing out
>> Oids (if there is any demand) numbered, say, 9000 and up. That should
>> keep us well clear of any existing use.
>
> No, I think you missed my point entirely: handing out OIDs at the top
> of the manual assignment range is approximately the worst possible
> scenario.  I foresee having to someday move FirstBootstrapObjectId

If that's the only problem, there's an easy way out by registering from
the top of the Oid range down to some constant up over there.

Now it seems like the problem here is that we both want the extension to
be managed as disconnected as possible from PostgreSQL, and at the same
time we want to be able to trust them from within the backend with it
being the only trusted authority.

I'm not security oriented enough to devise a working scheme here, but it
makes me think about some "shared secret" or GnuPG signature. Is it
possible to automatically verify that an extension's ID card (containing
some type's OID, name, etc) has been signed with PostgreSQL's own
private key, without ever having to ship that key in the backend code
nor binary?

To me it's all about how to setup a distributed network of trust…

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

Re: Oid registry

From

Peter Eisentraut

Date:

25 September 2012, 21:45:24

On 9/24/12 10:26 PM, Andrew Dunstan wrote:
> Well, an obvious case is how record_to_json handles fields. If it knows
> nothing about the type all it can do is output the string value. That
> doesn't work well for types such as hstore. If it could reliably
> recognize a field as an hstore it might well be able to do lots better.

I think the fix there is that the type supplies a "to JSON" function as
part of its type definition (just like input and output functions).  "to
XML" would also be useful.  Most types wouldn't need them and default to
string output.

Re: Oid registry

From

Peter Eisentraut

Date:

25 September 2012, 21:53:11

On 9/24/12 11:39 PM, Tom Lane wrote:
> My recollection of the PGCon discussion is that people wanted to allow
> client-side code to hard-wire type OIDs for add-on types, in more or
> less the same way that things like JDBC know that "25" is "text".

If I write a custom uint type and want to explain to JDBC that it is
number-like, I don't think it's very attractive to have to register a
hard-wired OID for it.  There would not only be the burden of
maintaining this OID database, JDBC and every driver would on the other
hand have to maintain its own database of all custom types in the world.

I would rather imagine a system where the type communicates its
properties to the client using some sort of label system, like "number",
"string", "pair of number" (point), "pair of string" (hstore).  Or
something with more or less detail.  Maybe something like that already
exists (maybe somewhere in thrift, protocol buffers, etc.?).

Re: Oid registry

From

Peter Eisentraut

Date:

25 September 2012, 21:54:38

On 9/25/12 1:11 AM, Hitoshi Harada wrote:
> Say, if plv8
> wants to convert hstore into a javascript object.  It is arbitrary for
> users to define such a function that accepts hstore as arguments, but
> how does plv8 know the input is actually hstore?

That's what the proposed transforms feature would be for:
https://commitfest.postgresql.org/action/patch_view?id=879

Re: Oid registry

From

Tom Lane

Date:

25 September 2012, 21:58:51

Peter Eisentraut <peter_e@gmx.net> writes:
> On 9/24/12 10:26 PM, Andrew Dunstan wrote:
>> Well, an obvious case is how record_to_json handles fields. If it knows
>> nothing about the type all it can do is output the string value. That
>> doesn't work well for types such as hstore. If it could reliably
>> recognize a field as an hstore it might well be able to do lots better.

> I think the fix there is that the type supplies a "to JSON" function as
> part of its type definition (just like input and output functions).  "to
> XML" would also be useful.  Most types wouldn't need them and default to
> string output.

Yes ... but I really don't want to go down the path of treating those as
new type properties; it doesn't scale.  (And please don't tell me that
JSON is the last word in container types and there will never be
requests for any more of these.)

Can we define these functions as being the cast-from-foo-to-json and
cast-from-foo-to-xml functions?  That would let us use the existing cast
infrastructure to manage them.
        regards, tom lane

Re: Oid registry

From

Peter Eisentraut

Date:

25 September 2012, 22:00:23

On 9/25/12 6:13 AM, Heikki Linnakangas wrote:
> That brings us to another question: should the cast be part of the
> hstore extension, or json? (json is built-in, but imagine for a moment
> that it was an extension, too, so that there was a choice). IIRC someone
> started a discussion on that recently on pgsql-hackers, but I don't
> think there was any conclusion on that.

That just depends how just want to direct the dependency arrows between
these things.  Either you make it a third extension, or you put it into
one or the other, for example, if you feel that json support is
"integral" to the hstore functionality.

Re: Oid registry

From

Peter Eisentraut

Date:

25 September 2012, 22:01:34

On 9/25/12 9:18 AM, Andrew Dunstan wrote:
> No, the difficulty (or at least the first difficulty) is in having the
> code recognize that it has an hstore at all. The code picks apart the
> record field by field at run time and takes various actions depending on
> the field's type. For any type it doesn't recognize it just outputs the
> value as a string, and so that's what it does with hstore. Mostly this
> is the right thing but in the hstore case it's rather sad.

But if you have a cast defined from hstore to json, then it is
unambiguous, because the pg_cast entry binds the types together by OID.

Re: Oid registry

From

Peter Eisentraut

Date:

25 September 2012, 22:06:27

On 9/25/12 5:58 PM, Tom Lane wrote:
> Yes ... but I really don't want to go down the path of treating those as
> new type properties; it doesn't scale.  (And please don't tell me that
> JSON is the last word in container types and there will never be
> requests for any more of these.)

Yeah, I didn't like that part either, but we only add one every five
years or so.

> Can we define these functions as being the cast-from-foo-to-json and
> cast-from-foo-to-xml functions?  That would let us use the existing cast
> infrastructure to manage them.

Sounds attractive, but there might be some problems in the details.  For
example, you can't cast scalar values to valid json values, because a
valid json value can only be a dictionary or an array.  If we had a flag
of some kind saying "cast from foo to json, but only when part of a
larger json serialization, not by itself", then it might work.

Re: Oid registry

From

Tom Lane

Date:

25 September 2012, 22:22:10

Peter Eisentraut <peter_e@gmx.net> writes:
> On 9/25/12 5:58 PM, Tom Lane wrote:
>> Can we define these functions as being the cast-from-foo-to-json and
>> cast-from-foo-to-xml functions?  That would let us use the existing cast
>> infrastructure to manage them.

> Sounds attractive, but there might be some problems in the details.  For
> example, you can't cast scalar values to valid json values, because a
> valid json value can only be a dictionary or an array.  If we had a flag
> of some kind saying "cast from foo to json, but only when part of a
> larger json serialization, not by itself", then it might work.

Actually, after reading another message you sent, I thought you were
going to respond that your proposed transforms feature would cover it.

If there's some reason that's not what to use, I guess we could add
another optional argument to cast support functions; but that interface
is already rather overloaded.
        regards, tom lane

Re: Oid registry

From

Hannu Krosing

Date:

25 September 2012, 23:26:43

On 09/25/2012 12:13 PM, Heikki Linnakangas wrote:
> On 25.09.2012 12:19, Hitoshi Harada wrote:
>> On Tue, Sep 25, 2012 at 1:06 AM, Simon Riggs<simon@2ndquadrant.com>  
>> wrote:
>>> On 24 September 2012 21:26, Andrew Dunstan<andrew@dunslane.net>  wrote:
>>>> Well, an obvious case is how record_to_json handles fields. If it 
>>>> knows
>>>> nothing about the type all it can do is output the string value. That
>>>> doesn't work well for types such as hstore. If it could reliably 
>>>> recognize a
>>>> field as an hstore it might well be able to do lots better.
>
> I'm not at all familiar with record_to_json or the json datatype, but 
> wouldn't it be appropriate to create a cast from hstore to json to 
> handle that case?
>
> That brings us to another question: should the cast be part of the 
> hstore extension, or json? (json is built-in, but imagine for a moment 
> that it was an extension, too, so that there was a choice). IIRC 
> someone started a discussion on that recently on pgsql-hackers, but I 
> don't think there was any conclusion on that.
>
I would make the CAST part of both extensions and try to do it so, that 
if the "other" type does not exists then it will not be created. So the 
extension that comes later will be creating it.

The most sensible extension would be to make it into a 3rd extension 
which depends on both json and hstore being installed

CREATE EXTENSION hstore_to_json_cast;
>
> - Heikki
>
>

Re: Oid registry

From

Hannu Krosing

Date:

25 September 2012, 23:46:29

On 09/26/2012 12:06 AM, Peter Eisentraut wrote:
> On 9/25/12 5:58 PM, Tom Lane wrote:
>> Yes ... but I really don't want to go down the path of treating those as
>> new type properties; it doesn't scale.  (And please don't tell me that
>> JSON is the last word in container types and there will never be
>> requests for any more of these.)
> Yeah, I didn't like that part either, but we only add one every five
> years or so.
>
>> Can we define these functions as being the cast-from-foo-to-json and
>> cast-from-foo-to-xml functions?  That would let us use the existing cast
>> infrastructure to manage them.
> Sounds attractive, but there might be some problems in the details.  For
> example, you can't cast scalar values to valid json values, because a
> valid json value can only be a dictionary or an array.
Nope. Json _value_ can be anything you put in these dicts or arrays.

It was just definition about something called "a json generator" which
was defined to return an array or object/dict

and the json type already can hold all the scalar values, the input
conversion functions happily accept these and generate corresponding
json values.

It was all hashed through when I woke up on this too late in the 9.2 dev
cycle and proposed all the CAST and to_json functions and also a single
to_json for any json generation instead of array_to_json and row_to_json.

so currently we do and we don't have this "json is an array or dict" 
functionality

we cant convert anything else directly to json using a to_json function,
but we can do it through input/output functions

and thus we can output scalar-valued json values :

jt=# select '1'::json,'"1"'::json; json | json
------+------ 1    | "1"
(1 row)

IIRC it was something about the to_json functions being generators and 
type io functions not being that :P

>   If we had a flag
> of some kind saying "cast from foo to json, but only when part of a
> larger json serialization, not by itself", then it might work.
>
>

Re: Oid registry

From

Peter Eisentraut

Date:

26 September 2012, 00:35:11

On Tue, 2012-09-25 at 18:22 -0400, Tom Lane wrote:
> Actually, after reading another message you sent, I thought you were
> going to respond that your proposed transforms feature would cover it.

I had thought about this some time ago, but it's clearer to think of
casts as associating two types, versus transforms associating a type and
a language.  JSON and XML tend to be types.

Re: Oid registry

From

Peter Eisentraut

Date:

26 September 2012, 00:36:14

On Mon, 2012-09-24 at 21:18 -0700, Daniel Farina wrote:
> The gap between
> pre-JSON-in-the-standard-library in Python, Ruby, et al and
> post-JSON-in-stdlib was much smaller. 

Except in Python they renamed the thing.

Re: Oid registry

From

Andrew Dunstan

Date:

26 September 2012, 00:48:46

On 09/25/2012 08:35 PM, Peter Eisentraut wrote:
> On Tue, 2012-09-25 at 18:22 -0400, Tom Lane wrote:
>> Actually, after reading another message you sent, I thought you were
>> going to respond that your proposed transforms feature would cover it.
> I had thought about this some time ago, but it's clearer to think of
> casts as associating two types, versus transforms associating a type and
> a language.  JSON and XML tend to be types.

OK, I think this solves the object_to_json problem after all - we'll 
look for a cast to json and if it's not there use the string form of the 
object. Nice.

That still leaves the other uses for having well known Oids (or possibly 
UUIDs) for non-builtin types (e.g. so Drivers don't have to look them up 
in the catalogs, or having issues when types are added to the core.)

cheers

andrew

Re: Oid registry

From

Robert Haas

Date:

26 September 2012, 00:56:19

On Tue, Sep 25, 2012 at 10:23 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> Given your previous comments, perhaps we could just start handing out
>> Oids (if there is any demand) numbered, say, 9000 and up. That should
>> keep us well clear of any existing use.
>
> No, I think you missed my point entirely: handing out OIDs at the top
> of the manual assignment range is approximately the worst possible
> scenario.  I foresee having to someday move FirstBootstrapObjectId
> down to 9000, or 8000, or even less, to cope with growth of the
> auto-assigned OID set.  So we need to keep manually assigned OIDs
> reasonably compact near the bottom of the range, and it doesn't matter
> at all whether such OIDs are used internally or reserved for external
> developers.  Nor do I see a need for such reserved OIDs to "look
> different" from internally-used OIDs.  Reserved is reserved.

I'm not sure how much anyone cares, but just in case anyone does... it
would be mighty useful to EnterpriseDB to have a range of OIDs that
are guarantee not to be assigned to anyone else, because we're
maintaining a fork of PostgreSQL that regularly merges with the
mainline.  We're actually likely to get crunched in our fork well
before PostgreSQL itself does.  There are enough other forks of
PostgreSQL out there that there may other people who are in a similar
situation, though I am not sure how much we want to cater to people
doing such things.  That having been said, I can't really see how it
would be practical anyway unless we raise the 16384 lower limit for
user-assigned OIDs considerably.   And I'm not sure how to do that
without breaking pg_upgrade.

I am somewhat opposed to the idea of an OID registry.  I can't see why
the space of things people want to reserve OIDs for would be only tens
rather than thousands.  There are surely plenty of extensions that
would like to depend on specific other extensions, or on core.  If we
legislate that hard-coded OIDs are the way to do that, I think we're
going to end up with a lot of 'em.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Oid registry

From

Hannu Krosing

Date:

26 September 2012, 08:47:37

On 09/26/2012 02:48 AM, Andrew Dunstan wrote:
>
> On 09/25/2012 08:35 PM, Peter Eisentraut wrote:
>> On Tue, 2012-09-25 at 18:22 -0400, Tom Lane wrote:
>>> Actually, after reading another message you sent, I thought you were
>>> going to respond that your proposed transforms feature would cover it.
>> I had thought about this some time ago, but it's clearer to think of
>> casts as associating two types, versus transforms associating a type and
>> a language.  JSON and XML tend to be types.
>
>
> OK, I think this solves the object_to_json problem after all - we'll 
> look for a cast to json and if it's not there use the string form of 
> the object. Nice.
This is solved then :)

Would it be possible to also use the cast mechanism to do 
anyarray-to-json casts as parallel spelling for array_to_json()
and record-to-json cast for row_to_json()

btw, is anybody currently working on also going the opposite way, that 
is loading rows/records from json ?

> That still leaves the other uses for having well known Oids (or 
> possibly UUIDs) for non-builtin types (e.g. so Drivers don't have to 
> look them up in the catalogs, or having issues when types are added to 
> the core.)
One way to solve this would be to pass non-system oids to clients as 
their names. this would need a change in protocol.

Or we could make it so that server sends a special record with 
typename->typeoid mappings at first use of non-system type.

> cheers
>
> andrew
>
>
>

Re: Oid registry

From

Antonin Houska

Date:

26 September 2012, 13:12:37

I'm also implementing an extension where direct access to non-fixed OIDs
(i.e. no catalog cache lookup by name) would be very helpful. I spent some
time thinking about a workaround that makes OID registry unnecessary.
How about the following?

1. Add a new varlena column to pg_proc catalog table, say 
'ext_types',containing
C array of OIDs.

2. Let each extension declare requirements like the following in its 
configuration
files:

"I expect <some type's name> type at 0-th position of 'ext_types' array."
"I expect <other type's name> type at 1-st position of 'ext_types' array."
etc.

3. Ensure that CREATE EXTENSION command reads all these type names, 
finds the
appropriate OIDs in pg_type and puts them to the appropriate position in 
the 'ext_types'
column for each function of the new extension (while in-core types would 
have it set to
NULL of course).

4. Implement a macro to get the 0-th, 1-st, etc. value from 
pg_proc.ext_types (via FMGR).

Is there any serious circumstance I forgot or does it seem to be e.g. 
too invasive?

Kind regards,
Tony H.

On 09/25/2012 12:59 AM, Andrew Dunstan wrote:
> This rather overdue mail arises out the developer's meeting back in 
> May, where we discussed an item I raised suggesting an Oid registry.
>
> The idea came from some difficulties I encountered when I did the 
> backport of the JSON work we did in 9.2 to 9.1, but has wider 
> application. Say someone writes an extension that defines type A. You 
> want to write an extension that takes advantage of that type, but it's 
> difficult if you don't know the type's Oid, and of course currently 
> there is no way of knowing for sure what it is unless it's a builtin 
> type. So the proposal is to have an Oid registry, in which authors 
> could in effect reserve an Oid (or a couple of Oids) for a type. We 
> would guarantee that these Oids would be reserved in just the same way 
> Oids for builtins are reserved, and #define symbolic constants for the 
> reserved Oids. To make that viable, we'd need to extend the CREATE 
> commands for any objects we could reserve Oids for to allow for the 
> specification of the Oids, somewhat like:
>
>    CREATE TYPE newtype ( ....) WITH (TYPEOID = 123456, TYPEARRAYOID =
>    234567);
>
> I'm not sure what objects we would need this for other than types, but 
> CREATE TYPE would be a good start anyway.
>
> Another suggestion that was made during the discussion was that we 
> should also reserve a range of Oids for private use, and guarantee 
> that those would never be allocated, somewhat analogously to RFC1918 
> IP addresses.
>
> thoughts?
>
> If there is general agreement I want to get working on this fairly soon.
>
> cheers
>
> andrew
>
>
>
>

Re: Oid registry

From

Andrew Dunstan

Date:

26 September 2012, 14:08:59

On 09/25/2012 08:56 PM, Robert Haas wrote:
> On Tue, Sep 25, 2012 at 10:23 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Andrew Dunstan <andrew@dunslane.net> writes:
>>> Given your previous comments, perhaps we could just start handing out
>>> Oids (if there is any demand) numbered, say, 9000 and up. That should
>>> keep us well clear of any existing use.
>> No, I think you missed my point entirely: handing out OIDs at the top
>> of the manual assignment range is approximately the worst possible
>> scenario.  I foresee having to someday move FirstBootstrapObjectId
>> down to 9000, or 8000, or even less, to cope with growth of the
>> auto-assigned OID set.  So we need to keep manually assigned OIDs
>> reasonably compact near the bottom of the range, and it doesn't matter
>> at all whether such OIDs are used internally or reserved for external
>> developers.  Nor do I see a need for such reserved OIDs to "look
>> different" from internally-used OIDs.  Reserved is reserved.
> I'm not sure how much anyone cares, but just in case anyone does... it
> would be mighty useful to EnterpriseDB to have a range of OIDs that
> are guarantee not to be assigned to anyone else, because we're
> maintaining a fork of PostgreSQL that regularly merges with the
> mainline.  We're actually likely to get crunched in our fork well
> before PostgreSQL itself does.  There are enough other forks of
> PostgreSQL out there that there may other people who are in a similar
> situation, though I am not sure how much we want to cater to people
> doing such things.  That having been said, I can't really see how it
> would be practical anyway unless we raise the 16384 lower limit for
> user-assigned OIDs considerably.   And I'm not sure how to do that
> without breaking pg_upgrade.

How many would EDB need for it to be useful?

>
> I am somewhat opposed to the idea of an OID registry.  I can't see why
> the space of things people want to reserve OIDs for would be only tens
> rather than thousands.  There are surely plenty of extensions that
> would like to depend on specific other extensions, or on core.  If we
> legislate that hard-coded OIDs are the way to do that, I think we're
> going to end up with a lot of 'em.

Maybe you have a better feel than I do for how many live addon types are 
out there. Maybe there are hundreds or thousands, but if there are I am 
blissfully unaware of them.

I did like the alternative idea upthread of UUIDs for types which would 
give them a virtually unlimited space.

cheers

andrew

Re: Oid registry

From

Tom Lane

Date:

26 September 2012, 14:21:37

Antonin Houska <antonin.houska@gmail.com> writes:
> I'm also implementing an extension where direct access to non-fixed OIDs
> (i.e. no catalog cache lookup by name) would be very helpful. I spent some
> time thinking about a workaround that makes OID registry unnecessary.
> How about the following?

> 1. Add a new varlena column to pg_proc catalog table, say 
> 'ext_types',containing
> C array of OIDs.

> 2. Let each extension declare requirements like the following in its 
> configuration
> files:

> "I expect <some type's name> type at 0-th position of 'ext_types' array."
> "I expect <other type's name> type at 1-st position of 'ext_types' array."
> etc.

I think this just begs the question: how do you specify <some type> and
how do you know that whatever was found is what you want?

Beyond that, nothing in what you said can't be done today by a function
that does type name lookups and caches the results internally.  And I'd
just as soon not burden the function-call infrastructure with more
overhead to support something only a small fraction of functions would
need.

Another point is that server-internal features don't help client-side
code, which is actually where most of the pain is AFAICT.  We aren't
providing any infrastructure that helps clients interpret PQftype()
values for non-core types.
        regards, tom lane

Re: Oid registry

From

Peter Eisentraut

Date:

26 September 2012, 14:38:32

On 9/25/12 8:48 PM, Andrew Dunstan wrote:
> That still leaves the other uses for having well known Oids (or possibly
> UUIDs) for non-builtin types (e.g. so Drivers don't have to look them up
> in the catalogs, or having issues when types are added to the core.)

Here is a problem:  If I write an "hstore-ng" extension, I have two options:

1. "Steal" the OID/UUID assigned to hstore so my extension integrates
nicely with all the drivers.

2. Don't steal the OID/UUID assigned to hstore and have a very low
chance that my extension will ever be supported nicely by a lot of drivers.

Neither option is very nice.

Re: Oid registry

From

Dimitri Fontaine

Date:

26 September 2012, 15:37:49

Peter Eisentraut <peter_e@gmx.net> writes:
> Here is a problem:  If I write an "hstore-ng" extension, I have two

In the patch for Finer Extension Dependencies, the offer is that you
have the hstore-ng extension provide the 'hstore' feature.
 https://commitfest.postgresql.org/action/patch_view?id=727

Now, that says nothing about the type's OID nor how to trust it.

In my proposal we would have a pg_features catalog, that are just
keywords used in control files so that you can change your mind as far
as dependencies are concerned from one release of your extension to the
next (adding or removing some, splitting the extension in parts or
joining them again, etc).

Those features entries do not exist yet, and are a very specific set of
OIDs, so we could maybe provision a large number of them here and refuse
to assign them to untrusted sources.

Again, the complex part of that problem, to me, is not about how to
manage those numbers (running a registry, adding syntax support, etc) as
much as how to manage a distributed network of trust.

When applying for an OID or a feature identifier, the registry team
could maybe GPG sign "something" with the PostgreSQL private key, the
backend only having the public key embedded in its code and binary. IIRC
that's enough to then validate the feature name/oid (with the signature,
a new control file parameter) and allow it to get installed in the
registry reserved range.

We then need a way to ask for the list of objects provided by the
extension providing that feature (it's been signed, it's now trusted),
something that we already know how to do (joining pg_depend etc). The
result of that query would be cached on the client side, because we're
not trying to change the way pg_type OID assignment ranges work. We
could maybe specialize all this down to pg_type and distribute a list of
reserved OIDs there too, but it seems like that ship has already sailed
(nice pg_upgrade chicken and egg problem).

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

Re: Oid registry

From

Daniel Farina

Date:

27 September 2012, 08:22:01

On Tue, Sep 25, 2012 at 5:36 PM, Peter Eisentraut <peter_e@gmx.net> wrote:
> On Mon, 2012-09-24 at 21:18 -0700, Daniel Farina wrote:
>> The gap between
>> pre-JSON-in-the-standard-library in Python, Ruby, et al and
>> post-JSON-in-stdlib was much smaller.
>
> Except in Python they renamed the thing.

By 'smaller' I mean the difference in capability between promoted to
the Python stdlib is comparatively smaller than data types and
operators from being promoted to core in Postgres, and the main reason
for that is that people can compose on top of Postgres-core
functionality, but those extensions themselves are not very useful in
further composition.  The extensibility at the first level is great.

There exists an entirely livable alternative-future/present where
Python never promoted the (simple)json library into the stdlib.  The
same could not be said for Postgres extensions as they exist today.

-- 
fdr

Re: Oid registry

From

Robert Haas

Date:

27 September 2012, 16:52:47

On Wed, Sep 26, 2012 at 10:08 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
> How many would EDB need for it to be useful?

Looks like we currently have about 160 hard-coded OIDs in our fork
that are not in PostgreSQL, across all catalogs.  Actually, there are
probably some things we could do to reduce that number, which might be
the smarter way to go.  But taken at face value I suppose that means
we would need a reservation of several hundred to allow for future
growth.

>> I am somewhat opposed to the idea of an OID registry.  I can't see why
>> the space of things people want to reserve OIDs for would be only tens
>> rather than thousands.  There are surely plenty of extensions that
>> would like to depend on specific other extensions, or on core.  If we
>> legislate that hard-coded OIDs are the way to do that, I think we're
>> going to end up with a lot of 'em.
>
> Maybe you have a better feel than I do for how many live addon types are out
> there. Maybe there are hundreds or thousands, but if there are I am
> blissfully unaware of them.

Well, that's a fair point.  There probably aren't.  But then again,
the proposed registry wouldn't only cover live projects.  We'd
probably have a decent number of people say: I can't do what I want
unless I have an OID.  And then the project becomes dormant or
obsolescent but we still have the OID reservation and there's not
really any politically feasible way of recovering the address space.
I can't help thinking that it sounds a little bit like what IANA does,
and the problems they face, except with 2^16 OIDs instead of 2^32 IP
addresses.  Admittedly there should be a lot less demand for type OIDs
than for IP addresses, but the amount of address space we can allocate
without pain is also much smaller.

> I did like the alternative idea upthread of UUIDs for types which would give
> them a virtually unlimited space.

Yeah, me too.  That doesn't require a centralized authority (hence, no
debates here about whether a given extension is important enough to
merit an allocation of a given size), doesn't move us further in the
direction of exposing the database's internal identifiers as a concept
that users have to care about, ad provides essentially infinite
address space.  There's more engineering work involved but sometimes
more engineering work means a better result.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Oid registry

From

"David Johnston"

Date:

27 September 2012, 17:08:34

> 
> > I did like the alternative idea upthread of UUIDs for types which
> > would give them a virtually unlimited space.
> 
> Yeah, me too.  That doesn't require a centralized authority (hence, no
> debates here about whether a given extension is important enough to merit
> an allocation of a given size), doesn't move us further in the direction
of
> exposing the database's internal identifiers as a concept that users have
to
> care about, ad provides essentially infinite address space.  There's more
> engineering work involved but sometimes more engineering work means a
> better result.
> 

Random thought from the sideline...

GIT is able to provide assurances as to content because it creates a hash.
Now, for a function PostgreSQL could hash the catalog entry (including
function body) and return than as proof that said function is the same as
one installed in some other database or published publically.  I do not know
enough about the other objects to know if something similar is possible but
maybe this will spark someone else's thoughts.

David J.

Re: Oid registry

From

Magnus Hagander

Date:

27 September 2012, 18:34:21

<p dir="ltr"><br /> On Sep 27, 2012 9:52 AM, "Robert Haas" <<a
href="mailto:robertmhaas@gmail.com">robertmhaas@gmail.com</a>>wrote:<br /> ><br /> > On Wed, Sep 26, 2012 at
10:08AM, Andrew Dunstan <<a href="mailto:andrew@dunslane.net">andrew@dunslane.net</a>> wrote:<br /> > > How
manywould EDB need for it to be useful?<br /> ><br /> > Looks like we currently have about 160 hard-coded OIDs in
ourfork<br /> > that are not in PostgreSQL, across all catalogs.  Actually, there are<br /> > probably some
thingswe could do to reduce that number, which might be<br /> > the smarter way to go.  But taken at face value I
supposethat means<br /> > we would need a reservation of several hundred to allow for future<br /> > growth.<p
dir="ltr">I'mnot sure that's a way we really want to go down. How do we define which third party vendors would get to
reserveoids? And how many? And under what other potential terms?<p dir="ltr">Seems like we'd set ourselves up for
endlessdiscussions and bike shedding... <br /><br /><p dir="ltr">> >> I am somewhat opposed to the idea of an
OIDregistry.  I can't see why<br /> > >> the space of things people want to reserve OIDs for would be only
tens<br/> > >> rather than thousands.  There are surely plenty of extensions that<br /> > >> would
liketo depend on specific other extensions, or on core.  If we<br /> > >> legislate that hard-coded OIDs are
theway to do that, I think we're<br /> > >> going to end up with a lot of 'em.<br /> > ><br /> > >
Maybeyou have a better feel than I do for how many live addon types are out<br /> > > there. Maybe there are
hundredsor thousands, but if there are I am<br /> > > blissfully unaware of them.<br /> ><br /> > Well,
that'sa fair point.  There probably aren't.  But then again<p dir="ltr">There are probably more than we know. Many many
internalthings at different organizations. <br /><p dir="ltr">> the proposed registry wouldn't only cover live
projects. We'd<br /> > probably have a decent number of people say: I can't do what I want<br /> > unless I have
anOID.  And then the project becomes dormant or<br /> > obsolescent but we still have the OID reservation and
there'snot<br /> > really any politically feasible way of recovering the address space.<br /> > I can't help
thinkingthat it sounds a little bit like what IANA does,<br /> > and the problems they face, except with 2^16 OIDs
insteadof 2^32 IP<br /> > addresses.  Admittedly there should be a lot less demand for type OIDs<br /> > than for
IPaddresses, but the amount of address space we can allocate<br /> > without pain is also much smaller.<p
dir="ltr">Yeah,I think that would rapidly turn into a pain point. <br /><br /><p dir="ltr">> > I did like the
alternativeidea upthread of UUIDs for types which would give<br /> > > them a virtually unlimited space.<br />
><br/> > Yeah, me too.  That doesn't require a centralized authority (hence, no<br /> > debates here about
whethera given extension is important enough to<br /> > merit an allocation of a given size), doesn't move us
furtherin the<br /> > direction of exposing the database's internal identifiers as a concept<br /> > that users
haveto care about, ad provides essentially infinite<br /> > address space.  There's more engineering work involved
butsometimes<br /> > more engineering work means a better result.<br /><p dir="ltr">Yeah, seems like it would work
muchbetter long term. <p dir="ltr">/Magnus

Re: Oid registry

From

Robert Haas

Date:

27 September 2012, 20:02:49

On Thu, Sep 27, 2012 at 2:34 PM, Magnus Hagander <magnus@hagander.net> wrote:
> I'm not sure that's a way we really want to go down. How do we define which
> third party vendors would get to reserve oids? And how many? And under what
> other potential terms?
>
> Seems like we'd set ourselves up for endless discussions and bike
> shedding...

Not really.  I'm only proposing that it would be nice to have a block
of OIDs that core agrees not to assign for any other purpose, not that
we dole out specific ones to specific companies.  There's no reason
why, for example, EnterpriseDB's fork can't use OIDs from the same
reserved block as PostgreSQL-XC's fork or Greenplum's fork or Aster
Data's fork - those are all distinct projects.  All might need private
OIDs but they can all come from the same range because the code bases
don't mingle.

That having been said, we've gotten this far without having any
terrible trouble about this, so maybe it's not worth worrying about.
It's a nice-to-have, not a big deal.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Oid registry

From

Magnus Hagander

Date:

27 September 2012, 20:08:59

On Thu, Sep 27, 2012 at 10:02 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Sep 27, 2012 at 2:34 PM, Magnus Hagander <magnus@hagander.net> wrote:
>> I'm not sure that's a way we really want to go down. How do we define which
>> third party vendors would get to reserve oids? And how many? And under what
>> other potential terms?
>>
>> Seems like we'd set ourselves up for endless discussions and bike
>> shedding...
>
> Not really.  I'm only proposing that it would be nice to have a block
> of OIDs that core agrees not to assign for any other purpose, not that
> we dole out specific ones to specific companies.  There's no reason

Ah, ok. In that case I agree, that wouldn't be open for a lot of bikeshedding.

> why, for example, EnterpriseDB's fork can't use OIDs from the same
> reserved block as PostgreSQL-XC's fork or Greenplum's fork or Aster
> Data's fork - those are all distinct projects.  All might need private
> OIDs but they can all come from the same range because the code bases
> don't mingle.
>
> That having been said, we've gotten this far without having any
> terrible trouble about this, so maybe it's not worth worrying about.
> It's a nice-to-have, not a big deal.

Yeah, there's got to be a whole lot of other very much more
complicated things you have to do with each new major version :)

-- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/

Re: Oid registry

From

Bruce Momjian

Date:

01 October 2012, 14:12:14

On Thu, Sep 27, 2012 at 04:02:45PM -0400, Robert Haas wrote:
> On Thu, Sep 27, 2012 at 2:34 PM, Magnus Hagander <magnus@hagander.net> wrote:
> > I'm not sure that's a way we really want to go down. How do we define which
> > third party vendors would get to reserve oids? And how many? And under what
> > other potential terms?
> >
> > Seems like we'd set ourselves up for endless discussions and bike
> > shedding...
> 
> Not really.  I'm only proposing that it would be nice to have a block
> of OIDs that core agrees not to assign for any other purpose, not that
> we dole out specific ones to specific companies.  There's no reason
> why, for example, EnterpriseDB's fork can't use OIDs from the same
> reserved block as PostgreSQL-XC's fork or Greenplum's fork or Aster
> Data's fork - those are all distinct projects.  All might need private
> OIDs but they can all come from the same range because the code bases
> don't mingle.
> 
> That having been said, we've gotten this far without having any
> terrible trouble about this, so maybe it's not worth worrying about.
> It's a nice-to-have, not a big deal.

Interesting idea, but if plugable data types started using that reserved
range, it could conflict with XC or EDB-reserved oids, making those data
types unusable in those forks.  Maybe we need two reserved ranges.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +