Thread: inet type/merge joins

inet type/merge joins

From
Alex Pilosov
Date:
Hi,

I noticed that postgresql won't ever use a merge join when dealing with
inet types (for t1.ip=t2.ip, where merge is definitely the best method).
Delving into pg_operator, it appears that = for inet datatype is not
created with any left_sort_op/right_sort_op and thus won't merge won't be
possible. To me, this looks like an oversight, as (at least for inet
type), network_eq should be able to properly sort the data.

Anyone up to modify the catalog accordingly for 7.2?

(For cidr datatype this doesn't exactly apply, I think, because of all the
trickery with netmasks, network_eq may not properly sort the cidr data. I
need to think more about this)


Thanks
-alex




Re: inet type/merge joins

From
Alex Pilosov
Date:
Hate to reply to my own email, but now that I started thinking about
this, there are few more things about network types in catalog that should
be fixed:

a) inet and cidr should be hashable (they are memset'd on input, and the
only way values are equal is for their binary representations to be
equal). All necessary stuff appears to be already in pg_amop. 

b) both inet and cidr should be merge-joinable for operator = using
network_lt as sorting function. 

I'm also thinking that <<, <<=, etc functions must be using an index, but
I'm still trying to understand the way pg_operator, pg_amop, pg_amproc all
fit together...


On Sat, 9 Jun 2001, Alex Pilosov wrote:

> Hi,
> 
> I noticed that postgresql won't ever use a merge join when dealing with
> inet types (for t1.ip=t2.ip, where merge is definitely the best method).
> Delving into pg_operator, it appears that = for inet datatype is not
> created with any left_sort_op/right_sort_op and thus won't merge won't be
> possible. To me, this looks like an oversight, as (at least for inet
> type), network_eq should be able to properly sort the data.
> 
> Anyone up to modify the catalog accordingly for 7.2?
> 
> (For cidr datatype this doesn't exactly apply, I think, because of all the
> trickery with netmasks, network_eq may not properly sort the cidr data. I
> need to think more about this)
> 
> 
> Thanks
> -alex
> 
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly
> 
> 




Re: inet type/merge joins

From
Tom Lane
Date:
Alex Pilosov <alex@pilosoft.com> writes:
> a) inet and cidr should be hashable

That's not safe.  The critical requirement for hashability (in its
present implementation) is that two values with distinct bit patterns
must never compare as equal.  This is not true for inet/cidr, since
the comparison function doesn't pay attention to the 'type' field
(inet vs. cidr).

At some point it'd be nice to use type-specific hash functions for
hashjoin --- we support 'em for hash indexes, and I don't see why the
join mechanism should not use the same functions.  With that, it'd be
possible to overcome this problem by making a hash function that has
the same blind spots as the comparison function ...

But you're right that the network types should be mergejoinable;
anything that supports btree indexes ought to be mergejoinable.
I see a couple of similar omissions now that I look.  Will fix.

> I'm also thinking that <<, <<=, etc functions must be using an index, but
> I'm still trying to understand the way pg_operator, pg_amop, pg_amproc all
> fit together...

I think you'd be better off to hack these into the "special index
operator" stuff in optimizer/path/indxpath.c.  Adding to pg_amop
would only be appropriate for something that you could define in a
datatype-independent fashion.
        regards, tom lane


Re: inet type/merge joins

From
Alex Pilosov
Date:
On Sun, 10 Jun 2001, Tom Lane wrote:

> Alex Pilosov <alex@pilosoft.com> writes:
> > a) inet and cidr should be hashable
> 
> That's not safe.  The critical requirement for hashability (in its
> present implementation) is that two values with distinct bit patterns
> must never compare as equal.  This is not true for inet/cidr, since
> the comparison function doesn't pay attention to the 'type' field
> (inet vs. cidr).
Oops. I forgot about that field. 

> At some point it'd be nice to use type-specific hash functions for
> hashjoin --- we support 'em for hash indexes, and I don't see why the
> join mechanism should not use the same functions.  With that, it'd be
> possible to overcome this problem by making a hash function that has
> the same blind spots as the comparison function ...

I guess the idea is to add a hash function to a catalog of a type
definition is a good idea?

> But you're right that the network types should be mergejoinable;
> anything that supports btree indexes ought to be mergejoinable.
> I see a couple of similar omissions now that I look.  Will fix.
> 
> > I'm also thinking that <<, <<=, etc functions must be using an index, but
> > I'm still trying to understand the way pg_operator, pg_amop, pg_amproc all
> > fit together...
> 
> I think you'd be better off to hack these into the "special index
> operator" stuff in optimizer/path/indxpath.c.  Adding to pg_amop
> would only be appropriate for something that you could define in a
> datatype-independent fashion.
Thanks for pointer, I'll try to have a patch later :)



Re: inet type/merge joins

From
Tom Lane
Date:
Alex Pilosov <alex@pilosoft.com> writes:
>> At some point it'd be nice to use type-specific hash functions for
>> hashjoin --- we support 'em for hash indexes, and I don't see why the
>> join mechanism should not use the same functions.  With that, it'd be
>> possible to overcome this problem by making a hash function that has
>> the same blind spots as the comparison function ...

> I guess the idea is to add a hash function to a catalog of a type
> definition is a good idea?

Either that, or replace the oprcanhash boolean field of pg_operator
by an OID field that links to the appropriate hash function.  Not sure
offhand which is better.
        regards, tom lane